2 WAYS TO USE SIMILARITY SEARCH
1. Find similar labeled data across splits
This is useful when you find low quality data (mislabeled, garbage, empty, etc) and you want to find other samples similar to it, so that you can take bulk action (remove, relabel, etc). Galileo automatically assigns a smart threshold to give you the most similar data samples.
While surfacing similar samples, you can easily change the number of similar samples shown within the dataset view and embeddings visualization.