Explore Galileo NLP Studio’s Alerts feature, designed to detect and summarize dataset issues like mislabeling and class imbalance, enhancing data inspection.
Likely Mislabeled | Leverages our Likely Mislabeled algorithm to surface the samples we believe were incorrectly labeled by your annotators |
Misclassified | Surfaces mismatches between your data and the model’s prediction |
Hard For The Model | Exposes the samples we believe we hard for your model to learn. These are samples with high Data Error Potential scores |
Low Performing Classes | Classes that performed significantly worse than average (e.g. their F1 score was 1 std below the mean F1 score) |
Low Performing Metadata | Slices the data by different metadata values and shows any subsets of data that perform significantly worse than average |
High Class Imbalance is Impacting Performance | Exposes classes that have a low relative class distribution in the training set and perform poorly in the validation/test set |
High Class Overlap | Surfaces classes our Class Overlap algorithm detected as being confused by one another by the model |
Out Of Coverage | Surfaces samples in your validation/test split that are fundamentally different from samples contained in your training set |
PII | Identifies any Personal Identifiable Information in your data |
Non-Primary Language | Exposes samples that are not in the primary language of your dataset |
Semantic Cluster with High DEP | Surfaces semantic clusters of data found through our Clustering algorithm that have high Data Error Potential |
High Uncertainty Samples | Surfaces samples that exist on the model’s decision boundary |
[Inference Only] Data Drift | The data your model sees in this inference run has drifted from what it was trained on |
[Named Entity Recognition Only] High Frequency Problematic Word | Shows you words that the models struggles with (i.e. have high Data Error Potential) more than 50% of the time |
[Named Entity Recognition or Semantic Segmentation Only] False Positives | Spans or Segments predicted by the model for which the Ground Truth has no annotation |
[Named Entity Recognition Only] False Negatives | Surfaces spans for which the Ground Truth had an annotation but the model didn’t predict any |
[Named Entity Recognition Only] Shifted Spans | Surfaces spans where the beginning and end locations are not aligned in the Ground Truth and Prediction |
[Object Detection Only] Background Confusion Errors | Surfaces predictions that don’t overlap significantly with any Ground Truth |
[Object Detection Only] Localization Mistakes | Surfaces detected objects that overlap poorly with their corresponding Ground Truth |
[Object Detection Only] Missed Predictions | Surfaces annotations the model failed to make predictions for |
[Object Detection Only] Misclassified Predictions | Surfaces objects that were assigned a different label than their associated Ground Truths |
[Object Detection Only] | Surfaces instances where multiple duplicate predictions were being made for the same object |