- The model can generate a substantial quantity of boxes (several thousand for YOLO before NMS).
- Each box can be positioned at any location on the image, as long as it has integer coordinates.
The 6 Error Types
The initial stage in assigning error types to flawed boxes involves identifying the boxes that are not deemed correct. We will refer to inaccurate predictions as False Positives (FP) and erroneous annotations as False Negatives (FN). There are many ways in which a predicted box can turn into a FP, so we will classify them further in more granular buckets:- Duplicate Error: the predicted box highly overlaps with an annotation that is already used
- Classification Error: the predicted box highly overlaps with an annotation of different label
- Localization Error: the predicted box slightly overlaps with an annotation of same label
- Classification and Localization Error: the predicted box slightly overlaps with an annotation of different label
- Background Error: the predicted box does not even slightly overlap with an annotation.
- Missed Error: the annotation was not used by any prediction (either used to declare a prediction a TP or used to bin a prediction in any of the above errors).

The 6 error types and Galileo
Count and Impact on mAP
In the Galileo Console, we surface two metrics for each of the 6 error types: their count and their impact on mAP. The count is simply the number of boxes tagged with that error type, and the impact on mAP is the amount by which mAP would increase if we were to fix all errors of that type.
Focus on a single Error Type to gain insight
Galileo allows you to focus on any of the error types in order to dig and understand in each case whether the data quality is poor or the model is not well trained. For this you can either click on an error type in the above bar chart, or simply add the error type filter by clicking on Add Filters. Once a single error type is selected, Galileo will only display the boxes with that error type together with any other box that is necessary context in order to explain that error type. For example, a prediction is tagged as a classification error because it significantly overlaps with an annotation of different label. In this case, we will show this annotation and its label. We refer to the Technical deep dive below for more details on associated boxes.Improve your data quality
Galileo offers the possibility to fix your annotations in a few clicks from the console. After adding a filter by error type, select the images with miss-annotated boxes either one-by-one, or by selecting them all and, if any, unselecting the images with correct annotations.
Update your annotations in a few clicks from the console.
-
Duplicate error: this is often a model error, and duplicates can be reduced by decreasing the IoU threshold in the NMS step. However, sometimes a duplicate box will have more accurate localization that both the TP prediction and the annotation, in which case we would overwrite the annotation with the duplicate box.
The inner prediction has higher confidence than the larger box, and is thus selected as a TP. The duplicated outer prediction is however a better bounding box than both the TP prediction and the annotation..
- Classification error: more often than not, classification errors in OD represent mislabeled annotation. Correcting this error would simply relabel the annotation with the predicted one. Note that these errors have overlap with the Likely Mislabeled feature.

Typical classification error where the annotation is mislabeled.
-
Localization error: localization errors surface inaccuracies in the annotations localization. Correcting this error would overwrite the annotation’s coordinates with the predicted ones. Note that this error is very sensitive to the IoU threshold chosen (the mAP threshold).
Localization error exhibiting an inaccurate annotation.
- Classification and Localization error: these errors are less predictable and can be due to various phenomena. We suggest going through these images one-by-one and taking action accordingly.
- Background error: more often than not a background error is due to a missed annotation. In this setting, the Overwrite Ground Truth button adds the missing annotation.
-
Missed error: these errors are sometimes due to the model not predicting the appropriate box, and sometimes due to poor annotations. Some common scenarios include:
-
poor/gibberish annotations that do not represent an object or do not represent an object that we want to predict
The annotation does not represent any object.
-
multiple annotations for the same object
In this case, overwriting the ground truth means removing the bad annotation.
There are multiple annotations for the same object.
-
poor/gibberish annotations that do not represent an object or do not represent an object that we want to predict
The 6 error types: Technical deep dive
In this section, we will elaborate on our methodology for determining the suitable error type associated with a box that fails to meet the criteria for correctness.Coarse Errors: FPs and FNs
The first step consists of a coarser association is determining all wrong predictions (False Positives, FP), and all wrong annotations (False Negatives, FN). This algorithm is also used for calculating the main metric in Object Detection: the mean Average Precision (mAP). We summarize the steps necessary for finding our error types, and refer to a modern definition for more details:- Pick a global IoU threshold. This is used to decide when two boxes overlap enough to be paired together.
- Loop over labels. For every label, only consider the predictions and annotations of that label.
- Sort all predictions descending by their score and go through them one by one. At the beginning all annotation are unused.
- If a prediction overlaps enough with an unused annotation: call that prediction at True Positive (TP) and declare that annotation as used.
- If it doesn’t, call that prediction a FP.
- When all predictions are exhausted, call all unused annotations become FNs.

Finer Errors: The 6 Error Types of TIDE
The 6 error types cited above were introduced in the TIDE toolbox paper, to which we refer for more details. For a concise definition, we will re-use the illustration posted above.
[0,1]
interval appearing below the image indicates the range (in orange) for the IoU between the predicted box (in red) and an annotated box (in yellow). Note that it contains two thresholds: the background threshold t_b
and the foreground threshold t_f
. Galileo sets the background threshold t_b
at 0.1
and the foreground threshold t_f
at the mAP threshold
used to compute the mAP score. As an example, a predicted box overlapping with an annotation with IoU >= t_f
will be given the classification error type if the class of the annotation doesn’t match that of the prediction.
With the above ambiguous definition, there are cases where a predicted box could be part of multiple error types. To avoid ambiguity, Galileo classifies the errors in the following order:
- Localization
- Classification
- Duplicate
- Background
- Classification and Localization.
-
has IoU with an annotation with same label in the range
[t_b, t_f]
-
has IoU with an annotation with different label in the range
[t_f, 1]
-
has IoU with an annotation already used, with same label in the range
[t_f, 1]
-
has IoU
< t_b
with all annotations.
[t_b, t_f]
with a box of different label).
Finally, the Missed error type is given to any annotation that is already considered a FN, and that was not used in the above definition by either a Classification Error or a Localization Error. Note that Missed annotations can overlap with predictions, for example, they can overlap < t_b
with a classification and localization error.
Associated boxes
The above definitions beg for better terminology. We will say that an annotation is associated with a prediction, or that a prediction links to an annotation in any of the following cases- the prediction is a TP corresponding to the annotation
- the prediction is an FP (except background error), and the annotation is the one involved in the IoU deciding so.

The predicted box is a localization error. Without the context of the associated annotation, this would be confusing since the prediction looks correct. With the context, one can see that the annotation is inaccurate and should be updated.