Build your own conditions
A class to build custom conditions for DataFrame assertions and alerting.
A Condition
is a class for building custom data quality checks. Simply create a condition, and after the run is processed your conditions will be evaluated. Integrate with email or slack to have condition results alerting via a Run Report. Use Conditions to answer questions such as “Is the average confidence for my training data below 0.25” or “Has over 20% of my inference data drifted”.
What do I do with Conditions?
You can build a Run Report
that will evaluate all conditions after a run is processed.
You can also build and evaluate conditions by accessing the processed DataFrame.
How do I build a Condition?
A Condition
is defined as:
To gain an intuition for what can be accomplished, consider the following examples:
- Is the average confidence less than 0.3?
- Is the max DEP greater or equal to 0.45?
By adding filters, you can further narrow down the scope of the condition. If the aggregate function is “pct”, you don’t need to specify a metric, as the filters will determine the percentage of data.
- Alert if over 80% of the dataset has confidence under 0.1
- Alert if at least 20% of the dataset has drifted (Inference DataFrames only)
- Alert 5% or more of the dataset contains PII
Complex conditions can be built when the filter has a different metric than the metric used in the condition.
- Alert if the min confidence of drifted data is less than 0.15
- Alert if over 50% of high DEP (>=0.7) data contains PII:
You can also call conditions directly, which will assert its truth against a DataFrame.
- Assert that average confidence less than 0.3
Aggregate Function
The available aggregate functions are:
Operator
The available operators are:
Metric & Treshold
The metric must be the name of a column in the DataFrame. Threshold is a numeric value for comparison in the Condition.
Alerting
Alerting via email, slack in development. Please reach out to Galileo at team@rungalileo.io for more information.