Safety and Compliance Metrics

Safety and compliance metrics help you ensure your AI systems are safe, fair, and meet regulatory requirements. These metrics are essential for protecting users, avoiding harmful outputs, and building trust in your AI applications. Use these metrics when you want to:

Detect and prevent the exposure of sensitive or personally identifiable information (PII).
Identify and filter out toxic, biased, or inappropriate content.
Guard against prompt injection attacks and other security risks.
Ensure your AI meets industry or legal compliance standards.

Below is a quick reference table of all safety and compliance metrics:

Name	Description	Supported Nodes	When to Use	Example Use Case
PII / CPNI / PHI	Identifies personally identifiable or sensitive information in prompts and responses.	Trace (root input/output only)	When handling potentially sensitive data or in regulated industries.	A healthcare chatbot that must detect and redact patient information in conversation logs.
Prompt Injection	Detects attempts to manipulate the model through malicious prompts.	Trace (root input only)	When allowing user input to be processed directly by your AI system.	A public-facing AI assistant that needs protection from users trying to bypass content filters or extract sensitive information.
Sexism / Bias	Detects gender-based bias or discriminatory content.	Trace (root input/output only)	When ensuring AI outputs are free from bias and discrimination.	A resume screening assistant that must evaluate job candidates without gender or demographic bias.
Toxicity	Identifies harmful, offensive, or inappropriate content.	Trace (root input/output only)	When monitoring AI outputs for harmful content or implementing content filtering.	A social media content moderation system that must detect and flag potentially harmful user-generated content.

Next steps

Completeness Prompt Injection

⌘I

Documentation Index

​Next steps

Next steps