Model confidence metrics help you gauge how certain your AI is about its answers. These metrics are useful for flagging uncertain responses, improving reliability, and knowing when to involve a human in the loop. Use these metrics when you want to:Documentation Index
Fetch the complete documentation index at: https://docs.galileo.ai/llms.txt
Use this file to discover all available pages before exploring further.
- Identify responses where the model is unsure or likely to make mistakes.
- Improve user trust by surfacing confidence scores or warnings.
- Analyze which prompts or situations are most challenging for your AI.
| Name | Description | Supported Nodes | When to Use | Example Use Case |
|---|---|---|---|---|
| Prompt Perplexity | Evaluates how difficult or unusual the prompt is for the model to process. | LLM span | When you want to identify prompts that may confuse the model or lead to lower-quality responses. | Detecting outlier prompts in a customer support chatbot to improve prompt engineering. |
| Uncertainty | Measures the model’s confidence in its generated response. | LLM span | When you want to understand how certain the model is about its answers. | Flagging responses where the model is unsure, so a human can review them before sending to a user. |