Multimodal Quality metrics help you measure whether multimodal inputs and outputs (such as images and audio conversations) are usable and compliant for the task at hand. Use Multimodal Quality metrics when you want to:Documentation Index
Fetch the complete documentation index at: https://docs.galileo.ai/llms.txt
Use this file to discover all available pages before exploring further.
- Validate that input images are clear enough for reliable task completion.
- Enforce explicit brand or content rules on generated images using only visible evidence.
- Detect turn-taking issues in audio-based conversations (overlap and barge-in).
| Name | Description | Supported Nodes | When to Use | Example Use Case |
|---|---|---|---|---|
| Visual Quality | Judges whether the quality of an input image / PDF is sufficient to reliably complete the task in the adjoining prompt. | LLM span | When user-supplied images might be blurry, occluded, cropped, or poorly lit, and those artifacts can make the task infeasible. | A document capture flow where you need to know if a photo is readable enough to extract a serial number. |
| Visual Fidelity | Checks whether a generated image satisfies every applicable provided brand rule, based only on visible evidence. | LLM span | When generated images must comply with explicit brand, style, layout, or content rules. | A marketing image generator where logos, colors, and prohibited elements must always comply with a brand rule set. |
| Interruption Detection | Detects turn-taking violations in audio conversations, including overlap and barge-in events. | Session (trace inputs/outputs only) | When evaluating voice agents where smooth turn-taking and endpoint are critical to user experience. | A voice assistant where the agent must avoid speaking over users or cutting them off mid-utterance. |