Tool Error

Definition: Detects errors or failures during the execution of Tools. Calculation: Tool Errors is computed by sending additional requests to an LLM (e.g. OpenAI’s GPT4o-mini), using a carefully engineered chain-of-thought prompt that asks the model to judge whether or not the tools executed correctly. We also surface a generated explanation.

Note: This metric is computed by prompting an LLM.

Usefulness: This metric helps you detect whether your tools executed correctly. It’s most useful in Agentic Workflows where many Tools get called. It helps you detect and understand patterns in your Tool failures.

Tone Tool Selection Quality

Introduction

Evaluate

Observe

Protect

Galileo Guardrail Metrics

Fine Tune

Galileo NLP Studio