> ## Documentation Index > Fetch the complete documentation index at: https://docs.galileo.ai/llms.txt > Use this file to discover all available pages before exploring further. # Tool Selection Quality > Evaluate tool selection quality in AI agents using Galileo Guardrail Metrics to ensure agents choose appropriate tools with correct parameters export const DefinitionCard = ({children}) => { return

{children}

; }; export const Scale = ({low, mid, high, lowLabel = "Low", midLabel = "Mid", highLabel = "High", lowDescription, midDescription, highDescription, midColor = "yellow", inverted = false}) => { const lowColor = inverted ? "green" : "red"; const highColor = inverted ? "red" : "green"; const gradientId = inverted ? "greenToRed" : "redToGreen"; return

{low}

{mid &&

{mid}

}

{high}

{lowLabel}

{lowDescription &&

{lowDescription}

}

{mid &&

{midLabel}

{midDescription &&

{midDescription}

}

{highLabel}

{highDescription &&

{highDescription}

}

; }; Tool Selection Quality determines whether the agent selected the correct tool and for each tool the correct arguments. This metric is particularly valuable for evaluating agentic AI systems where the model must decide which tools to use and how to use them correctly. Poor tool selection can lead to ineffective or incorrect responses. Here's a scale that shows the relationship between Tool Selection Quality and the potential impact on your AI system: ## Calculation method Tool Selection Quality is computed through a multi-step process: Multiple evaluation requests are sent to an LLM evaluator (e.g., OpenAI's GPT4o-mini) to analyze the agent's tool selection decisions. A carefully engineered chain-of-thought prompt guides the model to evaluate whether the selected tools and their parameters were appropriate for the task. The system requests multiple distinct responses to this prompt to ensure robust evaluation through consensus. Each evaluation generates both an explanation of the reasoning and a binary judgment (yes/no) on tool selection appropriateness. The final Tool Selection Quality score is computed as the ratio of positive ('yes') responses to the total number of evaluation responses. We also surface one of the generated explanations, always choosing one that aligns with the majority judgment among the responses. This metric is computed by prompting an LLM multiple times, and thus requires additional LLM calls to compute, which may impact usage and billing. ## Understanding tool selection quality

When Tool Selection is Evaluated

Tool Selection Quality evaluates different scenarios:

No Tool Needed: The assistant is not expected to call tools if there are no unanswered user queries, if no tools can help answer any query, or if all the information to answer is contained in the history.

Tool Needed: When tools should be used, the turn is considered successful if the agent selected the correct tool and provided all required arguments with correct values.

Unsuccessful Selection: If the agent calls tools when it shouldn't, or selects the wrong tool/arguments when it should call tools, the turn is considered unsuccessful.

## Optimizing your AI system

Addressing Low Tool Selection Quality

When a response has a low Tool Selection Quality score, consider these improvements:

Analyze error patterns: Identify common mistakes in tool selection or parameter usage.

Improve tool descriptions: Enhance tool documentation with clearer descriptions of when and how to use each tool.

Refine system prompts: Update instructions to provide better guidance on tool selection criteria.

Consider model capabilities: Some models may be better at tool selection than others.

## Best practices Provide detailed descriptions for each tool, including when to use it and what parameters are required. Implement validation for tool parameters to prevent incorrect usage and provide helpful error messages. Track which tools are frequently misused to identify opportunities for improvement in tool design or documentation. Provide examples of correct tool usage in different scenarios to help the agent learn appropriate selection patterns. Tool Selection Quality is most useful in Agentic Workflows, where an LLM decides the course of action to take by selecting a Tool. This metric helps you detect whether the right course of action was taken by the Agent.