> ## Documentation Index > Fetch the complete documentation index at: https://docs.galileo.ai/llms.txt > Use this file to discover all available pages before exploring further. # Instruction Adherence > Assess instruction adherence in AI outputs using Galileo Guardrail Metrics to ensure prompt-driven models generate precise and actionable results export const DefinitionCard = ({children}) => { return

{children}

; }; export const Scale = ({low, mid, high, lowLabel = "Low", midLabel = "Mid", highLabel = "High", lowDescription, midDescription, highDescription, midColor = "yellow", inverted = false}) => { const lowColor = inverted ? "green" : "red"; const highColor = inverted ? "red" : "green"; const gradientId = inverted ? "greenToRed" : "redToGreen"; return

{low}

{mid &&

{mid}

}

{high}

{lowLabel}

{lowDescription &&

{lowDescription}

}

{mid &&

{midLabel}

{midDescription &&

{midDescription}

}

{highLabel}

{highDescription &&

{highDescription}

}

; }; Instruction Adherence measures whether a model followed or adhered to the system or prompt instructions when generating a response. ## How it works This metric is particularly valuable for uncovering hallucinations where the model is ignoring instructions, which can lead to responses that don't meet user requirements or business rules. Here's a scale that shows the relationship between Instruction Adherence and the potential impact on your AI system: ## Calculation method Instruction Adherence is computed through a multi-step process: The system sends multiple evaluation requests to OpenAI's GPT4o model to analyze whether the response follows the provided instructions. A specialized chain-of-thought prompt guides the model through a detailed evaluation of how well the response adheres to the specific instructions given. The system requests and collects multiple distinct responses to ensure a robust evaluation through consensus. Each evaluation produces both a detailed explanation of the reasoning and a binary judgment (yes/no) on instruction adherence. The final score is computed as the ratio of positive ('yes') responses to the total number of evaluation responses. We also surface one of the generated explanations, always choosing one that aligns with the majority judgment among the responses. This metric is computed by prompting an LLM multiple times, and thus requires additional LLM calls to compute, which may impact usage and billing. ## Understanding instruction adherence

Differentiating from Context Adherence

It's important to understand the distinction between related metrics:

Instruction Adherence: Measures whether the response follows the instructions in your prompt template.

Context Adherence: Measures whether the response adheres to the context provided (e.g., your retrieved documents).

## Optimizing your AI system

Addressing Low Instruction Adherence

When a response has a low Instruction Adherence score, the model likely ignored its instructions. To improve your system:

Flag and examine non-compliant responses: Identify patterns in responses that don't follow instructions.

Experiment with prompt engineering: Test different prompt formulations to find versions the model is more likely to adhere to.

Implement guardrails: Take precautionary measures to prevent non-compliant responses from reaching end users.

Consider model selection: Some models may be better at following instructions than others.

## Best practices Write clear, specific instructions without ambiguity or contradictions to improve adherence rates. Place the most important instructions prominently in your prompt and consider repeating them for emphasis. Compare Instruction Adherence scores across different LLMs to identify which models best follow your specific instructions. Use low-adherence examples to refine your prompts and create test cases for future prompt iterations. When optimizing for Instruction Adherence, balance strict adherence with allowing the model some flexibility. Overly rigid instructions may limit the model's ability to provide helpful responses in edge cases.