Galileo Observe allows you to monitor your Retrieval-Augmented Generation (RAG) application with out-of-the-box Tracing and Analytics.
{context_embedding}
, {query_embedding}
.
Context Relevance is a relative metric. High Context Relevance values indicate significant similarity or relevance. Low Context Relevance values are a sign that you need to augment your knowledge base or vector DB with additional documents, modify your retrieval strategy, or use better embeddings.
Completeness
If Context Adherence is your precision metric for RAG, Completeness is your recall. In other words, it tries to answer the question: “Out of all the information in the context that’s pertinent to the question, how much was covered in the answer?”
Low Completeness values indicate there’s relevant information to the question included in your context that was not included in the model’s response.
Chunk Attribution
Chunk Attribution is a chunk-level metric that denotes whether a chunk was or wasn’t used by the model in generating the response. Attribution helps you more quickly identify why the model said what it did, without needing to read over the whole context.
Additionally, Attribution helps you optimize your retrieval strategy.
Chunk Utilization
Chunk Utilization measures how much of the text included in your chunk was used by the model to generate a response. Chunk Utilization helps you optimize your chunking strategy.
Non-RAG specific Metrics
Other metrics such as Uncertainty and Correctness might be useful as well. If these don’t cover all your needs, you can always write custom metrics.