Documentation Index
Fetch the complete documentation index at: https://docs.galileo.ai/llms.txt
Use this file to discover all available pages before exploring further.
When implementing RAG systems, it’s crucial to properly handle document retrieval, context management, and response generation. This guide demonstrates a basic RAG implementation using Galileo’s observability features.
What you’ll need
- OpenAI API key
- Galileo API key
- Python environment with required packages
- Basic understanding of RAG concepts
Setup instructions
Set Up Your Environment
Create a .env file with your API keys:# Your Galileo API key
GALILEO_API_KEY="your-galileo-api-key"
# Your Galileo project name
GALILEO_PROJECT="your-galileo-project-name"
# The name of the Log stream you want to use for logging
GALILEO_LOG_STREAM="your-galileo-log-stream"
# Provide the console url below if you are using a
# custom deployment, and not using the free tier, or app.galileo.ai.
# This will look something like “console.galileo.yourcompany.com”.
# GALILEO_CONSOLE_URL="your-galileo-console-url"
# OpenAI properties
OPENAI_API_KEY="your-openai-api-key"
# Optional. The base URL of your OpenAI deployment.
# Leave this commented out if you are using the default OpenAI API.
# OPENAI_BASE_URL="your-openai-base-url-here"
# Optional. Your OpenAI organization.
# OPENAI_ORGANIZATION="your-openai-organization-here"
Install Dependencies
Install required dependencies:galileo[openai]
python-dotenv
rich
questionary
Running and Monitoring
Execute the application:Use Galileo to monitor:
- Document retrieval performance
- Chunk relevance
- Chunk attribution utilization
- Completeness
- System performance metrics
Implementation guide
Let’s break down the implementation into manageable sections:
1. setting up the environment
First, we’ll set up our imports and initialize our environment:
import os
from dotenv import load_dotenv
from galileo import log
from galileo.openai import openai
from rich.console import Console
from rich.panel import Panel
from rich.markdown import Markdown
import questionary
import sys
load_dotenv()
# Initialize console for rich output
console = Console()
# Check if Galileo logging is enabled
logging_enabled = os.environ.get("GALILEO_API_KEY") is not None
# Initialize OpenAI client directly
client = openai.OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
This section:
- Imports necessary libraries
- Loads environment variables
- Sets up rich console output
- Initializes the OpenAI client with Galileo integration
2. document retrieval system
The document retrieval function is decorated with Galileo’s logging:
@log(span_type="retriever")
def retrieve_documents(query: str):
# TODO: Replace with actual RAG retrieval
documents = [
{
"id": "doc1",
"text": """
Galileo is an observability platform for LLM applications. It helps developers
monitor, debug, and improve their AI systems by tracking inputs, outputs,
and performance metrics.""",
"metadata": {
"source": "galileo_docs",
"category": "product_overview"
}
},
{
"id": "doc2",
"text": """
RAG (Retrieval-Augmented Generation) is a technique that enhances LLM responses
by retrieving relevant information from external knowledge sources before
generating an answer.""",
"metadata": {
"source": "ai_techniques",
"category": "methodology"
}
},
{
"id": "doc3",
"text": """
Common RAG challenges include hallucinations, retrieval quality issues,
and context window limitations. Proper evaluation metrics include relevance,
faithfulness, and answer correctness.""",
"metadata": {
"source": "ai_techniques",
"category": "challenges"
}
},
{
"id": "doc4",
"text": """
Vector databases like Pinecone, Weaviate, and Chroma are optimized for
storing embeddings and performing similarity searches, making them
ideal for RAG applications.""",
"metadata": {
"source": "tech_stack",
"category": "databases"
}
},
{
"id": "doc5",
"text": """
Prompt engineering is crucial for RAG systems. Well-crafted prompts
should instruct the model to use retrieved context, avoid making up
information, and cite sources when possible.""",
"metadata": {
"source": "best_practices",
"category": "prompting"
}
}
]
return documents
Key points:
- Uses
@log decorator with the retriever span type
- Returns structured document objects
- Includes metadata for tracking sources
- Simulates a real document retrieval system
3. RAG pipeline implementation
The core RAG functionality:
def rag(query: str):
documents = retrieve_documents(query)
# Format documents for better readability in the prompt
formatted_docs = ""
for i, doc in enumerate(documents):
formatted_docs += f"Document {i+1} (Source: {doc['metadata']['source']}):\n{doc['text']}\n\n"
prompt = f"""
Answer the following question based on the context provided.
If the answer is not in the context, say you don't know.
Question: {query}
Context:
{formatted_docs}
"""
try:
console.print("[bold blue]Generating answer...[/bold blue]")
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": """
You are a helpful assistant that answers questions based only on the
provided context."""
},
{
"role": "user",
"content": prompt
}
],
)
return response.choices[0].message.content.strip()
except Exception as e:
return f"Error generating response: {str(e)}"
This section:
- Retrieves relevant documents
- Formats context for the LLM
- Constructs a clear prompt
- Handles API calls and errors
- Uses the gpt-4o model for responses
4. interactive interface
The main application interface:
def main():
console.print(Panel.fit(
"RAG Demo\nThis demo uses a simulated RAG system to answer your questions.",
title="Galileo RAG Terminal Demo",
border_style="blue"
))
# Check environment setup
if logging_enabled:
console.print("[green]✅ Galileo logging is enabled[/green]")
else:
console.print("[yellow]⚠️ Galileo logging is disabled[/yellow]")
api_key = os.environ.get("OPENAI_API_KEY")
if api_key:
console.print("[green]✅ OpenAI API Key is set[/green]")
else:
console.print("[red]❌ OpenAI API Key is missing[/red]")
sys.exit(1)
# Main interaction loop
while True:
query = questionary.text(
"Enter your question about Galileo, RAG, or AI techniques:",
validate=lambda text: len(text) > 0
).ask()
if query.lower() in ['exit', 'quit', 'q']:
break
try:
result = rag(query)
console.print("\n[bold green]Answer:[/bold green]")
console.print(Panel(Markdown(result), border_style="green"))
continue_session = questionary.confirm(
"Do you want to ask another question?",
default=True
).ask()
if not continue_session:
break
except Exception as e:
console.print(f"[bold red]Error:[/bold red] {str(e)}")
if __name__ == "__main__":
try:
main()
except KeyboardInterrupt:
console.print("\n[bold]Exiting RAG Demo. Goodbye![/bold]")
This section provides:
- Environment validation
- Interactive question-answer loop
- Rich formatting for outputs
- Graceful error handling
- Clean exit handling
Key features
- Galileo Logging: Track document retrieval and LLM interactions
- Rich Console Interface: User-friendly terminal interface
- Error Handling: Graceful handling of API and runtime errors
- Context Management: Proper formatting of retrieved documents
- Interactive Experience: Easy-to-use question-answering interface
Next steps
- Implement real document retrieval using a vector database
- Add response streaming for better user experience
- Implement more sophisticated prompt engineering
- Add evaluation metrics for retrieval quality
- Integrate advanced Galileo logging features