Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.galileo.ai/llms.txt

Use this file to discover all available pages before exploring further.

If your application uses a framework with built-in OpenTelemetry instrumentation, you can run experiments against it using Galileo’s GalileoSpanProcessor. The span processor captures OTel traces and routes them to the experiment automatically — no need to use the Galileo log decorator or manual logging. This is useful when you are working with frameworks like Microsoft Agent Framework, Google ADK, Pydantic AI, or any other framework that emits OTel spans. In this guide you will:
  1. Set up OpenTelemetry with Galileo
  2. Create an agent with framework instrumentation
  3. Expose an entry point for the experiment runner
  4. Run the experiment

How it works

When you run an experiment with a custom function, Galileo’s experiment runner:
  1. Creates an experiment and sets the experiment_id in the Galileo context.
  2. For each row in the dataset, it calls your function and wraps the call with dataset context (input, ground truth, and metadata).
  3. The GalileoSpanProcessor reads this context and attaches it to every OTel span your framework creates.
  4. Spans are exported to Galileo’s OTLP endpoint with the experiment ID, routing them to the experiment instead of a regular Log stream.
Because the experiment runner manages the trace lifecycle, you must disable Galileo’s native logger to avoid duplicate traces. Set GALILEO_LOGGING_DISABLED=true before importing your agent code.

Prerequisites

Install the Galileo SDK with OpenTelemetry support, plus any framework-specific packages:
Terminal
pip install "galileo[otel]" opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp

Set up environment variables

Configure your Galileo credentials and disable the native logger:
.env
# Galileo
GALILEO_API_KEY=your-galileo-api-key
GALILEO_PROJECT=your-project-name
GALILEO_CONSOLE_URL=https://app.galileo.ai  # Only needed for custom deployments
GALILEO_API_URL=https://api.galileo.ai      # Only needed for custom deployments

# Disable native Galileo logger — OTel handles tracing
GALILEO_LOGGING_DISABLED=true

# LLM provider
OPENAI_API_KEY=your-openai-api-key
VariableRequiredDescription
GALILEO_API_KEYYesYour Galileo API key
GALILEO_PROJECTYesThe project to log experiments to
GALILEO_CONSOLE_URLNoOnly needed for custom deployments. Defaults to https://app.galileo.ai
GALILEO_API_URLNoOnly needed for custom deployments. Defaults to https://api.galileo.ai. Used by the GalileoSpanProcessor to construct the OTLP endpoint
GALILEO_LOGGING_DISABLEDYesSet to true to disable the native logger. Required when using OTel for experiments

Set up OpenTelemetry with Galileo

Create a TracerProvider and attach the GalileoSpanProcessor. This processor automatically configures authentication and the OTLP endpoint using your environment variables.
agent.py
from galileo.otel import GalileoSpanProcessor, add_galileo_span_processor
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider

# Set up the OTel tracer provider with the Galileo span processor
tracer_provider = TracerProvider()
galileo_processor = GalileoSpanProcessor()
add_galileo_span_processor(tracer_provider, galileo_processor)
trace.set_tracer_provider(tracer_provider)
Then enable your framework’s instrumentation. For example, with Microsoft Agent Framework:
agent.py
from agent_framework.observability import enable_instrumentation

# Enable the framework's built-in OTel instrumentation
# Set enable_sensitive_data=True to capture LLM inputs and outputs
enable_instrumentation(enable_sensitive_data=True)
Set enable_sensitive_data=True to capture LLM inputs and outputs in your traces. If set to False, only span metadata (timing, token counts, etc.) is sent.

Create the agent

Set up your agent with your framework of choice. The key requirement is that the framework emits OTel spans through the registered TracerProvider.
agent.py
import asyncio
import os
from typing import Any

from agent_framework import openai, tool
from agent_framework.observability import enable_instrumentation
from galileo.otel import GalileoSpanProcessor, add_galileo_span_processor
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider

# ── OTel Setup ───────────────────────────────────────────────────────────────
tracer_provider = TracerProvider()
galileo_processor = GalileoSpanProcessor()
add_galileo_span_processor(tracer_provider, galileo_processor)
trace.set_tracer_provider(tracer_provider)

enable_instrumentation(enable_sensitive_data=True)

# ── Tools ────────────────────────────────────────────────────────────────────
@tool(approval_mode="never_require")
def lookup_account(account_id: str) -> str:
    """Look up account details by account ID."""
    accounts = {
        "ACCT-1001": {"name": "Alice", "plan": "Premium", "balance": 120.00},
        "ACCT-1002": {"name": "Bob", "plan": "Basic", "balance": 45.50},
    }
    account = accounts.get(account_id)
    if account:
        return f"Account {account_id}: {account}"
    return f"Account {account_id} not found."

# ── Agent ────────────────────────────────────────────────────────────────────
client = openai.OpenAIChatClient(model_id="gpt-4.1-mini")

agent = client.as_agent(
    name="BillingAgent",
    instructions="You are a helpful billing assistant. Use the lookup_account "
    "tool to find customer account details.",
    tools=[lookup_account],
)


# ── Experiment Entry Point ───────────────────────────────────────────────────
async def _run_agent_async(user_message: str) -> str:
    session = agent.create_session()
    result = await agent.run(user_message, session=session)
    return getattr(result, "text", str(result)).strip()


def run_agent(input_data: Any) -> str:
    """Called by Galileo's run_experiment for each dataset row."""
    if isinstance(input_data, str):
        user_message = input_data
    elif isinstance(input_data, dict):
        user_message = input_data.get("input", "")
    else:
        raise TypeError(f"Unsupported input type: {type(input_data)!r}")

    return asyncio.run(_run_agent_async(user_message))

Create the experiment entry point

The experiment runner calls your function once per dataset row. Your function receives the row data and must return a string result.
agent.py
async def _run_agent_async(user_message: str) -> str:
    session = agent.create_session()
    result = await agent.run(user_message, session=session)
    return getattr(result, "text", str(result)).strip()


def run_agent(input_data: Any) -> str:
    """Called by Galileo's run_experiment for each dataset row."""
    if isinstance(input_data, str):
        user_message = input_data
    elif isinstance(input_data, dict):
        user_message = input_data.get("input", "")
    else:
        raise TypeError(f"Unsupported input type: {type(input_data)!r}")

    return asyncio.run(_run_agent_async(user_message))
The function should handle both str and dict inputs, since the experiment runner passes the parsed row data which can be in either format depending on your dataset structure.

Run the experiment

Use run_experiment to iterate over the dataset, call your agent function, and evaluate the results with metrics.
main.py
import os

from dotenv import load_dotenv
from galileo.experiments import run_experiment
from galileo.schema.metrics import GalileoMetrics

load_dotenv()

# Disable the native Galileo logger — OTel handles tracing
os.environ["GALILEO_LOGGING_DISABLED"] = "true"

from agent import run_agent

DATASET = [
    {
        "input": "Can you look up the account details for ACCT-1001?",
        "ground_truth": "lookup_account(account_id='ACCT-1001')",
    },
    {
        "input": "What is the current balance for ACCT-1002?",
        "ground_truth": "lookup_account(account_id='ACCT-1002')",
    },
]

results = run_experiment(
    "my-otel-experiment",
    dataset=DATASET,
    function=run_agent,
    metrics=[
        GalileoMetrics.tool_error_rate,
        GalileoMetrics.instruction_adherence,
        GalileoMetrics.tool_selection_quality,
    ],
    project=os.environ["GALILEO_PROJECT"],
)
Set GALILEO_LOGGING_DISABLED=true before importing your agent module. The agent module initializes OTel on import, and the native logger must be disabled before that happens to avoid conflicts.

Key parameters

ParameterDescription
experiment_nameName shown in the Galileo console. If a duplicate name exists, a timestamp is appended automatically
datasetA list of dictionaries with input and optional ground_truth fields, a Galileo Dataset object, or a dataset name
dataset_idID of an existing Galileo dataset. Alternative to dataset
dataset_nameName of an existing Galileo dataset. Alternative to dataset
functionYour entry point function, called once per dataset row
metricsList of metrics to evaluate. Supports built-in, custom LLM-as-a-judge, and local metrics
projectProject name (can also be set via the GALILEO_PROJECT environment variable)
project_idProject ID. Alternative to project
experiment_tagsDictionary of key-value pairs to tag the experiment for filtering and comparison

Ground truth

When your dataset includes a ground_truth field, this value is attached to the OTel spans as ground truth. Metrics like Ground Truth Adherence use this to evaluate the agent’s response. The experiment runner uses galileo_dataset_context to attach the ground truth to the OTel context, so it is available to the GalileoSpanProcessor without any manual configuration.

Supported frameworks

Any framework with OpenTelemetry instrumentation works with this approach. See the framework-specific guides for OTel setup details:

Microsoft Agent Framework

Built-in OTel instrumentation, no extra packages needed.

Google ADK

Integrate Google Agent Development Kit with Galileo via OTel.

Pydantic AI

Use Pydantic AI’s OTel support with Galileo.

Strands Agents

Integrate Strands Agents with Galileo via OTel.

Next steps

Run experiments in code

Learn about all experiment approaches including prompt templates and custom functions.

OpenTelemetry overview

Learn more about Galileo’s OpenTelemetry integration for logging and monitoring.

Metrics reference

Explore the full list of available metrics for experiments.

Datasets

Learn how to create and manage datasets for experiments.