Overview of Galileo Evaluate

Core features
The Workflow
Getting Started

Galileo Evaluate is a powerful bench for rapid, collaborative experimentation and evaluation of your LLM applications.

Core features

Tracing and Visualizations - Track the end-to-end execution of your queries. See what happened along the way and where things went wrong.
State-of-the-art Metrics - Combine our research-backed Guardrail Metrics with your own Custom Metrics to evaluate your system.
Experiment Management - Track all your experiments in one place. Find the best configuration for your system.

The Workflow

Log your runs

Integrate promptquality into your system or test a template model combination through the Playground. Choose and register your metrics to define what success means for your use case.

Analyze results

Identify poor perfomance, trace it to the broken step, form hypothesis on what could be behind it.

Debug, Fix & Run another Eval

Tweak your system and try again until your quality bar is met.

Getting Started

Quickstart

What is Galileo?Quickstart Guide | Galileo Evaluate

⌘I

Introduction

Evaluate

Observe

Protect

Galileo Guardrail Metrics

Fine Tune

Galileo NLP Studio

Overview of Galileo Evaluate

Core features

The Workflow

Getting Started

Quickstart

Introduction

Evaluate

Observe

Protect

Galileo Guardrail Metrics

Fine Tune

Galileo NLP Studio

​Core features

​The Workflow

​Getting Started

Quickstart

Core features

The Workflow

Getting Started