Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.galileo.ai/llms.txt

Use this file to discover all available pages before exploring further.

If your question isn’t answered here, check Troubleshooting for runtime issues.

General

Luna Studio is Galileo’s web app for fine-tuning custom evaluation metrics for LLM applications. You bring a small labelled test set, optionally generate a training set, fine-tune a Luna base model, and register the resulting metric back into the Galileo metrics store. See Welcome for the longer pitch.
ML and AI engineers who need evaluation metrics tailored to a specific domain (legal, healthcare, RAG over internal docs, etc.) and don’t want to write fine-tuning code from scratch.
Galileo is the broader platform — evaluation, observability, guardrails. Luna Studio is the metric-fine-tuning workspace inside Galileo. Metrics produced in Luna Studio are registered to the Galileo metrics store, where they’re usable across the rest of the platform.
Luna Studio is part of the enterprise tier of Galileo and is deployed by Galileo into your own cluster or cloud. See Availability and deployment, or contact us to get started.

Test sets and training sets

Use at least 300 hand-labelled rows when possible. Beyond a few hundred rows, you start paying inference cost without much added signal for most metrics. Exact requirements can vary by metric and are checked during validation.
No. The most common path is Generate from test set — Luna Studio synthesizes a 2,000-example training set from 20% of your test set. See Step 3.Upload your own training set when you have labelled production logs that better represent the distribution you want to evaluate.
Yes, for uploaded or imported logs. Luna Studio labels unlabelled logs with your LLM-as-judge prompt first, saves the labelled result as a training dataset, and then uses that labelled dataset for training. Generated training sets are always labelled.
Yes. Datasets are org-scoped, not project-scoped. Once you’ve added a test set, every project in your org can use it.
.csv and .jsonl. Both must include an input column; test sets and labelled training sets also need a label column. See Add a dataset.

Metrics

Predefined metrics use battle-tested LLM-as-judge prompts curated by Galileo (e.g. Toxicity, Context adherence). Custom metrics let you write your own prompt. Both fine-tune the same way.
Pick the simplest trainable type that captures what you need: Boolean for yes/no questions, or Categorical for picking from a fixed list. Other Galileo metric output types are not trainable in Luna Studio yet.
The step is which part of a Galileo trace the metric runs against: a single LLM call (LLM span), a retrieval step (Retriever), an agent step (Agent span), or a trace-level input. See Custom metrics.
No. Once registered, the metric is snapshotted in the Galileo metrics store. To iterate, launch a new run with the same metric template and register it under a new name (or unregister the old one in Galileo first).

Base models

Use the base model configured for your organization. Luna Studio loads available base models from your deployment and shows the selected model in Step 4.
Depends on the base model and training set size. Most runs take a few hours, and larger models or larger datasets can take longer.
Not today. Once a run is Training, it runs to completion or failure. We’re tracking this for a future release.

Integrations

Luna Studio supports named hosted providers, Azure, Vertex AI, AWS-hosted models, custom model setups, and Galileo. See Integrations overview.
Org-wide. Once added, every project and every member of your org can use them.
Only if you want to import datasets from Galileo or register metrics to the Galileo metrics store. See Galileo integration.
For in-house models, OpenAI-compatible proxies, or providers that aren’t covered by the named integrations. See Custom models and proxies.

Onboarding

You can’t re-enter the onboarding wizard, but you can do the same things from the main app: add LLM providers from the Integrations page, and create projects from the Projects page.
Yes. Most teams have one project per metric or per application. See Projects overview.

Where to go next

Troubleshooting

Runtime errors and how to recover.

Quickstart

End-to-end walkthrough.