Test sets

A test set is the ground truth for a training run. After fine-tuning, Luna Studio scores the resulting metric against the test set and reports F1, AUC-ROC, and other performance KPIs.

What makes a good test set

Human-labelled. Don’t auto-generate test labels — they’re the tape measure for evaluating the run.
Representative of production data. Sample inputs from the same distribution your application sees in production.
Size. Luna Studio enforces at least 300 rows total and 100 rows per label. Aim for 1,000-3,000 representative rows when possible.

Required schema

Check Prerequisites for the columns required by each metric shape.

File formats

CSV — the end-to-end format for run validation and evaluation. Headers are required.
JSONL — accepted by the source picker during ingestion, but current downstream processing reads CSV. Convert it to CSV before selecting it for a run.

Browser uploads are limited to 20 MiB.

Add a test set

You can add a test set in three places:

The Datasets page → Add test set primary button.
The Step 2 of the run creation flow → dropdown’s Add new test set action.
(Indirectly) by importing from Galileo — see Galileo integration.

All three paths open the same Add test set modal — see Add a dataset for the modal reference.

Where to go next

Add a dataset

Walk through the Upload / URL / Galileo flows.

Validation

What Luna Studio checks and what to do when validation fails.

Training sets

The other dataset type — used to fine-tune the base model.

Datasets overview Training sets

⌘I

Overview

Get Started

Observability

Evaluation Metrics

AI Assistant

Luna Studio

Experiments

Agent Control

Annotations

Integrations

Security

References

What makes a good test set

Required schema

File formats

Add a test set

Where to go next

Add a dataset

Validation

Training sets

​What makes a good test set

​Required schema

​File formats

​Add a test set

​Where to go next

Add a dataset

Validation

Training sets

What makes a good test set

Required schema

File formats

Add a test set

Where to go next