The Datasets page (sidebar → Datasets) is where you manage every dataset in your org. Datasets are org-scoped — once added, they’re available across every project for any training run.Documentation Index
Fetch the complete documentation index at: https://docs.galileo.ai/llms.txt
Use this file to discover all available pages before exploring further.

Test sets vs. training sets
Luna Studio splits datasets into two flavors, accessible via tabs on the page:Test sets
Small, hand-labelled datasets used to evaluate fine-tuned metrics. Required for every run.
Training sets
Larger datasets used to fine-tune the base model. Often generated from a test set.
Datasets table
Both tabs use the same column layout:| Column | What it shows |
|---|---|
| Dataset name | The dataset’s name. |
| Rows | Row count, with thousands separators. |
| Source | One of Galileo (Galileo glyph), Upload (upload icon), or URL (link icon). |
| Used in metric | Outline-style badges for each metric that uses this dataset. Empty if unused. |
| Created at | When the dataset was added. |
| Last updated at | When the dataset was most recently changed. |
Top-bar actions
- Search — filter by dataset name.
- Add test set / Add training set — primary button. The label tracks the active tab. Opens the Add dataset modal — see Add a dataset.
How datasets relate to runs
Each training run consumes exactly one test set and one training set. The same dataset can be reused across many runs. The Used in metric column on the datasets table shows you which metrics’ fine-tuning depends on a dataset — useful before deleting one.Source types
| Source | What it means |
|---|---|
| Upload | You uploaded a .csv or .jsonl file from your machine. |
| URL | Luna fetched the dataset from an http/https/s3/gs URL. |
| Galileo | Luna pulled the dataset from a project in your connected Galileo workspace. |
- Generated — produced by the Generate from test set flow inside the run creation flow.
Where to go next
Test sets
What test sets are, schema rules, and best practices.
Training sets
What training sets are and how to create or reuse them.
Add a dataset
Reference for the three dataset sources (Upload, URL, Galileo).
Dataset validation
What Luna checks when you add a dataset.