Start with prerequisites
Before running the pipeline, make sure you have:- A high-quality human-labelled test set
- A reliable LLM-as-a-judge prompt for your metric
- The environment and extras required for your chosen providers
The end-to-end workflow
Creating a Luna metric with the SDK happens in two stages:- Data generation creates or labels the training dataset for your metric.
- Training fine-tunes a LoRA adapter on the prepared dataset.
data_generation and training sections.
Choose your starting point
Start from a test set
Generate synthetic labelled training data when you only have a high-quality labelled test set.
Label existing training data
Use label-only mode when you already have raw training data but still need labels.
Train from a labelled dataset
Skip data generation and go straight to fine-tuning when your labelled dataset is already ready.
Hardware requirements
Data generation
Data generation is CPU-friendly. The main requirement is access to the LLM provider you want to use for synthetic example generation and labeling.Training
Training requires a GPU. We recommend an H100 for production runs. For larger jobs, use a managed training environment when needed.Where to go next
Data generation
Create or label the dataset you need before training.
Training
Fine-tune the metric and inspect the resulting artifacts.