The SDK reads a single YAML file that contains bothdata_generationandtrainingsettings. You typically run one or both of these steps from the same config:
run_data_generation(config_path=...)run_training(config_path=...)
Run config structure
Your YAML file controls the overall workflow and includes top-level keys plus nesteddata_generation and training sections:
Top-level fields
run_steps
run_steps controls which parts of the workflow run.
Valid values are:
data_generationtraining
["data_generation", "training"]: run the full end-to-end workflow["data_generation"]: generate or label data only["training"]: skip data generation and train from an existing labelled dataset
pipeline_provider
pipeline_provider controls where the workflow runs.
This selects the execution backend for your run. Since you are using the SDK, you can fix this to “local”.
metric_name
metric_name selects the packaged metric config to use for the run.
Supported values include:
action_advancementaction_completioncontext_adherencecontext_relevanceprompt_injectionsexismtonetool_error_ratetool_selection_qualitytoxicitycustom
custom when your metric does not match one of the packaged presets.
Preset metric example: Using a preset metric
Using custom
Set metric_name: "custom" when you want to define your own metric behavior.
With custom, you are expected to define the metric explicitly in your config, including:
data_generation.metric.namedata_generation.metric.descriptiondata_generation.metric.typedata_generation.metric.input_formatdata_generation.metric.class_labelsordata_generation.metric.llmaj_source_promptdata_generation.source_data.dataset.columns.featurestraining.prompt_template
custom template starts as a binary setup, uses boolean training output. You should update those values to match your use case.
Custom metric example: Trace input / output only
Nested config sections
data_generation
The data_generation section controls how source data is loaded, how synthetic or labelled examples are produced, and where the resulting dataset is written.
training
The training section controls how the model is fine-tuned, evaluated, and where training artifacts are saved.