Using Datasets
How to use datasets in Galileo
Datasets serve as inputs to an Evaluate run. Each column in a dataset represents a variable that can be used within a prompt template.
Using Datasets in the Galileo Console
Create a dataset
From the Datasets page, click the “Create Dataset” button.
You can upload a CSV, or enter data directly into the table.
Using a dataset in an evaluation run
When creating a new evaluation run, you can select a dataset to use as input.
Using Datasets in code
Prerequisites
For Python, install the promptquality
library.
For TypeScript, install the @rungalileo/galileo
package.
Create a dataset
You can create a new dataset by running:
These functions accepts a few different formats for the dataset.
-
A dictionary mapping column names to lists of values (as shown above).
-
A list of dictionaries, where each dictionary represents a row in the dataset, e.g.
-
A path to a file in either CSV, Feather, or JSONL format, e.g.
Using a dataset in an evaluation run
To use the dataset in an evaluation run, provide the dataset ID to the run function (Python only).
Note that the TypeScript client does not currently support creating runs. However, you can use the dataset for logging workflows.
Getting the contents of a dataset
You can list the dataset’s contents like so:
Was this page helpful?