> ## Documentation Index
> Fetch the complete documentation index at: https://docs.galileo.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# datasets

## Dataset

### add\_rows

```python theme={null}
def add_rows(self, row_data: list[dict[str, Any]]) -> 'Dataset'
```

Adds rows to the dataset.

**Arguments**

* `row_data` (`List[Dict[str, Any]]`): The rows to add to the dataset.

**Raises**

* `errors.UnexpectedStatus`: If the server returns an undocumented status code and Client.raise\_on\_unexpected\_status is True.
* `httpx.TimeoutException`: If the request takes longer than Client.timeout.

**Returns**

* `Dataset`: The updated dataset with the new rows.

### get\_content

```python theme={null}
def get_content(self,
                starting_token: int=0,
                limit: int=MAX_DATASET_ROWS) -> Union[None, DatasetContent]
```

Gets and returns the content of the dataset.

Also refreshes the content of the local dataset instance.

**Raises**

* `errors.UnexpectedStatus`: If the server returns an undocumented status code and Client.raise\_on\_unexpected\_status is True.
* `httpx.TimeoutException`: If the request takes longer than Client.timeout.

**Returns**

* `Union[None, DatasetContent]`: The content of the dataset

### list\_projects

```python theme={null}
def list_projects(self, limit: Union[Unset, int]=100) -> list
```

Lists all projects that this dataset is associated with.

**Arguments**

* `limit` (`Union[Unset, int]`): The maximum number of projects to return. Default is 100.

**Raises**

* `errors.UnexpectedStatus`: If the server returns an undocumented status code and Client.raise\_on\_unexpected\_status is True.
* `httpx.TimeoutException`: If the request takes longer than Client.timeout.

**Returns**

* `List[DatasetProject]`: A list of projects this dataset is used in.

## Datasets

### create

```python theme={null}
def create(self,
           name: str,
           content: DatasetType,
           *,
           project_id: Optional[str]=None,
           project_name: Optional[str]=None) -> Dataset
```

Creates a new dataset, optionally associating it with a project.

**Arguments**

* `name` (`str`): The name of the dataset.
* `content` (`DatasetType`): The content of the dataset.
* `project_id` (`str`): Associate the dataset with this project by ID. Mutually exclusive with project\_name.
* `project_name` (`str`): Associate the dataset with this project by name. Mutually exclusive with project\_id.

**Raises**

* `ValueError`: If both project\_id and project\_name are provided, or if the specified project does not exist.
* `errors.UnexpectedStatus`: If the server returns an undocumented status code and Client.raise\_on\_unexpected\_status is True.
* `httpx.TimeoutException`: If the request takes longer than Client.timeout.

**Returns**

* `Dataset`: The created dataset.

### delete

```python theme={null}
def delete(self,
           *,
           id: Optional[str]=None,
           name: Optional[str]=None,
           project_id: Optional[str]=None,
           project_name: Optional[str]=None) -> None
```

Deletes a dataset by id or name.

Optionally validates that the dataset is used in a specific project before deletion.

**Arguments**

* `id` (`str`): The id of the dataset.
* `name` (`str`): The name of the dataset.
* `project_id` (`str`): Validate that the dataset is used in this project by ID before deletion.
  Mutually exclusive with project\_name.
* `project_name` (`str`): Validate that the dataset is used in this project by name before deletion.
  Mutually exclusive with project\_id.

**Raises**

* `ValueError`: If neither or both `id` and `name` are provided, if both project\_id and project\_name
  are provided, or if the specified project does not exist, or if the dataset is not
  used in the specified project.
* `errors.UnexpectedStatus`: If the server returns an undocumented status code and Client.raise\_on\_unexpected\_status is True.
* `httpx.TimeoutException`: If the request takes longer than Client.timeout.

### extend

```python theme={null}
def extend(self,
           *,
           prompt_settings: Optional[dict[str, Any]]=None,
           prompt: Optional[str]=None,
           instructions: Optional[str]=None,
           examples: Optional[builtins.list[str]]=None,
           data_types: Optional[builtins.list[str]]=None,
           count: int=10) -> builtins.list[DatasetRow]
```

Extends a dataset with synthetically generated data based on the provided parameters.

This method initiates a dataset extension job, waits for it to complete by polling its status,
and then returns the content of the extended dataset.

**Arguments**

* `prompt_settings` (`Dict[str, Any]`): Settings for the prompt generation. Should contain 'model\_alias' key.
  Example: `{'model_alias': 'GPT-4o mini'}`
* `prompt` (`str`): A description of the assistant's role.
* `instructions` (`str`): Instructions for the assistant.
* `examples` (`List[str]`): Examples of user prompts.
* `data_types` (`List[str]`): The types of data to generate. Possible values are:
  'General Query', 'Prompt Injection', 'Off-Topic Query',
  'Toxic Content in Query', 'Multiple Questions in Query',
  'Sexist Content in Query'.
* `count` (`int, default 10`): The number of synthetic examples to generate.

**Raises**

* `DatasetAPIException`: If the request to extend the dataset fails.
* `errors.UnexpectedStatus`: If the server returns an undocumented status code and Client.raise\_on\_unexpected\_status is True.
* `httpx.TimeoutException`: If the request takes longer than Client.timeout.

**Returns**

* `List[DatasetRow]`: A list of rows from the extended dataset.

### get

```python theme={null}
def get(self,
        *,
        id: Optional[str]=None,
        name: Optional[str]=None,
        with_content: bool=False,
        project_id: Optional[str]=None,
        project_name: Optional[str]=None) -> Optional[Dataset]
```

Retrieves a dataset by id or name (exactly one of `id` or `name` must be provided).

Optionally validates that the dataset is used in a specific project.

**Arguments**

* `id` (`str`): The id of the dataset.
* `name` (`str`): The name of the dataset.
* `with_content` (`bool`): Whether to return the content of the dataset. Default is False.
* `project_id` (`str`): Validate that the dataset is used in this project by ID. Mutually exclusive with project\_name.
* `project_name` (`str`): Validate that the dataset is used in this project by name. Mutually exclusive with project\_id.

**Raises**

* `ValueError`: If neither or both `id` and `name` are provided, if both project\_id and project\_name
  are provided, or if the specified project does not exist, or if the dataset is not
  used in the specified project.
* `errors.UnexpectedStatus`: If the server returns an undocumented status code and Client.raise\_on\_unexpected\_status is True.
* `httpx.TimeoutException`: If the request takes longer than Client.timeout.

**Returns**

* `Dataset`: The dataset.

### list

```python theme={null}
def list(self,
         limit: Union[Unset, int]=100,
         *,
         project_id: Optional[str]=None,
         project_name: Optional[str]=None) -> list[Dataset]
```

Lists all datasets, optionally filtered by project.

**Arguments**

* `limit` (`Union[Unset, int]`): The maximum number of datasets to return. Default is 100.
* `project_id` (`str`): Filter datasets used in this project by ID. Mutually exclusive with project\_name.
* `project_name` (`str`): Filter datasets used in this project by name. Mutually exclusive with project\_id.

**Raises**

* `ValueError`: If both project\_id and project\_name are provided, or if the specified project
  does not exist.
* `errors.UnexpectedStatus`: If the server returns an undocumented status code and Client.raise\_on\_unexpected\_status is True.
* `httpx.TimeoutException`: If the request takes longer than Client.timeout.

**Returns**

* `List[Dataset]`: A list of datasets.

## convert\_dataset\_row\_to\_record

```python theme={null}
def convert_dataset_row_to_record(dataset_row: DatasetRow) -> DatasetRecord
```

Converts a DatasetRow to a DatasetRecord.

Supports both 'output' and 'ground\_truth' field names for backward compatibility.

**Arguments**

* `dataset_row` (`DatasetRow`): The dataset row to convert.

**Raises**

* `ValueError`: If the dataset row does not have an input field.

**Returns**

* `DatasetRecord`: The converted dataset record.

## create\_dataset

```python theme={null}
def create_dataset(name: str,
                   content: DatasetType,
                   *,
                   project_id: Optional[str]=None,
                   project_name: Optional[str]=None) -> Dataset
```

Creates a new dataset, optionally associating it with a project.

**Arguments**

* `name` (`str`): The name of the dataset.
* `content` (`DatasetType`): The content of the dataset.
* `project_id` (`str`): Associate the dataset with this project by ID. Mutually exclusive with project\_name.
* `project_name` (`str`): Associate the dataset with this project by name. Mutually exclusive with project\_id.

**Raises**

* `ValueError`: If both project\_id and project\_name are provided, or if the specified project does not exist.
* `errors.UnexpectedStatus`: If the server returns an undocumented status code and Client.raise\_on\_unexpected\_status is True.
* `httpx.TimeoutException`: If the request takes longer than Client.timeout.

**Returns**

* `Dataset`: The created dataset.

## delete\_dataset

```python theme={null}
def delete_dataset(*,
                   id: Optional[str]=None,
                   name: Optional[str]=None,
                   project_id: Optional[str]=None,
                   project_name: Optional[str]=None) -> None
```

Deletes a dataset by id or name (exactly one of `id` or `name` must be provided).

Optionally validates that the dataset is used in a specific project before deletion.

**Arguments**

* `id` (`str`): The id of the dataset.
* `name` (`str`): The name of the dataset.
* `project_id` (`str`): Validate that the dataset is used in this project by ID before deletion.
  Mutually exclusive with project\_name.
* `project_name` (`str`): Validate that the dataset is used in this project by name before deletion.
  Mutually exclusive with project\_id.

**Raises**

* `ValueError`: If neither or both `id` and `name` are provided, if both project\_id and project\_name
  are provided, or if the specified project does not exist, or if the dataset is not
  used in the specified project.
* `errors.UnexpectedStatus`: If the server returns an undocumented status code and Client.raise\_on\_unexpected\_status is True.
* `httpx.TimeoutException`: If the request takes longer than Client.timeout.

## extend\_dataset

```python theme={null}
def extend_dataset(*,
                   prompt_settings: Optional[dict[str, Any]]=None,
                   prompt: Optional[str]=None,
                   instructions: Optional[str]=None,
                   examples: Optional[list[str]]=None,
                   data_types: Optional[list[str]]=None,
                   count: int=10) -> list[DatasetRow]
```

Extends a dataset with synthetically generated data based on the provided parameters.

This function initiates a dataset extension job, waits for it to complete by polling its status,
and then returns the content of the extended dataset.

**Arguments**

* `prompt_settings` (`Dict[str, Any]`): Settings for the prompt generation. Should contain 'model\_alias' key.
  Example: `{'model_alias': 'GPT-4o mini'}`
* `prompt` (`str`): A description of the assistant's role.
* `instructions` (`str`): Instructions for the assistant.
* `examples` (`List[str]`): Examples of user prompts.
* `data_types` (`List[str]`): The types of data to generate. Possible values are:
  'General Query', 'Prompt Injection', 'Off-Topic Query',
  'Toxic Content in Query', 'Multiple Questions in Query',
  'Sexist Content in Query'.
* `count` (`int, default 10`): The number of synthetic examples to generate.

**Raises**

* `DatasetAPIException`: If the request to extend the dataset fails.
* `errors.UnexpectedStatus`: If the server returns an undocumented status code and Client.raise\_on\_unexpected\_status is True.
* `httpx.TimeoutException`: If the request takes longer than Client.timeout.

**Returns**

* `List[DatasetRow]`: A list of rows from the extended dataset.

## get\_dataset

```python theme={null}
def get_dataset(*,
                id: Optional[str]=None,
                name: Optional[str]=None,
                project_id: Optional[str]=None,
                project_name: Optional[str]=None) -> Optional[Dataset]
```

Retrieves a dataset by id or name (exactly one of `id` or `name` must be provided).

Optionally validates that the dataset is used in a specific project.

**Arguments**

* `id` (`str`): The id of the dataset.
* `name` (`str`): The name of the dataset.
* `project_id` (`str`): Validate that the dataset is used in this project by ID. Mutually exclusive with project\_name.
* `project_name` (`str`): Validate that the dataset is used in this project by name. Mutually exclusive with project\_id.

**Raises**

* `ValueError`: If neither or both `id` and `name` are provided, if both project\_id and project\_name
  are provided, or if the specified project does not exist, or if the dataset is not
  used in the specified project.
* `errors.UnexpectedStatus`: If the server returns an undocumented status code and Client.raise\_on\_unexpected\_status is True.
* `httpx.TimeoutException`: If the request takes longer than Client.timeout.

**Returns**

* `Dataset`: The dataset.

## get\_dataset\_version

```python theme={null}
def get_dataset_version(*,
                        version_index: int,
                        dataset_name: Optional[str]=None,
                        dataset_id: Optional[str]=None) -> Optional[DatasetContent]
```

Retrieves a dataset version by dataset name or dataset id.

**Arguments**

* `version_index` (`int`): The version of the dataset.
* `dataset_name` (`Optional[str]`): The name of the dataset.
* `dataset_id` (`Optional[str]`): The id of the dataset.

**Returns**

* `DatasetContent`:

## get\_dataset\_version\_history

```python theme={null}
def get_dataset_version_history(*,
                                dataset_name: Optional[str]=None,
                                dataset_id: Optional[str]=None) -> Optional[Union[HTTPValidationError, ListDatasetVersionResponse]]
```

Retrieves a dataset version history by dataset name or dataset id.

**Arguments**

* `dataset_name` (`str`): The name of the dataset.
* `dataset_id` (`str`): The id of the dataset.

**Raises**

* `HTTPValidationError`:

**Returns**

* `ListDatasetVersionResponse`:

## list\_dataset\_projects

```python theme={null}
def list_dataset_projects(*,
                          dataset_id: Optional[str]=None,
                          dataset_name: Optional[str]=None,
                          limit: Union[Unset, int]=100) -> list
```

Lists all projects that a dataset is associated with.

**Arguments**

* `dataset_id` (`str`): The ID of the dataset.
* `dataset_name` (`str`): The name of the dataset.
* `limit` (`Union[Unset, int]`): The maximum number of projects to return. Default is 100.

**Raises**

* `ValueError`: If neither or both `dataset_id` and `dataset_name` are provided, or if the dataset does not exist.
* `errors.UnexpectedStatus`: If the server returns an undocumented status code and Client.raise\_on\_unexpected\_status is True.
* `httpx.TimeoutException`: If the request takes longer than Client.timeout.

**Returns**

* `List[DatasetProject]`: A list of projects the dataset is used in.

## list\_datasets

```python theme={null}
def list_datasets(limit: Union[Unset, int]=100,
                  *,
                  project_id: Optional[str]=None,
                  project_name: Optional[str]=None) -> list[Dataset]
```

Lists all datasets, optionally filtered by project.

**Arguments**

* `limit` (`Union[Unset, int]`): The maximum number of datasets to return. Default is 100.
* `project_id` (`str`): Filter datasets used in this project by ID. Mutually exclusive with project\_name.
* `project_name` (`str`): Filter datasets used in this project by name. Mutually exclusive with project\_id.

**Raises**

* `ValueError`: If both project\_id and project\_name are provided, or if the specified project
  does not exist.
* `errors.UnexpectedStatus`: If the server returns an undocumented status code and Client.raise\_on\_unexpected\_status is True.
* `httpx.TimeoutException`: If the request takes longer than Client.timeout.

**Returns**

* `List[Dataset]`: A list of datasets.
