Create Workflows Run
Create a new Evaluate run with workflows.
Use this endpoint to create a new Evaluate run with workflows. The request body should contain the workflows
to be ingested and evaluated.
Additionally, specify the project_id
or project_name
to which the workflows should be ingested. If the project does not exist, it will be created. If the project exists, the workflows will be logged to it. If both project_id
and project_name
are provided, project_id
will take precedence. The run_name
is optional and will be auto-generated (timestamp-based) if not provided.
The body is also expected to include the configuration for the scorers to be used in the evaluation. This configuration will be used to evaluate the workflows and generate the results.
curl --request POST \
--url https://api.acme.rungalileo.io/v1/evaluate/runs \
--header 'Content-Type: application/json' \
--header 'Galileo-API-Key: <api-key>' \
--data '{
"project_name": "my-evaluate-project",
"run_name": "my-evaluate-run",
"scorers": [
{
"name": "correctness"
},
{
"name": "output_pii"
}
],
"workflows": [
{
"created_at_ns": 1744827344427819800,
"duration_ns": 0,
"input": "who is a smart LLM?",
"metadata": {},
"name": "llm",
"output": "I am!",
"type": "llm"
}
]
}'
{
"message": "<string>",
"project_id": "<string>",
"project_name": "<string>",
"run_id": "<string>",
"run_name": "<string>",
"workflows_count": 123,
"records_count": 123
}
Authorizations
Body
List of workflows to include in the run.
Input to the step.
Type of the step. By default, it is set to workflow.
"workflow"
Output of the step.
Name of the step.
Timestamp of the step's creation, as nanoseconds since epoch.
Duration of the step in nanoseconds.
Status code of the step. Used for logging failed/errored steps.
Ground truth expected output for the step.
Steps in the workflow.
Input to the step.
Type of the step. By default, it is set to workflow.
"workflow"
Output of the step.
Name of the step.
Timestamp of the step's creation, as nanoseconds since epoch.
Duration of the step in nanoseconds.
Metadata associated with this step.
Status code of the step. Used for logging failed/errored steps.
Ground truth expected output for the step.
Steps in the workflow.
Input to the step.
Type of the step. By default, it is set to workflow.
"workflow"
Output of the step.
Name of the step.
Timestamp of the step's creation, as nanoseconds since epoch.
Duration of the step in nanoseconds.
Metadata associated with this step.
Status code of the step. Used for logging failed/errored steps.
Ground truth expected output for the step.
Steps in the workflow.
Input to the step.
Type of the step. By default, it is set to workflow.
"workflow"
Output of the step.
Name of the step.
Timestamp of the step's creation, as nanoseconds since epoch.
Duration of the step in nanoseconds.
Metadata associated with this step.
Status code of the step. Used for logging failed/errored steps.
Ground truth expected output for the step.
Steps in the workflow.
Parent node of the current node. For internal use only.
Parent node of the current node. For internal use only.
Input to the step.
Type of the step. By default, it is set to workflow.
chain
, chat
, llm
, retriever
, tool
, agent
, workflow
, trace
Output of the step.
Name of the step.
Timestamp of the step's creation, as nanoseconds since epoch.
Duration of the step in nanoseconds.
Metadata associated with this step.
Status code of the step. Used for logging failed/errored steps.
Ground truth expected output for the step.
Steps in the workflow.
Parent node of the current node. For internal use only.
Input to the step.
Type of the step. By default, it is set to workflow.
chain
, chat
, llm
, retriever
, tool
, agent
, workflow
, trace
Output of the step.
Name of the step.
Timestamp of the step's creation, as nanoseconds since epoch.
Duration of the step in nanoseconds.
Metadata associated with this step.
Status code of the step. Used for logging failed/errored steps.
Ground truth expected output for the step.
Steps in the workflow.
Input to the step.
Type of the step. By default, it is set to workflow.
"workflow"
Output of the step.
Name of the step.
Timestamp of the step's creation, as nanoseconds since epoch.
Duration of the step in nanoseconds.
Metadata associated with this step.
Status code of the step. Used for logging failed/errored steps.
Ground truth expected output for the step.
Steps in the workflow.
Parent node of the current node. For internal use only.
Parent node of the current node. For internal use only.
Input to the step.
Type of the step. By default, it is set to workflow.
chain
, chat
, llm
, retriever
, tool
, agent
, workflow
, trace
Output of the step.
Name of the step.
Timestamp of the step's creation, as nanoseconds since epoch.
Duration of the step in nanoseconds.
Metadata associated with this step.
Status code of the step. Used for logging failed/errored steps.
Ground truth expected output for the step.
Steps in the workflow.
Input to the step.
Type of the step. By default, it is set to workflow.
"workflow"
Output of the step.
Name of the step.
Timestamp of the step's creation, as nanoseconds since epoch.
Duration of the step in nanoseconds.
Metadata associated with this step.
Status code of the step. Used for logging failed/errored steps.
Ground truth expected output for the step.
Steps in the workflow.
Input to the step.
Type of the step. By default, it is set to workflow.
"workflow"
Output of the step.
Name of the step.
Timestamp of the step's creation, as nanoseconds since epoch.
Duration of the step in nanoseconds.
Metadata associated with this step.
Status code of the step. Used for logging failed/errored steps.
Ground truth expected output for the step.
Steps in the workflow.
Parent node of the current node. For internal use only.
Parent node of the current node. For internal use only.
Input to the step.
Type of the step. By default, it is set to workflow.
chain
, chat
, llm
, retriever
, tool
, agent
, workflow
, trace
Output of the step.
Name of the step.
Timestamp of the step's creation, as nanoseconds since epoch.
Duration of the step in nanoseconds.
Metadata associated with this step.
Status code of the step. Used for logging failed/errored steps.
Ground truth expected output for the step.
Steps in the workflow.
List of Galileo scorers to enable.
"agentic_workflow_success"
List of filters to apply to the scorer.
Filters on node names in scorer jobs.
eq
, ne
, contains
"node_name"
"string"
"plus"
Alias of the model to use for the scorer.
Number of judges for the scorer.
1 <= x <= 10
List of registered scorers to enable.
Name of the scorer to enable.
List of filters to apply to the scorer.
Filters on node names in scorer jobs.
eq
, ne
, contains
"node_name"
"string"
List of generated scorers to enable.
Name of the scorer to enable.
List of filters to apply to the scorer.
Filters on node names in scorer jobs.
eq
, ne
, contains
"node_name"
"string"
Evaluate Project ID to which the run should be associated.
Evaluate Project name to which the run should be associated. If the project does not exist, it will be created.
Name of the run. If no name is provided, a timestamp-based name will be generated.
Was this page helpful?
curl --request POST \
--url https://api.acme.rungalileo.io/v1/evaluate/runs \
--header 'Content-Type: application/json' \
--header 'Galileo-API-Key: <api-key>' \
--data '{
"project_name": "my-evaluate-project",
"run_name": "my-evaluate-run",
"scorers": [
{
"name": "correctness"
},
{
"name": "output_pii"
}
],
"workflows": [
{
"created_at_ns": 1744827344427819800,
"duration_ns": 0,
"input": "who is a smart LLM?",
"metadata": {},
"name": "llm",
"output": "I am!",
"type": "llm"
}
]
}'
{
"message": "<string>",
"project_id": "<string>",
"project_name": "<string>",
"run_id": "<string>",
"run_name": "<string>",
"workflows_count": 123,
"records_count": 123
}