langchain
. To log these chains, we require using the callback from our Python client promptquality
.
For logging your data, first login:
GalileoPromptCallback
:
- project_name: each “run” will appear under this project. Choose a name that’ll help you identify what you’re evaluating
- scorers: This is the list of metrics you want to evaluate your run over. Check out Galileo Guardrail Metrics and Custom Metrics for more information.
Executing and Logging
Next, run your chain over your Evaluation set and log the results to Galileo. When you execute your chain (withrun
, invoke
or batch
), just include the callback instance created earlier in the callbacks as:
If using .run()
:
.invoke()
:
.batch()
:
finish
step uploads the run to Galileo and starts the execution of the scorers server-side. This step will also display the link you can use to interact with the run on the Galileo console.
A full example can be found here.
Note 1: Please make sure to set the callback at execution time, not at definition time so that the callback is invoked for all nodes of the chain.
Note 2: We recommend using .invoke
instead of .batch
because langchain
reports latencies for the entire batch instead of each individual chain execution.