Galileo provides users the ability to tune which metrics to use for their evaluation.

Check out Choose your Guardrail Metrics to understand which metrics or scorers apply to your use case.

Using scorers

To use scorers during a prompt run, sweep, or even a more complex workflow, simply pass them in through the scorers argument:


import promptquality as pq

pq.run(..., scorers=[pq.Scorers.correctness, pq.Scorers.context_adherence])

Disabling default scorers

By default, we turn on a few scorers for you (PII, Toxicity, BLEU, ROUGE). If you want to disable a default scorer you can pass in a ScorersConfiguration object.


pq.run(...,
  scorers=[pq.Scorers.correctness,pq.Scorers.context_adherence],
  scorers_config=pq.ScorersConfiguration(latency=False)
  )

You can even use the ScorersConfiguration to turn on other scorers, rather than using the scorers argument.

  pq.run(..., scorers_config=pq.ScorersConfiguration(latency=False, groundedness=True))

Logging Workflows

If you’re logging workflows using EvaluateRun, you can add your scorers similarly:

evaluate_run = pq.EvaluateRun(run_name="my_run", project_name="my_project", scorers=[pq.Scorers.correctness, pq.Scorers.context_adherence])

Customizing Plus Scorers

We allow customizing execution parameters for the Chainpoll-powered metrics from our Guardrail Store. Check out Customizing Chainpoll-powered Metrics.