LLMs
Integrate large language models (LLMs) into Galileo Evaluate to assess performance, refine outputs, and enhance generative AI model capabilities.
This section only applies if you want to:
- Query your LLMs via the Galileo Playground or via promptquality.runs()
- Or leverage any of our the metrics that are powered by OpenAI / Azure models. If you have an application or prototype where you’re querying a model in code you can integrate Galileo into your code. Jump to Evaluating and Optimizing Agents, Chains, or multi-stage workflows to learn more.
Galileo integrates with publicly accessible LLM APIs as well as Open Source LLMs (privately hosted). Before you start using Evaluate on your own LLMs, you need to set up your models on the system.
- Go to the ‘Galileo Home Page’.
- Click on your ‘Profile’ (bottom left).
- Client on ‘Settings & Permissions’.
- Click on ‘Integrations’.
You can set up and manage all your LLM API and Custom Model integrations from the ‘Integrations’ page.
Public APIs supported
OpenAI
We support both the Chat and Completions APIs from OpenAI, with all of the active models. This can be set up from the Galileo console or from the Python client.
Note: OpenAI Models power a few of Galileo’s Guardrail Metrics (e.g. Correctness, Context Adherence, Chunk Attribution, Chunk Utilization, Completeness). To improve your evaluation experience, we recommend setting up this integration even if the model you’re prompting or testing is a different one.
Azure OpenAI
If you use OpenAI models through Azure, you can set up your Azure integration. This can be set up from the Galileo console or from the Python client.
Google Vertex AI
For integrating with models served by Google via Vertex AI (like PaLM 2 and Gemini), we recommend setting up a Service Account within your Google Cloud project that has Vertex AI enabled. This service account requires at minimum the ‘Vertex AI User (roles/aiplatform.user)’ role’s policies to be attached.
Once the role is created, create a new key for this service account. The contents of the JSON file provided are what you’ll copy over into the Integrations page for Galileo.
AWS Bedrock
Add your AWS Bedrock integration in the Galileo Integrations page. You should see a green light indicating a successful integration. Now, you should see new Bedrock models show up in the Prompt Playground.
AWS Sagemaker
If you’re hosting models on AWS Sagemaker, you can query them via Galileo. Set up your AWS Sagemaker integration via the Integrations page.
You’ll need to enter your authentication credentials (as an access key <> secret pair or an AWS role that can be assumed) alongwith the AWS region in which your endpoints are hosted. For each endpoint, you can configure the name of the endpoint and an alias alongwith the schema mapping in dpath notation
.
Required parameters for each endpoint are:
-
Prompt: To pass the prompt to the payload.
-
Response: To parse the response from the response.
Optional parameters, which are included in the payload if set, are:
- Temperature
- Max tokens
- Top K
- Top P
- Frequency penalty
- Presence penalty
Check out this video for step-by-step instructions.
Other Custom Models
If you are prompting via Langchain, Galileo can use custom models through Langchain the same way you might use OpenAI in Langchain. Check out ‘Using Prompt with Chains or multi-step workflows’ for more details on how to integrate Galileo into your Langchain application.
To prompt your custom models through the Galileo UI, they need to be hosted on AWS Sagemaker (see above).
Was this page helpful?