Defining Rules
A condition or rule you never want your application to break. It’s composed of three ingredients:
-
A metric
-
An operator
-
A target value
Your Rules should evaluate to False for the base case, and to True for unwanted scenarios.
In the example above, the “input/output shall never contain PII” is encoded into a Rule like below:
Metrics and Operators supported
We support several metrics within Protect rules. Because each metric can have different output values (e.g. float metrics, categorical, etc.), the Operators and Target values differ by metric. Below is a list of all supported metric and their available configurations:
Prompt Injection
Used to detect and stop prompt injections in the input.
Metric Constants:
gp.RuleMetrics.prompt_injection
Payload Field: input
Potential Categories:
-
impersonation
-
obfuscation
-
simple_instruction
-
few_shot
-
new_context
Operators and Target Value Supported:
Operator | Target Value |
---|---|
Any (gp.RuleOperator.any ) | A list of categories (e.g. [“obfuscation”, “impersonation”]) |
All (gp.RuleOperator.all ) | A list of categories (e.g. [“obfuscation”, “impersonation”]) |
Contains (gp.RuleOperator.contains ) | A single category (e.g. “impersonation”) |
Equal (gp.RuleOperator.eq ) | A single category (e.g. “impersonation”) |
Not equal (gp.RuleOperator.neq ) | A single category (e.g. “impersonation”) |
Empty (gp.RuleOperator.empty ) | - |
Not Empty (gp.RuleOperator.not_empty ) | - |
Example:
PII (Personal Identifiable Information)
Used to detect and stop Personal Identifiable Information (PII). When applied on the input, it can be used to stop the user or company PII from being included in API calls to external services. When applied on the output, it can be used to prevent data leakage or PII being shown back to the user.
Metric Constants:
-
gp.RuleMetrics.pii
for output PII -
gp.RuleMetrics.input_pii
for input PII
Payload Field: input
(for input PII) or output
(for output PII)
Potential Categories:
-
address
-
date
-
email
-
financial_info
-
name
-
phone_number
-
ssn
-
username_password
Operators and Target Value Supported:
Operator | Target Value |
---|---|
Any (gp.RuleOperator.any ) | A list of categories (e.g. [“ssn”, “address”]) |
All (gp.RuleOperator.all ) | A list of categories (e.g. [“ssn”, “address”]) |
Contains (gp.RuleOperator.contains ) | A single category (e.g. “ssn”) |
Equal (gp.RuleOperator.eq ) | A single category (e.g. “ssn”) |
Not equal (gp.RuleOperator.neq ) | A single category (e.g. “ssn”) |
Empty (gp.RuleOperator.empty ) | - |
Not Empty (gp.RuleOperator.not_empty ) | - |
Example:
Context Adherence
Measures whether your model’s response was purely based on the context provided. It can be used to stop hallucinations from reaching your end users. Powered by Context Adherence Luna.
Metric Constant: gp.RuleMetrics.context_adherence_luna
Payload Field: Both input
and output
must be included in the payload
Potential Values: 0.00 to 1.00.
Generally, we see 0.1 as a good threshold below which we’re confident the response is not adhering to the context.
Operators Supported:
-
Greater than (
gp.RuleOperator.gt
) -
Less than (
gp.RuleOperator.lt
) -
Greater than or equal (
gp.RuleOperator.gte
) -
Less than or equal (
gp.RuleOperator.lte
)
Example:
Toxicity
Used to detect and stop toxic or foul language in the input (user query) or output (response shown to the user).
Metric Constants:
-
gp.RuleMetrics.toxicity
for output Toxicity -
gp.RuleMetrics.input_toxicity
for input Toxicity
Payload Field: input
or output
Potential Values: 0.00 to 1.00 (higher values indicate higher toxicity)
Operators Supported:
-
Greater than (
gp.RuleOperator.gt
) -
Less than (
gp.RuleOperator.lt
) -
Greater than or equal (
gp.RuleOperator.gte
) -
Less than or equal (
gp.RuleOperator.lte
)
Example:
Sexism
Detect sexist or biased language. When applied on the input, it can be used to detect sexist remarks in user queries. When applied on the output, it can be used to prevent your application from using an making biased or sexist comments in its responses.
Metric Constants:
-
gp.RuleMetrics.sexist
for output Sexism -
gp.RuleMetrics.input_sexist
for input Sexism
Payload Field: input
or output
Potential Values: 0.00 to 1.00 (higher values indicate higher toxicity)
Operators Supported:
-
Greater than (
gp.RuleOperator.gt
) -
Less than (
gp.RuleOperator.lt
) -
Greater than or equal (
gp.RuleOperator.gte
) -
Less than or equal (
gp.RuleOperator.lte
)
Example:
Tone
Primary tone detected from the text. When applied on the input, it can be used to detect negative tones in user queries. When applied on the output, it can be used to prevent your application from using an undesired tone in its responses.
Metric Constants:
-
gp.RuleMetrics.tone
for output Tone -
gp.RuleMetrics.input_tone
for input Tone
Payload Field: input
(for input Tone) or output
(for output Tone)
Potential Categories:
-
anger
-
annoyance
-
confusion
-
fear
-
joy
-
love
-
sadness
-
surprise
-
neutral
Operators and Target Value Supported:
Operator | Target Value |
---|---|
Equal (gp.RuleOperator.eq ) | A single category (e.g. “anger”) |
Not equal (gp.RuleOperator.neq ) | A single category (e.g. “neutral”) |
Example:
Registered Scorers
If you have a registered scorer, it can also be used in your Galileo Protect rulesets.
Example:
The operators and target values here should match the type of data that the registered scorer is expected to produce.
Was this page helpful?