Guardrail API

Policy Packs

Policy packs describe how Guardrail API should evaluate prompts, responses, and metadata. Each pack is a versioned collection of rules that can be attached to tenants and environments to enforce safety, compliance, and organization-specific standards.

Think of a policy pack as "infrastructure as code" for AI safety: you define your expectations once, then apply them to all LLM traffic passing through Guardrail.

What is a policy pack?

A policy pack is a configuration file that contains one or more rules. Each rule expresses a condition (when it should apply) and a decision recommendation (what should happen when it fires).

  • Rules can look at prompt text, model output, metadata, or a combination of all three.
  • Rules may apply on ingress (user input), egress (model output), or both.
  • Packs are versioned so you can roll out updates gradually and keep a history of policy changes.

Built-in vs custom packs

Guardrail API ships with baseline safety packs designed to block common high-risk behaviors and misuse patterns. Most organizations then layer custom packs on top to reflect their own risk model and regulatory requirements.

  • Built-in packs handle generic safety concerns, such as obvious abuse, self-harm content, or high-risk code execution requests.
  • Custom packs encode organization-specific rules, such as data residency constraints, prohibited topics, or sector-specific compliance obligations.
  • Multiple packs can be active at once. A single request may be evaluated by both built-in and custom packs before a decision is made.

How policy packs are applied at runtime

At runtime, Guardrail API collects all policy packs attached to the active tenant and environment. It runs each applicable rule and merges their recommendations into a single decision.

  1. Guardrail determines which policy packs apply to the current tenant + environment.
  2. Each pack evaluates the request (and optionally the model output) according to its rules.
  3. Rules return recommendations such as allow, block, or escalate.
  4. Guardrail merges these recommendations using its decision semantics, typically favoring safety in the presence of conflicts.
  5. The final decision is returned to your application, along with an incident ID if the event should be reviewed in the Enterprise Console.

The exact rule syntax and configuration format are documented in the policy pack reference; the goal of this page is to explain how packs behave conceptually.

Examples of policy logic

Here are some practical examples of what policy packs can express, without committing to a specific rule language in this overview:

  • Block obvious attacks: "If the prompt requests malware, SQL injection, or credential harvesting, block the request and flag an incident."
  • Protect regulated data: "If the response appears to contain unmasked PII or protected health information, block and log for review."
  • Enforce internal norms: "If the output includes certain disallowed phrases or unsupported commitments, rewrite or block before delivering it to the user."
  • Trigger verifier review: "If the request looks ambiguous but potentially high risk, require a verifier to assess intent before allowing execution."

Authoring and change management

Policy packs should be treated like any other production-critical configuration: versioned, reviewed, and tested before rollout.

  • Store policy packs in a source-controlled repository, alongside application or infrastructure code.
  • Use code review to validate new or updated rules before they reach production tenants.
  • Deploy changes first to a non-production environment and observe their impact in the Enterprise Console.
  • Document the intent of each pack and provide human-readable names and descriptions to make assignments clear.

Many teams align policy pack changes with their normal release cadence, so that risk and safety changes are tracked alongside product changes.

Assigning packs to tenants and environments

Policy packs are attached to tenants and environments using the Enterprise Console or configuration. This allows different applications, customers, or environments to have different guardrail profiles.

  • Assign stricter packs to production than to development, especially when real user data is involved.
  • Use specialized packs for particular tenants with unique regulatory or contractual requirements.
  • Gradually roll out new packs by attaching them to a small subset of tenants or environments first.

Building a safety model with policy packs

Policy packs are how you encode your organization's view of what "safe and acceptable" AI behavior looks like. By combining built-in safety packs with your own custom rules, you can express a clear safety model, apply it consistently across LLM workloads, and evolve it as your use cases and risk posture change.