Guardrail API

Guardrail API — Frequently Asked Questions

This page explains Guardrail in plain language. It's written for people who need to make decisions about AI safety and risk, even if they don't live in code every day: executives, security leaders, product owners, and advisors.

For deeper technical documentation, see the documentation page or enterprise contact if you need to talk through a specific environment.

For non-technical decision makers

What does the Guardrail API actually do?

Guardrail sits in the middle between your applications and your AI models. Every time an app tries to ask the model a question, the request goes through Guardrail first. Guardrail checks the request against safety and policy rules, and only lets safe, compliant traffic through. If something looks risky, it can block it or ask for clarification before the model ever sees it.

Is this a firewall?

It's a firewall specifically for AI language models. Traditional firewalls look at network traffic like IP addresses and ports. Guardrail looks at the actual language going into and coming out of your models. It's focused on things like prompt injection, data leaks, policy violations, and abuse of tools or agents.

Does Guardrail replace our AI models or vendors?

No. Guardrail doesn't replace your models. You keep using OpenAI, Azure, Google, Anthropic, or your own internal models. Guardrail is a safety and governance layer that sits in front of them. It standardizes how you apply rules, how you log decisions, and how you see what's happening across all of your models.

Will this slow everything down?

Guardrail is designed to run in real time, in-band with your traffic. For normal, clearly safe requests, the overhead is small, similar to putting an API gateway or authentication layer in front of a service. When something is ambiguous, Guardrail can take longer because it may need to ask a verifier to review the intent. That is by design: suspicious traffic should get more attention than routine requests.

What kinds of problems does this actually prevent?

  • Prompt injection: users or external content trying to trick the model into ignoring your instructions.
  • Data exfiltration: prompts that try to make the model reveal sensitive or internal data.
  • Policy violations: requests or responses that would violate GDPR, HIPAA, internal usage policies, or sector-specific rules.
  • Unsafe tool use: attempts to get an AI agent to run dangerous actions by hiding instructions in text, files, or web content.

We already have security tools. Why do we need this?

Existing tools like firewalls, WAFs, and endpoint protection don't understand the content of AI prompts and responses. They see network traffic, not intent. Guardrail fills that gap. It's designed to work alongside your existing controls and plug into your logging and SIEM stack, not to replace them.

Is this meant for personal projects or only large enterprises?

Guardrail is designed to scale from a single developer to large, multi-team organizations. You can start by protecting a personal project or pilot application, then grow into an enterprise deployment without changing the core engine. The difference at larger scales is in how you deploy it: dedicated tenants, higher throughput, SIEM integration, and cloud-specific integration guides for platforms like AWS, Azure, and Google Cloud.

For security, risk, and compliance leaders

What do you mean by 'clarify-first' decisions?

Clarify-first means we don't guess about risky intent. If a prompt could be harmless or harmful depending on context, Guardrail treats it as unclear. Instead of silently blocking or allowing it, Guardrail asks for clarification or routes it to a verifier for intent assessment. Only once the intent is clear does it move forward or get blocked.

How does Guardrail help with regulations like GDPR, HIPAA, or the EU AI Act?

Guardrail uses policy packs, structured rule sets, to encode regulatory and internal requirements. These packs can express rules like "don't output unredacted PII," "don't route this data outside a region," or "don't answer this class of questions for this user group." Over time, those packs can evolve as regulations change, without you having to re-implement the logic in every application.

What is the 'dual-arm' model you mention?

Guardrail treats traffic into the model (ingress) and traffic coming out of the model (egress) as two separate arms. Each arm can enforce its own rules and remain effective even if the other arm has a problem. That means an issue in output filtering does not silently disable input checks, and the other way around. It is about reducing single points of failure in your AI safety layer.

What does the verifier do, and does it run untrusted code?

The verifier is a separate service that uses LLMs to assess intent and risk. It never executes untrusted code or tools. When the core runtime is unsure whether a request is safe, it sends a description of the situation to the verifier. The verifier's only job is to answer questions like "Is running this request likely to be harmful or violate policy?" not to carry out the request itself.

What kind of audit trail does Guardrail provide?

Every decision can be tagged with IDs, reasons, and headers that your logging and observability stack can ingest. That means you can answer questions like "Why was this request blocked?" or "Who changed this policy and when?" without digging through raw model logs. The goal is to make AI behavior explainable to auditors and risk teams in a way that fits existing processes.

Does Guardrail integrate with our SIEM or logging tools?

Yes. Guardrail Core and the Guardrail Enterprise Console generate structured, SIEM-ready audit logs for every decision, including ingress checks, egress checks, and incident events. These logs are available in current Guardrail Core (1.6.0 and later) and Enterprise Console (1.5.0 and later) releases.

Today, those logs can be exported from the Admin Console or via API and then forwarded into your existing observability pipeline, whether that's Splunk, Elastic, Datadog, CloudWatch, Azure Monitor, GCP Logging, or another platform that accepts JSON or NDJSON.

Native streaming integrations, where Guardrail automatically sends logs directly to services like CloudWatch, Splunk, Elastic, or Datadog without a manual export step, are Enterprise features currently under development and are not enabled yet. The roadmap is to let Guardrail plug cleanly into the monitoring and incident-response processes you already have, instead of creating a separate island of telemetry.

For engineers and architects

How does Guardrail sit in the architecture?

In most setups, your applications call Guardrail instead of calling an LLM endpoint directly. Guardrail then forwards only safe, policy-aligned traffic to your chosen providers. From an engineer's perspective, it looks a lot like an API gateway focused on language and safety rather than just routing.

How does Guardrail fit into my architecture?

Guardrail sits between your application and your model provider. Your app sends its prompt to Guardrail instead of sending it straight to OpenAI, Azure, Claude, or another model. Guardrail checks the request, applies policy, and only then forwards it to the model. The response comes back through Guardrail so it can apply egress rules before your app sees it.

From an engineering view, this is a small change in where your code sends requests. Guardrail behaves like an API service that focuses on safety, compliance, and logging. It does not require you to redesign your system, change your models, or adopt new frameworks.

How does Guardrail work with LangChain, LlamaIndex, or other LLM tools?

Guardrail exposes an OpenAI-compatible API, which means most LLM tools already know how to talk to it. In LangChain, LlamaIndex, and similar frameworks, you point the OpenAI client to your Guardrail URL instead of the vendor URL. The rest of your code stays the same.

These tools treat Guardrail as the model endpoint. Guardrail checks the request, forwards it to the real model, checks the response, and returns a standard completion or chat response. You do not need a special SDK or wrapper.

What does integration look like in code?

Integration is usually a small change to where your app points its chat or completion requests. Instead of hitting api.provider.com, you hit api.guardrailapi.com (or your internal deployment). You include your Guardrail tenant identifiers and any metadata needed for policy decisions. The rest of your stack, model selection, prompts, tools, can be introduced progressively.

Can it support multiple model providers?

Yes. Guardrail is designed to front multiple providers (for example, OpenAI, Azure, Google, Anthropic, or internal models). It's common to use Guardrail as a single enforcement and observability layer while the underlying models evolve over time.

Are you planning integrations for AWS Bedrock, Azure OpenAI, or other cloud providers?

Guardrail API already works with all major LLM platforms because the Core runtime is fully provider agnostic. You route prompts and model outputs through Guardrail, and the evaluation flow works the same whether you use OpenAI, Anthropic, AWS Bedrock, Azure OpenAI, Google Vertex AI, or another provider.

For enterprise teams, we provide Integration Packages that act as cloud-specific deployment guides. Each package includes ready-to-use examples, configuration templates, and notes on IAM, networking, and recommended deployment patterns for platforms like Amazon Bedrock, Azure OpenAI, Google Vertex AI, and the OpenAI API.

These packages do not change how the engine works. They remove friction so you can drop Guardrail into an existing cloud workflow, with a reference architecture you can adapt rather than starting from scratch. Additional providers can be added over time based on customer demand.

What happens if Guardrail is unavailable?

In high-stakes environments, you typically run Guardrail as part of your core platform, with the same availability and redundancy expectations as your gateways and identity providers. Exact failure behavior, fail-open versus fail-closed, is a design choice for each deployment and is something we expect to discuss explicitly during architecture reviews.

Where can I find more technical detail?

The high-level concepts, pseudo-code examples, and ecosystem overview live on the documentation page. For enterprise deployments, we expect to provide deeper architecture guides, threat models, and reference configurations under NDA as the product matures.

Billing, Subscriptions, and Refunds

How are subscriptions billed?

Guardrail uses Stripe as our payment processor. When you start a paid plan, Stripe creates a subscription in your name and charges the card you provide. Charges appear on your statement with a clear Guardrail Labs or Stripe reference, and you receive receipts or invoices by email for your records.

What is our refund policy?

Hi, Dr. Wes here. My refund policy is very simple: if you decide you don't want it, cancel it. When you sign up under the paid tiers through Stripe, you'll receive a license code for your package and a confirmation email with links you can use to cancel at any time. Canceling turns off your subscription in Stripe.

Once that's done, I issue the refund manually, prorated for the remaining time depending on when you canceled. I think that's fair. If you use it for a week and decide it isn't right for you, that week's charges cover the processing fees I get hit with for setting all of this up.

And honestly, I never wanted to do all this. But it's been an eye-opening return to web and e-commerce development, something I hadn't touched in a few years. I'm a mad scientist. I want to develop cost-effective ways to limit the damage AI development has the potential to cause to our world and species, while also making AI safe enough to use so we can actually move forward and learn from the massive piles of data humanity has accumulated over the last twenty years.

I hate the business side of all this, and it actually feels good to say that out loud. You'd be correct to guess I used AI heavily. It's how I built in six months what would otherwise take teams of people a couple of years. But this section right here is just me typing.

What I'm hoping will happen is that most folks use the free, open-source version and join the Discord so I can build a community. This site represents the real work I did in Git developing the verifier, which I patented entirely to keep greedy buzzards from stealing it.

Tesla is a hero of mine, but if we learn from our heroes, what I learned from Tesla is to watch out when the bankers show up. What I learned from the Joker is that if you're good at something, never do it for free. The free version is the working core, and most people will be able to tell there whether this is useful to them. Each version above that adds specific capabilities based on the size and complexity of the model or agent system you need to monitor.

So what does this have to do with refunds? I'm hoping that by the time people reach the paid tiers, they already know they have a real use for this, and refunds aren't an issue. I also want serious users who get this far to know that I'm actively here, and that what I've built is a new way of approaching security for AI.

I've already done the theoretical legwork, and I plan to scale this to protect AI browsers and operating systems as well. People hate AI, often for good reason, but the biggest problem with AI right now is that small and mid-sized organizations are rightly afraid of creative deployments. The regulations emerging in places like Europe and California are tailored for the big players burning billions who can absorb the impact. Smaller organizations can't, which means the most valuable uses of AI haven't even been discovered yet.

Guardrail API is built to fill that gap. This isn't another gimmick designed to cash in on AI fear or hyper-spending. And yes, I'll sell out in an instant if the right offer lets me keep doing this work in a bigger sandbox.

Bottom line: refunds are prorated from activation to cancellation date and are issued manually after you cancel your subscription in Stripe using the links provided in your welcome email.

How long do refunds take to reach my card or bank account?

When we approve a refund, Stripe processes it right away on our side. Your bank is responsible for posting the credit back to your card or account. Most banks show the refund within three to five business days, but some can take up to ten business days. That timing is controlled by your bank, not by Guardrail Labs or Stripe.

Why do I still see the original charge after I received a refund confirmation?

It is common for the original charge to stay visible on your statement until your bank finishes posting the refund. In many cases, the bank will keep the original charge line and add a separate credit line for the refund. If you have a refund confirmation from Stripe or from Guardrail Labs and do not see the credit after ten business days, your bank is the best place to ask for an update on posting times.

Does Guardrail Labs ever pull money back from my account after a refund?

No. Once a refund is issued, we do not attempt to pull funds back from your bank or card. Stripe sends the refund credit to your bank and then adjusts our Stripe balance and future payouts. You will never see a second debit from us tied to the same refund.

Who can I contact if something on my bill does not look right?

If you ever see a charge or refund that you do not recognize, please reach out to us directly so we can review it with you. You can contact Guardrail Labs at billing@guardrailapi.com. We would rather clear up a small question early than have anyone worry that something is wrong with their account.