Guardrails

The rules that keep the assistant from going off the road.

The analogy

They're like the barriers on a mountain road: they don't drive for you, but they stop you from flying off at the bend. They filter what goes in and what comes out so the assistant doesn't say or do dangerous or out-of-place things.

In detail

Guardrails are controls added around the model: content filters, rules in the system prompt, output validation and limits on which tools it can use. They don't change the model inside; they wrap it so its behavior is safe and predictable.

An example

A well-protected medical assistant refuses to give a definitive diagnosis, avoids recommending specific doses and always suggests seeing a professional. Those refusals are the guardrails at work.

Related concepts

System prompt Prompt injection Hallucinations