Glossary

AI Guardrails

Technical safeguards such as input/output filtering, rate limiting and prompt injection protection that keep AI systems within safe operating boundaries.

A

AI guardrails are the safety mechanisms that prevent your AI system from producing harmful, biased or off-topic outputs. For startups deploying LLM-based products, guardrails are essential — a single viral screenshot of your chatbot saying something inappropriate can cause serious reputational damage.

How to implement this:

  • Input filtering: Validate and sanitise user prompts before they reach your model. Block known prompt injection patterns and set maximum input lengths.
  • Output filtering: Screen AI responses for harmful content, PII leakage and off-topic answers before showing them to users. Use a classifier or a second LLM call as a safety layer.
  • Rate limiting: Cap requests per user to prevent abuse and control costs. Start with conservative limits and adjust based on real usage.
  • System prompts: Define clear behavioural boundaries in your system prompt — what the AI should and should not do, and how it should handle edge cases.
  • Monitoring: Log all interactions (respecting privacy) and set up alerts for anomalous patterns such as repeated jailbreak attempts or unusual output lengths.

The OWASP LLM Top 10 is a practical checklist for the most common vulnerabilities. Tidal Control helps you document your guardrails as controls and track their effectiveness over time.

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

Z