Prompt Injection Prevention

Tier 1 SECURE

What This Requires

Implement technical controls to prevent prompt injection: input validation, context isolation, privilege separation, and output filtering. Test defenses with adversarial prompts.

Why It Matters

Prompt injection is the AI equivalent of SQL injection. Attackers manipulate prompts to bypass restrictions, leak data, or gain unauthorized access. Defense-in-depth is essential.

How To Implement

Input Validation

Block known injection patterns: "Ignore previous instructions", "You are now DAN", excessive newlines. Use regex or ML-based classifier to detect attacks.

Context Isolation

Separate system instructions from user input using delimiters (e.g., XML tags, triple quotes). Use LLM features like OpenAI's message roles (system vs. user).

Privilege Separation

Limit agent permissions. If agent only needs read access, don't grant write. Enumerate allowed tools in system prompt.

Output Filtering

Monitor outputs for sensitive patterns (SSN, API keys). If detected, block response and alert security team. Use regex or DLP library.

Evidence & Audit

  • Input validation rules (regex, ML classifier config)
  • Sample prompts showing context isolation implementation
  • Agent privilege configuration (RBAC, tool allowlists)
  • Output filtering rules and test results
  • Adversarial testing reports

Related Controls