AI Incident Response Plan
Purpose
An AI-specific incident response plan covering detection, triage, containment, and post-incident review procedures.
Related Controls
1. Purpose
Define the plan's scope and activation criteria.
This plan establishes procedures for detecting, responding to, and recovering from incidents involving AI systems at [ORGANIZATION NAME]. It supplements the general IT incident response plan with AI-specific procedures.
Plan Owner: [ROLE TITLE], [DEPARTMENT]
Effective Date: [DATE]
Activation Criteria: Any event that disrupts AI system availability, compromises AI system integrity, exposes data through AI channels, or causes AI systems to produce harmful outputs.
2. Incident Type Definitions
Classify AI-specific incident types with severity levels.
| Incident Type | Category | Default Severity | Examples |
|---|---|---|---|
| Prompt Injection Attack | Security | High | System prompt extracted, unauthorized actions executed |
| Data Leakage via AI | Privacy | Critical | PII exposed in model outputs, training data extracted |
| Model Drift/Degradation | Quality | Medium | Accuracy drops below threshold, biased outputs detected |
| AI Service Outage | Availability | High | Model endpoint unresponsive, API provider outage |
| Harmful Output | Safety | High | Toxic, misleading, or dangerous content generated |
| Agent Runaway | Operational | Critical | Agent exceeding limits, unauthorized resource access, cascading failures |
| Supply Chain Compromise | Security | Critical | Compromised model, malicious dependency, vendor breach |
3. Response Team Roles
Define the incident response team structure and responsibilities.
| Role | Responsibilities | Primary | Backup |
|---|---|---|---|
| Incident Commander | Overall coordination, decision authority, stakeholder communication | [NAME, PHONE] | [NAME, PHONE] |
| AI Technical Lead | AI-specific diagnosis, model behavior analysis, prompt forensics | [NAME, PHONE] | [NAME, PHONE] |
| Security Analyst | Attack analysis, evidence preservation, threat assessment | [NAME, PHONE] | [NAME, PHONE] |
| Operations | System health, log collection, service restoration | [NAME, PHONE] | [NAME, PHONE] |
| Communications | Internal/external communications, customer notification | [NAME, PHONE] | [NAME, PHONE] |
4. Response Procedures
Step-by-step procedures for each severity level.
Severity 1 (Critical) — Response within 15 minutes
- Detect: Alert received via monitoring / user report / automated detection
- Triage: Incident Commander confirms severity, activates response team
- Contain: Immediately disable affected AI system or isolate from network
- Notify: Alert executive leadership, legal, and affected stakeholders
- Investigate: Preserve logs, analyze attack vector or failure mode
- Remediate: Fix root cause, apply patches, update controls
- Restore: Re-enable system with enhanced monitoring
- Review: Conduct post-incident review within 48 hours
Severity 2 (High) — Response within 1 hour
- Detect & Triage: Confirm severity and assign to response team
- Contain: Apply targeted mitigation (rate limit, input filter, output block)
- Investigate: Analyze scope and impact
- Remediate: Apply fix in staging, test, deploy
- Review: Post-incident review within 1 week
Severity 3 (Medium) — Response within 4 hours
- Detect & Triage: Log incident, assign owner
- Investigate: Determine root cause
- Remediate: Schedule fix in next sprint
- Review: Include in weekly security review
5. Escalation Matrix
Define when and to whom incidents are escalated.
| Severity | Initial Response | Escalation (if unresolved) | Timeframe | Executive Notification |
|---|---|---|---|---|
| Critical | Incident Commander + Full Team | CISO → CTO → CEO | Immediate | Within 15 minutes |
| High | On-Call + AI Tech Lead | Incident Commander → CISO | 1 hour | Within 4 hours |
| Medium | On-Call Engineer | Team Lead → AI Program Lead | 4 hours | Weekly summary |
| Low | Assigned Engineer | Team Lead | Next business day | Monthly summary |
6. Communication Templates
Pre-drafted communications for rapid incident notification.
Internal Notification (Severity 1-2)
Subject: [SEVERITY] AI Incident — [SYSTEM NAME] — [SHORT DESCRIPTION]
Status: Active / Contained / Resolved
Impact: [Description of user/business impact]
Current Actions: [What the team is doing right now]
Next Update: [TIME]
External/Customer Notification (if required)
Subject: Service Update — [SYSTEM NAME]
Body: We are aware of an issue affecting [SERVICE]. Our team is actively working to resolve the situation. [SPECIFIC IMPACT]. We expect to provide an update by [TIME]. We apologize for any inconvenience.
Post-Resolution Notification
Subject: Resolved — [SYSTEM NAME] Incident
Body: The incident affecting [SERVICE] has been resolved as of [TIME]. Root cause: [BRIEF SUMMARY]. A full post-incident review will be completed within [TIMEFRAME].
7. Plan Review History
Track plan reviews, tabletop exercises, and updates.
| Date | Activity | Participants | Findings | Changes Made |
|---|---|---|---|---|
| [DATE] | Plan Review | [NAMES] | [FINDINGS] | [CHANGES] |
| [DATE] | Tabletop Exercise | [NAMES] | [FINDINGS] | [CHANGES] |
| [DATE] | Live Incident Lessons | [NAMES] | [FINDINGS] | [CHANGES] |
Tabletop Exercise Schedule
- Frequency: Semi-annual (minimum)
- Scenarios: Rotate through all incident types over 2-year cycle
- Next Exercise: [DATE]
- Scenario: [PLANNED SCENARIO]