AI Incident Response Plan

Plan MONITOR

Purpose

An AI-specific incident response plan covering detection, triage, containment, and post-incident review procedures.

Related Controls

AI Incident Response

NIST MS-4

1. Purpose

Define the plan's scope and activation criteria.

This plan establishes procedures for detecting, responding to, and recovering from incidents involving AI systems at [ORGANIZATION NAME]. It supplements the general IT incident response plan with AI-specific procedures.

Plan Owner: [ROLE TITLE], [DEPARTMENT]

Effective Date: [DATE]

Activation Criteria: Any event that disrupts AI system availability, compromises AI system integrity, exposes data through AI channels, or causes AI systems to produce harmful outputs.

2. Incident Type Definitions

Classify AI-specific incident types with severity levels.

Incident Type	Category	Default Severity	Examples
Prompt Injection Attack	Security	High	System prompt extracted, unauthorized actions executed
Data Leakage via AI	Privacy	Critical	PII exposed in model outputs, training data extracted
Model Drift/Degradation	Quality	Medium	Accuracy drops below threshold, biased outputs detected
AI Service Outage	Availability	High	Model endpoint unresponsive, API provider outage
Harmful Output	Safety	High	Toxic, misleading, or dangerous content generated
Agent Runaway	Operational	Critical	Agent exceeding limits, unauthorized resource access, cascading failures
Supply Chain Compromise	Security	Critical	Compromised model, malicious dependency, vendor breach

3. Response Team Roles

Define the incident response team structure and responsibilities.

Role	Responsibilities	Primary	Backup
Incident Commander	Overall coordination, decision authority, stakeholder communication	[NAME, PHONE]	[NAME, PHONE]
AI Technical Lead	AI-specific diagnosis, model behavior analysis, prompt forensics	[NAME, PHONE]	[NAME, PHONE]
Security Analyst	Attack analysis, evidence preservation, threat assessment	[NAME, PHONE]	[NAME, PHONE]
Operations	System health, log collection, service restoration	[NAME, PHONE]	[NAME, PHONE]
Communications	Internal/external communications, customer notification	[NAME, PHONE]	[NAME, PHONE]

4. Response Procedures

Step-by-step procedures for each severity level.

Severity 1 (Critical) — Response within 15 minutes

Detect: Alert received via monitoring / user report / automated detection
Triage: Incident Commander confirms severity, activates response team
Contain: Immediately disable affected AI system or isolate from network
Notify: Alert executive leadership, legal, and affected stakeholders
Investigate: Preserve logs, analyze attack vector or failure mode
Remediate: Fix root cause, apply patches, update controls
Restore: Re-enable system with enhanced monitoring
Review: Conduct post-incident review within 48 hours

Severity 2 (High) — Response within 1 hour

Detect & Triage: Confirm severity and assign to response team
Contain: Apply targeted mitigation (rate limit, input filter, output block)
Investigate: Analyze scope and impact
Remediate: Apply fix in staging, test, deploy
Review: Post-incident review within 1 week

Severity 3 (Medium) — Response within 4 hours

Detect & Triage: Log incident, assign owner
Investigate: Determine root cause
Remediate: Schedule fix in next sprint
Review: Include in weekly security review

5. Escalation Matrix

Define when and to whom incidents are escalated.

Severity	Initial Response	Escalation (if unresolved)	Timeframe	Executive Notification
Critical	Incident Commander + Full Team	CISO → CTO → CEO	Immediate	Within 15 minutes
High	On-Call + AI Tech Lead	Incident Commander → CISO	1 hour	Within 4 hours
Medium	On-Call Engineer	Team Lead → AI Program Lead	4 hours	Weekly summary
Low	Assigned Engineer	Team Lead	Next business day	Monthly summary

6. Communication Templates

Pre-drafted communications for rapid incident notification.

Internal Notification (Severity 1-2)

Subject: [SEVERITY] AI Incident — [SYSTEM NAME] — [SHORT DESCRIPTION]

Status: Active / Contained / Resolved

Impact: [Description of user/business impact]

Current Actions: [What the team is doing right now]

Next Update: [TIME]

External/Customer Notification (if required)

Subject: Service Update — [SYSTEM NAME]

Body: We are aware of an issue affecting [SERVICE]. Our team is actively working to resolve the situation. [SPECIFIC IMPACT]. We expect to provide an update by [TIME]. We apologize for any inconvenience.

Post-Resolution Notification

Subject: Resolved — [SYSTEM NAME] Incident

Body: The incident affecting [SERVICE] has been resolved as of [TIME]. Root cause: [BRIEF SUMMARY]. A full post-incident review will be completed within [TIMEFRAME].

7. Plan Review History

Track plan reviews, tabletop exercises, and updates.

Date	Activity	Participants	Findings	Changes Made
[DATE]	Plan Review	[NAMES]	[FINDINGS]	[CHANGES]
[DATE]	Tabletop Exercise	[NAMES]	[FINDINGS]	[CHANGES]
[DATE]	Live Incident Lessons	[NAMES]	[FINDINGS]	[CHANGES]

Tabletop Exercise Schedule

Frequency: Semi-annual (minimum)
Scenarios: Rotate through all incident types over 2-year cycle
Next Exercise: [DATE]
Scenario: [PLANNED SCENARIO]

← Back to all templates