Post-Incident Review Report

Report IMPROVE

Purpose

A post-incident review template with timeline, root cause analysis, corrective actions, and lessons learned.

Related Controls

ISO Clause 10

1. Incident Summary

Capture key facts about the incident for quick reference.

Incident ID: INC-[NNN]

Incident Title: [SHORT DESCRIPTION]

Date/Time Detected: [DATE TIME]

Date/Time Resolved: [DATE TIME]

Duration: [HOURS:MINUTES]

Severity: Critical / High / Medium / Low

Affected Systems: [SYSTEM NAMES]

Affected Users: [NUMBER / SCOPE]

Incident Commander: [NAME]

Report Author: [NAME], [ROLE TITLE]

Report Date: [DATE]

Impact Summary: [2-3 sentences describing the business and user impact]

2. Timeline of Events

Reconstruct the full incident timeline from detection to resolution.

TimeEventAction TakenBy
[TIME]First anomaly detected by monitoringAlert triggered, on-call notifiedAutomated
[TIME]On-call engineer acknowledges alertBegins investigation[NAME]
[TIME]Root cause identifiedIncident Commander notified, team assembled[NAME]
[TIME]Containment action taken[SPECIFIC ACTION — e.g., AI service disabled][NAME]
[TIME]Fix implemented and deployed[SPECIFIC FIX][NAME]
[TIME]Service restored and verifiedMonitoring confirmed normal operation[NAME]
[TIME]All-clear communicated to stakeholdersResolution notification sent[NAME]

3. Root Cause Analysis

Use the 5 Whys technique to identify the true root cause.

5 Whys Analysis

Why 1: Why did the incident occur?

→ [ANSWER — e.g., "The AI agent executed an unauthorized API call"]

Why 2: Why did that happen?

→ [ANSWER — e.g., "The agent's permission boundary did not restrict that API endpoint"]

Why 3: Why was it not restricted?

→ [ANSWER — e.g., "The endpoint was added after the permission matrix was last reviewed"]

Why 4: Why wasn't the permission matrix updated?

→ [ANSWER — e.g., "No process to trigger permission review when new endpoints are added"]

Why 5: Why is there no trigger process?

→ [ANSWER — e.g., "Permission reviews are calendar-based (quarterly) not event-based"]

Root Cause Statement

[Clear, specific statement of the root cause — e.g., "Agent permission reviews are calendar-based rather than triggered by system changes, creating windows where new capabilities are not governed."]

4. Contributing Factors

Identify factors that contributed to the incident but are not the root cause.

Process Factors

  • [FACTOR — e.g., "Change management did not include permission review as a checklist item"]
  • [FACTOR — e.g., "No automated detection for agent behavior outside permitted boundaries"]

Technical Factors

  • [FACTOR — e.g., "Agent framework does not enforce permission boundaries at runtime"]
  • [FACTOR — e.g., "Monitoring did not alert on the unauthorized API call pattern"]

Human Factors

  • [FACTOR — e.g., "Team unfamiliar with the new API endpoint's sensitivity level"]
  • [FACTOR — e.g., "Alert fatigue caused initial monitoring signals to be dismissed"]

Note: This analysis is blameless. The goal is to improve systems and processes, not assign personal blame.

5. What Went Well

Acknowledge effective response elements to reinforce good practices.

  1. [POSITIVE — e.g., "Incident was detected within 5 minutes of occurrence"]
  2. [POSITIVE — e.g., "Response team assembled quickly and communicated effectively"]
  3. [POSITIVE — e.g., "Containment action prevented further data exposure"]
  4. [POSITIVE — e.g., "Rollback procedure worked as documented"]

6. What Didn't Go Well

Identify areas where the response could be improved.

  1. [NEGATIVE — e.g., "Took 30 minutes to identify root cause due to insufficient logging"]
  2. [NEGATIVE — e.g., "Customer communication was delayed by 1 hour"]
  3. [NEGATIVE — e.g., "Runbook for this scenario did not exist"]
  4. [NEGATIVE — e.g., "Monitoring alert was too noisy, causing initial dismissal"]

7. Corrective Actions

Define specific, measurable, and time-bound corrective actions.

Action IDCorrective ActionOwnerDeadlineStatusVerification Method
CA-001[ACTION — e.g., "Add permission review to change management checklist"][NAME][DATE]Open / In Progress / Complete[HOW TO VERIFY]
CA-002[ACTION — e.g., "Implement runtime permission enforcement in agent framework"][NAME][DATE]
CA-003[ACTION — e.g., "Add monitoring rule for unauthorized API call patterns"][NAME][DATE]
CA-004[ACTION — e.g., "Create runbook for agent boundary violation incidents"][NAME][DATE]

8. Lessons Learned Summary

Distill key takeaways that should be shared across the organization.

Key Lessons

  1. [LESSON TITLE]: [DESCRIPTION — e.g., "Permission reviews must be event-driven, not just calendar-based. Any change to system capabilities should trigger a permission review."]
  1. [LESSON TITLE]: [DESCRIPTION — e.g., "Runtime enforcement of agent boundaries is essential. Documentation-only controls are insufficient for autonomous systems."]
  1. [LESSON TITLE]: [DESCRIPTION — e.g., "Monitoring must be tuned to detect behavioral anomalies, not just system health metrics."]

Distribution List

This report is distributed to:

  • AI Governance Committee
  • Incident Response Team
  • System Owner(s)
  • Engineering Team Lead(s)
  • [ADDITIONAL STAKEHOLDERS]

Classification: Confidential — Internal Use Only

← Back to all templates