SLA & Performance Baselines

Tier 2 DEPLOY

What This Requires

Define SLAs for AI systems (uptime, latency, error rate) and establish performance baselines. Monitor against SLAs and alert on violations. Review and adjust baselines quarterly based on actual performance.

Why It Matters

Without SLAs, teams lack objective criteria for success. Baselines enable detection of performance degradation (model drift, infrastructure issues).

How To Implement

Define SLAs

For each AI system, set SLAs: uptime (99.9%), latency p99 (<500ms), error rate (<1%), throughput (100 QPS). Align to business requirements.

Baseline Establishment

Run load tests and collect 2 weeks of production metrics. Calculate baseline: median, p50, p99. Document in runbook.

Monitoring & Alerting

Configure alerts for SLA violations: uptime <99.9%, latency p99 >500ms, error rate >1%. Alert on-call via PagerDuty/Opsgenie.

Quarterly Review

Review SLA performance vs. baseline. Adjust if needed (e.g., baseline latency increased due to added features). Document changes in change log.

Evidence & Audit

  • SLA definitions per AI system
  • Baseline establishment methodology and results
  • Monitoring dashboards showing SLA metrics
  • Alert configuration and incident history
  • Quarterly review records with baseline adjustments

Related Controls