Incident Severity
Matrix.
Define clear 5-level severity definitions (SEV0–SEV4 or SEV1–SEV5) to align your team on what's an emergency. Generate, customize, and export to your runbooks.
Why define severity levels?
Clear severity definitions prevent "everything is a crisis" fatigue. When everyone knows what a SEV0 is versus a SEV3, on-call engineers can prioritize effectively and avoid burnout.
Why You Need a Severity Matrix
Without clear severity definitions, every alert feels like a fire drill. This leads toalert fatigue, where on-call engineers stop taking alerts seriously because "everything is urgent."
A good severity matrix answers two questions instantly:
- How bad is this? (Impact)
- How fast do I need to fix it? (Response SLA)
Best Practices
- ✓Keep it simple: Don't overcomplicate definitions. "Is money being lost?" is a good binary check.
- ✓Print it out: Have this pinned in your Slack incident channel topic.
- ✓Automate it: Use tools like Runframe to automatically set severity based on alert tags.
Standard Severity Levels
A common 5-level framework used by many high-performing teams (often SEV0–SEV4).
Catastrophic
Critical Emergency
Major Incident
Minor Incident
Low Priority
Note: Some teams use SEV1-5 instead of SEV0-4. Both systems have 5 levels—the difference is just numbering. SEV0 emphasizes that "zero" means everything is on fire 🔥