During a SEV0, everyone wants answers at once. - Executives want a timeline and business impact. - Support wants a script to calm customers down. - Sales/CSMs want something they can forward to key accounts. - Someone on social asks "are you aware?" - The person fixing the database keeps getting interrupted. The technical fix might take 45 minutes. The communication mess can take 2 hours. This guide gives you **copy-paste templates** and a simple operating rule: **one owner, one source of truth, consistent cadence**. --- ## The only framework you need In incidents: **status is the truth. Everything else points to it.** 1) **One owner**: the Incident Commander (IC) owns outbound updates. 2) **One source of truth**: pick one place where updates live (customer email thread, status page, or a single internal update doc). Everything else should point to it. 3) **One cadence**: predictable updates beat "big updates when we feel like it." 4) **Impact over internals**: describe symptoms and scope, not system trivia. 5) **Honest uncertainty**: "unknown at this time" beats fake ETAs. --- ## Frequently asked questions ### Who should send incident updates? The Incident Commander. The person debugging should not also be writing customer updates. For more on the IC role, see [our incident response playbook](/blog/incident-response-playbook). ### How often should we update during a SEV0? Every 15 minutes on your canonical source (status page or a customer email thread). If you don't have either, use a single internal update doc. Also every 15–30 minutes to executives. Always include the next update time. ### What if we don't know the ETA? Say "unknown at this time" and commit to the next update time. Fake ETAs destroy trust. --- ## Template index (jump to what you need) - [Status page incident communication templates](#1-status-page-incident-communication-templates) - [Customer outage email templates](#2-customer-outage-email-templates-only-when-needed) - [Executive incident update templates](#3-executive-incident-update-templates-forwardable) - [Support incident communication kit](#4-support-incident-communication-kit-paste-into-slack-pin) - [Sales / CSM forwardable note](#5-salescsm-key-account-note-forwardable-low-drama) - [Internal engineering update](#6-internal-engineering-update-context-without-noise) - [Social incident response templates](#7-social-incident-response-templates-xlinkedin) - [Post-incident customer summary](#8-post-incident-customer-summary-short-trust-building) --- ## Who needs updates and what they actually want **Customers** - Want: are we impacted, what changed, what's the workaround, when's next update. - Don't want: your root cause guesses. **Executives** - Want: customer impact, revenue risk (or "unknown"), timeline, mitigations, next update. **Support** - Want: a script + how to handle tickets + what not to promise. **Sales/CSMs** - Want: a forwardable note for key accounts + status link + what to say on renewals. **Engineering** - Want: what's broken, who owns it, what's next, where to coordinate. **Public/social** - Want: acknowledgment + status link. Nothing else. --- ## Cadence: how often to update If you only remember one line: **set the next update time in every message**. Recommended cadence (adjust for your business, but keep it consistent): | Severity | Customer/status page | Exec | Support | Social | |---|---:|---:|---:|---:| | SEV0 (outage) | every 15 min | every 15–30 min | push when status changes + at least every 30 min | acknowledge once, then link | | SEV1 (degraded) | every 30–60 min | every 30–60 min | push when status changes | usually link only | | SEV2 (minor) | every 60–120 min | on request | push when status changes | none | **Cadence (plain text):** - **SEV0 (outage):** customer/canonical every **15 min** · exec every **15–30 min** · support on change + at least every **30 min** · social: acknowledge once, then link - **SEV1 (degraded):** customer/canonical every **30–60 min** · exec every **30–60 min** · support on change · social: usually link only - **SEV2 (minor):** customer/canonical every **60–120 min** · exec on request · support on change · social: none Middle of the night does not change expectations. The IC might change; the cadence should not. For more on severity levels, see [our SEV0-SEV4 framework](/blog/incident-severity-levels). --- ## The master message map To avoid fragmented comms, decide where each message type lives: **First: pick your canonical source** The "status page" in this article means whatever your canonical source is: - **Status page** (status.yourcompany.com): most common, public - **Social**: pointer only (rarely canonical — use it to link to your status page or email) - **Customer email**: B2B companies often skip public status pages entirely - **Internal only**: early-stage or regulated industries The rule: **one source, everything points to it.** Don't let Slack say one thing and email say another. **Message destinations:** - **Canonical source**: status page, customer email, or a single internal update doc — your timeline lives here. - **Internal Slack channel**: operational coordination + internal updates. - **Support channel**: the "support kit" pinned and updated. - **Exec email/Slack**: business impact + timeline + next update. - **Social (if not your canonical source)**: acknowledgment + link. Rule: **if your canonical source says "Investigating," no other channel is allowed to say "Resolved in 10 minutes."** --- ## Template quick picker Don't search during a SEV0. Find what you need instantly. | Scenario | Use template | |----------|--------------| | SEV0 declared, first 5 minutes | Status page: Initial | | SEV0, 15 min later, no fix yet | Status page: Update (identified) | | SEV0, fix implemented, monitoring | Status page: Update (mitigation in progress) | | SEV0, resolved | Status page: Resolved | | SEV0 lasting > 30 min, enterprise customers | Customer email: Initial notification | | Executive asks "what's the impact?" | Executive update: Initial | | Support getting slammed with tickets | Support kit: Initial | | Key account at renewal risk, incident active | Sales/CSM note | | Internal engineers asking "what's broken?" | Internal engineering update | | Social media asking "are you aware?" | Social: Acknowledgment | **Template quick picker (plain text):** 1. **SEV0 declared (first 5 minutes)** → Status page: Initial 2. **SEV0, 15 min later, no fix yet** → Status page: Update (identified) 3. **SEV0, fix implemented, monitoring** → Status page: Update (mitigation in progress) 4. **SEV0, resolved** → Status page: Resolved 5. **SEV0 > 30 min (enterprise customers)** → Customer email: Initial notification 6. **Exec asks "what's the impact?"** → Executive update: Initial 7. **Support getting slammed** → Support kit: Initial 8. **Key account at renewal risk** → Sales/CSM note 9. **Engineers asking "what's broken?"** → Internal engineering update 10. **Social asks "are you aware?"** → Social: Acknowledgment --- ## One filled example (SEV0 checkout outage) Scenario: Checkout is failing with "Unable to process payment" for most customers. **Status page (initial):** ``` We're experiencing an outage affecting checkout. Customers may see "Unable to process payment" errors. We're investigating. Next update: 20:15 UTC ``` **Exec update (initial):** ``` We're investigating a SEV0 incident affecting checkout. Impact: - Customer checkout failing for most traffic (scope still being confirmed) - Revenue impact: unknown at this time Timeline: - Started: 20:00 UTC - Status: Investigating - ETA: unknown at this time Next update: 20:30 UTC ``` **Support kit (initial):** ``` What to tell customers: "We're aware of an outage affecting checkout. We're investigating and posting updates here: [status link]. Next update by 20:15 UTC." Do NOT promise: - Resolution times - Credits - Root cause guesses ``` --- ## Good vs bad: why wording matters Most incident communication fails because it talks about internals instead of impact. **❌ Bad update:** > "We're experiencing database replication lag on shard 3. The GC pause caused a cascading failure in the payment microservice. We're restarting the pods and investigating the root cause. Our SRE team is looking into query optimization." **Why it's bad:** - Customers don't know what "shard 3" or "GC pause" means - "Microservice" and "pods" are internal jargon - No clear next update time - Doesn't say whether they can use your product **✅ Good update (using this template):** > "We're experiencing an outage affecting checkout. Customers may see 'Unable to process payment' errors. We're investigating. > > Next update: 3:15 PM ET" **Why it works:** - Clear impact: "checkout" is down, "payment" errors - Specific symptom: customers know what to expect - Next update time: sets expectations - No technical jargon: describes what customers see, not what's broken internally **The pattern:** Describe symptoms, not systems. Customers care about "can I check out," not "your database shard." --- # Copy-paste templates ## 1) Status page incident communication templates ### SEV0: complete outage **Initial (send within 5 minutes of declaring incident):** ``` We're experiencing an outage affecting [service]. Customers may see [symptom]. We're investigating. Next update: [HH:MM TZ] (in 15 minutes) Status: Investigating ``` **Update (identified, working on fix):** ``` We've identified the issue and are working on a fix. Customers may continue to see [symptom]. Next update: [HH:MM TZ] ``` **Update (mitigation in progress / partial recovery):** ``` We've applied a mitigation and are monitoring recovery. Some customers may still see [symptom] while systems stabilize. Next update: [HH:MM TZ] ``` **Resolved:** ``` This incident is resolved. [Service] is operating normally. We'll share a brief post-incident summary within [24–48 hours]. ``` ### SEV1: degraded performance ``` We're seeing degraded performance affecting [service]. Some customers may see [symptom]. We're investigating. Next update: [HH:MM TZ] (in 30–60 minutes) ``` ### SEV2: minor impact / limited scope ``` Some customers may be experiencing [symptom]. This affects [region / tier / % of users]. We're investigating. ``` ### Status page "what not to do" - Don't post internal jargon ("shards," "rebalance," "GC pause"). - Don't promise resolution times you can't keep. Promise the next update time instead. - Don't write 200-word paragraphs. Keep it under ~100 words. --- ## 2) Customer outage email templates (only when needed) Use customer email when: - SEV0 lasts > 30–60 minutes, or - regulated / high-trust domain requires it, or - you have contractual comms obligations. ### Customer email: initial notification **Subject:** Service disruption affecting [Product/Feature] ``` We're currently experiencing an issue impacting [Product/Feature]. What you may see: [Symptom 1] [Symptom 2] (optional) Current status: Investigating Latest updates: [Status page URL] Next update by: [HH:MM TZ] We're sorry for the disruption. [Company] Team ``` ### Customer email: recovery in progress **Subject:** Update: [Product/Feature] disruption (recovery in progress) ``` We've identified the cause and are implementing a fix. Current impact: [Symptom] (if changed, say what changed) Latest updates: [Status page URL] Next update by: [HH:MM TZ] [Company] Team ``` ### Customer email: resolution + next steps **Subject:** Resolved: [Product/Feature] disruption ``` The issue affecting [Product/Feature] is resolved. Duration: [X minutes/hours] Impact: [brief, customer-facing impact] We'll publish a short post-incident summary within [24–48 hours] here: [Link to summary or status page incident post] [Company] Team ``` --- ## 3) Executive incident update templates (forwardable) Executives want business impact + timeline + next update. ### Exec update: initial **Subject:** Incident Update: [Service] — SEV0 — [HH:MM TZ] ``` We're investigating a SEV0 incident affecting [service]. Impact: Customers affected: [X / % / unknown] Customer symptoms: [checkout failing / login errors / etc.] Revenue/contract risk: [known estimate / unknown at this time] Timeline: Started: [HH:MM TZ] Current status: Investigating ETA: [honest estimate or "unknown at this time"] Next update: [HH:MM TZ] (in 15–30 minutes) or sooner if status changes. [Name], Incident Commander ``` ### Exec update: follow-up (delta-based) **Subject:** Update: [Service] incident — [Status] ``` What changed since last update: [1–3 bullets] Current status: [Investigating / Fix in progress / Monitoring / Resolved] Revised ETA: [if known / unchanged / unknown] Next update: [HH:MM TZ] ``` ### Exec summary: post-incident (within 24 hours) **Subject:** Post-Incident Summary: [Service] — [Date] ``` The incident affecting [service] is resolved. What happened (high level): [1–2 sentences] Business impact: Duration: [X] Customers affected: [X / %] Revenue impact: [known / unknown] Root cause (high level): [1–2 sentences] What we're doing to prevent recurrence: [Action + owner + due date] [Action + owner + due date] [Action + owner + due date] Postmortem: [link] (due [date]) ``` --- ## 4) Support "incident communication kit" (paste into Slack + pin) Support needs a script and clear boundaries. ### Support kit: initial ``` 🚨 INCIDENT COMMUNICATION KIT Incident: [Service] is [down / degraded] Severity: SEV0/SEV1/SEV2 Customer impact: [What customers are experiencing] Status page: [URL] What to tell customers (copy/paste): "We're experiencing an issue affecting [service]. Our team is investigating. We're posting updates here: [URL]. Next update by [HH:MM TZ]." Do NOT promise: Resolution times Credits/compensation Root cause guesses ETA: [honest estimate / unknown at this time] Next support update: [HH:MM TZ] Owner: [Incident Commander] in #[incident-channel] ``` ### Support kit: update (only when context changes) ``` 🚨 INCIDENT UPDATE — [HH:MM TZ] What changed: [1–3 bullets] Updated customer script: [only if needed; otherwise "same as above"] Next support update: [HH:MM TZ] ``` For more on incident coordination, see [our guide on reducing context switching during incidents](/blog/engineering-productivity-incident-management). --- ## 5) Sales/CSM "key account note" (forwardable, low drama) Use this when: - customers are enterprise/high-touch, or - you have renewal risk, or - accounts are likely to escalate. **Subject:** Update: [Service] disruption — status + next update ``` Sharing a quick update on an incident affecting [service]. Current customer impact: [One sentence] Latest updates: [Status page URL] Next update by: [HH:MM TZ] If your customer asks for details: Keep it to impact + status link. Avoid root-cause speculation. ``` --- ## 6) Internal engineering update (context without noise) This is for broad awareness, not incident-room debugging. ``` FYI: SEV0/SEV1 incident in progress for [service]. Customer impact: [One sentence] Incident channel: #[channel] IC: [Name] Status page: [URL] Next update: [HH:MM TZ] ``` For more on incident roles, see [our incident response playbook with roles and escalation rules](/blog/incident-response-playbook). --- ## 7) Social incident response templates (X/LinkedIn) Goal: acknowledge + link to status page. Nothing else. **Acknowledgment (within 5–10 minutes of public awareness):** ``` We're aware of an issue affecting [service] and are investigating. Updates: [canonical source URL] ``` **If issue persists > 1 hour:** ``` Still working on the [service] issue. Latest updates: [canonical source URL] ``` **After resolution:** ``` The [service] issue is resolved. Thanks for your patience. We'll share a post-incident summary within [24–48 hours]. ``` --- ## 8) Post-incident customer summary (short, trust-building) This is not the engineering postmortem. It's a customer-facing closure. ``` Post-incident summary (customer-facing) Incident: [1 sentence] Duration: [X] Customer impact: [1 sentence] What we changed: [1–2 bullets] How we'll prevent recurrence: [1–3 bullets] ``` For postmortem templates, see [our post-incident review templates with 3 ready-to-use formats](/blog/post-incident-review-template). --- # Common communication failures (and how to prevent them) ### 1) The debugger is also the communicator Fix: separate roles. IC owns comms; engineers fix. ### 2) "We'll be back in 10 minutes" Fix: next update time, not resolution time. ### 3) Explaining internals instead of impact Fix: describe symptoms, scope, workarounds. ### 4) Fragmented messaging Fix: pick one canonical source (status page, customer email, or a single internal update doc) and make everything point to it. ### 5) Radio silence after resolution Fix: close the loop with a short summary within 24 hours. --- ## How Runframe bakes this into your Slack incident workflow Most teams don't fail at templates—they fail at consistency. The hard part is enforcing: one owner, one canonical source, and a predictable cadence when everyone is stressed. Runframe operationalizes the exact rules above inside Slack: - **Role assignment:** the Incident Commander owns outbound updates. The scribe and responders stay focused on the fix. - **Canonical-source discipline:** Runframe treats your chosen source (status page or customer email) as the timeline and makes every other update point to it. - **Cadence prompts:** if a SEV0 is active and the next update time passes, Runframe nudges the IC to post the next update (no more "we forgot for 45 minutes" gaps). - **Channel-specific templates:** the IC can post a customer-safe update, an exec update, or a support kit update without rewriting from scratch. Two concrete examples: 1) **SEV0 declared:** IC posts "Status page: Initial" (copy-paste), then instantly posts the support kit template in #support with the status link. 2) **Update time reached:** Runframe prompts the IC with the exact "Update (identified)" block so the next update goes out on time, with no fake ETA. --- # The bottom line Incident communication is a system, not a talent. - Assign one owner (IC). - Keep one source of truth (your canonical source). - Use predictable cadence. - Talk in impact, not internals. - Say "unknown" when it's unknown. Templates make this easy. They also make you look calm under pressure. --- **Read more:** - [Incident Response Playbook: Roles, Scripts & Templates](/blog/incident-response-playbook) - [Reducing Context Switching During Incidents](/blog/engineering-productivity-incident-management) - [Post-Incident Review Templates: 3 Ready-to-Use Formats](/blog/post-incident-review-template) - [On-Call Rotation: Handoffs, Escalation, and Schedules](/blog/on-call-rotation-guide) - [Incident Severity Levels: The Framework That Actually Works](/blog/incident-severity-levels) --- <script type="application/ld+json"> { "@context": "https://schema.org", "@type": "FAQPage", "mainEntity": [ { "@type": "Question", "name": "Who should send incident updates to customers?", "acceptedAnswer": { "@type": "Answer", "text": "The Incident Commander. The person debugging should not also be writing customer updates. This separation prevents the technical fix from slowing down and the communication from being fragmented." } }, { "@type": "Question", "name": "How often should we update during a SEV0?", "acceptedAnswer": { "@type": "Answer", "text": "Every 15 minutes on your canonical source (status page, customer email, or a single internal update doc), and every 15–30 minutes to executives. Always include the next update time." } }, { "@type": "Question", "name": "What if we don't know the ETA?", "acceptedAnswer": { "@type": "Answer", "text": "Say 'unknown at this time' and commit to the next update time. Fake ETAs destroy trust faster than honesty." } }, { "@type": "Question", "name": "Should we apologize in incident communications?", "acceptedAnswer": { "@type": "Answer", "text": "Yes, briefly. Acknowledge impact. Avoid defensiveness and blame. Focus on fixing the problem and preventing recurrence." } }, { "@type": "Question", "name": "Do we need to post public updates for SEV2?", "acceptedAnswer": { "@type": "Answer", "text": "If customers are impacted, yes — briefly. If you're not sure, err toward a small note on your canonical source. Some customers are affected and they deserve transparency." } }, { "@type": "Question", "name": "What's the difference between customer and executive updates?", "acceptedAnswer": { "@type": "Answer", "text": "Customers need impact + next update. Executives need impact + business risk + timeline + next update. Executives get business metrics; customers get simple language." } }, { "@type": "Question", "name": "How do I handle fragmented communication across channels?", "acceptedAnswer": { "@type": "Answer", "text": "Pick one canonical source immediately — status page, customer email, or a single internal update doc. All other channels should link to it rather than duplicating information. When the canonical source updates, notify stakeholders that it updated." } }, { "@type": "Question", "name": "What if customers are asking on social media?", "acceptedAnswer": { "@type": "Answer", "text": "Acknowledge once with your standard social template, link to your canonical source, then stop engaging. Don't debate, don't get defensive, and don't promise ETAs in replies. Every customer question gets the same response: link to canonical source." } } ] } </script>
Incident Communication: 8 Copy-Paste Templates for Status, Email & Execs
Stop writing updates at 2 AM. Copy-paste templates for status pages, emails, exec updates, and social posts. Plus cadence and ownership rules for SREs.