MSP Incident Management: How It Actually Works

Incident management is the backbone of MSP operations. It's how you turn chaos into a process. But in most MSPs, the reality is very different from the ITIL textbook version.

Priority Classifications

P1 — Critical (Response: 15 min, Resolution: 4 hours)

What qualifies as P1: - Complete business down (all users affected) - Core infrastructure failure (Exchange, AD, file server) - Security breach or active attack - Data loss event - Compliance-critical system failure

Example scenarios: - Ransomware encryption spreading across the network - Primary domain controller offline - Complete internet connectivity loss - Financial system compromised

What happens: - Immediate page to on-call engineer - All other work stops - Client notified within 15 minutes - Bridge call established - Status updates every 30 minutes - Post-incident review mandatory

P2 — High (Response: 30 min, Resolution: 8 hours)

What qualifies as P2: - Major business function impaired - Multiple users affected (>20%) - Workaround not available - Performance degradation on critical systems

Example scenarios: - Email delivery delays - CRM system slow or unavailable - Printer fleet down office-wide - VPN connectivity issues affecting remote workers

What happens: - Assigned to available engineer - Client notified within 30 minutes - Status updates every 2 hours - Escalation to P1 if not resolved in 4 hours

P3 — Medium (Response: 2 hours, Resolution: 24 hours)

What qualifies as P3: - Single user or small group affected - Workaround available - Non-critical system issue - Convenience functions impaired

Example scenarios: - Individual user can't print - One user's Outlook not syncing - Non-critical application slow - Mobile device issues

What happens: - Normal ticket queue processing - Client notified within 2 hours - Status updates at resolution - Escalation to P2 if impact increases

P4 — Low (Response: 8 hours, Resolution: 5 days)

What qualifies as P4: - Cosmetic issues - Feature requests - Information requests - Scheduled maintenance items

Example scenarios: - User wants a software change - Password reset request - New user setup (non-urgent) - Documentation update request

What happens: - Normal ticket queue processing - Resolved during standard business hours - No escalation required unless deadline-driven

Escalation Paths

Functional Escalation

L1 Service Desk
    → Triage and basic troubleshooting
    → If can't resolve in 30 min → L2

L2 Systems Engineer
    → Advanced troubleshooting
    → If can't resolve in 2 hours → L3

L3 Senior Engineer / Architect
    → Complex root cause analysis
    → If can't resolve → Vendor support / External specialist

L4 Vendor Support
    → Microsoft, Cisco, etc.
    → Escalation through partner channels

Management Escalation

15 min — Team Lead notified
30 min — Service Delivery Manager notified
1 hour — Operations Manager notified
2 hours — Director / VP notified
4 hours — Executive briefing (for P1)

Client Escalation

15 min — Client primary contact notified (email + phone)
30 min — Client IT manager notified
1 hour — Client executive sponsor notified (P1 only)
2 hours — Client leadership briefing (P1 only)

Communication Templates

Initial Notification (P1)

Subject: [P1] [Client] - [Issue Summary] - OUTAGE

Severity: P1 - Critical
Status: Investigating
Impact: [X users affected] - [Business function] unavailable
Start Time: [Time]
Assigned To: [Engineer name]

Next update: [Time + 30 min]

Current Actions:
- Investigating root cause
- [Specific action taken]

Status Update (P1)

Subject: [P1] [Client] - [Issue Summary] - UPDATE [X]

Status: [Investigating/Identified/Monitoring/Resolved]
Impact: [Current impact]
Root Cause: [If identified]

Actions Taken:
- [Action 1]
- [Action 2]

Next Steps:
- [Next action]
- ETA for resolution: [Time]

Next update: [Time + 30 min]

Resolution Notification

Subject: [P1] [Client] - [Issue Summary] - RESOLVED

Status: Resolved
Resolution Time: [X hours X minutes]
Root Cause: [Brief description]

Actions Taken:
- [Resolution steps]

Follow-up:
- Post-incident review scheduled for [Date]
- [Any ongoing monitoring]

Please confirm normal operations from your end.

SLA Implications

How SLAs Affect Your Work

Response Time SLA: - The clock starts when the ticket is created - If you're on-call, your phone should never be on silent - Missed response SLAs = client credits = your bonus takes a hit

Resolution Time SLA: - Complex issues may legitimately exceed SLA - But poor documentation and communication makes it look worse - Escalate early if you think you'll miss resolution SLA

Escalation SLAs: - If you don't escalate per process, you own the delay - Document every escalation and response - Cover yourself: "Escalated to L2 at [time], awaiting response"

SLA Breach Consequences

For the MSP: - Financial penalties (contractual credits) - Client churn risk - Reputation damage - Partner status impact (Microsoft, etc.)

For you: - Performance reviews - Bonus impact - On-call rotation changes - Increased scrutiny

On-Call: The Reality

What On-Call Actually Involves

Phone must be on 24/7 during your rotation
Response within 15 minutes of page (most MSPs use PagerDuty or similar)
Resolve or escalate — don't sit on issues during on-call
Document everything — on-call work must be logged in the PSA
Handover — brief the next on-call engineer on any open issues

Surviving On-Call

Prepare before your rotation — Know the current state of all clients
Keep tools accessible — VPN, RMM, PSA should be on your phone
Set up alerts properly — Don't get paged for every low-priority ticket
Sleep when you can — But don't miss pages
Track your hours — On-call should be compensated (standby + call-in rates)
Request TOIL — Time off in lieu for after-hours work

On-Call Compensation (Australia)

Under most awards and agreements: - Standby allowance — Paid for being available (typically $3-5/hour) - Call-in rate — Minimum 3 hours at overtime rates when called in - Weekend/holiday rates — Higher standby and call-in rates

[!WARNING] If your MSP doesn't compensate on-call work, check your award and employment contract. Under the Professional Employees Award, additional hours beyond 38/week must be "reasonable" and may attract penalty rates.

Fair Work Rights — Know your legal rights around on-call compensation
MSP Onboarding Checklist — Your first 90 days guide
MSP Burnout Guide — Warning signs and how to recover
Essential 8 Implementation — Security incident response
PowerShell Automation — Automate repetitive tasks

Was this helpful?

Thanks for your feedback!

`/`	Focus search
`Esc`	Close search / modal
`J` / `K`	Next / Previous article
`B`	Bookmark this article
`D`	Toggle dark mode
`P`	Print article
`?`	Show this help

MSP Incident Management: How It Actually Works - MSP Guide Australia

MSP Incident Management: How It Actually Works

Priority Classifications

P1 — Critical (Response: 15 min, Resolution: 4 hours)

P2 — High (Response: 30 min, Resolution: 8 hours)

P3 — Medium (Response: 2 hours, Resolution: 24 hours)

P4 — Low (Response: 8 hours, Resolution: 5 days)

Escalation Paths

Functional Escalation

Management Escalation

Client Escalation

Communication Templates

Initial Notification (P1)

Status Update (P1)

Resolution Notification

SLA Implications

How SLAs Affect Your Work

SLA Breach Consequences

On-Call: The Reality

What On-Call Actually Involves

Surviving On-Call

On-Call Compensation (Australia)

Frequently Asked Questions

Related Reading

⌨️ Keyboard Shortcuts

MSP Incident Management: How It Actually Works - MSP Guide Australia

MSP Incident Management: How It Actually Works

Priority Classifications

P1 — Critical (Response: 15 min, Resolution: 4 hours)

P2 — High (Response: 30 min, Resolution: 8 hours)

P3 — Medium (Response: 2 hours, Resolution: 24 hours)

P4 — Low (Response: 8 hours, Resolution: 5 days)

Escalation Paths

Functional Escalation

Management Escalation

Client Escalation

Communication Templates

Initial Notification (P1)

Status Update (P1)

Resolution Notification

SLA Implications

How SLAs Affect Your Work

SLA Breach Consequences

On-Call: The Reality

What On-Call Actually Involves

Surviving On-Call

On-Call Compensation (Australia)

Related Guides

📧 Get MSP Analysis Weekly

Frequently Asked Questions

Related Reading

📑 Table of Contents