🔍

MSP Disaster Recovery Testing: Don't Wait Until It's Too Late - MSP Guide Australia

Compliance 2026-06-11 🕐 5 min 1089 words

MSP Disaster Recovery Testing: Don't Wait Until It's Too Late

Your MSP says you have a disaster recovery plan. Your backups are running. Your systems are protected. But when was the last time anyone actually tested whether recovery works?

In the Australian MSP industry, disaster recovery testing is one of the most neglected activities. Many MSPs back up data religiously but never verify that the backups can be restored in a real disaster. Here is why testing matters and how to ensure it happens.

Why Untested DR Plans Are Dangerous

A disaster recovery plan that has not been tested is a hypothesis, not a plan. It may look good on paper, but until you have actually recovered systems from backups and verified they work, you have no confidence that recovery will succeed when you need it.

Common DR Testing Failures

  • Backup corruption — backups were running but the data was corrupted, making restoration impossible
  • Missing dependencies — the primary system was restored but a critical dependency was not
  • Insufficient infrastructure — the DR site cannot handle the load of production systems
  • Configuration drift — the DR environment was not updated to match production changes
  • Documentation gaps — the recovery steps were incomplete or outdated
  • Personnel issues — the people who wrote the plan are no longer available

The Real-World Consequence

When a disaster strikes and recovery fails, the impact is catastrophic:

  • Extended downtime (days instead of hours)
  • Data loss beyond the backup window
  • Revenue loss from inability to serve customers
  • Regulatory penalties for breach notification delays
  • Reputational damage that persists long after systems are restored

Types of DR Testing

Tabletop Exercises

A guided discussion where stakeholders walk through a disaster scenario:

  • What happens if your primary data centre goes offline?
  • Who is responsible for initiating recovery?
  • How do you communicate with staff and customers?
  • What are the recovery priorities?

Tabletop exercises are low-cost and reveal gaps in planning without any technical risk. They should be conducted at least twice per year.

Parallel Tests

Recovery systems are built and tested alongside production:

  • Restore data to a separate environment
  • Verify the restored systems function correctly
  • Measure actual recovery time against targets
  • Identify issues without impacting production

Parallel tests provide meaningful validation without the risk of disrupting live operations.

Full Failover Tests

Production traffic is actually switched to recovery systems:

  • All critical systems fail over to DR infrastructure
  • Production operations continue on DR systems
  • Systems fail back to primary after verification

Full failover tests are the most realistic but carry the highest risk. They require careful planning and should only be conducted when the business can tolerate potential disruption.

What Recovery Metrics Mean

Recovery Time Objective (RTO)

The maximum acceptable time from disaster declaration to restored operations:

RTO What It Means Typical Cost
1 hour Mission-critical systems Premium
4 hours Business-critical systems Moderate
24 hours Important but not urgent Standard
72 hours Can tolerate significant downtime Budget

Recovery Point Objective (RPO)

The maximum acceptable data loss measured in time:

RPO What It Means Backup Frequency
1 hour Can lose up to 1 hour of data Hourly backups
4 hours Can lose up to 4 hours of data 4-hourly backups
24 hours Can lose up to 1 day of data Daily backups
1 week Can lose up to 1 week of data Weekly backups

Your MSP should define RTO and RPO for each critical system and test against these targets. Our MSP Backup and Disaster Recovery guide covers backup strategy in detail.

How to Verify Your MSP's DR Testing

Ask for Test Reports

Request documentation of the most recent DR test, including:

  • Date of the test
  • Scope (what systems were tested)
  • Methodology (tabletop, parallel, or failover)
  • Results (what worked, what failed)
  • Issues found and remediation actions
  • Actual RTO and RPO achieved vs targets

If your MSP cannot produce this documentation, testing is either not happening or not being recorded.

Request RTO and RPO Commitments

Ensure your MSP contract specifies RTO and RPO targets for each service tier. Without contractual commitments, there is no accountability for recovery performance.

Attend DR Tests

Request to participate in tabletop exercises. This ensures you understand the recovery process and can provide input on business priorities.

Review DR Infrastructure

Ask your MSP to demonstrate the DR environment. Where is it located? How is it maintained? How current is the data? What capacity does it have?

The Testing Calendar

A practical testing schedule for MSP clients:

Test Type Frequency Participants
Tabletop exercise Quarterly MSP + client leadership
Backup restoration test Monthly MSP technical team
Parallel DR test Semi-annually MSP + client IT
Full failover test Annually All stakeholders
Communication test Quarterly MSP + client leadership

Red Flags in MSP DR Testing

"We Test Automatically"

Some MSPs claim their monitoring tools "automatically test" backups. Automated verification confirms that backup files exist and are readable — it does not verify that systems can actually be restored and function correctly. Automated checks are useful but insufficient.

No Client Involvement

If your MSP tests DR without any involvement from your team, you are not prepared for a real disaster. Recovery requires coordination between the MSP and the business.

No Test Documentation

If there are no written test reports, the testing is not being taken seriously. Documentation creates accountability and provides evidence of due diligence.

Always "Successful"

If every DR test is reported as "successful" with no issues found, either the tests are too superficial or the reports are not honest. Real testing reveals real issues — and that is the point.

No Improvement Actions

If the same issues appear in consecutive test reports without remediation, the testing process is not driving improvement.

DR Testing and Compliance

For regulated industries, DR testing is a compliance requirement:

  • Essential 8 — data recovery and backup controls require testing
  • ISO 27001 — requires testing of business continuity and disaster recovery plans
  • SOC 2 — business continuity testing is a trust service criteria
  • APRA CPS 234 — requires testing of information security controls including incident response

Our Essential 8 Implementation Checklist includes data recovery requirements.

The Bottom Line

Disaster recovery is not a set-and-forget activity. It requires regular testing, documentation, and improvement. Your MSP should be conducting DR tests and providing you with evidence of results.

If your MSP cannot demonstrate that they test DR — with documentation, results, and improvement actions — you have an untested recovery plan. And an untested plan is no plan at all.


Use our MSP Health Score to evaluate your MSP's operational maturity, or review our Essential 8 Guide for data recovery requirements.

Frequently Asked Questions

How often should disaster recovery testing occur?
Full DR tests should occur at least annually, with partial tests quarterly. Critical systems should be tested monthly. Many MSPs do not test DR at all, which means their clients have untested recovery plans.
What types of DR testing should an MSP perform?
Three main types: tabletop exercises (walk through the plan), parallel tests (test recovery in parallel with production), and full failover tests (actually switch to backup systems). Each serves a different purpose and provides different levels of confidence.
Should I as a business owner be involved in DR testing?
Yes. You should understand the recovery plan, know the Recovery Time Objective (RTO) and Recovery Point Objective (RPO) for your critical systems, and participate in tabletop exercises. If the MSP handles testing without your involvement, you may not be prepared for an actual disaster.
What if my MSP says they test DR but cannot prove it?
Ask for test reports, including the date of the last test, what was tested, what the results were, and what issues were found. If they cannot produce documentation, the testing is either not happening or not being documented — both are problems.

Related Reading