How do RTO and RPO effect your Disaster Recovery Plan?

28 June 2021 - by Charlie Metcalfe


Banner image showing a pile of stopwatch clocks

Introduction

Organisations that offer time-critical services to their customers cannot afford periods of partial or full system downtime.

Disasters that degrade or completely take your infrastructure offline can happen in a split-second, and at any time. Unexpected issues like natural disasters, through to human error, power outages, fire, flood and theft can all take your infrastructure offline for days, or even weeks.

This downtime results in the loss of valuable productivity, causing the inability to provide all or some of your business services. This can be damaging upon your organisation’s reputation, due to delays caused to customer projects, etc. This is one of the main reasons why every organisation must have a disaster recovery & business continuity plan in place, whilst also ensuring it’s regularly tested - proven effective and working.

There’s a lot of information to take in when it comes to RTO and RPO parameters, and how they should be fine-tuned to your organisation and the unique service you provide. In this blog post, our Disaster Recovery experts at Flownet have created a guide to help you understand the importance of these parameters. However, we invite you to contact us if you’d like to learn more.

What does RTO and RPO mean?

Whilst they look similar, RTO and RPO have very different meanings, but are both equally as important to the planning of an effective Disaster Recovery system.

RTO (Recovery Time Objective) is about looking forward – or, the downtime of your environment. RTO defines how quickly services should be ‘back online’ after an outage. In other words – how quickly your systems should recover to an operational state.
RPO (Recovery Point Objective) is about looking backwards – it’s the amount of lost data that is acceptable following the restoration of the system. In essence, it represents the frequency of replications / backups, as to how much data you will lose and therefore could have to re-enter following a system failure / outage.

If you would like to learn more about these terms, please contact us today.

What is an example of RTO and RPO?

As discussed, RTO is about the downtime of your system, or how long it’ll take to recover. RPO, on the other hand, is about the data lost from the last successful replication or backup, once systems are recovered within the timeframe of the RTO.

An example of this could be that you’re working on an important project and there’s a power outage. You could think of the RPO in this situation as the last time you saved the task you’re working on. In other words – the amount of data you’ll lose from the last time you saved that document to when there was a power cut. The RTO in this situation could be thought of how long you can survive before you require the ability to work on that document again.

For example, agreements with customers as to project turnaround times would be a key influencer to setting a realistic RTO, however how many changes happen in a certain period of time could be a key driver behind setting an RPO. In other words – if you have a system which has a lot of data change, you may need a quicker RPO to avoid a large amount of data loss, than one with a longer RPO.

How do RTO and RPO differ in my Disaster Recovery planning?

Both RTO and RPO are critical in planning and designing a Disaster Recovery system. RTO defines the amount of time your organisation can survive with being offline before you need systems online again. Your systems can have different RTO and RPO times based on criticality of the service in terms of RTO: time to get the system back online again, and the RPO: how little data you can afford to lose in a certain period of time.

There’s no correct or wrong answer as to how much data loss you can tolerate or how long for. This is why having a business continuity plan which reflects the specifics and intricacies of your business operations is crucial.

Many organisations have a variety of systems, some which vary in criticality and importance. This is why you can have ‘tiers of services which fall under different RTO and RPO measures, in order to break down which services are the most important to get back online, and which you can ‘live without’ for a little longer following an outage.

For example, you could organise your services into three tiers:

Tier 1 – Mission Critical Systems

These services would be defined as the most important services for your business to perform its operations at all times. For instance, in an IT disaster recovery sense, this could be your CRM, sales platform, website or other highly-important application. In a physical sense, it could be the power feed into your server room, or your server room air conditioning.

Tier 1 services should be recovered within 0 to 2 hours.

Tier 2 – Business Critical Systems

Business critical applications are less critical than your mission critical services. Often, your business-critical services can be dependent upon your mission critical services. E.g. Tier 2 services may be powered by your Tier 1 servers.

In an IT Disaster Recovery sense, an example of a Tier 2 service could be your Remote Desktop / VPN system, especially in today’s flexible working world, but it could also be a payment gateway or other Line of Business application.

Tier 2 services should be recovered within 4 to 24 hours.

Tier 3 – Non-Critical Services

Non-Critical Services are those where you could survive without them, but it’s inconvenient to have them offline. Examples of these could be your phone lines in an environment where phones are less used, or it could be your development / testing systems, rather than production systems. For example, if you have in-house developers, then it’s important that they can continue working on their development systems, but regaining production systems is more important in an outage.

Tier 3 services can sometimes use different Tier 3 services have the lowest priority when recovering from a downtime. However, they should still be restored within around 24-48 hours.

The perfect balance between What’s Ideal & What’s Realistic

Animated man holding balance weights

In an ideal world, every organisation’s RTO and RPO would be as close to zero as possible. However, in the majority of organisations, this would be extremely expensive and may not be required. This shows exactly where you cannot have a “one size fits all” approach to your business continuity and disaster recovery plan – it needs to be bespoke for your organisation.

Flownet’s disaster recovery experts are able to help your organisation understand times to recovery for your various applications and systems to define what works for you.

Contact Flownet for your Business Continuity & Disaster Recovery Solutions

At Flownet, we have a team of experienced IT professionals, trusted by organisations across a variety of sectors, who are well-seasoned with the latest technologies and best practices to design and build a rock-solid Disaster Recovery solution – protecting you when you need it most.

Flownet’s team of experts have the capability to understand your business needs and explain intricate IT issues in a way that’s easy to understand, ensuring you have a robust business continuity plan. Our team will also ensure that your plan is always maintained and tested frequently – so it’s available for when you need it the most!

For professional advice and consultancy on business continuity and disaster recovery solutions, contact us today!