This week Charlie looks at how businesses determine their Recovery Time Objectives (RTO) and presents the possibility of a new type of RTO.
Next week, Ewan (PlanB Senior Consultant and occasional BC Training tutor) and I, are going to conduct a full business continuity lifecycle for an international airport. We have both been thinking about how we will adapt our lifecycle to this particular type of client. One of the things I have been thinking about a lot is how businesses determine and represent their RTOs.
Within the BCI’s Good Practice Guidelines 2013 an RTO is defined as 'The period of time following an incident within which a product or an activity must be resumed, or resources must be recovered'. For many activities, this is quite simple. As part of your BIA, you can identify an activity such as 'producing management accounts’, and by looking at the impact of the activity not being done, you can provide a draft RTO of say three days. During the design stage of the lifecycle, it can be worked out whether you can devise a strategy to meet the draft RTO. If you can’t meet the three days, you may be able to adjust the RTO to, say four days, and devise a strategy within which the RTO can be met.
Once you have looked at all the RTOs and have strategies in place, you should then get them signed off by top management. This is so they understand that if an incident occurs, they have already agreed that they will have no management accounts produced until four days after the disaster. I am a great believer that you should be honest in your designation of RTOs and you should never have an agreed RTO which cannot reasonably be met. This is for several reasons:
1. Management might think that the RTO can be met and it can have a major impact on the business if it is not met.
2. You might tell your customers that they will get a certain activity back at the RTO, but you could be effectively lying to them if that time cannot be met.
3. If you are audited and it is discovered that the RTO has been stated but cannot be met, it could easily undermine the credibility of the rest of your programme.
So, this brings us back to the airport. When you speak to the senior managers and you ask them for their tolerance for downtime, they will tell you (I suspect) that they want their airport operational all the time. If there is downtime they will want it up and running as soon as possible.
Within the airport, if we have key activities which we cannot operate without, such as security scanning or vital IT systems which manage boarding, how do we set their RTOs? We have some options:
1. We could set them at 0 hours but we know that unless we have two complete separate sets of scanning equipment the RTO cannot be 0. The RTO will be as long as it takes to fix them (minutes or hours) or in a worst case scenario, replace them (weeks or months).
2. They could be set at an arbitrary time of 1 or 2 hours, which we think in the majority of cases is the most likely time it will take to conduct repairs. In this case, there is absolutely no guarantee that these can be met.
3. Or we could set them at several months, which is the time it will take for them to be replaced.
Option three might be very difficult to sell to senior managers as it is suggesting you can survive if your airport is not operational for several months.
I think there needs to be a new designated RTO and I have called it a 'Keystone RTO', however I am open to suggestions for a better term! The idea is that a Keystone RTO is an activity or resource which is essential to the delivery of your key products and services, and without them the activity stops. The designation is 0 hours RTO which shows its importance to the process. We must accept that with Keystone RTOs, we will do whatever it takes to get up and running but we have no ‘guaranteed’ or planned recovery time. The recovery will take as long as it takes. If we can designate a ‘keystone’ we can say to our clients or auditors that we recognise its importance, but we understand that it will not be instantly recovered.
Many organisations, especially utilities, manufacturing, critical infrastructure, emergency services, transport, and in our case, airports, have a very low tolerance of downtime but have to accept that this will happen. If they designate activities or resources as is done with the Keystone RTOs, at least they are acknowledging that they are absolutely key to the organisation and will be restored as soon as possible.