This week Charlie discusses the recent British Airways IT disaster and how the incident was handled by the organisation.
The pictures of stranded passengers sleeping on the floor at Gatwick and Heathrow are not good for the reputation of British Airways. I noticed, with slight amusement, that every business continuity and IT armchair pundit have taken to social media to express their outrage. Many are asking why the airline didn’t have backup systems, where their planning for this type of event was and why the plan hadn’t been exercised, as this would have greatly improved their response.
British Airways is a reasonably well-run airline and their profits last year were twice of similar sized airlines, so they must be doing something right. I suspect they do have plans and backups, which they exercise in a similar way to other airlines, large companies and organisations that provide services to the public. The bit the pundits have not really thought about is the criticality of the IT systems and how British Airways simply cannot operate without their systems.
I have been scouring the internet for any information on why the outage happened, but the only information released so far is that it was due to a power surge. The East Coast Amazon data centre outage a couple of months ago should have taught us that however many backup systems an organisation has, their IT can still fail. In the business continuity profession, we should understand this and should accept that these things happen; we just have to deal with the consequences.
For British Airways, if they lose their IT, there is no manual workaround. You cannot print off who is meant to be on each plane, as there are no systems up and working to get the information from. There is only one strategy available to you, which is cancelling all flights until your systems are up again. If you have passengers in transit, they are stranded until you are operating again.
What makes this situation worse is that access to the booking systems (either through their call centre or on the web) don’t work, which means even if a passenger can get through to the call centre, the employees can’t do anything. Your staff in the terminal cannot give passengers any information, because there is none to be had. The only thing you can proactively do is tell passengers not to come to the airport, which at least lets them be angry at home, rather than in public view.
Once your systems are up and running, there is the issue of the plane being in the wrong place, luggage to send on and clearing the backlog of passengers. When you run your operations at near capacity, the disruption has a long tail of knock-on effects.
Yes, I am sure the outage should never have happened and yes, I am sure that there could have been better information at the terminal and better management of passengers, but in defence of the airline they did quite a lot right. They had the CEO, Alex Cruz, apologising and speaking directly to customers through a series of YouTube videos, in which I noticed passengers in the terminal had yoga mats to sleep on.
For British Airways, the media circus and internet outrage will move on. A few weeks ago it was United Airlines dragging the passenger off the plane, this week it's British Airways, next week there will be something new.
I travel a lot and I have a short memory. I can’t count the number of times I have said I am never going to fly Ryanair, British Airways or Flybe again, travel on Virgin, Southern or Scotrail, or drive on the A14, A82 or M25. If I followed through with my threats, I would be walking everywhere, and having got soaked a couple of times walking, this form of transport is not available to me either! Memories fade very quickly and people will continue to fly British Airways, whether it is because they have the right route at the right time, they have lots of air miles to use, or they were not affected by the incident.
The lesson for us business continuity people is that however ‘bomb proof’ our IT people say the organisations systems are and however many backup systems you have, they can still fail. In your plans, you have to think about how you would handle an incident where all your IT systems go down. For many organisations, there is no manual workaround and we have to deal with the consequence of the incident until the systems are back up again.