Cloud Outage Risk Mitigation

All Our Eggs In One Basket?

The Fastly outage of 8th June 2021 was a wake up call on the risks of relying on outsourced platforms. Fastly is a cloud web service provider. A bug in their content delivery software led to an outage of under an hour and the bug fix being deployed within a day. The incident hit global media as the sites knocked off-line included PayPal, The New York Times and It is not the first ‘big ticket’ failure of a cloud service Amazon Web Services were disrupted on 25th November 2020 which affected devices such as Roomba and Ring.  Google services including Gmail and YouTube were disrupted for about 45 minutes on 14th December 2020

None of these outages were for extended periods of time and their overall impact on business and lifestyle was minimal. Nevertheless they prove that major service providers can go down not as criminal attacks but due to vulnerabilities within the systems themselves. These were all major incidents from big players. It is some consolation that there have been relatively few incidents and these were swiftly overcome.

On the other hand the consequences of any outage are out of the control of the data user and there is always the possibility of a longer or more serious incident. An additional worry is that key services are becoming the provision of a small number of big business players. Those players may have the experience and resources to handle increasing demand but the result is of disparate services relying on single points of failure.

The end user of these services needs to be aware that there could be failures that impact on their own business model. Use of an external data system must be treated in the same was as dealing with any other supplier. Their services need to be audited and the processes in place for safeguarding data identified. Unfortunately with some of the popular cloud services it is a case of ‘take it or leave it’. The services have a very large number of customers and adapt their service provision based on individual requirements. This means that to some degree the customer pays for what they need and can scale up or down as their demand changes. This ability to vary the scale and security of a cloud solution is reflected in the costs and risks of that provision. Outsourcing of provision does not mean outsourcing of responsibility. A supplier audit will not guarantee against failure but will identify where the supplier’s and user’s responsibilities lie.

It should always be assumed that no data is completely secure. In a private network there should be a system of backups and off-site systems. A cloud provider cannot be regarded as an immutable off-site data provider. Even a remote chance of loss of service or data needs to be built into the business continuity plan. Some means of continuing business must be in place if the cloud service is not available. The solution could be as simple as temporary web and social media presences apologising for and explaining the issue together with some means of carrying on working; even as far as relying on paper records. Most importantly the business continuity plan needs to include details of what staff need to do during any system outage.

Cloud computing services can be reliable and cost effective but there will always be some risk. Kindus advocates that users take care when selecting a provider and plan for the event of system failure no matter how unlikely that might be.

Leave a comment:

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.