A question that I’ve encountered many times in the field of late is what are the impacts of DDoS attacks on cloud compute environments? The primary benefit of cloud is that it elastically scales to meet variable demand, scale up instantly, scale down when demand subsides – in seconds… So layman’s logic might say that cloud-based services are immune from the downtime effects of DDoS attackers, however the possibility of gigantic unexpected bills is a given?
After exploring the topic with a number of eminent cloud architects and representatives of online organizations, with experience running primary service on cloud platforms such as Amazon AWS and Azure, here are the seven interlinked complexities that they choose to reflect on:
- Customer experience is key – The simple fact is that the majority of DDoS attacks have limited ambition of killing a service entirely but rather significantly impair customer experience. So without DDoS protection that can distinguish legitimate traffic from bad, it is not uncommon for DDoS attacks to go unnoticed and instead blight customer experience in cloud environments just as they do in traditional physical data centres. A number of the below points expand on the where’s and how’s of this key consideration.
- DDoS infrastructure pains are different between traditional data centres and cloud – In traditional physical data centre environments all DDoS payload targeting base infrastructure becomes a potential customer experience and economic hit against capacity assumptions. Into cloud environments this pattern changes. For starters, attacks against the underlying infrastructure before the customer’s front-end apps are generally dealt with by the cloud service provider. There are recorded outages where cloud service providers have not got things entirely right of course, such as the 2016 Dyn attack, but such incidents remain exceptions to the rule for now. The upshot of in-built protection is that customers do not normally feel the burn of poor user experience or the bill of DDoS attacks hitting shared internet connectivity or APIs. However, when the attack gets into the customer’s personal compute domain, the pain occurs. The user experience degradation is discussed below, focusing now on economic impacts that can vary both within a service and between the different providers. For example, AWS does not bill you for ingress traffic into an Elastic Load Balancer, you just pay for the egress traffic, however if the DDoS traffic is getting NATed then you would be paying $0.050 per GB processed for the inbound traffic. Obviously when a front-end web application starts to take a hit, then it is only the severity of cost that is variable.
- Lift and shift of traditional data centre to cloud renders auto-scalability virtually useless to fend off DDoS attacks – If a customer takes their traditional data centre footprint into a cloud environment without transformation, then they will inevitably be lifting the majority of their previous capacity ceilings into their new cloud home. Into an attack the customer would be almost assured of hitting a limitation in licensing, O/S threshold, messaging queue or some other interlink between front-end and back-end applications that would cause an outage or major service degradation that no end of horizontal auto-scaling or virtual RAM and CPU resources could mitigate. In one sense this scaling failure might protect them from the worst of a so-called Economic Denial-of-Service attack, EDoS, AKA a huge bill. Not something to applaud of course.
- Excess billing prevention is on you – The major cloud providers simply do not provide utilisation caps. It is not that doing so is contra revenue interests (which it is of course), but as much they don’t want to be the party that brings their customer’s house down due to their involvement in the automatic shutdown of service. For this reason it is down to the end users to define the thresholds for dollar spend alerts to somehow then run countermeasures to limit the economic blast radius of an EDoS attack. For instance, since 2012 Amazon has had functionality which allows customers to get an aggregate view on their billing across services (S3, EC2, etc.) and set alert thresholds around these. Obviously it is on you to determine exactly what you do with the alerts.
- Cloud-architected applications is the road to infinite billing – Customers that have gone through the cloud transformation playbook face the nemesis of infinite cloud scale: infinite cloud billing. Stripping back an environment from the traditional limits of scalability in favor of cloud-built apps quite simply takes the gloves off a DDoS attack to economically hurt. For this reason such organizations embarking on this transformative cloud journey would have received the very sound advice of cloud migration experts and a multitude of best practice whitepapers to purchase effective DDoS protection. If they had not taken heed then the clock is ticking to face down the bill caused by the insatiable demand of an IoT botnet!
- Knowledge is power, metrics matter – Organizations that take their cloud play highly seriously are some of the most probable organizations to be leveraging an understanding of utilisation to drive a better understanding of customer experience and predicting forward demand for their products. A so-called Data Driven Company. In stark example, a streaming service like Netflix would be measuring the volume of video commencements, the logic being that a higher than normal ratio of commences could indicate a service issue with frustrated users attempting to recommence content on initial failure. It is not difficult to understand that a DDoS attack into the service left unmitigated could completely throw off such metrics and the value they have towards the company bottom line.
- Auto-scaling combined with pulse attacks is the primary pain for both economic and experience impacts – One of the key trends Radware sees is around pulse attacks, whereby organizations endure multi-vector attacks that oscillate between volumetric and application layer in short periods of time that leave no room for infrastructure to recover or manual countermeasures to be formed. Think of how such burst attacks play out against auto-scaling triggers, even if the organization has DDoS technology. The technology would need to be able to detect and mitigate attacks in seconds to prevent auto-scaling triggers firing. Fleets of virtual servers coming online automatically, remaining up for a period of time and then shutting down again when load disappeared. You are left thinking about the unnecessary bills caused by mass over-provisioning – which would certainly add up. However, more seriously, what are the implications for customer experience and the management overheads for supervising services coming online and going offline in high frequency. Most organizations would expect that service goes into overdrive just once or twice a day to meet variable demand, what if that was happening +500 times a day? The organizations I’ve heard about that have encountered this scenario were proud of how they fought off 48-hour attacks through the determined efforts of DevOps staff to keep putting plugs in the dam to limit the billing blast radius. But into campaigns that go on much longer, the distraction for the ICT org is irrefutably highly problematic without highly effective DDoS mitigation technology.
In summing up, cloud computing provides no assurance that inherent scalability can mitigate against DDoS attacks. The pain could be service outage, user experience degradation or unnecessary reoccurring spend and the only viable mitigation is DDoS technology that is as automated enough to cope.