How the strongest DDoS attack in the Czech Republic took place

[gtranslate]

Since the early morning hours of Monday 05.04.2021, our website and our infrastructure have been the target of very strong DDoS attacks, so strong that it is probably the strongest DDoS attack in the Czech Republic to date. We have prepared technical background and charts for this attack. This is just to let you know what can commonly happen on the Czech Internet.

First a statement on the attack or don’t worry, we’ve got it under control

The vast majority of our customers were not significantly affected by the attack. We are experiencing slow services or, for a small part of services, a few outages of a few minutes. The impact was greater for a unit percentage of services (unavailability of tens of minutes or hours, but only from some locations). On the contrary, the vast majority of services had no problem.

The attack was noticeable mainly because our website, client administration and our status page were down for short periods of time. That was mainly the situation on Monday 5. 4. 2021 and partly also Tuesday 6. 4. 2021. After that we made some adjustments to the infrastructure, made some improvements and on Wednesday 7. 4. By 2021, such strong attacks were already causing only minor slowdowns on some of our websites.

Some monitoring services reported an outage to customers, but in reality there was no outage. We just filtered different types of communication (including ICMP packets, etc.). We had to completely block some networks and autonomous systems, so we couldn’t get anywhere at all.

This was the first attack of this force and this kind. We have not encountered anything like this before. Firstly, this attack was extremely powerful (hundreds of Gbps) and long (basically 72 hours and still partially ongoing) . It took us a while to adapt to the new current situation. More attacks like this will hopefully become routine for us. We automated what we could, and we will automate what we can.

The attackers attacked our routers first. When we successfully fought it, they started attacking our company websites. When we fought this too, they started attacking customer services (web hosts and some virtual servers). It was changing a lot.
By the way, the strongest measured attack on 1 of our services – one of our sites was over 160 Gbps. The attacks were, of course, variously cumulative and connected.

What is a DDoS attack

A DDoS attack is a coordinated effort by a large number of compromised or vulnerable Zombie computers (PC, server, mobile, IoT, etc.) to overload a network element by overwhelming it with requests. DDoS attacks are carried out by the so-called Botnet, which is a coordinated network of these infected Zombie computers controlled via so-called Command and Control (C&C) Servers.

The more Zombie Machines there are in the Botnet, the more powerful the attacker is able to prepare and execute. The largest Botnets can contain millions of devices. Most people don’t even know that their computer is part of the Botnet and is waiting for orders from the C&C server.

DDoS attacks usually try to overload the target with brute force. Two methods are used: clogging connectivity with large traffic and packet count.

In the first case, the idea is to direct more traffic to the target than the connectivity of the target can handle. If you manage to fully load it, regular users can no longer reach the target and it seems to be slow or offline.

The second type of attack attempts to overload the performance of an element in the target network with a large number of packets. Either the processor can’t handle it or the memory is running out.

In both cases, it is important that the network infrastructure is also tailored to the entire connectivity. If you have 100 Gbps connectivity and a 20 Gbps border router, it will take down a 30 Gbps attack. Similarly with packets.

First, some information about our network infrastructure

At the end of 2017, we were one of the first commercial hosts to connect a 100 Gbps route. This was a significant event that allowed us to much better analyze and filter very strong DDoS attacks.

Connection of the first 100 Gbps route in Prague in December 2017. In the picture you still have working wiring and that’s why there are different cables… 🙂

Gradually we upgraded all 3 routes to 100 Gbps. We currently have:

Route 1 – 100 Gbps – DC1 WEDOS ⟶ Tábor ⟶ Praha SITEL (CeColo)
Route 2 – 100 Gbps – DC1 WEDOS ⟶ Písek ⟶ Praha SITEL (CeColo)
Route 3 – 100 Gbps – DC1 – DC2 WEDOS ⟶ Jihlava ⟶ ČDT (U2) ⟶ TTC

And then two more 10 Gbps backup routes.

The main routes were Route 1 (CETIN – Tábor) and Route 2 (CETIN – Písek). They lead from the Sitel datacenter via Cetin optics to us at Hluboká and form the so-called backbone network (200 Gbps). In Prague we have connections to SITEL (CeColo) – 100 Gbps to Cogent, 100 Gbps to Telia, 100 Gbps to Kaora, 10 Gbps to CTD.

As a backup we have Route 3 (CTD – Jihlava) and there in the U2 datacenter (CTD) we have 100 Gbps connections to Kaora and 10 Gbps to CTD.

There are 100 Gbps Arista smart switches all along the route, which we think are some of the best machines in the world. We are so happy with them that we decided to start using them for internal infrastructure as well.

The Arista 7280QR-C36 smart switch, which we use as edge “routers”, can handle up to 4.32 Tb of data transferred per second or 1.44 billion packets per second.

Arista DCS-7050QX-32S. Každý má 32 portů pro 40 Gbps. Kapacita switche 2,56 Tb/s a 1,44 miliardy paketů za sekundu, a to vše při odezvě 550 ns.

Arista DCS-7050QX-32S zadní strana a příslušenství. V případě problémů umožňuje i hot-swap zdrojů.

How the attack on WEDOS was carried out

The attack began at 3:25 a.m. on Monday, April 5, 2021, and continued in various forms and strength for the following days.

The graphs are not complete because due to capacity overload some snmp statistics are not available because they have been overwritten with more recent information.

Moreover, on the charts you see different averages and over several minutes (3 to 15) and therefore it is distorting. We have always filtered the attacks after about a second and so the traffic has dropped and the average for a few minutes is affected.

The following graphs show the traffic and packets intercepted by our DDoS protection during each day.

The traffic graphs show the attacks via each route. The strongest one was on Monday and we will describe it in more detail. On Tuesday we made a routing adjustment and connected the Telia backup route.

A single route (05.04 – 07.04)

Cogent

Kaora

Telia

The attacks from Monday to Tuesday

It wasn’t until 12:00 on Monday that the attack really got interesting. 5. 4. 2021, when the striker started to really kick it up a notch. At 1:00 the next day, he then tried a very strong attack via packets. There have been other attacks all along, but we are focusing on this time area only.

In the following graph you can see the attacks as they came in on the Cogent and Karoa routes (we only fully engaged Telia on Tuesday afternoon).

The first graph shows the raw power in Gbps.

DDoS attack 05.04.2021 – 06.04.2021 on route 1 and route 2 with Gbps brute force

The second graph shows attempts to drop individual elements of our network infrastructure via packet count.

DDoS attack 05.04.2021 – 06.04.2021 on route 1 and route 2 by brute force Packets

As for the attack itself, it was several hundred different attacks, varying in form, strength, and targets. As we gradually reacted to them and put new filters and rules into the protections, the attacker tried to adapt to that. In general, about 7 minutes out of the whole day were crisis minutes, when we saw a drop in natural traffic on one router.

As for the targets, the attacker changed it continuously depending on what he found and what he hoped was not so well protected or we underestimated the network infrastructure somewhere. However, he did not do much damage. Rather, it was a desperate change of IP addresses and a few domains.

What he did manage to bring down was our status page, but we have had it off our infrastructure and protections from the beginning so that we can communicate with customers if there are problems. And we just verified that even the foreign solution is not 100%.

The attacks came from all over the world

On Monday evening, during the attack, we had an online conference with 70 people via meet.wedos.com. When a strong evening attack came, the boss just alerted us and you could see him quickly clicking through the stats and monitoring every now and then. However, the attack on the conference call was unnoticeable, and we had people from abroad there.

Record-breaking brute force attack – attempt to clog connectivity (Gbps)

Let’s now divide the two routes and see what the attacker “managed”. It should be taken into account that the graph shows minute averages, so 100 Gbps has been reached.

The Cogent connection was reaching its maximum and we actually recorded a total of tens of seconds during the whole time when it was clogged and the transmissions had to go through route 2. At this time, some packets that do not require acknowledgement (such as UDP) may have actually been lost. However, for web sites, TCP is used and if a packet is lost (no acknowledgement from the other side), it is sent again and should go through the other route.

At the highest peaks, we are unable to tell how strong the attack on this route was, as it exceeded 100 Gbps, which we can still measure.

DDoS attack 5. 4. 2021 – 6. 4. 2021 to route 1 Gbps

The second route did much better. There wasn’t “so much” going through her and she wasn’t overwhelmed. So we can say that our 2x 100 Gbps capacity was sufficient for this attack.

DDoS attack 5. 4. 2021 – 6. 4. 2021 to Route 2 with Gbps brute force

This was the strongest part of Monday’s brute force attack with an attempt to clog our connectivity. Minute averages are calculated and recorded once every 3 minutes.

The strongest part of Monday’s brute force DDoS attack.

Graphs of the individual routes are attached.

Brute force attack – attempt to overload network elements (packets)

When the attacker was unable to clog our connectivity, he decided to try to overwhelm our network elements with packets. This attack is very unpleasant because it can take many forms. Here we are really happy to invest in quality switches and routers that far exceed what we need for normal operation.

In the end, this attack did not cause any major damage, but 77.1 million packets per machine sounds really scary.

DDoS attack 5. 4. 2021 – 6. 4.2021 to route 1 brute force packets

The second route also had its fun, but the attack on/over it was not as strong.

Attacks from Tuesday to Wednesday

In the afternoon we made a number of adjustments to the routing and improved the filtering. The result was mainly the separation of CZ/SK traffic, which most of our customers need. Connection 3 (Telia) is also fully connected.

It worked well, only a few Czech IP addresses ended up on the blacklist in the morning, some of the attacks in the tens of Gbps range go through NIX.

A couple of tens of Gbps are also available via NIX.

Therefore, there are a lot of compromised/vulnerable devices in the Czech ISPs and everyone should be prepared that they can create an attack with the strength of tens of Gbps. Some more extensive IP range blocking or blackholing cannot be used here. We are glad for our “washing machines” that can clean such traffic.

We are well accessible from abroad, but some monitoring services may report unavailability because we have stricter filters (you can use our WEDOS OnLine for now). In general, during peak attack times, services may experience slowdowns or very brief outages.

On Wednesday, however, something happened that we (didn’t) expect. For the first time, the attackers tried to clog all 3 routes, which they actually succeeded in doing for a few seconds. So within those few seconds we had 3x 100 Gbps congestion. In the end, however, it did “only” 142.3 Gbps at minute averages (we measure 1x every 3 minutes) and overall there was an average of around 100 million packets per second. The peak was a clogged 300 Gbps and over 200 million packets per second. We always filtered after a second and gradually solved the situation.

Where did we get the numbers, you ask? This has a simple explanation. The numbers show the routers on each interface. They have accurate stats (highs and averages over the last time period ).
We also collect various netflow and snmp data that provide similar clues.

Our infrastructure has held up. However, such massive traffic also overwhelms the routes of our suppliers and some other ISPs along the way. They can therefore only be performed for a very short time.

Attackers have not been able to bring down our sites anymore, we have only received occasional warnings that they are slower. Instead, they tried literally everything possible. For example, they picked random domains that they thought were in our country and directed attacks to IP addresses in DNS.

We always get to manage and control everything because we have a physically separate internal LAN.
From this internal LAN we go to the public IP addresses of the servers through the internal firewall and then further through our routers, because the gateway of this internal LAN is in a different IP address range. So we go from our PCs to public IP addresses of websites and servers over a congested part of the network.

What we plan to improve

We knew that at some point attacks over 100 Gbps would come, but we honestly didn’t expect it to come so soon and straight at us 🙂

We are in a completely different position than in 2014, when the first strong attacks came at us and we had no real way to defend ourselves. What we lacked was experience, know how and hardware. Today we have all that, plus 3x 100 Gbps to boot 🙂

We won’t write that these attacks have not troubled us a little. Some of us were awake waiting to see what else the attackers could come up with with such an arsenal. If you get six hours of sleep in three nights, you’re probably tired…

We also found a few “weaknesses” that could be exploited by such a strong attack. For example, we also switched some IP addresses and made adjustments in DNS at night. We had to act quickly and there were a few mistakes. When you’re rewriting hundreds of IP addresses and counting ranges with a few hours of sleep, sometimes you miss something. For example, the attackers (newly) attacked the IP addresses of services, according to DNS names. We needed to know which ones. So we rewrote the IP addresses in DNS for the services and gradually reduced the group until we figured out exactly what they were attacking. Unfortunately, something like this is going very slowly. You have a huge group of domains and you need to know which one they are attacking. So how do we do this? Break it down into smaller groups. And you’re waiting for it to show up in the DNS to see which group it’s attacking. You divide it into smaller ones and wait again. And like that, one by one, until you get to a specific name. Since you’re always waiting for DNS changes, this is an operation that takes many, many, many hours. But success has come.

Of course, stopping such an attack is different than filtering it. We have learned to filter attacks of tens of Gbps so that the customer who is the target of such an attack cannot tell the difference. This helped us a lot on Tuesday when more attacks from the Czech NIX started to appear. Yes, the attacks came from the Czech Republic as well.

On Monday, we were analyzing the data to see if the attacks were just from a continent and we would block it. I’m sorry. The IP addresses were from all over the world. Normally from all countries around the Czech Republic, including the Czech Republic. And you don’t just block that… We’ve probably been attacked by every modern refrigerator.

For these reasons, it was not entirely easy to use the so-called. selective blackholing so that we do not promote some of our IP addresses to certain locations (geographically).

However, filtering will not be enough for such strong attacks. That is where we need to come up with a different solution at a global level. We plan to do this with WEDOS AnyCast. The priority of its development and rapid deployment has increased significantly.

Already in January, the decision was made that we would replace the very powerful machines for filtering the faulty traffic. We have the machines and we want to deploy them now. Four new ones are planned and each should be able to filter 40-120 Gbps (depending on the type of attack) and up to 50 million packets per second. We’ll use the old ones as probes and be able to analyze the attacks more accurately. We have some ideas that require a lot of computing power.

Hmm we thought 3x 100 Gbps would be enough. I guess we’ll have to consider the possibilities of an increase 🙂

Conclusion

We were able to measure a DDoS attack with a strength of over 300 Gbps (164.3 Gbps in minute averages – the number is written once every 3 minutes) and in hundreds of millions of packets. We assume that the attack was significantly stronger, but due to averaging several minutes and effectively overloading the routes, we cannot measure this accurately. Then on Wednesday, there were actually attacks for a few seconds, and they clogged all three routes, meaning an attack of over 300 Gbps.

But the official number we were able to measure as an average is 164.3 Gbps. The record is therefore 300 Gbps.

The only question that remains is how strong the attack really was, and how much was the threshold where we could no longer handle it.

DDoS attack 5 . 4. 2021. The strongest part of the 164.3 Gbps measured strength.

In terms of packets and trying to take down individual elements of our network infrastructure, the most we’ve seen is 98.1 million packets per second. Again, these are minute averages. 77.1 million packets went through one route and 21 million packets through the other. But the short-term numbers (without averages) were above 200 million packets per second.

DDoS attack 5. 4. 2021. The strongest part of the 98.1 million packets of measured power.

Thank you all for your support during a very challenging few days. We have tried to keep the impact on clients to a minimum and we have succeeded on the whole. For example, the virtual and dedicated servers had no problems, with a few exceptions where they slowed down for a few tens of seconds.

Thank you all for your understanding. From the beginning, we have reported on our status page, which we have been using for a similar purpose for some time.

We are planning several other improvements. We will keep you informed.

Long attacks deserved a long article 🙂

Addition on 9. 4. 2021:

As a precautionary protection against DDOS attacks, we have launched anycast DNS in Asia, Europe, America and will soon add more. So far, one of the DNS servers we have is running there.
We launched 2 new filters that can filter 80 Gbps (each) and about 50 million packets per second (each). We’ll add another pair after the weekend. So our filtering capacity reaches 320 Gbps and 200 million packets per second.
We have prepared a backup URL for customer administration, which is on a completely different domain and is via anycast.