The AO3 July/August DDoS Attacks: Behind the Scenes

This work provides an overview of the July & August of 2023 DDoS attacks against the Archive of Our Own from the perspective of the OTW’s Systems Committee. As such, it may include some technical terms and information. We’ll do our best to explain or link to external resources to help provide some context as needed. We will focus on the series of events for which we have data & evidence, rather than speculation. All times & dates are in Coordinated Universal Time (UTC) and in 24 hour format unless otherwise stated.

As a reminder, the Systems Committee consists of 8 volunteers (6 at the time of the incident) who donate their free time to maintaining the OTW infrastructure. The events outlined below were fit in during our day jobs, evenings and night times.

Background

The OTW infrastructure consists of multiple servers & networking devices with differing roles. In order to understand how the attack affected us, we’ll need to briefly explain a few of these layers.

At the edge of our network, we have dual redundant firewalls. These are primarily responsible for restricting traffic into our network, but they are also used to load balance between our frontend servers.

The frontend servers are responsible for traffic shaping and load balancing of traffic that will be served by the application servers. The frontends also serve some static files, such as images and stylesheets.

The application servers actually generate the pages of the Archive. They talk to numerous support services, such as our database & Elasticsearch to generate pages for everyone.

July 10th, 2023

The Archive was operating normally. At approximately 11:48 UTC, we began to see increased levels of traffic to the Archive, which rapidly began generating errors and overloading the CPUs on our frontend servers.

Graph showing requests per minute to AO3. A relatively flat line is shown at about 80-90k rpm, then spiking over 200k rpm before dropping to near 0.

The first Systems committee volunteer responded shortly after 12:00 UTC. The volunteer felt that the traffic was likely malicious, but was at that time unable to immediately conclude an attack since some initial signals suggested that the suspicious traffic could have been a browser update or a misbehaving bot.

Until the volunteer had time to further investigate and create a fix, they decided to place the site in maintenance mode around 12:15 UTC, which stopped requests at the frontend layer. This did relieve load to the application servers, but did not reduce the noticeable load on the frontend servers. At 13:52 UTC, the next Systems volunteer checked into our internal chat, and by 14:00, the team had more or less decided that the outage was due to foul play, specifically, an HTTP DDoS attack. Details of the traffic were identified in order to attempt mitigation.

The third Systems team member checked in around 15:21 UTC, and began deploying a potential fix provided by the first responding volunteer. Due to excess load on the frontend servers, this was taking much longer than expected. To allow the deployment to continue, traffic was stopped at our firewalls which allowed the deployment to complete in a reasonable time.

Unfortunately, after allowing traffic back through the firewalls, we continued to see high load on our frontend servers. Around 17:16 UTC, a change was deployed to our firewalls to block traffic to the page that was being abused. This stopped the abusive traffic at the edge of our network, which reduced the load to our frontend servers and kept things mostly at bay. The site returned to more or less normal operation for about an hour.

After an hour of uptime, the attackers began targeting different pages on the site. The 3 team members followed up by blocking requests to those pages, which allowed the site to remain mostly available between 19:05 and 21:42 UTC. Around that time, we began to see traffic spikes in our APM tool of over 1 million requests per minute. For context, normal peak traffic hovers around 150k requests per minute. Since our servers were doing their best to respond to all of these requests, we began to exhaust our physical internet connection.

From 21:42 UTC to about 23:10 UTC, the site was sporadically up and down while the team tried to keep the site alive. We were attempting to deploy completely new rate limiting measures on the fly with not much success. The request spikes continued to reach a peak of 1.5 million requests per minute (including those returning “you’re browsing too fast” responses) which is just what our application servers were able to actually process. There were undoubtedly more requests that congested upstream and thus were not logged.

Requests per minute graph showing numerous spikes near & over 1 million RPM. One spike reaches 1.5 million RPM.

After 23:10 UTC, the site was more or less down as we were completely flooded with more traffic than we could physically handle.

July 11th, 2023

The team continued to attempt to mitigate the attack ourselves into July 11th, but each time we attempted to return to service, we were immediately overwhelmed by the traffic.

At 00:21 UTC, our datacenter informed us that the attack had exceeded 1.2 terabits per second, which is around 600 times the bandwidth capacity we had at the time. This caused temporary disruption for the whole DC until further upstream filtering was enabled. This was likely the result of a DNS amplification attack or similar in addition to the HTTP flooding we were receiving, and was unknown to us until this point.

By 05:00 UTC, the team had spent hours attempting to handle things ourselves, and it was clear we weren’t getting far on our own. At this point, we made the decision to set up Cloudflare to get the site back online, and we worked with our datacenter to make the necessary preparations. Around 09:47 UTC, we started setting ourselves up on the Cloudflare free tier while waiting for the necessary approvals to upgrade, but further changes to the backend were needed. In the meantime, one of our volunteers was able to connect with a Cloudflare employee & fellow user of the site, who referred us to Project Galileo and supported our application internally.

Thanks to these efforts, we were officially approved at 14:04 UTC, which granted us access to more advanced tooling only ~2 hours after applying. At 15:00 UTC, some traffic began successfully hitting our application servers via Cloudflare. This initially included some attack traffic. We worked with our Cloudflare contact to put in some rules to further mitigate the abusive traffic, and their system began to recognize and stop more or less all of the abusive traffic. The Archive was once more fully accessible around 15:42 UTC.

Requests per minute to the Archive returned roughly to normal upon successfully implementing Cloudflare.

Traffic graph from the Cloudflare portal showing unmitigated & mitigated requests. Initially, the majority is not mitigated, but later reverses once Cloudflare’s systems kicked in & specific rules were applied. Times are US Eastern.

July 11th to August 31st

We continued to receive a series of attacks in this time frame. The majority of these attacks had no impact to the site and were mitigated by Cloudflare. However, there were a couple of notable events.

On August 26th, 2023 at 12:29 UTC, we received a large attack peaking at 10 million requests per second. The attack had no notable impact on the Archive, but was the largest attack we had recorded at the time.

On August 28th, 2023 at 20:49 UTC, we received a notification from Cloudflare that a DDoS attack of 6.95 million requests per second was detected. The numbers on these alerts are frequently lower than the actual peak of the traffic, so this was initially alarming to us.

Internal Cloudflare alert showing a 6.95 million RPS attack in progress.

We later found out that the attack had actually peaked at 65 million requests per second. For context, the largest publicly announced HTTP DDoS attack by Cloudflare at the time was a 71 million request per second attack. Additionally, we received information that the attack originated from the Mirai botnet. However, Cloudflare did its job well and we saw very little, if any, impact.

A screenshot from our, at the time, in progress Cloudflare stats dashboard. The peak reaching slightly over 65 million RPS is visible.

On August 30th, 2023 at about 22:15 UTC, we received a set of attacks that was not initially mitigated well by Cloudflare, which caused some disruption. The attack lasted until approximately 00:10 UTC on the 31st. We believe the disruption in this case was due to some long standing issues in a piece of legacy software that was part of our stack, which we disabled at this time and later removed.

In the later hours of August 31st, we received another set of attacks which caused brief problems. Initially the attack was not fully mitigated by Cloudflare, but we were able to put in place some caching rules which helped the situation. Some Cloudflare automatic rules started to kick in, but did cause some brief collateral damage. Although this wasn’t an issue for long, it threw up the default Cloudflare block page, which is a little scary. We later replaced this with a custom page that is nicer and more reassuring. :)

The Cloudflare default block page, stating "Sorry, you have been blocked. You are unable to access archiveofourown.org."

Our custom block page, stating more reassuringly that the user's action was temporarily blocked and does not affect account status.

A number of smaller attacks occurred after this, however essentially none have had any impact on the Archive nor required any major action from us.

Acknowledgements

We thank Cloudflare for the quick turnaround during the initial attack, for providing us with services under Project Galileo, and for continuing to be responsive to our needs. 🧡

We are very grateful to our datacenter for hosting us as long as they have, for their initial support during the attacks, and for their quick response in getting us items needed to enable Cloudflare. 🧡

We thank all of the OTW volunteers who were around to support us during the attacks. We also appreciate everything you do for the org. ❤️

Finally, we thank the users of the Archive for all of the love and support we received during the downtime. We also received a lot of offers to assist us in any way possible from various industry professionals. All of this was incredibly motivating in keeping us going. Thank you. ❤️

Actions

Kudos

pathetiquepathos, gambling_tater_tot, citrus_cola, arcanedreamer, Ink_wars, Valirian, tuiwrites, Curiosity_Guest, WaffleMaker057, AvengedBeth, Simpleminds, ALizardLaments, ehehehehe, Lunaia, StrangeandSentient, rcyn_718, FakeSky, Karina_Lubomudrova, NotanAndaliteBandit, linkfun03, wherebear, Rainbow_Entropy, who_sang_the_sun_in_flight, PrincessPuffles, dandelionsunset_1210, GeometryLemmy, IdleCrow15, Pacers, SpringGreenHeart, InterdimensionalFiend, quietly_happy, Panziku_Nox, bowekatan, Quill_writes, Random_Corpse, WorudoFyuchaSutaa, StarFruitDreamz, Silberphoenix, ZissaVulpes, TheBiggerFish, doodleduckio, themagicmemo, GreatPriestsOfTheMiddleAges, kareiku, MinstrelFourtyTwo, YampiWolf, crowsinanacreousvoid, kazamahayatedesu, mareseatoats, sproutingsprouts, and 3398 more users as well as 1361 guests left kudos on this work!

Comments

Pages Navigation

Arika_Ito Mon 25 Mar 2024 04:25AM UTC

Thank you for all the work that you do! Fandom wouldn't be the same if it wasn't for this website <3

Comment Actions
- Reply
- Thread
meandmysloth Mon 25 Mar 2024 04:25AM UTC

Ooh yeah! We were all so upset about this lmaoo 😂😂 thanks for all the work you do! We love ao3!!

Comment Actions
- Reply
- Thread
1. AO3_Systems (Official) Mon 25 Mar 2024 05:08AM UTC
  
  Thank you for the praise :) We were very upset too, we had a long day of work and no fic before bed 😅
  Frost The Fox
  Systems Chair
  
  Comment Actions
Asexual_Sheep Mon 25 Mar 2024 04:25AM UTC

Thank y’all so much for all the hard work!!!

Comment Actions
- Reply
- Thread
Harlando Mon 25 Mar 2024 04:27AM UTC

A big THANK YOU to the amazing volunteers that keep this site running!!!!!!❤️❤️❤️❤️❤️

Comment Actions
- Reply
- Thread
JennsterJay Mon 25 Mar 2024 04:28AM UTC

This was definitely a 'you had to be there moment' in ao3 history last year. Thank you all for your hard work on this site 👏🏾💜💜💜✨

Comment Actions
- Thread
KL1234 Mon 25 Mar 2024 04:28AM UTC

Thank you!

Comment Actions
- Reply
- Thread
fixomnia Mon 25 Mar 2024 04:29AM UTC

We love you so much. And we are in awe of your technical expertise and partnerships.

Comment Actions
- Reply
- Thread
Falconlord Mon 25 Mar 2024 04:30AM UTC

Thank you volunteers! ❤️

Comment Actions
- Reply
- Thread
xieliansbignaturals Mon 25 Mar 2024 04:32AM UTC

Thank you so much for your hard work! <3

Comment Actions
- Reply
- Thread
1. honey_bee239 Mon 25 Mar 2024 05:55AM UTC
  
  omg big naturals in the wild? Heyo
  
  Comment Actions
2. 1. xieliansbignaturals Mon 25 Mar 2024 11:23AM UTC
    
    Hi! 😄
    
    Comment Actions
boxysmiles Mon 25 Mar 2024 04:33AM UTC

this was a really interesting delve into the behind the scenes, thank you so much for sharing! and thank you so much to all the volunteers for all your hard work in responding so quickly to such a major attack, you guys do an absolutely incredible job and it is definitely so appreciated! ❤️

Comment Actions
- Reply
- Thread
NothingSoDivine Mon 25 Mar 2024 04:36AM UTC

a fascinating breakdown of the situation! thank you to the whole Systems team & everyone who helped keep the site protected, and thank you to whoever wrote this explanation!

Comment Actions
- Reply
- Thread
1. AO3_Systems (Official) Mon 25 Mar 2024 04:44AM UTC
  
  Thank you for the praise! More or less the entire committee contributed to this work in some form - whether it be documenting the original information & screenshots, reviewing, rewriting from our internal postmortem, etc. Most definitely a group effort :)
  Frost The Fox
  Systems Chair
  
  Comment Actions
dethna Mon 25 Mar 2024 04:36AM UTC

Thanks for the breakdown of events! I wouldn't even know where to start with stopping a ddos. I'm super glad you got the cloudflare payed for, I know these services are expensive!

Comment Actions
- Reply
- Thread
NoelEvangilineCarson Mon 25 Mar 2024 04:37AM UTC

Thank you to all of the volunteers and especially those who worked so hard during these attacks!! Y'all are the backbone of society! ❤️

Comment Actions
- Reply
- Thread
greenleaf Mon 25 Mar 2024 04:38AM UTC

I read it a bit slowly because I couldn't quite understand all the jargon, but from what I did get, it just made me feel so thankful for the efforts of the volunteers to protect AO3. And a special thank you for creating the custom block page in place of the default Cloudflare block page. Some might thinking of it as just a small thing, but I encountered that quite a few times back then and I'm just so thankful for the team's efforts and their consideration towards us users.
Thank you truly for all that you do. ❤️

Comment Actions
- Reply
- Thread
1. AO3_Systems (Official) Mon 25 Mar 2024 04:52AM UTC
  
  Thank you for the praise, the little things definitely are important! The original block page suggests emailing the site owner "to let them know you were blocked", so we definitely found out that page needed some tweaking (and that our users are good at following instructions 😉).
  Frost The Fox
  Systems Chair
  
  Comment Actions
in_a_mellow_tone Mon 25 Mar 2024 04:39AM UTC

Thank you guys for all the work that you do. Fandom history and art history would not exist without you guys. I am kissing you all sloppy style with tongue

Comment Actions
- Reply
- Thread
kvasannushka Mon 25 Mar 2024 04:40AM UTC

thank you for all the work you put in! it does not go unnoticed and you are so very appreciated!

Comment Actions
- Reply
- Thread
worstgirlsfuture Mon 25 Mar 2024 04:42AM UTC

A HUGE thanks to the volunteers for their dedication!! 💞💞💞

Comment Actions
- Reply
- Thread
spiceblueeyes Mon 25 Mar 2024 04:43AM UTC

Thank you for your work and dedication!

Comment Actions
- Reply
- Thread
13_Sept Mon 25 Mar 2024 04:43AM UTC

God bless y'all for your work, you guys are amazing and deserve so much love and support 💓

Comment Actions
- Reply
- Thread
Parf_Adaneth Mon 25 Mar 2024 04:44AM UTC

I am very grateful to you volunteer tech wizards! Thank you for your quick response and skillful resolution of the attack, and for everything you do now to keep AO3 running smoothly. 💜

Last Edited Mon 25 Mar 2024 04:44AM UTC

Comment Actions
- Reply
- Thread

Actions

Work Header