Tech Made Simple

Mystery Solved: Amazon Web Services says “overwhelmed network devices” Triggered their Outage

todayDecember 13, 2021

Background
share close

If you’ve been wondering how that major Amazon Web Services (AWS) outage happened, and nervously asking, “Could it happen again?,” you’re not alone. The outage knocked out a slew of popular services like Venmo, Tinder, Disney Plus, and even Roomba, and the December 7th outage also put some Amazon deliveries on hold. Amazon experienced its last major outage around this time last year, causing a number of sites and apps to go down for hours.

Now, AWS has provided an explanation as to what caused the outage that downed parts of its own services, as well as the third-party websites and online platforms that utilize AWS. In a post on the AWS website, the company explains that an automated process caused the outage, which began around 10:30AM ET in the Northern Virginia (US-EAST-1) region.

“An automated activity to scale capacity of one of the AWS services hosted in the main AWS network triggered an unexpected behavior from a large number of clients inside the internal network,” Amazon’s report says. “This resulted in a large surge of connection activity that overwhelmed the networking devices between the internal network and the main AWS network, resulting in delays for communication between these networks.”

According to the report, this issue even impacted Amazon’s ability to see what exactly was going wrong with the system. It prevented the company’s operations team from using the real-time monitoring system and internal controls that they typically rely on, explaining why the outage took so long to fix. Amazon notes that service started didn’t start improving until 4:34PM ET, and the issue was fully resolved at 5:22PM ET.

Since Amazon’s Support Contact Center also runs on the AWS network, customers weren’t able to create support cases for seven hours during the outage. Amazon’s Service Health dashboard, which the platform uses to provide status updates, was also impacted, resulting in Amazon’s delayed acknowledgment of the issue. The company says that it’s working on a way to improve its response to outages, and plans on releasing a revamped version of the Service Health Dashboard that should help customers across receive timely updates if an outage occurs.


Photo Credit: Gil C / Shutterstock.com

Written by: Vipology Staff Writer

Rate it

Previous post

Health & Wellness

CDC says NO Safety Problems seen after nearly 5 million children getting COVID vaccines

With nearly 5 million children ages 5 to 11 now vaccinated against Covid-19, the Centers for Disease Control and Prevention (CDC) Director Dr. Rochelle Walensky says real-world monitoring finds vaccines are safe for young children. Crucially, the CDC hasn't identified any concerns with the temporary heart inflammation known as myocarditis, a potential side effect of mRNA vaccines seen in rare circumstances in teenagers and young adults. "We haven't seen anything […]

todayDecember 13, 2021


Subscribe

LISTEN WITH YOUR APP

0%