Amazon Web Services resolves massive outage that affected thousands of global sites and apps

Amazon Web Services (AWS) announced late Monday that it had resolved a massive outage that left some of the world’s largest websites offline for most of the day.

More than 1,000 apps and websites, including social media platforms like Snapchat and banks Lloyds and Halifax, were affected by the problem that occurred at Amazon’s cloud computing operations centers in the US.

The platform’s outage monitor, Downdetector, reported that users globally reported more than 11 million issues during the outage. The problems began around 7 a.m. on Monday and included a wide range of services, from massively multiplayer online games like Fortnite to language learning apps like Duolingo.

Technology experts warned of the dangers of many companies relying on a single dominant provider.

“What this episode has highlighted is how interdependent our infrastructure is,” said Professor Alan Woodward of the University of Surrey. “Many online services rely on third parties for their physical infrastructure, and this shows that problems can occur even with the largest third-party providers. Small errors, often caused by humans, can have a wide and significant impact,” he added.

The outage was addressed after several hours of intervention, but according to Mike Chapple, a professor of information technology at the University of Notre Dame, several “cascading” failures can arise after the initial outage: “It’s like when you have a large-scale power outage. Crews start working to get it back up and running. The power can be interrupted several times. It’s possible that Amazon initially only addressed the symptoms and not the cause,” Chapple said.

At around 23:00 p.m., Amazon confirmed that all AWS services had returned to normal operations, after slowing down parts of the system to address the root problem.