/plushcap/analysis/assemblyai/8-31-2021-aws-outage-post-mortem

8/31/2021 AWS Outage Post-Mortem

What's this blog post about?

On August 31st, AWS experienced an outage in their us-west-2 region, impacting a single availability zone (usw2-az2). This resulted in increased API error responses and slowdowns in transcription turnaround time for users. The issue was caused by a component within the subsystem responsible for processing network packets becoming impaired, affecting health checks and causing packet loss within the region. During this incident, there were also massive spikes in DynamoDB query times and some SQS queues started to back up. AWS' incident updates mentioned impacts to many key AWS services, including Kinesis. The company has learned from this experience and plans to make use of all four availability zones in the us-west-2 region and move production workloads out of the default VPC to follow best practices.

Company
AssemblyAI

Date published
Sept. 3, 2021

Author(s)
Mitch Anderson

Word count
1338

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.