2023-03-08 Incident: A Deep Dive into the Platform-level Impact
On March 8, 2023, Datadog experienced an outage that affected all services across multiple regions due to a systemd update in Ubuntu 22.04. The new systemd-networkd behavior led to the flushing of IP rules and loss of network connectivity for both host and pod traffic on AWS and Azure, while only affecting host traffic on Google Cloud. This incident impacted multiple regions across distinct cloud providers and delayed the recovery process due to the different actions required by each provider.
Company
Datadog
Date published
May 24, 2023
Author(s)
Laurent Bernaille
Word count
3864
Hacker News points
1
Language
English