/plushcap/analysis/cloudflare/post-mortem-the-ugly-the-bad-the-good

Post Mortem: The Ugly, the Bad & the Good

What's this blog post about?

On February 24, 2012, at around 7:30 GMT, a DNS update by Cloudflare led to an outage affecting some websites for approximately 30 minutes. The new DNS infrastructure was designed to make updates faster but accidentally deleted the primary DNS database during the process. It took about five minutes for Cloudflare to recognize the issue, retrieve the backup, and push it to production. Some users experienced longer downtime due to cached results from their ISP's recursive DNS or issues with two data centers not taking all corrected DNS file updates correctly. The company apologized for the incident and has added safeguards to prevent similar occurrences in the future.

Company
Cloudflare

Date published
Feb. 24, 2012

Author(s)
Matthew Prince

Word count
1059

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.