Today's Outage Post Mortem
On March 3rd, 2013 at 09:47 UTC, CloudFlare experienced a system-wide failure of its edge routers, causing an outage that affected all of its services including DNS and web proxies. The cause was traced to a bad rule in the Flowspec protocol, which led to routers consuming all their RAM and crashing. The issue was resolved within 30 minutes by removing the bad rule and performing hard reboots on unresponsive routers. CloudFlare is working with Juniper to investigate the issue further and improve its network protocol controls.
Company
Cloudflare
Date published
March 3, 2013
Author(s)
Matthew Prince
Word count
1105
Hacker News points
None found.
Language
English