Cloudflare Dashboard and API Outage on April 15, 2020
On April 16, 2020, between 1531 UTC and 1952 UTC, Cloudflare experienced an outage affecting its Dashboard and API due to the accidental disconnection of multiple redundant fiber connections from one of its two core data centers. The incident was not caused by a DDoS attack or any other external factors. During the outage, customers were unable to log into the dashboard, use the API, make configuration changes, purge cache, run automated load balancing health checks, create or maintain Argo Tunnel connections, create or update Cloudflare Workers, transfer domains to Cloudflare Registrar, access logs and analytics, encode videos on Cloudflare Stream, or log information from edge services. However, the Cloudflare network itself continued to operate normally, and all security services remained functional. The company worked simultaneously to restore connectivity and cut over to its disaster recovery core data center. Full redundant connectivity was restored at 2031 UTC. Moving forward, Cloudflare plans to address the risk of similar problems by improving design, documentation, and processes related to hardware decommissioning and cable management in their data centers.
Company
Cloudflare
Date published
April 16, 2020
Author(s)
John Graham-Cumming
Word count
799
Language
English
Hacker News points
32