Cloudflare incident on October 30, 2023
A service disruption occurred in Workers KV, a distributed key-value store used by many applications including first-party Cloudflare products like Pages, Access, and Zero Trust. This was due to an issue with the deployment tool which resulted in production traffic being directed to a version that was not authorized for production access, leading to HTTP 401 errors. The issue affected parts of the Cloudflare dashboard and other services dependent on Workers KV. The incident lasted approximately one hour before being resolved. The team has identified the root cause and is taking several steps to prevent similar incidents in the future. These include improving the deployment tooling, enhancing the rollback process, adding pre-checks to deployments, hardening progressive deployment scripts, and ensuring compatibility between applications and their environments during deployments. The company apologizes for any inconvenience caused by this incident.
Company
Cloudflare
Date published
Nov. 1, 2023
Author(s)
Matt Silverlock, Kris Evans
Word count
1670
Language
English
Hacker News points
None found.