Incident report on memory leak caused by Cloudflare parser bug
On February 23, 2017, John Graham-Cumming from Cloudflare reported a security problem with their edge servers that was discovered by Tavis Ormandy from Google's Project Zero. The issue involved some unusual circumstances where the edge servers were running past the end of a buffer and returning memory containing private information such as HTTP cookies, authentication tokens, HTTP POST bodies, and other sensitive data. Some of this data had been cached by search engines. The problem was quickly identified and fixed after disabling three minor Cloudflare features that were using the same HTML parser chain causing the leakage. A cross-functional team from software engineering, infosec, and operations formed in San Francisco and London to fully understand the underlying cause, the effect of the memory leakage, and to work with Google and other search engines to remove any cached HTTP responses. The bug was serious because the leaked memory could contain private information and because it had been cached by search engines. The greatest period of impact was from February 13 and February 18 with around 1 in every 3,300,000 HTTP requests through Cloudflare potentially resulting in memory leakage (that’s about 0.00003% of requests). The root cause of the bug was a pointer error in the Ragel code used to parse HTML pages on the fly. The problem had been dormant for years until the internal feng shui of the buffers passed between NGINX filter modules changed with the introduction of cf-html. After identifying and fixing the issue, Cloudflare worked closely with Google and other search engines to purge any cached HTTP responses containing leaked memory. They also undertook a project to fuzz older software looking for potential other security problems.
Company
Cloudflare
Date published
Feb. 23, 2017
Author(s)
John Graham-Cumming
Word count
3256
Language
English
Hacker News points
115