/plushcap/analysis/cloudflare/cloudflare-training-a-million-models-per-day-to-save-customers-of-all-sizes-from-ddos

Training a million models per day to save customers of all sizes from DDoS attacks

What's this blog post about?

The text discusses the challenges of detecting Distributed Denial of Service (DDoS) attacks and presents an anomaly detection pipeline developed by Cloudflare to identify unmitigated or partially mitigated DDoS attacks. The initial approach, based on a naive volumetric model, is shown to be ineffective due to its reliance on stable traffic volume over time, which rarely holds true in practice. Time series forecasting methods are also considered but deemed impractical for various reasons. The solution proposed by Cloudflare involves using multiple dimensions to measure traffic and identifying correlations between these variables. Through careful analysis, a dozen such variables were discovered that follow a normal distribution, aren't correlated with volume, and deviate from the underlying normal distribution during "under attack" events. Principal Component Analysis (PCA) is used to convert these multidimensional data into a spherical shape, allowing for an anomaly score based on distance from the center of the cloud. The process is highly parallelizable and can be scaled horizontally as needed. Cloudflare currently re-trains models every day but may reduce this frequency in the future due to minimal intraday model drift. The company trains models for a large sample of representative customers, including those on the Free plan, to identify attacks for further study and tuning of existing DDoS systems for all customers.

Company
Cloudflare

Date published
Oct. 23, 2024

Author(s)
Nick Wood, Manish Arora

Word count
2159

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.