Using the power of Cloudflare’s global network to detect malicious domains using machine learning

Company

Cloudflare

Date Published

March 15, 2023

Author

Jesse Kipp

Word count

2271

Language

English

Hacker News points

None

URL

blog.cloudflare.com/threat-detection-machine-learning-models

Summary

Cloudflare uses machine learning to detect Domain Generation Algorithm (DGA) domains and DNS tunneling, two techniques used by attackers to evade detection and control using domain names that look like random strings. The company trains a model that extends a pre-trained transformers-based neural network to identify DGA domains, achieving an accuracy of over 99% on test data. For DNS tunneling detection, Cloudflare uses a two-stage model consisting of a gradient boosted decision tree and a neural network model. The first stage makes quick yes/no decisions about whether the domain might be a DNS tunneling domain, while the second stage refines the categorization to distinguish legitimate applications from malicious ones.