How we made our DNS stack 3x faster
In 2017, Cloudflare announced significant improvements to its DNS infrastructure, which included replacing two core elements of the service: the part of their DNS server that answers authoritative queries and the data pipeline responsible for distributing changes made by customers to DNS records across the globe. The new system was designed to handle millions of zones more efficiently than the previous one. The new data model grouped data by RRSet, allowing a single request to be made to the KV store to retrieve all the data needed to answer a query. This change significantly reduced memory usage and improved performance. The team also switched from using JSON and Protocol Buffers to MessagePack for serialization, which further enhanced efficiency. A new data pipeline was implemented to distribute zone changes globally in seconds rather than minutes or hours. Global distribution is challenging due to various layers of caches, but the new system has been successful thus far. The authoritative rrDNS v2 filter was also rewritten from scratch to better suit the scale and shape of DNS traffic today. The migration process involved moving zones over gradually while keeping both systems in sync, allowing for quick rollbacks if needed. Overall, these improvements resulted in a 3x performance boost in code handling DNS queries, faster updates to DNS data globally, better system robustness, and more test coverage.
Company
Cloudflare
Date published
April 11, 2017
Author(s)
Tom Arnfeld
Word count
2298
Language
English
Hacker News points
15