Squeezing the firehose: getting the most from Kafka compression
In order to reduce the size of messages being sent through Apache Kafka, we decided to implement compression algorithms. We started with Snappy, which is a fast compression algorithm that provides good compression ratio for many types of data. However, it was not enough to keep up with our ingress and egress rates. We then switched to Zstandard, which is a modern compression algorithm promising high compression ratio and throughput, tunable in small increments. We found that using Zstandard level 6 provided the best compromise between compression ratio and CPU cost for our data. By implementing these compression algorithms, we were able to reduce the size of messages being sent through Apache Kafka by up to 4.5x compared to no compression at all. This allowed us to improve our network and storage utilization significantly.
Company
Cloudflare
Date published
March 5, 2018
Author(s)
Ivan Babrou
Word count
3309
Hacker News points
None found.
Language
English