/plushcap/analysis/mux/mux-processing-cdn-logs-exactly-once-with-kafka-transactions

Processing CDN logs exactly-once with Kafka transactions

What's this blog post about?

Mux Video is a platform that enables the building of video-centric applications with global audiences by partnering with Content Delivery Network (CDN) providers such as Fastly, Cloudflare, and Highwinds. The platform leverages these services to deliver video, thumbnails, subtitles, and other content more efficiently from its origin, reducing costs and improving the end-user experience. To monitor performance and invoice customers correctly, Mux Video processes request level logs retrieved from CDN providers through an Apache Kafka pipeline that ensures Exactly-Once Semantics (EOS). The system was re-architected to scale better by separating log file ingestion and processing into two parts: the Leader responsible for figuring out which log files need ingesting, and Workers responsible for downloading, parsing, and writing each individual log to Kafka. This design allows the use of Kafka transactions to prevent all records from being written until the corresponding offsets have been committed to Kafka, ensuring data consistency and accuracy in billing.

Company
Mux

Date published
July 14, 2021

Author(s)
Drew Rodman

Word count
2997

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.