/plushcap/analysis/mux/mux-kafka-connect

Kafka Connect: The Magic Behind Mux Data Realtime Exports

What's this blog post about?

Mux has observed an increasing demand for access to raw and enriched video QoS data processed by Mux Data. To meet this need, they have made real-time metric and event-stream exports available for their Enterprise customers. These exports enable various applications such as monitoring CDN performance, identifying viral videos, and joining metrics with internal metrics for faster troubleshooting. Mux uses Apache Kafka as its internal streaming platform and has chosen to use the open-source software Kafka Connect to manage real-time event-stream exports to external streaming services. In production, Kafka Connect has proven to be reliable, scalable, and easy to extend and customize. To get started with Kafka Connect, users can follow a three-step process: build their Kafka Connect Docker image, start up the Kafka Connect cluster, and run connectors. Lessons learned from using Kafka Connect include setting the 'CONNECT_REST_ADVERTISED_HOST_NAME' environment variable, configuring the 'errors.retry.timeout' value on sinks, and being cautious when implementing autoscaling for Kafka consumers. Monitoring is crucial for managing Kafka Connect resources effectively, and Mux uses Grafana to visualize Prometheus metrics such as CPU utilization, memory utilization, topic consumer lag, records consumed by connectors, sink batch average put-time, task errors, and task failures. Alerts are set up on key metrics like CPU utilization, memory utilization, and Kafka consumer fetch-record lag to ensure optimal performance.

Company
Mux

Date published
Dec. 16, 2020

Author(s)
Scott Kidder

Word count
1859

Language
English

Hacker News points
None found.


By Matt Makai. 2021-2024.