Adopting OpenTelemetry for our logging pipeline
Cloudflare recently migrated its logging pipeline from syslog-ng to OpenTelemetry Collector. The migration aimed to replace the existing infrastructure as transparently as possible and involved building four internal components for the initial version of the collector. Reasons for this shift included the ability to contribute improvements more easily, better support for Post-Quantum cryptography libraries, built-in support for Prometheus metrics, and unification onto one daemon for all types of telemetry. The migration process involved building an internal distribution of OpenTelemetry Collector using OCB (OpenTelemetry Collector Builder), creating custom components like cfjs1exporter, fileexporter, externaljsonprocessor, and ratelimit processor, and deploying the collector to both edge and core data centers. Lessons learned during this process included addressing failover issues and minimizing cutover delays. Future improvements include better handling for log sampling, insights for engineering teams on telemetry production, migration to OTLP as a line protocol, and upstreaming of custom components.
Company
Cloudflare
Date published
June 3, 2024
Author(s)
Colin Douch, Jayson Cena
Word count
2146
Language
English
Hacker News points
None found.