/plushcap/analysis/cloudflare/adopting-opentelemetry-for-our-logging-pipeline

Adopting OpenTelemetry for our logging pipeline

What's this blog post about?

Cloudflare recently migrated its logging pipeline from syslog-ng to OpenTelemetry Collector. The migration aimed to replace the existing infrastructure as transparently as possible and involved building four internal components for the initial version of the collector. Reasons for this shift included the ability to contribute improvements more easily, better support for Post-Quantum cryptography libraries, built-in support for Prometheus metrics, and unification onto one daemon for all types of telemetry. The migration process involved building an internal distribution of OpenTelemetry Collector using OCB (OpenTelemetry Collector Builder), creating custom components like cfjs1exporter, fileexporter, externaljsonprocessor, and ratelimit processor, and deploying the collector to both edge and core data centers. Lessons learned during this process included addressing failover issues and minimizing cutover delays. Future improvements include better handling for log sampling, insights for engineering teams on telemetry production, migration to OTLP as a line protocol, and upstreaming of custom components.

Company
Cloudflare

Date published
June 3, 2024

Author(s)
Colin Douch, Jayson Cena

Word count
2146

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.