/plushcap/analysis/ngrok/ngrok-post-how-we-built-ngroks-data-platform

How we built ngrok's data platform

What's this blog post about?

At ngrok, the company behind the popular tunneling service, they have built a comprehensive data platform that manages an extensive data lake with a single full-time data person (the author of this article). The data platform is designed to be highly scalable and flexible, using open-source tools such as Apache Iceberg, Apache Flink, and dbt. The architecture is structured around a small team of engineers working horizontally across the organization, with subject matter experts writing reusable dbt models and other teams contributing to the overall system. The platform stores various types of data, including customer information, usage metrics, subscription details, and third-party data, all of which are processed and analyzed using a combination of batch and streaming pipelines. One of the key challenges faced by ngrok's data team was integrating their existing data tools with their Go monorepo, but they overcame this by creating standardized tooling and enforcing standards around dbt models, as well as adopting custom Nix derivations for various components. The platform also handles complex schemas in Airbyte, using a combination of code generation, automated processing, and post-processing steps to ensure data consistency across different systems. Another significant challenge was scaling Apache Flink, Scala, and Protobuf to handle large volumes of data, which required customizing the Protobuf parser and creating a typeclass called ProtoHandle to provide schema parsing and encoding capabilities. The platform also uses meta signals to fight abuse and prevent abusive behavior, such as phishing scams, by analyzing metadata from various events and taking action based on the findings. Overall, ngrok's data platform is designed to be highly scalable, flexible, and maintainable, with a strong focus on collaboration and continuous improvement.

Company
Ngrok

Date published
Sept. 26, 2024

Author(s)
Christian Hollinger

Word count
4477

Language
English

Hacker News points
None found.


By Matt Makai. 2021-2024.