Airflow in Action: Data Engineering Insights from Processing PBs of Data Every Day at Stripe
Stripe, the payment infrastructure of the internet, processes $1 trillion in payments annually. To ensure regulatory compliance and maintain data integrity while enabling developers to innovate quickly, Stripe developed User Scope Mode (USM), an internal tool that allows safe testing of Apache Airflow® data pipelines without risking production data corruption. Stripe operates Airflow at a massive scale, processing multiple petabytes of data daily and managing 250 complex pipelines with 150,000 tasks. The company is transitioning from its own Airflow fork to the mainline project for reduced engineering efforts and faster access to new features. USM has revolutionized Stripe's development and testing workflows by enabling efficient validation of pipelines while maintaining strict compliance requirements, permissioning, and data integrity.
Company
Astronomer
Date published
Nov. 21, 2024
Author(s)
Matthew Keep
Word count
676
Language
English
Hacker News points
None found.