Company
Date Published
Author
Sandy Ryza
Word count
2369
Language
English
Hacker News points
None

Summary

Dagster is a data orchestrator that helps build and operate machine learning pipelines. It models graphs of data assets and the data transformations that connect them, allowing users to define training and batch inference pipelines in Python and run them reliably in production. Unlike workflow managers like Apache Airflow, Dagster is designed for use during development and modeling, with a lightweight execution model that enables quick iteration and easy debugging. It also models data assets, not just tasks, which helps users understand how changes to one asset affect the entire pipeline. By using Dagster, machine learning engineers can try out ideas faster, translate those ideas into production more easily, and understand how they're performing over time.