/plushcap/analysis/airbyte/processing-paradigms-stream-vs-batch-in-the-ml-era

Processing Paradigms: Stream vs Batch in the ML Era

What's this blog post about?

Batch and stream processing are two paradigms for efficiently handling data ingestion and processing. Batch processing involves taking finite input data, running a job on it, and producing output data. It is generally measured by throughput and data quality but can introduce significant latency into a system. Stream processing, on the other hand, consumes inputs and produces outputs continuously, operating on "events" shortly after they occur. This design allows for near-real-time data ingestion or processing. When deciding between implementing batch processing or stream processing pipelines, consider factors such as latency requirements and available resources. Both paradigms play a part in training, deploying, and maintaining quality ML models.

Company
Airbyte

Date published
Dec. 19, 2023

Author(s)
Jacob Prall

Word count
741

Language
English

Hacker News points
None found.


By Matt Makai. 2021-2024.