What is data pipeline architecture?
A data pipeline is a set of actions and technologies that route raw data from different sources to a destination like a data warehouse. It consists of three components: a data source, a transformation step, and a target destination. Data pipelines centralize data from disparate sources into one place for analysis and can ensure consistent data quality. There are three types of data pipeline architectures: batch processing, stream processing, and hybrid processing. ETL pipelines extract, transform, and load data into a target system, while data pipelines don't always involve data transformation or run in batches. Automated data connectors are the most effective way to reduce the programmer burden and enable data analysts and data scientists.
Company
Fivetran
Date published
Nov. 8, 2022
Author(s)
Fivetran
Word count
2054
Hacker News points
None found.
Language
English