ELT for Beginners: Extract from S3, Load to Databricks and Run Transformations
This tutorial demonstrates how to create an ELT (Extract-Load-Transform) pipeline using AWS S3 and Databricks, two popular tools in data engineering. The process involves extracting data from S3, loading it into Databricks, and running transformations defined in Databricks Notebooks. This pattern is versatile and applicable across various industries such as FinTech, E-commerce, and B2C services. The tutorial provides step-by-step instructions on setting up the necessary connections between S3, Databricks, and Airflow, creating Databricks notebooks for data transformations, and deploying the ELT DAG to an Astro deployment.
Company
Astronomer
Date published
Dec. 2, 2024
Author(s)
Tamara Fingerlin
Word count
2626
Language
English
Hacker News points
None found.