Push & Pull: Reducing DynamoDB spend with CDC & Kinesis
Propel is a system that exposes data sources like Snowflake, Amazon S3, and Kafka through GraphQL and SQL APIs by syncing them into optimized tables in ClickHouse. The initial implementation of the sync scheduler involved continuously polling DynamoDB for changes, which became expensive as the company grew. To optimize this process, a push-based approach was introduced where DynamoDB published change data capture (CDC) information to Kinesis Data Streams, and the sync scheduler reacted to it. This resulted in a 90% reduction in read usage for their global secondary index (GSI). The company followed the KISS principle initially, allowing them to iterate quickly before optimizing later on.
Company
Propel Data
Date published
Feb. 26, 2024
Author(s)
Mark Roberts
Word count
739
Hacker News points
1
Language
English