/plushcap/analysis/propel-data/push-pull-reducing-dynamodb-spend-with-cdc-kinesis

Push & Pull: Reducing DynamoDB spend with CDC & Kinesis

What's this blog post about?

Propel is a system that exposes data sources like Snowflake, Amazon S3, and Kafka through GraphQL and SQL APIs by syncing them into optimized tables in ClickHouse. The initial implementation of the sync scheduler involved continuously polling DynamoDB for changes, which became expensive as the company grew. To optimize this process, a push-based approach was introduced where DynamoDB published change data capture (CDC) information to Kinesis Data Streams, and the sync scheduler reacted to it. This resulted in a 90% reduction in read usage for their global secondary index (GSI). The company followed the KISS principle initially, allowing them to iterate quickly before optimizing later on.

Company
Propel Data

Date published
Feb. 26, 2024

Author(s)
Mark Roberts

Word count
739

Hacker News points
1

Language
English


By Matt Makai. 2021-2024.