Reading Very Large Postgres tables - Top Lessons We Learned
Over the past year, Airbyte has worked on making its Postgres Source connector enterprise grade by adding support for logical replication methods (CDC and xmin), handling most data types and formats, and improving performance. Throughput performances have increased from 4 to 11 MB per second compared to Fivetran at 5MB per second. The key lessons learned include reading data in its natural order, breaking large table reads into smaller sub-queries, using checkpoints for recovery, transitioning to incremental updates, and continuously measuring performance. These improvements allow Airbyte to handle any size of PostgreSQL table without putting the server under stress or requiring increased resources.
Company
Airbyte
Date published
Aug. 9, 2023
Author(s)
Rodi Reich-Zilberman
Word count
1418
Hacker News points
1
Language
English