/plushcap/analysis/cockroach-labs/faster-bulk-data-loading-in-cockroachdb

Faster Bulk-Data Loading in CockroachDB

What's this blog post about?

Cockroach Labs recently introduced a new algorithmic approach to organizing files in their storage system, Pebble, which led to an 80+% reduction in ingestion time for the standard TPC-C benchmark dataset. The team initially replaced the implementation of their IMPORT bulk-loading feature with a simpler and faster data ingestion pipeline but later faced issues where some IMPORTs were much slower or even stuck. They discovered that directly sending out-of-order data to the KV storage layer was causing these problems, as LSMs like RocksDB store data in order. The solution hinged upon their recent switch from RocksDB as their key-value store to Pebble, where they were able to add a new algorithmic approach to organizing files that led to the massive improvement.

Company
Cockroach Labs

Date published
Oct. 13, 2020

Author(s)
Bilal Akhtar

Word count
5283

Hacker News points
14

Language
English


By Matt Makai. 2021-2024.