/plushcap/analysis/datastax/datastax-optimizations-around-cold-sstables

Optimizations around Cold SSTables

What's this blog post about?

Cassandra's storage architecture utilizes large immutable files called SSTables, which are combined through compaction to evict obsolete data and enhance reading efficiency. In older versions, all SSTables were indexed at the same granularity, consuming resources proportional to the size of stored data. However, not all SSTables are equally important, especially in time series data models where recently written data is frequently read. Cassandra 2.0.2 introduced tracking of SSTable read rates, allowing manual tuning per table. In 2.0.3, improvements were made to size-tiered compaction based on this data, with automatic resource management added in 2.1. Two optimizations for handling cold SSTables were implemented: prioritizing the compaction of hottest SSTables and avoiding compacting cold SSTables altogether using a new compaction strategy option called 'cold_reads_to_omit'. Starting from Cassandra 2.1, this feature is enabled by default with a value of 0.05. Additionally, in 2.1, the memory usage for systems with many cold SSTables has been reduced by moving index summaries off-heap and resizing them periodically to fit within a fixed memory pool size.

Company
DataStax

Date published
Dec. 3, 2013

Author(s)
Tyler Hobbs

Word count
872

Language
English

Hacker News points
None found.


By Matt Makai. 2021-2024.