/plushcap/analysis/datastax/datastax-improving-secondary-index-write-performance-12

Improving Secondary Index Write Performance in 1.2

What's this blog post about?

Secondary indexes in Cassandra, introduced in version 0.7, allow data access using attributes other than the row key. They use an auxiliary column family to model an inverted index of values from a primary column family. The SecondaryIndex interface is an extension point for alternative implementations. However, secondary indexes add complexity as they need to be kept in sync with primary data. In Cassandra 1.2, the read-before-write requirement was removed by writing new index entries at the same time as updating primary data and deleting old entries lazily at query time. This led to performance improvements. To ensure consistency between primary data and secondary indexes, a RowMutation is received, and if any columns being mutated are configured with secondary indexes, additional work is required. The solution involves pushing updates to secondary indexes down the stack and implementing read-repair for indexes. This resulted in an ~11% improvement in write throughput.

Company
DataStax

Date published
March 28, 2013

Author(s)
Sam Tunnicliffe

Word count
805

Language
English

Hacker News points
None found.


By Matt Makai. 2021-2024.