/plushcap/analysis/datastax/datastax-tuning-row-cache-cassandra-21

Tuning the row cache in Cassandra 2.1

What's this blog post about?

Cassandra performs optimally when the required data is already in memory as disk operations are relatively slow. To design an effective data model in Cassandra, it's crucial to consider best practices such as writing rows to disk in the same order they will be read and utilizing PRIMARY KEY for ordering. In the example provided, a table designed for holding time-series status updates is created with a carefully designed primary key that ensures rows are stored on disk in reverse chronological order according to the status_id. This enables efficient retrieval of the last 10 status updates for a user. Cassandra's row caching ability can be utilized by enabling it and specifying the number of rows to cache per partition. To use the row cache, you must also instruct Cassandra how much memory you wish to dedicate to the cache using the row_cache_size_in_mb setting in the cassandra.yaml config file. To test if data is truly being retrieved from the cache rather than from disk, tracing can be enabled in cqlsh. The trace will indicate whether a disk read was necessary or not. If the cache is insufficient to complete the request, a disk read may be necessary, which can be mitigated by increasing the cache size limit or restructuring the table to place frequently accessed rows at the head of the partition. By studying your application's query model and tuning it according to these best practices, you can achieve great response times without needing an external caching layer.

Company
DataStax

Date published
May 16, 2014

Author(s)
Ryan McGuire

Word count
784

Language
English

Hacker News points
None found.