/plushcap/analysis/datastax/datastax-common-mistakes-and-misconceptions

Common Mistakes and Misconceptions

What's this blog post about?

The text discusses important operational aspects of running Apache Cassandra, an open-source distributed database management system. It covers the following key points: 1. Repair: This process ensures data consistency across replicas but can be expensive in terms of resources and latency. Running repair weekly is recommended. 2. Read_repair_chance: This setting controls how often Cassandra checks for inconsistencies between replicas during reads. The default value is 0.1, which means that 10% of requests will trigger a background read repair. 3. Cleanup: This process removes data no longer owned by a node after topology changes. It's recommended to schedule cleanup only when necessary and not at regular intervals. 4. Compaction: This optimization process merges rows in the background to reduce IO and CPU time for reads. Major compactions can exacerbate issues with tombstones and updates, so it's better to let compaction run its natural course. 5. JVM Heap Size: For optimal performance, Cassandra requires a small heap size (ideally less than 12 GB). Memory not allocated to the heap is utilized by Cassandra for memory-mapped IO, improving overall efficiency.

Company
DataStax

Date published
Oct. 11, 2013

Author(s)
Ben Coverston

Word count
1393

Language
English

Hacker News points
None found.


By Matt Makai. 2021-2024.