/plushcap/analysis/datastax/datastax-handling-disk-failures-cassandra-12

Handling Disk Failures In Cassandra 1.2

What's this blog post about?

The text discusses how Cassandra handles node failures and its robustness. Prior to version 1.2, a single unavailable disk could make an entire replica unresponsive due to issues with memtables and commitlog append. Traditional workarounds involved using RAID10 volumes, but this approach was becoming less feasible as data volumes increased. The upcoming Cassandra 1.2 release introduces a disk_failure_policy setting with two options: best_effort and stop. These policies allow for sensible handling of disk failure by either stopping the affected node or blacklisting the failed drive, depending on availability/consistency requirements. This improvement allows deploying Cassandra nodes with large disk arrays without the need for RAID10 overhead.

Company
DataStax

Date published
Oct. 11, 2012

Author(s)
Aleksey Yeschenko

Word count
237

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.