When a timeout is not a failure: how Cassandra delivers high availability, part 1
Cassandra is designed for fault tolerance and availability in distributed systems. When a client makes a request, it may talk to any node in a Cassandra cluster, which acts as the coordinator responsible for routing the request to appropriate replicas. If the coordinator fails mid-request or if a replica fails before the request arrives, the client is in the dark and has no choice but to retry. In case of a replica failure after the coordinator has forwarded the client's request, Cassandra replies with a TimedOutException and provides an acknowledged_by count of how many replicas succeeded. The coordinator can force the results towards either the pre-update or post-update state using hinted handoff, which stores the update locally and re-sends it to the failed replica when it recovers.
Company
DataStax
Date published
Aug. 15, 2012
Author(s)
Jonathan Ellis
Word count
567
Hacker News points
None found.
Language
English