Company
Date Published
Aug. 15, 2012
Author
Jonathan Ellis
Word count
567
Language
English
Hacker News points
None

Summary

Cassandra is designed for fault tolerance and availability in distributed systems. When a client makes a request, it may talk to any node in a Cassandra cluster, which acts as the coordinator responsible for routing the request to appropriate replicas. If the coordinator fails mid-request or if a replica fails before the request arrives, the client is in the dark and has no choice but to retry. In case of a replica failure after the coordinator has forwarded the client's request, Cassandra replies with a TimedOutException and provides an acknowledged_by count of how many replicas succeeded. The coordinator can force the results towards either the pre-update or post-update state using hinted handoff, which stores the update locally and re-sends it to the failed replica when it recovers.