An Epic Read on Follower Reads

Company

Cockroach Labs

Date Published

Sept. 16, 2021

Author

Andrei Matei

Word count

8679

Language

English

Hacker News points

URL

www.cockroachlabs.com/blog/follower-reads-stale-data

Summary

CockroachDB is a distributed SQL database that provides strong consistency, survivability, and high availability. One of the key features of CockroachDB is its ability to serve stale reads from follower replicas, which can significantly improve read performance in multi-region deployments. In this post, we'll discuss how CockroachDB enables follower reads and some of the trade-offs involved. Closed Timestamps: CockroachDB uses closed timestamps to enable follower reads. A closed timestamp is a point in time after which no writes can occur on a range (a contiguous set of keys). Once a timestamp is closed, it's guaranteed that all subsequent reads at or below the closed timestamp will see a consistent view of the data. To close a timestamp, CockroachDB uses a side-transport mechanism to propagate closed timestamp information between nodes in the cluster. Each node maintains a sender-side state and a receiver-side state for each connection with another node. The sender updates its state when it sends closed timestamp information, and the receiver updates its state when it receives this information. When a follower replica wants to determine whether it's up to date enough to serve a read at a particular timestamp, it checks its own state against the closed timestamp information received from other nodes. If the timestamp is below or equal to the range's closed timestamp, the follower can safely serve the read without blocking on any locks. Optimizations: To minimize access to the side-transport's state and improve performance, CockroachDB uses a per-replica cache in front of the side-transport and queries the closed timestamp information in conjunction with a timestamp that's "sufficient". The sufficient threshold is the timestamp of the read which we suspect we might be able to serve locally as a follower-read. Transaction commit and closed timestamps vs resolved timestamps: CockroachDB allows transactions to commit at timestamps below the closed timestamp, but it doesn't count transaction commits as writes for the purpose of updating closed timestamps. This is because transaction commits are not tied to any particular range and can only ratchet up the timestamps of the committed keys - so they cannot cause keys to float down from >ct to <ct. Request routing: CockroachDB uses a component called the DistSender to route read/write requests to the right place. The DistSender tries to route requests to leaseholders, but if the request is read-only with a timestamp that's old enough, it chooses the closest replica instead. A read timestamp is considered "old enough" if it's likely that every follower has been informed of a closed timestamp higher than the read timestamp. Comparing CockroachDB with Spanner: CockroachDB and Google's Spanner have some similarities in how they handle stale reads, but there are also some differences. In Spanner, every replica serves both strongly-consistent and stale reads, while in CockroachDB, the DistSender tries to route requests to leaseholders for strong reads and to followers for stale reads. Additionally, Spanner requires a follower to have caught up to a resolved timestamp rather than a closed timestamp before serving a read at that timestamp. In conclusion, CockroachDB's ability to serve follower reads is an important feature that can significantly improve read performance in multi-region deployments. By using closed timestamps and optimizing the request routing process, CockroachDB enables efficient and consistent stale reads while maintaining strong consistency guarantees.