/plushcap/analysis/cloudflare/moving-quicksilver-into-production

Moving Quicksilver into production

What's this blog post about?

The text discusses the process of migrating from Kyoto Tycoon, an existing key-value (KV) data store, to Quicksilver, a new distributed KV data store at Cloudflare. This migration was challenging due to the need for zero downtime and seamless integration with existing systems. The team built QSKTBridge, a bridge service that replicated from Kyoto Tycoon and wrote batched changes every 500ms. They also gradually phased out reads and writes to Kyoto Tycoon while ensuring Quicksilver's performance was healthy before moving on to the next data center. The team faced several issues during this migration, such as replication saturation due to I/O contention, high write amplification causing increased pressure on SSDs, and challenges in configuring topology dynamically. They also discovered that removing Kyoto Tycoon from the edge took around a year, and removing it from the core was even harder. Despite these challenges, the team successfully migrated all Cloudflare services to Quicksilver, making significant improvements in performance and reliability. They are now working on a sharded version of Quicksilver to address storage size problems and sustain their growing product and customer bases. Overall, this migration process provided valuable insights into improving the KV store's design for future needs.

Company
Cloudflare

Date published
Nov. 25, 2020

Author(s)
Geoffrey Plouviez

Word count
4056

Hacker News points
4

Language
English


By Matt Makai. 2021-2024.