Home / Companies / WarpStream / Blog / Post Details
Content Deep Dive

Dealing with rejection (in distributed systems)

Blog post from WarpStream

Post Details
Company
Date Published
Author
Richard Artoul
Word Count
2,458
Language
English
Hacker News Points
12
Summary

There are two ways to learn about distributed systems: by studying the literature and by building and operating them in production. The author, who has spent 10 years operating and building large-scale distributed databases, has learned practical knowledge about what it takes to convert a design into an implementation that works at scale. However, many topics related to distributed systems are not well-covered in the literature, such as backpressure, which is a critical detail that every good distributed system needs to get right to survive in production. The author discusses how they implemented a backpressuring system for their WarpStream database, which uses metrics such as memory usage and inflight requests to trigger backpressure, and how it provides better performance and scalability than traditional rate-limited approaches. The system is designed to make the distributed system feel "springy" so that it can immediately recover when additional resources are provided or load is removed.