Load Balancing without Load Balancers
CloudFlare experienced an hour-long outage last weekend due to a flaw in their architecture which they are working to eliminate. The company has designed its systems with the assumption that failure is inevitable and thus must be planned for at every level. They use Anycast routing, both on the wide area network (WAN) and local area network (LAN), to handle failures gracefully. In case of a process crash, server crash, switch crash, or router crash, traffic fails over to the next closest data center. The company is constantly working to improve its fault tolerance and is currently hiring for positions related to building a robust network.
Company
Cloudflare
Date published
March 6, 2013
Author(s)
Matthew Prince
Word count
2183
Hacker News points
None found.
Language
English