Twilio engineers improved their core services' availability by implementing Chaos Engineering and Ratequeue HA, which eliminated the need for human intervention in common faults involving their queueing-and-rate-limiting system. The team designed a custom solution leveraging existing Twilio services to automate failover, detecting primary host failure, promoting a replica, and ensuring data integrity. They also implemented Ratequeue Chaos, a tool that simulates failures, monitors recovery, and validates the effectiveness of their automated failover system. This approach increased system resilience and availability, with the complete automated failover completing in under a minute after detection.