Company
Date Published
Feb. 22, 2024
Author
Bu Kinoshita
Word count
469
Language
English
Hacker News points
73

Summary

The Resend experienced a significant database migration outage on February 21st, 2024, which lasted for approximately 12 hours and affected all users. The outage was caused by an incorrectly pointed database migration command that dropped all tables in production. The company immediately began the restoration process from backups but faced two failed attempts to restore data, resulting in a 5-minute window of data loss for some users. Despite the incident, Resend is committed to learning from it and improving its operations and tooling to avoid similar outages in the future. To address this issue, the company plans to implement measures such as local development improvements, redundancy, and increased cadence for disaster recovery tests. The goal is to restore reliability and minimize the impact of such incidents on users.