How Go shadowing and bad choices caused our first data loss bug in years
Ryland from Temporal, an MIT OSS platform for building highly reliable distributed applications, discussed a recent data-loss bug that affected their users' mission-critical applications. The issue was caused by a Golang variable shadowing problem in the persistence code and only occurred when Cassandra returned specific errors from failed transactions. The team initially assumed the issue was related to memory problems with the clusters running their persistence, but later discovered it was due to the Golang bug. They have since implemented measures to prevent such issues from happening again, including adding tests for dependency-level failures and improving communication channels with users.
Company
Temporal
Date published
March 3, 2021
Author(s)
Ryland Goldstein
Word count
2207
Hacker News points
None found.
Language
English