/plushcap/analysis/temporal/temporal-go-shadowing-bad-choices

How Go shadowing and bad choices caused our first data loss bug in years

What's this blog post about?

Ryland from Temporal, an MIT OSS platform for building highly reliable distributed applications, discussed a recent data-loss bug that affected their users' mission-critical applications. The issue was caused by a Golang variable shadowing problem in the persistence code and only occurred when Cassandra returned specific errors from failed transactions. The team initially assumed the issue was related to memory problems with the clusters running their persistence, but later discovered it was due to the Golang bug. They have since implemented measures to prevent such issues from happening again, including adding tests for dependency-level failures and improving communication channels with users.

Company
Temporal

Date published
March 3, 2021

Author(s)
Ryland Goldstein

Word count
2207

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.