/plushcap/analysis/buildkite/reliability-review-q1-2022

Reliability Review Q1 2022

What's this blog post about?

Buildkite, a software development tool used by thousands of teams worldwide, has undergone a Reliability Review in Q1 2022 following several reliability incidents in late 2021. The company is now focusing on defining Service Level Indicators (SLIs) and Service Level Objectives (SLOs), which will help them better understand customer expectations and improve their product's reliability. They have also introduced error budgets, where teams must stop feature work and focus on reliability when the budget is exhausted. Additionally, Buildkite has expanded its cloud footprint by operating from a third availability zone in AWS, improving resilience to single AZ incidents. The company plans to continue working on database improvements, including potential migration to Aurora and partitioning of large tables.

Company
Buildkite

Date published
April 11, 2022

Author(s)
Miguel Molina

Word count
1066

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.