/plushcap/analysis/datadog/introducing-recovery-thresholds

Introducing recovery thresholds for metric alerts

What's this blog post about?

The text discusses the problem of unstable metrics causing frequent alerts and introduces recovery thresholds in Datadog to address this issue. Recovery thresholds work on the principle of hysteresis, similar to how a thermostat regulates temperature. They set different values for when an alert triggers and when it resolves, ensuring that an alert only enters the "recovered" state once a metric has passed the recovery threshold. This helps reduce noise and distraction from flapping monitors and increases confidence in issue resolution. Recovery thresholds can be set up during monitor creation via the UI or API, with separate thresholds for recovery from alert and warning states.

Company
Datadog

Date published
Nov. 6, 2017

Author(s)
Rachel Windzberg

Word count
364

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.