Watchdog: Auto-detect performance anomalies without setting alerts
Watchdog is an auto-detection engine by Datadog that surfaces performance problems in applications without any manual setup or configuration. It uses machine learning algorithms to automatically detect issues such as latency spikes, elevated error rates, and network issues across microservices, endpoints, and cloud zones. The tool creates a feed of "stories" with notable findings, each highlighting the affected resource, timeframe, and impact duration. Each story links to a detail page that provides further context and aggregates performance statistics for the specific service or resource. Watchdog also identifies likely culprits by surfacing related behavior and common stack traces. It detects network issues in cloud infrastructure and helps identify the underlying cause of anomalies. The tool builds on Datadog's existing machine learning features to provide high-quality results tailored for large-scale infrastructure and applications, with plans to add more algorithms for root cause analysis and Kubernetes anomaly detection.
Company
Datadog
Date published
July 12, 2018
Author(s)
Brad Menezes
Word count
512
Hacker News points
3
Language
English