Best practices for monitoring and remediating connection churn
Connection churn refers to the rate of TCP client connections and disconnections in a distributed system. It can lead to latency, request bottlenecks, and decreased throughput due to the overhead associated with repeatedly creating and closing network connections. Common causes of connection churn include spikes in users, misconfigured client services, and issues with load balancing. To detect elevated connection churn, it's important to gather monitoring data on all distributed services and use tools like Datadog Network Performance Monitoring (NPM) and Universal Service Monitoring (USM) to track established connections, closed connections, latency over TCP sockets, and more.
Company
Datadog
Date published
Sept. 18, 2024
Author(s)
Nicholas Thomson, Guy Arbitman
Word count
1673
Hacker News points
None found.
Language
English