Nomad’s internal garbage collection and optimization discovery during the Nomad Bench project
During the development of Nomad's 1.8 LTS release, a benchmarking infrastructure was created to test performance and find areas for improvement in efficiency. The focus of this article is on Nomad's internal garbage collection process and an optimization discovered during the project. Nomad garbage collection applies to evaluations, nodes, jobs, deployments, and plugins, and can be configured by users. When a Nomad server becomes a leader, it starts periodic garbage collection "tickers" that clean objects marked for garbage collection from memory. An interesting example is job deregistration, whereby users issue a stop command or API call, but information about the stopped job remains in memory until job_gc_interval time passes. The optimization involved removing the loop and code that creates evaluations when garbage collecting a job, reducing both Raft and eval broker load. This change resulted in higher throughput, greater stability, and minor improvements in CPU and memory consumption.
Company
HashiCorp
Date published
Aug. 12, 2024
Author(s)
Piotr Kazmierczak
Word count
1123
Hacker News points
None found.
Language
English