Test in Production: A Panel Discussion on Chaos Engineering
LaunchDarkly recently hosted a meetup focused on chaos engineering, featuring engineers from Netflix, LinkedIn, and Gremlin. The event aimed to discuss how teams approach testing in production and share best practices for doing so safely. Honeycomb CEO Charity Majors emphasized the importance of resilience testing and acknowledging unknown unknowns when building systems. Netflix's Nora Jones discussed a chaos engineering tool developed by her team, while LinkedIn's Ted Strzalkowski shared how his team focuses on providing a comprehensive, automated, and measurable resiliency feedback loop. Gremlin's Pat Higgins talked about holistic thinking around failure and the development of user experience values in their platform. The panel discussed various aspects of chaos engineering, including the importance of order, monitoring, and safety measures. They also touched upon the challenges of bridging chasms between different stages of implementing chaos engineering within a company and obtaining buy-in for the long process. The engineers shared insights on how they prevent thundering herds of chaos experiments from taking over production systems and ensuring that failure mitigation strategies are practiced regularly to create a culture around it.
Company
LaunchDarkly
Date published
Oct. 5, 2018
Author(s)
Andrea Echstenkamper
Word count
3960
Hacker News points
None found.
Language
English