Company
Date Published
Author
Janos Szendi-Varga
Word count
1713
Language
English
Hacker News points
None

Summary

Chaos engineering allows companies like Google and Netflix to minimize downtime and surprise breakdowns in production by purposefully creating failures on a regular basis. The concept of chaos engineering has its roots in the idea of breaking things to explore toys as a child, and was popularized by Jesse Robbins, who founded GameDay to increase reliability by regularly creating major failures. Chaos engineering is now widely adopted by organizations, including Netflix, which promotes it online and provides business cases for its value. To conduct an experiment, one should define a steady state, utilize both a control and experimental group, inject failures into the experimental group, try to disprove the hypothesis that the system is resilient, and have a big red button to stop the experiment at any time. Tools such as Chaos Monkey, Mangle, and the Spring Boot Chaos Monkey tool can be used to conduct chaos engineering experiments. When it comes to Neo4j, monitoring known known elements and applications, and using tools such as Grafana and Failure Injection Testing (FIT) can help identify weaknesses in the system. Ultimately, chaos engineering with Neo4j involves simulating failures, injecting latency, and testing transaction faults, all while having a metric registry to monitor and control the experiment.