Case Study: Bringing Chaos Engineering to the Cloud Native Developers


Authors:   Ramiro Berrelleza, Uma Mukkara


The presentation discusses the importance of incorporating chaos engineering into the development workflow for cloud-native applications using Litmus and Octeto on Kubernetes.
  • Litmus and Octeto are open-source tools that allow for the validation and verification of code resilience and application functionality on Kubernetes.
  • Chaos engineering should be incorporated into the development workflow to improve application quality and resilience.
  • Self-service portals and catalogs make it easy for developers to run chaos experiments and tests.
  • Running chaos experiments on ephemeral dev environments on Kubernetes makes it easier to run tests and reuse experiments in staging and production.
  • The more chaos tests are run, the less expensive and more normal they become in the development workflow.
  • Using chaos engineering tools in all phases of development and with multiple components will improve application quality and resilience.
The speaker demonstrates a live environment of a simple to-do app running on Kubernetes using Octeto and Litmus. They run a baseline test to see what's happening and then proceed to run a chaos experiment that deletes the database of the application to see if it can handle a database outage. The experiment runs for 45 seconds, with Litmus killing the database every 5 seconds while making continuous requests to the to-do list. The objective is to test the application's resilience to database failures.


Though Chaos Engineering started as a solution for fixing unknown problems at scale, it has evolved in recent years into a totally different practice area. It is now beginning to play a major role in CI/CD apart from Ops and figures as an aid that improves developer experience. Chaos frameworks are beginning to feature in the list of must-have dev tools. In this session, we discuss the role of Chaos Engineering in stepping up the cloud native dev experience and how developers can use cloud native chaos tests to verify the resilience of their application even before the code is merged. Okteto is an open source tool that enables developers to deploy development environments directly in Kubernetes. The community behind Okteto has succeeded with the idea of providing cloud native chaos tests to the developers in their toolset. In this session we take examples of Litmus chaos tests on Okteto and show how developers can run them as part of the development process, rather than just on CI.Click here to view captioning/translation in the MeetingPlay platform!