Chaos engineering is a discipline that helps identify and resolve issues in distributed systems architecture, providing superior customer experience. FIS has developed an ecosystem to automate and scale chaos engineering across multiple products.
- Chaos engineering is a disciplinary practice experimenting on a system to find how resilient it is under turbulent or faulty conditions.
- Distributed systems architecture brings its own challenges, and chaos engineering can help identify and resolve issues around resiliency.
- FIS has developed an ecosystem to automate and scale chaos engineering across multiple products, identifying the toil within the practice and making it a repeatable process.
- The ecosystem includes load generators, APM tools, chaos tools, and Captain, which evaluates the chaos experiment and provides a pass/fail result.
- Chaos engineering can help validate architecture, measure business metrics, and ultimately provide greater customer experience and product quality.
Chaos engineering helps identify issues that can occur under load, such as network latency between microservices. By generating load on an application under test and injecting chaos through a tool like Captain, FIS can monitor the health of the application and evaluate the chaos experiment to identify issues and improve resiliency.
FIS, a Fintech company with more than 20,000 clients around the globe, offers Banking-as-a-Service Hub, which enables banks and corporations the ability to rapidly configure new financial services. The Delivery of “as-a-service” features across accounts, cards, and establishments is enabled by functional modules deployed on Kubernetes, which are used by thousands of customers each day. To bolster the resiliency of this critical infrastructure, FIS uses LitmusChaos to expose and help remediate the system flaws thereby ensuring highly available services for the customers. In this talk, Rajeshwar (FIS) & Neelanjan (Harness) will lay out the reliability challenges while delivering Banking-as-an-Service and demonstrate how chaos experimentation was leveraged as part of the organization’s “client-experience-year” initiatives to improve the banking APIs.