logo
Dates

Author


Conferences

Tags

Sort by:  

Authors: Amit Kalamkar, Vigith Maurice
2022-10-27

tldr - powered by Generative AI

Intuit's new platform, NewMapRaj, uses AI-based observability to improve change-related incidents and reduce MTTR and MTTD.
  • NewMapRaj is a Kubernetes native data processing and analytics tool used to derive actionable insights for different areas like operational excellence, cost, and security.
  • Intuit's core principle is innovation, and they invest in Argo to make sure their products are always available and issues are resolved quickly.
  • Change-related incidents were causing one-third of Intuit's incidents, and their MTTR was higher due to disjointed deployment and operational experiences.
  • NewMapRaj integrated AI-based observability into Argo CD and rollouts to add a metrics tab, run a multivariant model, and remove humans from the equation.
  • The AI-based observability is computed in real-time and normalized to a human understandable format.
  • NewMapRaj uses a streaming system that does feature engineering and inferencing, and triggers inline training to discover new applications and configurations.
  • The challenges of real-time streaming include boilerplate code and non-standard code, making it difficult to do quick experimentation and extension.
Conference:  CloudOpen 2022
Authors: Marcel Hild, Karsten Wade
2022-06-21

What are the benefits of running a project's code in an all-open source community cloud? What happens when a community of Site Reliability Engineering (SRE) practitioners decide to Open Source their craft? How does this Operate First concept help the nascent discipline of AIOps? There are many ways the Operate First concept can improve Open Source software development via operational insights. In this session you'll learn a few of those ways through stories and demonstrations. You'll see how the OS-Climate initiative has accelerated participation in the financial community via the Operate First community cloud. You'll explore the content and material from the SIG-SRE community that lets anyone see and learn how a real production clean is run. You'll get a look behind the scenes of the Operate First project's running OpenShift-based community cloud.