logo
Dates

Author


Conferences

Tags

Sort by:  

Authors: Amit Kalamkar, Vigith Maurice
2022-10-27

tldr - powered by Generative AI

Intuit's new platform, NewMapRaj, uses AI-based observability to improve change-related incidents and reduce MTTR and MTTD.
  • NewMapRaj is a Kubernetes native data processing and analytics tool used to derive actionable insights for different areas like operational excellence, cost, and security.
  • Intuit's core principle is innovation, and they invest in Argo to make sure their products are always available and issues are resolved quickly.
  • Change-related incidents were causing one-third of Intuit's incidents, and their MTTR was higher due to disjointed deployment and operational experiences.
  • NewMapRaj integrated AI-based observability into Argo CD and rollouts to add a metrics tab, run a multivariant model, and remove humans from the equation.
  • The AI-based observability is computed in real-time and normalized to a human understandable format.
  • NewMapRaj uses a streaming system that does feature engineering and inferencing, and triggers inline training to discover new applications and configurations.
  • The challenges of real-time streaming include boilerplate code and non-standard code, making it difficult to do quick experimentation and extension.
Authors: Shimon Tolts, Noaa Barki
2021-10-15

tldr - powered by Generative AI

Learning from 100+ Kubernetes post-mortems to prevent production outages
  • Reviewed 100+ post-mortems to discover recurring patterns, anti-patterns, and root causes of typical outages in Kubernetes-based systems
  • Aggregated insights gathered to review the most obvious DON'Ts and some less obvious ones to help prevent production outages
  • Shift left responsibility by delegating knowledge and educating developers on best practices in the industry
  • Anecdote about attending a devops meetup and realizing the importance of devops for developers