Kubernetes clusters are critical infrastructure at large, public companies, with large amounts of traffic, complex dependencies on 3rd party services, and constant change as developers release features and traffic scales up and down. In this panel discussion, engineers from Airbnb, Lyft, Netflix and Robinhood share their challenges, experiences and learnings when it comes to managing a sustainable on-call rotation that meets the needs of their internal users whilst maintaining a high uptime to serve business critical workloads. Topics covered will include: +Keeping on-call engineers happy + Balancing rapid response with alert fatigue + Strategies to proactively deal with production issues + Preparing engineers for on-call