Relativity schedules almost a million Windows containers per day to a globally distributed set of Kubernetes clusters. Two years ago we started to break apart our enterprise .NET monolith into microservices hosted on Kubernetes. At that time our developers had a multi-month release cadence. Now we have automated vulnerability patching, can do zero downtime migrations of workloads between clusters, have automated failover for critical services in the event of regional failures, and have have happy developers who can test and push to production immediately. How did we get here? By covering a rocky road full of issues. Come learn from our mistakes so you don't have to repeat them. We will talk about application and orchestration design patterns that have been successful for our teams, custom operators for Windows node problem identification that we have have built and found useful, and monitoring patterns that have helped us stay ahead of issues.