logo

How to Migrate 700 Kubernetes Clusters to Cluster API with Zero Downtime - Tobias Giese & Sean Schneeweiss, Mercedes

2022-05-19

Authors:   Tobias Giese, Sean Schneeweiss


Summary

The presentation discusses the migration of 700 clusters to Cluster API for central cluster management and the lessons learned from the process.
  • Cluster API provides central cluster management for the complete life cycle of a cluster.
  • The migration process involved transitioning from a legacy provisioning architecture to Cluster API.
  • Zero downtime was a crucial requirement during the migration process.
  • Testing and bug fixing were important steps in the migration process.
  • The platform is under constant construction with feature improvements and bug fixing.
  • The team plans to implement additional Cluster API features and offer public clouds to users.
  • The migration process took around a year to complete.
During the migration process, the team faced a caching problem where the controller that watches on resources didn't know that a resource was deleted because the cache had invalid or false data. To mitigate the problem, they restarted the ports and a colleague created a pull request for the controller runtime project to fix the bug.

Abstract

Cluster API promises "to simplify provisioning, upgrading, and operating multiple Kubernetes clusters." Do you find it challenging to migrate your existing Kubernetes cluster provisioning to Cluster API? Would you like to benefit from all the features that Cluster API offers and manage your infrastructure the Kubernetes style? At Mercedes-Benz, we run and operate more than 700 Kubernetes clusters and 3,500 machines all over the world in on-premises OpenStack data centers. By migrating to Cluster API, we replaced our legacy provisioning, consisting of Terraform, custom self-written tools and Kubernetes operators. Expect valuable insights on what it takes to transfer production systems into the control of Cluster API with zero downtime and zero customer impact. Get to know the technical challenges of migrating, how they can be solved and how to extend Cluster API functionality to fit your needs.Click here to view captioning/translation in the MeetingPlay platform!

Materials:

Post a comment