Airbnb's journey to heterogeneous clusters and the technical and organizational hurdles they faced in migrating from homogeneous clusters
- Airbnb migrated from running homogeneous Kubernetes clusters to heterogeneous clusters to improve cost and efficiency
- Changes were required in almost every part of their infrastructure to support multiple different node types
- They faced three specific technical and organizational hurdles in this journey
- Heterogeneous clusters have been instrumental for Airbnb and their team
Airbnb previously ran chef on AWS EC2 where each replica of each service would have its own machine. In 2017 and 2018, they started building OneTouch, an abstraction layer on top of Kubernetes for developers. In 2018 and 2019, they migrated 90% of their 700+ services to Kubernetes. Initially, their clusters were separated out by environment, but they were forced to split these clusters into different cluster types as they hit Kubernetes single node single cluster node size limits. They started with single instance type clusters, but as more specialized workloads started migrating, they required different instance types such as GPUs.