Keep Calm and Containerd On!


Authors:   Anusha Ragunathan


The presentation discusses Intuit's migration from 'dockerd' to 'containerd' as the CRI runtime for their Kubernetes clusters, and the challenges they faced during the process.
  • Intuit had over 200 Kubernetes clusters with 20,000 nodes running 'dockerd' as the CRI runtime
  • The upcoming removal of dockerd from upstream Kubernetes prompted the migration to containerd
  • Lessons learned during the migration process, including issues with log management, SELinux, and GPU support
  • Rollout of containerd to production clusters and handling compatibility issues during cluster upgrades
  • Performance analysis showed that containerd had lower startup times and CPU consumption compared to dockerd
During the migration process, Intuit faced a problem with their CNI where it would query the containerd socket and get an empty list of pods, causing it to start deallocating IP addresses from live docker pods. To solve this, they created a generic symlink for both containerd and dockerd sockets in their bootstrap code, and made sure these changes were released prior to the migration.


Letting go isn't easy! Especially when it comes to your Kubernetes cluster’s CRI implementation. Like most big Kubernetes deployments, Intuit’s 200+ clusters with 20000 nodes were running ‘dockerd’ as the CRI runtime, with dependencies on the docker API and CLI. We migrated our fleet of clusters to ‘containerd’. Whether you have a complicated Kubernetes installation with customized cluster addons or a simple set of clusters, you will be affected by the upcoming removal of dockerd from upstream Kubernetes. Come listen to us, learn from our journey and be prepared to make this migration smooth and seamless. We will share lessons learned migrating clusters to containerd. From issues faced with log management, SELinux and GPU support, to rewiring cluster addons related to CNI and runtime security, this talk is about Intuit’s journey moving to containerd. We will also talk about rollout of containerd to our production clusters and how we handled compatibility issues during cluster upgrades.Click here to view captioning/translation in the MeetingPlay platform!


Post a comment

Related work

Authors: Stefan Büringer, Shivani Singhal, Yuvaraj Balaji Rao Kakaraparthi, Killian Muldoon, Jack Francis

Authors: Maksym Pavlenko, Samuel Karp

Authors: Mike Brown, Phil Estes, Derek McGowan, Maksym Pavlenko

Authors: Arun M. Krishnakumar, Sahithi Ayloo