logo

This is The Way: A Crash Course on the Intricacies of Managing CPUs in K8s

2022-05-18

Authors:   Marlow Weston, Swati Sehgal


Summary

The presentation discusses the intricacies of managing CPUs in Kubernetes and the various options available for resource management.
  • Early Kubernetes had simple resource management with only CPU and memory as native resources
  • Resource Management Working Group was formed to enhance Kubernetes to support diverse and complex classes of applications
  • There are currently three options for resource management: CRIRM, CPU Pooler, and CMK
  • Community discussion is ongoing to address gaps in CPU management
  • Kubernetes has various operational areas organized as SIGs and working groups
In the early days of Kubernetes, resource management was simple with only single socket nodes. However, as workloads became more complex, the need for specialized hardware and performance-sensitive workloads became apparent. This led to the formation of the Resource Management Working Group, which included representatives from various companies. Currently, there are three options for resource management, but the community is actively discussing ways to address gaps in CPU management. It is important to keep an eye on SIG Node and SIG Scheduling for updates on this topic.

Abstract

Optimizing CPU management improves cluster performance and security, but is daunting to almost everyone. CPU management may seem complex, but it can be explained in such a way that even your inner toddler will comprehend. With this talk, we will give a path to success. You may have a multi-socket node cluster where your AI/ML workloads care about the proximity of your CPUs to GPUs. You may be running scientific workloads where you want to pin in cores within containers instead of just a pod level. You may have a single-socket server where you want to save a single core outside of Kubernetes for a daemon dedicated to mining bitcoin, without affecting your other jobs (please do not do this). We will cover these and more, helping you understand the intricacies of CPU management within the kubelet and what Kuberenetes can and cannot currently do. We will also cover how you can help escalate the visibility of use cases not currently covered within Kubernetes.Click here to view captioning/translation in the MeetingPlay platform!

Materials: