logo
Dates

Author


Conferences

Tags

Sort by:  

Authors: Alexander Kanevskiy, Swati Sehgal, David Porter, Sascha Grunert, Evan Lezar
2023-04-19

tldr - powered by Generative AI

The presentation discusses the importance of resource management in Kubernetes and highlights new features and enhancements in the ecosystem, such as the Container Device Interface (CDI) and Cgroups V2.
  • The CDI allows for sharing of GPUs and devices across different containers and pods, as well as dynamic partitioning and mixing and matching of devices.
  • Topology-aware scheduling is not the only use case for Node Resource Information (NRI) plugins, and top-level attributes can be used for other capabilities as well.
  • Cgroups V2 provides new resource management capabilities, such as memory QoS and PSI metrics, and there are plans to explore i/o isolation and network QoS guarantees.
  • The speaker encourages feedback from the audience on resource management challenges and desired features.
Authors: David Porter, Mrunal Patel
2022-10-28

tldr - powered by Generative AI

The presentation discusses the implementation and benefits of Cgroup V2 in Kubernetes.
  • Cgroup V2 is a new version of the control group subsystem in the Linux kernel that provides better resource management and control.
  • Kubernetes has integrated Cgroup V2 to improve node stability and resource management.
  • The presentation highlights the benefits of Cgroup V2, including improved memory usage, PSI pressure metrics, disk throttling, and UMD.
  • Users should test their applications for compatibility with Cgroup V2 and work with vendors to ensure compatibility.
  • Popular projects like C advisor and Auto Max Crocs have been upgraded to support Cgroup V2.
  • Java applications should upgrade to versions 11.0.16 and 15+ to ensure compatibility with Cgroup V2.
Authors: Maulin Patel, Pradeep Venkatachalam
2022-05-18

tldr - powered by Generative AI

The presentation discusses the challenges of sharing GPUs in Kubernetes and introduces two solutions: time sharing and multi-instance GPU.
  • Notebooks attached to GPUs waste expensive resources when idle
  • Real-time applications like chat box, vision product search, and product recommendation require latency-sensitive and business-critical solutions
  • Kubernetes allows fractional utilization of CPUs but not GPUs, leading to inefficient allocation
  • Time sharing allows multiple containers to run on a single GPU by allocating time slices fairly to all containers
  • Multi-instance GPU allows multiple containers to share a single GPU by creating multiple virtual GPUs
  • Both solutions address most use cases and workload needs
  • The solution is fully managed by GKE and can be configured through API calls or UI/UX