All
Articles
Conferences
Presentations
Dates
Clear
Within 1 day
Within 1 week
Within 1 month
Within 1 year
Within 3 years
Author
Has Video
1
Conferences
Apply
KubeCon + CloudNativeCon Europe 2022
1
Tags
Apply
GPU utilization
1
Kubernetes
1
containers
1
performance
1
resource management
1
Sort by:
Most recent
Improving GPU Utilization using Kubernetes
Conference:
KubeCon + CloudNativeCon Europe 2022
Authors:
Maulin Patel
,
Pradeep Venkatachalam
2022-05-18
tldr - powered by Generative AI
The presentation discusses the challenges of sharing GPUs in Kubernetes and introduces two solutions: time sharing and multi-instance GPU.
Notebooks attached to GPUs waste expensive resources when idle
Real-time applications like chat box, vision product search, and product recommendation require latency-sensitive and business-critical solutions
Kubernetes allows fractional utilization of CPUs but not GPUs, leading to inefficient allocation
Time sharing allows multiple containers to run on a single GPU by allocating time slices fairly to all containers
Multi-instance GPU allows multiple containers to share a single GPU by creating multiple virtual GPUs
Both solutions address most use cases and workload needs
The solution is fully managed by GKE and can be configured through API calls or UI/UX
Tags:
GPU utilization
resource management
Kubernetes
containers
performance
Show 0 Comments
1