The presentation discusses GPU utilization and benchmarking, focusing on time slicing and Mig, and provides insights on their use cases and performance trade-offs.
- Time slicing is useful for low priority jobs with idle time, but not suitable for latency-sensitive or performance-intensive tasks.
- Mig enables GPU sharing but comes with a performance loss due to the reduction in streaming multiprocessors.
- Benchmarking shows that time slicing incurs a significant performance loss when contact switching is required for long-running processes.
- Doubling memory and bandwidth through Mig can improve performance, but losing Mig without sharing the GPU results in a performance loss for no reason.
- Monitoring pipeline utilization can help understand user jobs and optimize GPU usage.