logo

Trimaran: Real Load Aware Scheduling in Kubernetes

2021-10-13

Authors:   Chen Wang, Abdul Qadeer


Summary

Load balancing and resource allocation in Kubernetes clusters using Trimaran plugins
  • Trimaran is a set of plugins for Kubernetes clusters that optimize resource allocation and load balancing
  • The Target Load Packing plugin aims to achieve high utilization across all nodes while maintaining a safe margin for CPU usage spikes
  • The Load Variation Risk Balancing plugin computes a risk score based on CPU and memory utilization and chooses the bottleneck resource score
  • Trimaran uses multiple metric sources and caches data to avoid overwhelming metric providers
  • Future work includes integrating Trimaran with other schedulers and incorporating additional resources like IO and network latency
In an experiment with 100 nodes and 400 pods, the Target Load Packing plugin resulted in better capacity utilization, fewer hot nodes, and fewer fragmented cores compared to the default scheduler

Abstract

Kubernetes is a popular solution for container orchestration and cluster management. Cluster management creates opportunity to improve resource utilization which can provide an organization with cost savings. To achieve this, we can make the native Kubernetes scheduler aware of the gap between its declarative resource allocation model and actual node resource utilization. We can pack pods more efficiently in a lower number of nodes considering real load of nodes. Native scheduler on the other hand only considers pod requests and allocable resources on nodes with its default plugins. We introduced two plugins to the scheduler community - TargetLoadPacking and LoadVariationRiskBalancing under the Trimaran framework to address this problem with collaboration between PayPal and IBM. The plugins provide scheduling support for all pod QoS guarantees.

Materials:

Post a comment

Related work




Authors: Alexander Kanevskiy, Swati Sehgal, David Porter, Sascha Grunert, Evan Lezar
2023-04-19