Network-aware Scheduling in Kubernetes

Conference: KubeCon + CloudNativeCon Europe 2022

2022-05-18

Authors: José Santos

Summary

The presentation discusses a network-aware framework for workload scheduling in Kubernetes clusters, which aims to reduce latency and improve performance.

The network-aware framework uses a combination of plugins and algorithms to optimize workload scheduling based on network topology and bandwidth resources.
The framework includes an application group and network topology controller, load watcher component, and a scheduler with filtering and scoring functions.
The framework was tested with the Redis cluster application and was able to improve throughput by 20% on average.
The framework is not yet production-ready but is expected to be included in the Seek scheduling community in the next few months.
Future plans include adding a plugin for monitoring bandwidth and dynamically adjusting workload scheduling based on real-time network congestion.
An anecdote was provided demonstrating the performance improvement of the online boutique application with the network-aware framework compared to the default Kubernetes scheduler.

The presenter demonstrated the performance improvement of the online boutique application with the network-aware framework compared to the default Kubernetes scheduler. With the framework, the average latency was reduced by at least 30-40% for most requests, and the minimum and average response time were also improved. The presenter used the Locust load tool to show the difference in performance between the two schedulers.

Abstract

Recent applications are latency-sensitive, demanding low latency between microservices in the application. Current scheduling algorithms in Kubernetes aim to reduce costs and increase resource efficiency, which is not enough for applications where end-to-end latency becomes a primary objective. Applications such as databases and multi-tier web services would benefit the most from network-aware scheduling policies that consider latency and bandwidth in addition to default resources (CPU and memory). We introduce a network-aware scheduling framework to tackle this challenge, including two controllers (AppGroup and NetworkTopology) and three scheduling plugins (TopologicalSort, NodeNetworkCostFit, and NetworkMinCost). The framework ensures bandwidth reservations and optimizes the end-to-end application latency since it schedules pods in an application with chained dependencies close to each other. We will show a demo highlighting the benefits of our framework.Click here to view captioning/translation in the MeetingPlay platform!

Materials:

Tags: