logo

Scheduling Framework: Batch Extensions with Apache YuniKorn

2023-04-20

Authors:   Wilfred Spiegelenburg, Peter Bacsko


Summary

The scheduling framework Apache YuniKorn has extended the Kubernetes scheduler to add batch-focused functionality, including workload queuing, gang scheduling, and application sorting.
  • Batch and data processing workloads require different scheduling requirements than service-oriented workloads.
  • Apache YuniKorn provides batch-focused functionality on top of the existing Kubernetes scheduler.
  • Features include workload queuing, gang scheduling, and application sorting.
  • These features are useful for bursty deployments and high-performance computing.
  • Apache YuniKorn is designed to be flexible and customizable.
One example of the usefulness of Apache YuniKorn's batch-focused functionality is in data processing. When processing large amounts of data, it is important to be able to schedule a set of pods to work together as a gang, rather than just one pod at a time. This allows for more efficient processing and better resource utilization. Additionally, the ability to queue workloads and schedule based on application requests allows for more flexibility and better management of resources.

Abstract

Kubernetes is no longer just running service oriented workloads. Batch and data processing workloads are everywhere. Apache Spark is a good example. Scheduling requirements differ for these different types of workloads. The default scheduler and workload resources, like a job or cronjob, are not always a good fit. Jobs change each run based on the data being processed. Some jobs need a set of pods, or gang, to start processing, but could use more or less. How do we get that flexibility? Pods are often created on the fly by a management pod. Jobs can fail due to pod rejection as part of quota enforcements. Why not queue the pods instead until quota is available? What happens if the management pod get preempted? Using the scheduling framework Apache YuniKorn has extended the scheduler to add batch focussed functionality. All added functionality is opt-in and can be used on top of the existing default functionality. Job aware preemption, gang scheduling and queueing with quotas enforced while scheduling are some of the features we will cover.

Materials:

Post a comment