logo

How To Handle Node Shutdown In Kubernetes

2022-10-26

Authors:   Xing Yang, Ashutosh Kumar


Summary

The presentation discusses the implementation of a new feature in Kubernetes called Graceful Shutdown, which allows for a smoother and more efficient process of shutting down nodes and pods in a cluster.
  • The Graceful Shutdown feature is currently in alpha and will be moved to beta in version 1.26 of Kubernetes.
  • The feature allows for a more efficient process of shutting down nodes and pods in a cluster, reducing the risk of data loss and improving overall cluster performance.
  • The feature requires manual tainting of nodes and may not work with certain pod disruption policies.
  • The presentation encourages audience members to get involved in the project and provide feedback.
  • Anecdote: The presenter discusses a common issue in node rebalancing where cloud providers take nodes out of rotation, and how the Graceful Shutdown feature can help in this situation if the shutdown is not a graceful one.
The presenter discusses a common issue in node rebalancing where cloud providers take nodes out of rotation, and how the Graceful Shutdown feature can help in this situation if the shutdown is not a graceful one.

Abstract

Shutting down of a node is an inevitable event and it can be graceful or non graceful in a Kubernetes cluster. A node shutdown can be graceful only if it can be detected by the Kubelet ahead of the actual shutdown. A node shutdown may not be detected by the Kubelet due to a variety of reasons causing the shutdown to be non graceful. In the talk, Xing and Ashutosh will explain the graceful shutdown concepts and its impact on the running workloads including the systemd inhibitor locks mechanism and configuration settings. In Kubernetes v1.24, alpha support for handling non graceful shutdown is introduced which enables replacement pods for StatefulSets to be created successfully on a different running node which otherwise would be stuck. The talk will explain how to use the non graceful shutdown feature using taints and the future roadmap around making the feature more automated.

Materials: