logo

How to Develop a Robust Operator for Day-2 (Lesson Learned on KubeVirt/HCO)

2023-04-20

Authors:   Simone Tiraboschi


Abstract

Developing a new Operator for day 1 operations (deployment, initial configuration) is nowadays quite easy. But from our experience, and from our mistakes, developing the Hyperconverged Cluster Operator for the KubeVirt project we know that this is just the tip of the iceberg. KubeVirt manages VMs and VMs are a strange beasts: they should not simply be destroyed and restarted on a different node but they should be migrated and this takes time so so the upgrade is long and complex. This presentation will share what we learned developing, over the years, an operator that manages a rich product that hosts stateful applications. You will learn about: - Control plane vs workload upgrade - Long running upgrades - Reliability concerns: canary deployments and fail-forward upgrades - Protecting pre-release feature with feature-gates - How to introduce new APIs and deprecate others - How to discriminate defaults vs explicit user choices vs don't care ones - How to implement it with a declarative approach to write less imperative code - How to keep the upgrade matrix small and how to be able to plot the upgrade graph Attendees will be ready to face upgrade challenges providing a robust operator that the user can trust for fully automatic and continuous upgrades

Materials:

Post a comment