The presentation discusses the reliability of running Cluster Autoscaler in production and provides insights on monitoring and debugging tools.
- Cluster Autoscaler's primary job is to ensure that all pods can schedule
- Metrics such as pending pod metrics are useful for monitoring Cluster Autoscaler's performance
- Cluster Autoscaler should be run on dedicated nodes or on the control plane VMs to prevent issues with scaling down
- Testing configurations before using them in production is recommended
- Ignoring certain flags can have significant side effects
- Auto scaling can vary significantly at scale and should be tested