The presentation discusses various methods and technologies used in machine learning cooperation systems for anomaly detection, root cause analysis, and predictive auto-scaling in Kubernetes clusters.
- The system is divided into two main parts: preparation and evaluation of models, and real-time execution of trend models
- Istio is used as a service mesh to collect service mesh metrics, and Permittelsa is used as a data layer to collect time series data
- Combining workloads in schedule groups can reduce network resource consumption and optimize overall latency
- Anomaly detection methods can significantly reduce the flow of notifications and automate the process of establishing monitoring thresholds
- Predictive auto-scaling can proactively predict the required number of service ports using time series data and feature generation
In the presentation, the speaker discussed how slow start of applications became a problem for colleagues who write code in Java. To address this issue, they tried to scale the application proactively using predictive auto-scaling, which always has the necessary amount of replicas at hand based on predicted values of a specific metric. This method can be useful for achieving elasticity in Kubernetes clusters.