Getting the Optimal Service Efficiency That Autoscalers Won’t Give You

Conference: KubeCon + CloudNativeCon Europe 2022

2022-05-18

Authors: Mauro Pessina

Summary

The presentation discusses an AI-powered optimization methodology for improving cost efficiency and performance of digital services provided by a company.

The challenge faced by the customer was to optimize their application while keeping on releasing application updates to introduce new business functionalities and align to new regulations.
The tuning practice in place was manual and took almost two months to tune one single macro service.
The AI-powered optimization methodology works in five steps: applying new configuration suggested by AI, applying workload to target system, collecting KPIs, analyzing results, and producing new configuration to be tested in the next iteration.
The methodology allows setting constraints and goals, such as minimizing application cost and ensuring service reliability.
The presentation provides an anecdote of how the methodology was used to optimize a customer's authentication service on Kubernetes, resulting in a 49% improvement on cost efficiency compared to the baseline configuration.

The presentation shows how the AI-powered optimization methodology was used to optimize a customer's authentication service on Kubernetes. The baseline configuration caused spikes in response time and higher memory usage, leading to a lack of performance, operational efficiency, and business agility. By experimenting with different Kubernetes and Java configurations suggested by AI, the methodology identified a new configuration that increased both memory and CPU request limits, adjusted JVM options, and sustained full load by one pod. This resulted in a 49% improvement on cost efficiency compared to the baseline configuration.

Abstract

A challenge when tuning a Kubernetes microservices application is identifying the container size (CPU and Memory), due to frequent application changes and varying traffic patterns. Kubernetes autoscalers are the standard solution to automatically adjust Kubernetes container resources for service efficiency. We present the results of an extensive tuning activity we successfully conducted on a Kubernetes application delivering business-critical financial services to SMB customers. Our goal was to minimize cloud cost without compromising on performance of this application. The unexpected result was that configurations minimizing the service cost were not recommended by the autoscaler. Indeed, autoscalers work by adjusting resource sizing wrt the historical usage, without being aware of the actual cost of cloud resources and of the impact on application performance. In our session, we illustrate how you can use our exploratory testing approach we leveraged to get these results.Click here to view captioning/translation in the MeetingPlay platform!

Materials:

Tags: