The presentation discusses a proof of concept for goal-driven scheduling and energy optimization in a cluster environment using a policy engine, machine learning, and metrics pipelines.
- The goal is to move workloads on a cluster while keeping a certain amount of CO2 emissions.
- The solution architecture includes a governance with a policy engine to enforce energy efficiency policies, a scheduler with intelligence, and a metrics pipeline to feed the system with data.
- The metrics pipeline includes components such as Kepler, Efficient Power Level Exporter, Telegraph, and XG Boost machine learning model.
- The Matrix Proxy component exposes the metrics for consumption by the scheduler.
- The presentation includes an anecdote about the challenges of scaling and distributing workload blocks in a Telco world.
The presenter gives an example of a Telco world where a single device such as an antenna can generate around 10 gigabits per second of metrics. Scaling and distributing workload blocks in such a scenario can be challenging, and the solution needs to work for both centralized and highly distributed environments. The presenter suggests that automation can help in such situations, and the system should be designed to allow for easy experimentation and versioning.