Multi-Cluster Observability with Service Mesh - That Is a Lot of Moving Parts!?

Conference: KubeCon + CloudNativeCon Europe 2023

2023-04-19

Authors: Ryota Sawada

Summary

The presentation discusses multi-cluster observability and the challenges involved in managing metrics and data retention across multiple clusters.

Cardinality and data retention are important aspects to consider in multi-cluster observability
Metrics can be fetched from running services like Prometheus, but data retention costs can add up quickly
Differentiating between clusters and applications is important for effective dashboarding
The presentation focuses on Istio, Prometheus, and Thanos as key projects for multi-cluster observability
The demo showcases the installation process for Istio and the creation of certificates for secure communication between clusters

The presenter emphasizes the importance of managing data retention costs in multi-cluster observability, as they can add up quickly. This highlights the need for effective data management strategies to ensure meaningful historical analysis and investigations.

Abstract

Observability is complicated and multi-faceted by nature. When you multiply that with a multi-cluster in play, the complexity can seem untameable. Service Mesh solutions could seem like they are the key to solving such a daunting task. They would make multi-cluster handling hidden away, and observability setup provided by default. So, is Service Mesh a silver bullet for any complex Observability requirements? No, it isn't - in fact, it can actually make things more complicated. Ryota has been running Istio since its v1.1 release in production. He will share how Istio helped in many areas, and also highlight some parts that he had trouble with, such as cross-cluster trace and metrics. We will then take a step back with Prometheus basics, understand what Istio does by default, and find the gaps. With the challenges of alert handling, high cardinality, remote read/write, we will wrap up with a demo of how such a multi-cluster Observability setup can be achieved using Istio, Prometheus Operator, and Thanos.

Materials:

Tags: