Presentations | Hack Dojo

Sort by:

Metrics at Full Throttle: Intro and Deep Dive Into Thanos

Conference: KubeCon + CloudNativeCon Europe 2023

Authors: Filip Petkovski, Saswata Mukherjee

2023-04-21

tldr - powered by Generative AI

Thanos is an open-source solution for scaling Prometheus-based monitoring by providing a distributed highly-available metric system with long-term retention. It addresses challenges with scaling functionality like querying metrics across large time ranges via downsampling and ingesting metrics at scale.

Prometheus is a standalone monitoring system that scrapes metrics from applications and stores them locally, but it cannot handle a large multi-environment setup or retain data for a long period of time
Thanos fills the gaps in Prometheus by providing a global view, long-term retention, downsampling, and multi-tenancy features
Thanos achieves a global view by using a standalone service called PromQL and defining the store API, which allows the queryer to request time series data from any component
Thanos also provides global alerting and rule recording through the Thanos ruler, which executes alerting rules across the entire data set
Thanos sidecar can be configured to upload data from Prometheus into object storage, making it easier to store data on disk for longer periods of time and move disks around

Tags:

Show 0 Comments

Unlocking Argo CD’s Hidden Tools for Chaos Engineering - Featuring VCluster and More

Conference: KubeCon + CloudNativeCon Europe 2023

Authors: Dan Garfield, Brandon Phillips

2023-04-20

tldr - powered by Generative AI

The presentation discusses the scalability of Argo CD and the importance of separating concerns to minimize blast radius. The speaker also introduces the Sig Scalability project and encourages audience participation.

Argo CD is scalable but it's important to separate concerns to minimize blast radius
Sig Scalability is a project aimed at improving the performance of Argo CD and the speaker encourages audience participation
The speaker demonstrates the scalability of Argo CD by syncing all apps from the internet
The speaker also discusses the Argo CD certification program

Tags:

Show 0 Comments

Unlocking the Potential of KEDA: New Features and Best Practices

Conference: KubeCon + CloudNativeCon Europe 2023

Authors: Zbynek Roubalik, Jorge Turrado

2023-04-20

tldr - powered by Generative AI

The presentation discusses the importance of certificate management and web hook validation in Keda, a Kubernetes-based event-driven autoscaler.

Encrypting internal traffic inside the cluster is necessary to prevent unauthorized access and scaling issues
Keda introduces mechanisms for automatically generating TLS certificates and supports the use of custom CA
Validation webhooks prevent scaling conflicts and ensure that required metrics are present
Managed identities are a secure way to connect to cloud provider infrastructure
Exposing metrics is critical for monitoring Keda's performance

Tags:

Show 0 Comments

Availability and Storage Autoscaling of Stateful Workloads on Kubernetes

Conference: KubeCon + CloudNativeCon Europe 2023

Authors: Leila Vayghan

2023-04-19

This talk is a story of how Shopify runs a highly available and scalable stateful application on Kubernetes which is accessed securely over the internet. The application discussed is Elasticsearch which stores petabytes of data over the globe. Search is a fundamental component of an ecommerce platform and high availability is an important requirement for it. While Kubernetes has proven to be the perfect platform for deploying stateless applications, running stateful applications on this platform in a highly available and scalable manner can be complicated. This talk will discuss these challenges and will share the steps towards solving them. For example, Leila will explain the obstacles of implementing storage autoscaling and how using the existing Kubernetes features allowed seamless expansion of persistent disks that store critical search data. She will also explain how her team implemented a feature that allowed shrinking persistent disks without any data loss and saved costs by releasing unused storage. Leila will also explain how Envoy is used to allow clients to connect to Elasticsearch through Kubernetes' ingress. This talk will give insight into the challenges and rewards of running highly available and scalable stateful applications on Kubernetes.

Tags:

high availability

stateful applications

Kubernetes

Elasticsearch

scaling

Show 0 Comments

Tales from on-Call: Fun with Operating Etcd at Scale

Conference: KubeCon + CloudNativeCon Europe 2023

Authors: Chao Chen, Geeta Gharpure

2023-04-19

tldr - powered by Generative AI

Operational issues and their mitigations in running etcd

Database size exceeding
Revision divergence
Out of memory panic
Timeouts due to defrag
Oversized requests

Tags:

Show 0 Comments

Vitess: Introduction And New Features

Conference: KubeCon + CloudNativeCon North America 2022

Authors: Deepthi Sigireddi, Rohit Nayak, Matt Lord

2022-10-28

tldr - powered by Generative AI

Vitess is a cloud-native database solution that enables virtually unlimited scaling of MySQL. The architecture is based on key spaces and shards, and it includes components such as vt tablets, vtgate, and vtc tld. VReplication is a subsystem that enables seamless migrations, resharding, materialized views, CDC, job queues, and other data workflows. Vitess is highly scalable, available, and compatible with various MySQL flavors. Key users include JD.com and Slack.

Vitess is a cloud-native database solution that enables virtually unlimited scaling of MySQL
The architecture is based on key spaces and shards, and it includes components such as vt tablets, vtgate, and vtc tld
VReplication is a subsystem that enables seamless migrations, resharding, materialized views, CDC, job queues, and other data workflows
Vitess is highly scalable, available, and compatible with various MySQL flavors
Key users include JD.com and Slack

Tags:

Show 0 Comments

How Adobe Planned For Scale With Argo CD, Cluster API, And VCluster

Conference: KubeCon + CloudNativeCon North America 2022

Authors: Dan Garfield, Joseph Sandoval

2022-10-28

tldr - powered by Generative AI

The presentation discusses the scalability and security challenges of using Argo CD in large organizations and offers strategies for addressing them.

Argo CD is a tool for managing Kubernetes applications that can support large numbers of developers and objects
The presenter offers a conservative benchmark for Argo CD scalability, suggesting that 15,000 objects and 50 clusters are safe limits
However, the presenter notes that with tweaking, Argo CD can support much larger numbers of objects and clusters
The presenter emphasizes the importance of security in using Argo CD, particularly in multi-tenant environments
The presenter suggests using an app of apps pattern to manage large numbers of objects and dependencies
The presenter recommends splitting Argo CD instances to provide better isolation and prevent noise
The presenter suggests that a control plane may be necessary for managing large numbers of Argo CD instances
The presenter notes that Adobe has been speaking at the conference and is part of a larger Adobe Cinematic Universe of talks

Tags:

Show 0 Comments

Stateless Collectors For Stateful Data: Scaling Prometheus As a Node Agent

Conference: KubeCon + CloudNativeCon North America 2022

Authors: Danny Clark

2022-10-28

tldr - powered by Generative AI

The presentation discusses the challenges of scaling Prometheus and offers a solution through a managed service that leverages Prometheus as a node agent.

Scaling Prometheus can be challenging due to issues with data aggregation and network failures
Existing solutions such as Federation, remote read, and Thanos require manual maintenance and expertise
A managed service that leverages Prometheus as a node agent can mitigate scaling issues and separate state and query concerns
The service forwards metrics data to a remote back end and leverages Kubernetes resource and Daemon set to achieve the setup
Google's Monarch provides the capacity needed to offer a prom ql compatible API and long-term retention of metrics

Tags:

Show 0 Comments

Kcp: Towards 1,000,000 Clusters, Name^WWorkspaced CRDs

Conference: KubeCon + CloudNativeCon North America 2022

Authors: Stefan Schimanski

2022-10-26

tldr - powered by Generative AI

The presentation discusses the KCP machine, a generalized API server built on Kubernetes, and its three dimensions of extension.

The KCP machine is a generalized API server built on Kubernetes that can be extended in three dimensions.
The first dimension is the addition of one million workspaces, each of which is like a small Kubernetes cluster.
The second dimension involves creating services between the workspaces and programming controllers that are multi-workspace aware.
The third dimension involves adding locality over the planet and eventual consistency for global state.
The KCP machine can be used to build multi-tenant services and has various use cases, such as end-to-end testing of controllers and modeling company hierarchies in workspaces.
The goal of the KCP machine is to make clusters uninteresting and allow for easy and cheap creation of workspaces.
The presentation emphasizes that the KCP machine is not meant to replace Kubernetes, but rather to generalize it for other use cases beyond container orchestration.

Tags:

Show 0 Comments