logo
Dates

Author


Conferences

Tags

Sort by:  

Authors: Filip Petkovski, Saswata Mukherjee
2023-04-21

tldr - powered by Generative AI

Thanos is an open-source solution for scaling Prometheus-based monitoring by providing a distributed highly-available metric system with long-term retention. It addresses challenges with scaling functionality like querying metrics across large time ranges via downsampling and ingesting metrics at scale.
  • Prometheus is a standalone monitoring system that scrapes metrics from applications and stores them locally, but it cannot handle a large multi-environment setup or retain data for a long period of time
  • Thanos fills the gaps in Prometheus by providing a global view, long-term retention, downsampling, and multi-tenancy features
  • Thanos achieves a global view by using a standalone service called PromQL and defining the store API, which allows the queryer to request time series data from any component
  • Thanos also provides global alerting and rule recording through the Thanos ruler, which executes alerting rules across the entire data set
  • Thanos sidecar can be configured to upload data from Prometheus into object storage, making it easier to store data on disk for longer periods of time and move disks around
Authors: Dan Garfield, Brandon Phillips
2023-04-20

tldr - powered by Generative AI

The presentation discusses the scalability of Argo CD and the importance of separating concerns to minimize blast radius. The speaker also introduces the Sig Scalability project and encourages audience participation.
  • Argo CD is scalable but it's important to separate concerns to minimize blast radius
  • Sig Scalability is a project aimed at improving the performance of Argo CD and the speaker encourages audience participation
  • The speaker demonstrates the scalability of Argo CD by syncing all apps from the internet
  • The speaker also discusses the Argo CD certification program
Authors: Zbynek Roubalik, Jorge Turrado
2023-04-20

tldr - powered by Generative AI

The presentation discusses the importance of certificate management and web hook validation in Keda, a Kubernetes-based event-driven autoscaler.
  • Encrypting internal traffic inside the cluster is necessary to prevent unauthorized access and scaling issues
  • Keda introduces mechanisms for automatically generating TLS certificates and supports the use of custom CA
  • Validation webhooks prevent scaling conflicts and ensure that required metrics are present
  • Managed identities are a secure way to connect to cloud provider infrastructure
  • Exposing metrics is critical for monitoring Keda's performance
Authors: Leila Vayghan
2023-04-19

This talk is a story of how Shopify runs a highly available and scalable stateful application on Kubernetes which is accessed securely over the internet. The application discussed is Elasticsearch which stores petabytes of data over the globe. Search is a fundamental component of an ecommerce platform and high availability is an important requirement for it. While Kubernetes has proven to be the perfect platform for deploying stateless applications, running stateful applications on this platform in a highly available and scalable manner can be complicated. This talk will discuss these challenges and will share the steps towards solving them. For example, Leila will explain the obstacles of implementing storage autoscaling and how using the existing Kubernetes features allowed seamless expansion of persistent disks that store critical search data. She will also explain how her team implemented a feature that allowed shrinking persistent disks without any data loss and saved costs by releasing unused storage. Leila will also explain how Envoy is used to allow clients to connect to Elasticsearch through Kubernetes' ingress. This talk will give insight into the challenges and rewards of running highly available and scalable stateful applications on Kubernetes.
Authors: Chao Chen, Geeta Gharpure
2023-04-19

tldr - powered by Generative AI

Operational issues and their mitigations in running etcd
  • Database size exceeding
  • Revision divergence
  • Out of memory panic
  • Timeouts due to defrag
  • Oversized requests
Authors: Jorge Palma
2023-04-19

tldr - powered by Generative AI

The presentation discusses the importance of building sustainable, carbon-aware cloud-native apps and reducing carbon emissions for k8s workloads using the CNCF open-source project KEDA.
  • Sustainability in the technology space requires reducing emissions while facing greater demand to build scalable applications
  • Green software principles include energy efficiency, hardware efficiency, and carbon awareness
  • Carbon intensity is the measure of the amount of carbon produced in order for the energy that we use to be created
  • The carbon-aware scalar for KEDA uses demand shaping to scale workloads based on the carbon intensity of the infrastructure where they're running
  • The carbon-aware scalar is implemented using a Kubernetes operator that reads infrastructure provider's data from a config map
  • The carbon-aware scalar is an open-source wrapper for public sources of data
  • The carbon-aware scalar allows users to define carbon emission thresholds and maximum replicas
  • The project is being developed for CADA core and users are encouraged to join the sustainability efforts
  • Join the CNCF sustainability tag and check the links for more information
Authors: Deepthi Sigireddi, Rohit Nayak, Matt Lord
2022-10-28

tldr - powered by Generative AI

Vitess is a cloud-native database solution that enables virtually unlimited scaling of MySQL. The architecture is based on key spaces and shards, and it includes components such as vt tablets, vtgate, and vtc tld. VReplication is a subsystem that enables seamless migrations, resharding, materialized views, CDC, job queues, and other data workflows. Vitess is highly scalable, available, and compatible with various MySQL flavors. Key users include JD.com and Slack.
  • Vitess is a cloud-native database solution that enables virtually unlimited scaling of MySQL
  • The architecture is based on key spaces and shards, and it includes components such as vt tablets, vtgate, and vtc tld
  • VReplication is a subsystem that enables seamless migrations, resharding, materialized views, CDC, job queues, and other data workflows
  • Vitess is highly scalable, available, and compatible with various MySQL flavors
  • Key users include JD.com and Slack
Authors: Dan Garfield, Joseph Sandoval
2022-10-28

tldr - powered by Generative AI

The presentation discusses the scalability and security challenges of using Argo CD in large organizations and offers strategies for addressing them.
  • Argo CD is a tool for managing Kubernetes applications that can support large numbers of developers and objects
  • The presenter offers a conservative benchmark for Argo CD scalability, suggesting that 15,000 objects and 50 clusters are safe limits
  • However, the presenter notes that with tweaking, Argo CD can support much larger numbers of objects and clusters
  • The presenter emphasizes the importance of security in using Argo CD, particularly in multi-tenant environments
  • The presenter suggests using an app of apps pattern to manage large numbers of objects and dependencies
  • The presenter recommends splitting Argo CD instances to provide better isolation and prevent noise
  • The presenter suggests that a control plane may be necessary for managing large numbers of Argo CD instances
  • The presenter notes that Adobe has been speaking at the conference and is part of a larger Adobe Cinematic Universe of talks
Authors: Danny Clark
2022-10-28

tldr - powered by Generative AI

The presentation discusses the challenges of scaling Prometheus and offers a solution through a managed service that leverages Prometheus as a node agent.
  • Scaling Prometheus can be challenging due to issues with data aggregation and network failures
  • Existing solutions such as Federation, remote read, and Thanos require manual maintenance and expertise
  • A managed service that leverages Prometheus as a node agent can mitigate scaling issues and separate state and query concerns
  • The service forwards metrics data to a remote back end and leverages Kubernetes resource and Daemon set to achieve the setup
  • Google's Monarch provides the capacity needed to offer a prom ql compatible API and long-term retention of metrics
Authors: Stefan Schimanski
2022-10-26

tldr - powered by Generative AI

The presentation discusses the KCP machine, a generalized API server built on Kubernetes, and its three dimensions of extension.
  • The KCP machine is a generalized API server built on Kubernetes that can be extended in three dimensions.
  • The first dimension is the addition of one million workspaces, each of which is like a small Kubernetes cluster.
  • The second dimension involves creating services between the workspaces and programming controllers that are multi-workspace aware.
  • The third dimension involves adding locality over the planet and eventual consistency for global state.
  • The KCP machine can be used to build multi-tenant services and has various use cases, such as end-to-end testing of controllers and modeling company hierarchies in workspaces.
  • The goal of the KCP machine is to make clusters uninteresting and allow for easy and cheap creation of workspaces.
  • The presentation emphasizes that the KCP machine is not meant to replace Kubernetes, but rather to generalize it for other use cases beyond container orchestration.