logo
Dates

Author


Conferences

Tags

Sort by:  

Authors: Ana Medina, Brad McCoy, Meha Bhalodiya, Giovanni Liva
2023-04-21

tldr - powered by Generative AI

The presentation discusses the use cases and benefits of the Captain Kubernetes deployment lifecycle management toolkit, which provides observability and automation for Kubernetes deployments.
  • Captain Kubernetes provides observability and automation for Kubernetes deployments
  • Use cases include preventing bad deployments during maintenance windows, supporting quality gates, and running evaluations after deployments
  • Captain Kubernetes allows for monitoring and tracing of application deployments across different stages
  • The Captain metric server brings observability directly into the cluster by exposing Kubernetes native metrics
  • This allows for configuration of HPA, Argo Rollout, and Flux to use Kubernetes native metrics
Authors: Brad McCoy, Meha Bhalodiya
2023-04-21

tldr - powered by Generative AI

The presentation discusses the experience of contributing to open source projects, particularly through Google Summer of Code (GSoC), and the benefits it brings to both mentors and mentees.
  • The speaker shares her experience of contributing to Jenkins through GSoC
  • The CDF community is welcoming and supportive of new contributors
  • Goal setting, time commitment, and communication are important factors in successful contributions
  • Imposter syndrome is a common challenge for mentees
  • Anyone can be a mentor regardless of their level of expertise
  • Matching the right project is crucial for successful contributions
Authors: Christian Hernandez, Leigh Capili, Priyanka Pinky Ravi, Roberth Strand, Filip Jansson
2023-04-21

tldr - powered by Generative AI

The panel discusses the evolution and principles of GitOps and its impact on configuration management and infrastructure deployment.
  • GitOps is a set of principles and practices that decouples CI and CD, allowing for a more asynchronous task coordination.
  • GitOps tools, such as Flux, Argo, and Carvel, have emerged to support the GitOps workflow.
  • Stateful infrastructure is a reality in larger teams and systems, but GitOps can help by providing a desired state for complex computers like Kubernetes.
  • The principles of GitOps have driven the development of the tools, resulting in similar workflows across different tool sets.
  • The panel emphasizes the importance of understanding what is GitOps material and what is not when setting up infrastructure with tools like Terraform.
Authors: Andre Marcelo-Tanner
2023-04-20

tldr - powered by Generative AI

Lessons learned from a Kubernetes outage and disaster recovery process
  • Complete your migrations
  • Be experts in your tooling
  • Always be practicing your disaster recovery
Authors: Dan Garfield, Brandon Phillips
2023-04-20

tldr - powered by Generative AI

The presentation discusses the scalability of Argo CD and the importance of separating concerns to minimize blast radius. The speaker also introduces the Sig Scalability project and encourages audience participation.
  • Argo CD is scalable but it's important to separate concerns to minimize blast radius
  • Sig Scalability is a project aimed at improving the performance of Argo CD and the speaker encourages audience participation
  • The speaker demonstrates the scalability of Argo CD by syncing all apps from the internet
  • The speaker also discusses the Argo CD certification program
Authors: Martin Villumsen, Michael Vittrup Larsen
2023-04-20

tldr - powered by Generative AI

The presentation discusses the development of a common Kubernetes platform and multi-tenant platform to reduce developer cognitive load and abstract away infrastructure. The focus is on using the Kubernetes API for everything and implementing the Gateway API for network configuration.
  • Development teams have been building their own cloud platforms for the past 4-5 years resulting in many similar platforms with some differences in details
  • Increased need for network features external to people such as web application firewalls and DDOS protection led to the establishment of a platform team to build a common Kubernetes platform
  • The main principle is to reduce developer cognitive load and provide a paved path for running applications in the cloud
  • The team aims to use the Kubernetes API for everything and expose it with some kind of abstraction on top
  • The team is building a custom Kubernetes controller from scratch using a cube builder and implementing the Gateway API for network configuration
  • The Gateway API is a networking model that consists of several Kubernetes resources making it more flexible and role-oriented
  • The team plans to use the Gateway API in production by the end of the year
Authors: Shahar Shmaram, Ran Mansoor
2023-04-20

tldr - powered by Generative AI

The presentation discusses the challenges faced by a company during hyper growth and how they implemented a solution using GitHub's methodology and Backstage to manage their resources and visualize them in one place.
  • The company faced challenges during hyper growth such as lack of alignment, manually managed resources, unknown resource dependencies and ownership, exploding budget, and lack of technical documentation.
  • They implemented a solution using GitHub's methodology which emphasizes declarative infrastructure as code, versioning, immutability, automatic deployment pipelines, and continuous reconciliation.
  • They also used Backstage, an open platform for building developer portals, to manage their resources in one location, write documentation easily, search for information, use automated software templates, and create self-contained plugins.
  • The solution was auditable, declarative, had a single source of truth, was community-driven, self-serve, and provided visibility.
  • An anecdote was given about how the GitHub solution detected a drift in a policy and automatically brought it back to its desired state.
  • Tags: AI, Cybersecurity, DevOps, GitHub, Backstage, hyper growth, resource management, visualization, automation, documentation, self-serve, community-driven.
Authors: Ionut-Maxim Margelatu, Larisa Andreea Danaila
2023-04-20

tldr - powered by Generative AI

The presentation discusses the challenges of having separate workflows for infrastructure provisioning and application deployment and proposes a unified approach using Crossplane. The speaker also highlights the importance of putting everything in a single release.
  • Separate workflows for infrastructure provisioning and application deployment lead to inefficiency, higher risk of errors, longer feedback loop, and unmanageable complexity
  • A unified approach using Crossplane can increase iteration speed, quality, and time to market
  • Putting everything in a single release is crucial for continuous deployment pipeline and reducing cognitive load on developers
  • Examples of challenges include running post-deployment tests, making changes in configuration, and dealing with multiple repositories
Authors: Joaquin Rodriguez, Alessandro Vozza
2023-04-20

tldr - powered by Generative AI

The presentation discusses the challenges of scaling observability and deployment automation in GitOps and proposes a solution using open-source tools like ClusterAPI, ArgoCD, and Prometheus+Thanos to manage and organize deployments.
  • GitOps has clear advantages over traditional CI/CD tools, but scaling observability and deployment automation can be challenging
  • Open-source tools like ClusterAPI, ArgoCD, and Prometheus+Thanos can help manage and organize deployments
  • The presentation proposes using immutable clusters treated as always stamped out of a template to address the fear of upgrading
  • The ClusterAPI project can be used to declaratively express the idea of a cluster and interact with different cloud providers
  • The v-cluster project can be used to create ephemeral clusters that live inside management clusters and can be used when provisioning time is a crucial parameter
  • The presentation emphasizes the importance of monitoring ephemeral clusters and collecting metrics from them
  • The use of open-source tools can automate the deployment of hundreds of clusters and applications automatically and securely
Authors: Ricardo Rocha, Spyros Trigazis
2023-04-20

The Kubernetes infrastructure at CERN runs a variety of workloads, from scientific computing to critical services for campus and our physics accelerator complex. It’s important to offer the features and capabilities our users require, but even more the required high levels of service. In this session we present in detail a recent incident where a rogue maintenance tool deleted a third of our production capacity in minutes, how this resulted in no downtime with only service degradation and how we were able to recover in a short time. We describe our architecture to achieve high service availability, the options we took to reduce blast radius, the concept of “clusters as cattle” and how extensive use of gitops saved the day. We will also describe some lessons learned in the process, the detected cyclic dependencies when recovering from a major outage, and the corner cases where more care is needed for stateful workloads and multi-cluster scheduling. We will demo this on stage showing how real CERN services recover from what would not so long ago be events with a very serious impact. And how the effort from the last years has paid off, with our users responding calmly and positively while going through a major incident.