logo
Dates

Author


Conferences

Tags

Sort by:  

Authors: Pavol Loffay, Benedikt Bongartz, Yuri Oliveira Sa, Severin Neumann, Kristina Pathak
2023-04-21

tldr - powered by Generative AI

The tutorial explores the use of OpenTelemetry for end-to-end observability data collection on Kubernetes. Participants will learn how to instrument applications using auto-instrumentation, deploy the OpenTelemetry collector, and collect traces, metrics, and logs.
  • Observability is about understanding applications by looking at metrics, logs, and traces
  • OpenTelemetry is a neutral approach to ship telemetry data
  • The OpenTelemetry project includes a specification, API, SDK, data model, tools for generating traces, and a collector
  • The OpenTelemetry collector can be run on Kubernetes or locally
  • The tutorial covers manual and automatic instrumentation
  • The OpenTelemetry operator can be used to integrate with Prometheus and get logs from nodes
Authors: Damien Grisonnet
2023-04-21

tldr - powered by Generative AI

The presentation discusses the importance of capacity planning, metrics, and logging in Kubernetes and the need for stability and automation in these areas.
  • Capacity planning requires up-to-date and fresh data, and aggregation at collection time to reduce scope.
  • The project provides a tool for capacity planning that does not require knowledge of prompt URL.
  • The metrics framework provides stability levels to prevent breaking changes and automation to prevent users from making breaking changes.
  • Structured logging in JSON format is easier to query and analyze than text-based logging.
  • Contextual logging allows for attaching context and data to log lines for better analysis and correlation with tracing.
  • The structured logging working group is actively working on migrating the code base to structured and contextual logging.
Authors: Benjamin Raskin, Emma Wang
2022-10-28

tldr - powered by Generative AI

The presentation discusses the migration of infrastructure and application metrics from Stacy to Prometheus at DoorDash, and the challenges and learnings encountered during the process.
  • The migration involved over 130 services, 1500 dashboards, and more than 7000 alerts.
  • The use of histograms instead of percentiles was a difficult change for engineers to adapt to.
  • The instance label is a high cardinality label that needs to be pre-aggregated to reduce volume.
  • PromptCare's aggregation gateway was used for some metrics, but push models were limited to special cases.
  • Automating the monitoring onboarding process for teams is crucial.
  • The migration was completed in one year, resulting in over 27,000 alerts and 2200 dashboards.
  • Post-migration, DoorDash ingests over 15 million metrics per second and persists over 10 million metrics per second.
Authors: Han Kang, David Ashpole, Damien Grisonnet
2022-10-28

tldr - powered by Generative AI

The presentation discusses the importance of observability in Kubernetes and the role of the SIG Instrumentation group in maintaining and improving the stability and quality of metrics.
  • The SIG Instrumentation group is responsible for maintaining and improving the stability and quality of metrics in Kubernetes.
  • Observability is important in Kubernetes to identify and fix issues related to latency regression and other problems.
  • The group is working on building an automated documentation to help users understand and use metrics more effectively.
  • The group is also working on adding a beta stage to the stability framework to improve expressiveness.
  • The group is actively seeking new contributors and offers various ways to contribute, including code reviews and documentation.
  • The group maintains several sub-projects, including CubeSat Metrics, Metric Server, and Primitives Adapter, which are used for auto-scaling and adapting queries.
  • The group also maintains the logging infrastructure in Kubernetes.
Authors: Jéssica Lins, Matej Gera
2022-05-20

tldr - powered by Generative AI

Metrics can be leveraged to improve end-to-end testing by externalizing the internal behavior of an application and asserting on the externalized information. This allows for more complex testing scenarios and better control over test scenarios.
  • Metrics provide performance insights about an application
  • Metrics allow for more detailed test assertions
  • Metrics enable better control over test scenarios
  • Metrics provide various extra data points about tests
  • Different types of tests can be applied using metrics, including benchmark tests
Authors: Patrick Ohly, Damien Grisonnet
2022-05-19

tldr - powered by Generative AI

The presentation discusses the role of SIG Instrumentation in maintaining the logging and metrics infrastructure of Kubernetes and their initiatives to improve the quality of metrics and logging output.
  • SIG Instrumentation maintains the logging and metrics infrastructure of Kubernetes
  • They review issues and PRs related to metrics and Kubernetes
  • They are involved in features development and announcement related to observability
  • They maintain projects such as CubeSat Metrics, Metric Server, and Primitives Adapter
  • They are working on implementing structured logging to improve the quality of logging output
  • They are deprecating command line options in k-log related to log file handling
  • They aim to remove unnecessary code in Kubernetes
Authors: Dotan Horovits
2022-05-19

tldr - powered by Generative AI

OpenTelemetry is a new open-source project that aims to provide a single set of APIs, libraries, agents, and collector services to capture distributed traces, metrics, and logs.
  • OpenTelemetry supports metric pipelines and has Prometheus support.
  • The auto collector has receivers and exporters in Prometheus formats.
  • OpenTelemetry is working on adding logging support.
  • The API is still in draft, but the focus is on getting a specification for a strongly typed and machine-readable format for logs.
  • OpenTelemetry has a new working group for client instrumentation.
  • The OpenTelemetry collector supports the Prometheus format, and you can use Prometheus as a back-end.
  • The future of OpenTelemetry includes making the operational side easier and adding more signals beyond logs, metrics, and tracing.
Authors: Ted Young, Liudmila Molkova
2021-10-15

tldr - powered by Generative AI

The presentation discusses the importance of instrumentation and semantic conventions in distributed tracing for libraries and applications using OpenTelemetry SDK.
  • Instrumentation should be opt-in initially and mature over time with user feedback
  • Performance impact should be considered and users should be mindful of costs
  • Semantic conventions are critical for user experience and should be followed
  • Context propagation is essential for distributed tracing and should be implemented in libraries and applications
  • OpenTelemetry SDK provides solutions for instrumentation, semantic conventions, and context propagation
Authors: Frederic Branczyk, Han Kang, Elana Hashman, David Ashpole
2021-10-13

tldr - powered by Generative AI

The presentation discusses the role of SIG Instrumentation in maintaining and improving observability in Kubernetes through metrics, logging, and auto-scaling.
  • SIG Instrumentation is responsible for maintaining and improving observability in Kubernetes through metrics, logging, and auto-scaling
  • Structured logging is being implemented to improve the logging infrastructure in Kubernetes
  • Projects such as kube-state-metrics, metrics-server, and prometheus-adapter are being maintained to generate and expose metrics for Kubernetes objects
  • Auto-scaling can be done based on any metric using projects such as prometheus-adapter
  • SIG Instrumentation reviews new additions and changes related to metrics to ensure high quality
  • Deprecated command line options related to log file handling will be removed in Kubernetes 1.26
  • SIG Instrumentation also maintains the k-log implementation itself