logo
Dates

Author


Conferences

Tags

Sort by:  

Authors: Reese Lee
2023-04-19

tldr - powered by Generative AI

The presentation covers the basics of metrics and Open Telemetry, including the architecture of a metrics pipeline, metric instruments, and their use cases.
  • Metrics and Open Telemetry are used for observability and provide an API and SDK for instrumenting code and collecting telemetry data.
  • The media provider is the API entry point for metrics, and meters and instruments are used to record measurements.
  • Aggregation, temporality, and dimensions are important concepts in metrics.
  • Async up down counters and gauges are two types of metric instruments that are used for different purposes.
  • There is much more to learn about metrics and Open Telemetry, including customization options and different processors for transforming metrics data.
  • The presentation provides references for further exploration and credits to the people who contributed to the content.
Authors: Benjamin Raskin, Emma Wang
2022-10-28

tldr - powered by Generative AI

The presentation discusses the migration of infrastructure and application metrics from Stacy to Prometheus at DoorDash, and the challenges and learnings encountered during the process.
  • The migration involved over 130 services, 1500 dashboards, and more than 7000 alerts.
  • The use of histograms instead of percentiles was a difficult change for engineers to adapt to.
  • The instance label is a high cardinality label that needs to be pre-aggregated to reduce volume.
  • PromptCare's aggregation gateway was used for some metrics, but push models were limited to special cases.
  • Automating the monitoring onboarding process for teams is crucial.
  • The migration was completed in one year, resulting in over 27,000 alerts and 2200 dashboards.
  • Post-migration, DoorDash ingests over 15 million metrics per second and persists over 10 million metrics per second.
Authors: Ganesh Vernekar
2022-05-19

tldr - powered by Generative AI

The presentation discusses the implementation of sparse histograms in Prometheus and Grafana for efficient monitoring of metrics.
  • Sparse histograms are a new type of histogram that allows for efficient monitoring of metrics with high resolution and low memory usage.
  • The implementation of sparse histograms in Prometheus and Grafana allows for efficient scraping and visualization of metrics.
  • The use of sparse histograms can be applied to various types of metrics, including latency and memory usage.
  • The implementation of sparse histograms is open source and available for use in the client golang library and Prometheus server.