OpenTelemetry provides correlations between different types of data that can be used to improve service operations and responses to outages.
- OpenTelemetry captures distributed traces, metrics, logs, and resource metadata
- Correlating this information is crucial for understanding failures in highly distributed systems
- OpenTelemetry allows for correlations between language runtime traces and network events
- Correlations can provide general production insights and improve development velocity
In the example of a mock e-commerce service, correlations between distributed traces and service information were used to identify the checkout service as the source of extremely high latency, which could lead to lost customers and decreased faith in the e-commerce system.
OpenTelemetry is used across the industry to capture distributed traces, however this is just a sliver of the value that the project provides. OpenTelemetry also gathers metrics (launched earlier this year) and logs (beta) from your applications and infrastructure, allowing you to capture all telemetry through a single pipeline and powerful analysis in whatever tools you choose! In this session we will discuss: - How OpenTelemetry correlates these signals, which allows your investigations to jump flow seamlessly between all of your services and underlying infrastructure - The deep functionality that OpenTelemetry provides for metrics and logs, including metric formats and aggregations, tailing logs from flat files, and the a high-performance strongly-typed logging pipeline for new applications - Real stories about how large well-known organizations use OpenTelemetry and the improvements that they’ve gained - What’s next for OpenTelemetry: new data sources, signals, and more