Thanos is an open-source solution for scaling Prometheus-based monitoring by providing a distributed highly-available metric system with long-term retention. It addresses challenges with scaling functionality like querying metrics across large time ranges via downsampling and ingesting metrics at scale.
- Prometheus is a standalone monitoring system that scrapes metrics from applications and stores them locally, but it cannot handle a large multi-environment setup or retain data for a long period of time
- Thanos fills the gaps in Prometheus by providing a global view, long-term retention, downsampling, and multi-tenancy features
- Thanos achieves a global view by using a standalone service called PromQL and defining the store API, which allows the queryer to request time series data from any component
- Thanos also provides global alerting and rule recording through the Thanos ruler, which executes alerting rules across the entire data set
- Thanos sidecar can be configured to upload data from Prometheus into object storage, making it easier to store data on disk for longer periods of time and move disks around
Prometheus cannot handle a large multi-environment setup, which means that it cannot provide a global view of the data. Thanos solves this problem by using a standalone service called PromQL and defining the store API, which allows the queryer to request time series data from any component. This allows the query to connect to multiple Prometheus instances, providing a global view of the data. With a global view, Thanos can also provide global alerting and rule recording through the Thanos ruler, which executes alerting rules across the entire data set.