You can connect Prometheus directly, or have it discovered automatically when you connect
Grafana. Either route gives investigations the same access — pick
whichever fits how you run Prometheus.
What we support
Investigations query Prometheus with PromQL, its query language. They go beyond reading a metric’s raw value: PromQL turns counters and histograms into the rates and percentiles you actually reason about during an incident.- Rates from counters. Counters only ever climb, so the raw number means little on its own. Investigations wrap them in
rate()to ask the real question — how fast are requests failing right now, and was that different before the incident started. - Percentiles from histograms. Latency lives in histogram buckets, not a single number. Investigations use
histogram_quantile()to pull out the p95 or p99 your responders care about, rather than an average that hides the tail. - Aggregations across dimensions. With
sum by,avg by, andmax by, investigations roll a metric up to the dimension that matters — per service, per route, per namespace — to find which slice of your fleet is misbehaving.
Discovering your metrics and labels
Prometheus exposes thousands of metrics and labels, and a query that names the wrong one returns nothing. Rather than guess, investigations read what your Prometheus actually holds: the metric names, their types, and the help text you’ve attached, plus every label and a sense of how many distinct values each one takes. That last point shapes how queries are built. Grouping by a low-cardinality label likeservice or namespace gives a readable breakdown; grouping by a high-cardinality one like pod or instance produces noise. Investigations learn which labels are which, so they group on the ones that clarify and filter on the ones that would overwhelm.
Investigations learn this structure — your metrics, their types, and your labels and their cardinality — automatically. How that works is covered in How telemetry works.
Connecting Prometheus
There are two ways to connect Prometheus. Both give investigations the same access — choose whichever matches how you run it.Directly
Connect Prometheus on its own, with its endpoint and credentials:- The URL of your Prometheus instance, or any backend that speaks the Prometheus HTTP API — Thanos, Cortex, Mimir, and VictoriaMetrics all work the same way.
- Any authentication it requires — for example basic auth or a bearer token.
sum(rate(...)) returns one correct global figure rather than per-instance numbers stitched together after the fact.
Through Grafana
If your Prometheus already sits behind Grafana, connect Grafana and Prometheus is discovered automatically as one of the data sources behind it, using Grafana’s own credentials — nothing separate to configure. Either way, Prometheus is disabled by default. You opt in deliberately — enable the Prometheus data sources your team uses once they’re connected.Best practice
- Connect the Grafana dashboards that query Prometheus. Investigations learn your real query patterns from them — which metrics matter, how they’re filtered and grouped — which makes Prometheus queries more accurate.
- Keep your metric metadata and help text populated. Investigations read it to tell counters from gauges and histograms, and to pick the right PromQL function for each.
- Enable the Prometheus data sources your responders actually reach for during incidents, rather than every source available.
Related
Grafana
The provider Prometheus can be connected through.
How telemetry works
How investigations query your metrics.