Skip to main content
Datadog is one connection that covers four kinds of telemetry: logs, metrics, traces, and error tracking. Investigations query it to see what your services were doing around the time of an incident — the log lines, the metric that moved, the slow span, and the errors that were firing.
You connect Datadog directly, with a Datadog API key, application key, and your Datadog site. A single connection brings all four capabilities, and you choose which ones investigations can use.

What we support

Connecting Datadog gives investigations four capabilities, each of which you enable independently:
CapabilityWhat it queries
LogsLog lines from your services, and trends derived from those logs
MetricsTime-series metrics, graphed for the incident window
TracesSpans across your services to find slow or failing requests
Error trackingErrors grouped into issues, with counts, services, and error types

Logs

Investigations search your Datadog logs to read what a service was logging at the time of an incident — the errors, the warnings, the request that failed. They can also turn those logs into time-series, so you get a graph of an error rate climbing or request volume dropping away even where you never set up a dedicated metric.

Metrics

Investigations query your metrics and graph them for the incident’s time window, so a CPU saturation, a latency change, or a queue backing up shows up against the period that matters.

Traces

Investigations search your spans to find the requests behind a problem — which service was slow, where a request failed, how an error propagated across services. Where you’ve connected the Datadog Service Catalog, services are described with their team, tier, and runbook links, so an investigation can tell which team owns a failing service.

Error tracking

Datadog groups individual error events into issues, so the same exception that fired ten thousand times is one issue rather than ten thousand log lines. Investigations query error tracking to see which issues are active and at what volume, then drill from an issue into the underlying logs or traces for the full stack traces and individual events. This is the difference between “errors appeared” and “this specific issue, on this service, started after the deploy and is firing at this rate” — which is usually what you want during an incident. Investigations learn the structure of your Datadog data — your log fields, metric names and tags, services, and the error issues that show up — automatically. How that works is covered in How telemetry works.

Connecting Datadog

You connect Datadog directly. There’s no provider in front of it — one connection covers all four capabilities. What you’ll need:
  • A Datadog API key.
  • A Datadog application key.
  • Your Datadog site — the region your Datadog account lives in, for example datadoghq.com (US), datadoghq.eu (EU), or another Datadog region.
We recommend keys scoped to read-only access: investigations only ever read from Datadog. Each capability reads a different part of Datadog, so grant the application key the read scopes for the capabilities you plan to use — for example log data for logs, and error tracking for error tracking. Some optional scopes enrich what investigations can do, such as reading your monitors and notebooks to learn the queries your team already relies on, and the Service Catalog to attach team and tier detail to your services.
  1. From the Investigations settings, add a telemetry data source and choose Datadog.
  2. Enter your API key, application key, and site, then test the connection.
  3. Choose which of the four capabilities — logs, metrics, traces, error tracking — investigations can use.
All four capabilities are enabled by default once you connect. Turn off any you don’t want investigations to query.

Best practice

  • Grant the read scopes for every capability you intend to use, and the optional scopes for monitors, notebooks, and the Service Catalog. Investigations learn your real query patterns and service ownership from them, which makes queries more accurate.
  • Keep error tracking enabled alongside logs and traces. On its own it tells you which issues are firing; with logs and traces enabled, an investigation can follow an issue through to the stack traces and individual events behind it.

Telemetry overview

How data sources and capabilities fit together.

How telemetry works

Routing, query planning, guidance, and memory.