Skip to main content
Datadog is one connection that covers four kinds of telemetry: logs, metrics, traces, and error tracking. Investigations query it to see what your services were doing around the time of an incident — the log lines, the metric that moved, the slow span, and the errors that were firing.
You connect Datadog directly, with a Datadog API key, application key, and your Datadog site. A single connection brings all four capabilities, and you choose which ones investigations can use.

What we support

Connecting Datadog gives investigations four capabilities, each of which you enable independently:
CapabilityWhat it queries
LogsLog lines from your services, and trends derived from those logs
MetricsTime-series metrics, graphed for the incident window
TracesSpans across your services to find slow or failing requests
Error trackingErrors grouped into issues, with counts, services, and error types

Logs

Investigations search your Datadog logs to read what a service was logging at the time of an incident — the errors, the warnings, the request that failed. They can also turn those logs into time-series, so you get a graph of an error rate climbing or request volume dropping away even where you never set up a dedicated metric.

Metrics

Investigations query your metrics and graph them for the incident’s time window, so a CPU saturation, a latency change, or a queue backing up shows up against the period that matters.

Traces

Investigations search your spans to find the requests behind a problem — which service was slow, where a request failed, how an error propagated across services. Where you’ve connected the Datadog Service Catalog, services are described with their team, tier, and runbook links, so an investigation can tell which team owns a failing service.

Error tracking

Datadog groups individual error events into issues, so the same exception that fired ten thousand times is one issue rather than ten thousand log lines. Investigations query error tracking to see which issues are active and at what volume, then drill from an issue into the underlying logs or traces for the full stack traces and individual events. This is the difference between “errors appeared” and “this specific issue, on this service, started after the deploy and is firing at this rate” — which is usually what you want during an incident. Investigations learn the structure of your Datadog data — your log fields, metric names and tags, services, and the error issues that show up — automatically. How that works is covered in How telemetry works.

Connecting Datadog

You connect Datadog directly. There’s no provider in front of it — one connection covers all four capabilities. What you’ll need:
  • A Datadog API key.
  • A Datadog application key.
  • Your Datadog URL, exactly as it appears in your browser — for example https://app.datadoghq.com (US1), https://app.datadoghq.eu (EU), or https://us3.datadoghq.com. The URL tells us which Datadog region your account lives in, and — if you have one — the custom subdomain that identifies your organisation.
We recommend keys scoped to read-only access: investigations only ever read from Datadog. Each capability reads a different part of Datadog, so grant the application key the read scopes for the capabilities you plan to use — for example log data for logs, and error tracking for error tracking. Some optional scopes enrich what investigations can do, such as reading your monitors and notebooks to learn the queries your team already relies on, and the Service Catalog to attach team and tier detail to your services.
  1. From the Investigations settings, add a telemetry data source and choose Datadog.
  2. Enter your API key, application key, and Datadog URL, then test the connection.
  3. Choose which of the four capabilities — logs, metrics, traces, error tracking — investigations can use.
All four capabilities are enabled by default once you connect. Turn off any you don’t want investigations to query.

Connecting more than one Datadog organisation

Several Datadog organisations can share a region — app.datadoghq.com, for instance, is the same URL for every US1 account. To connect more than one, each organisation needs its own custom subdomain so we can tell them apart and route queries to the right place: a URL like https://your-org.datadoghq.com rather than the shared https://app.datadoghq.com. Custom subdomains aren’t enabled by default. If your organisations don’t have one yet, contact Datadog support to request it — see Datadog’s custom sub-domains guide. Once an organisation has its own subdomain, connect it using that full URL. If you connect a second organisation on the same region without a distinguishing subdomain, the connection test flags the clash so you don’t end up with two connections we can’t tell apart.

Best practice

  • Grant the read scopes for every capability you intend to use, and the optional scopes for monitors, notebooks, and the Service Catalog. Investigations learn your real query patterns and service ownership from them, which makes queries more accurate.
  • Keep error tracking enabled alongside logs and traces. On its own it tells you which issues are firing; with logs and traces enabled, an investigation can follow an issue through to the stack traces and individual events behind it.

Telemetry overview

How data sources and capabilities fit together.

How telemetry works

Routing, query planning, guidance, and memory.