The shape of an investigation
Every investigation moves through three phases.Fast diagnosis
The investigation starts from what it already knows: the alert that fired (including any error and stack trace) and
the message that declared the incident. This gives it an initial read on what happened and where, in seconds.
Gather context in parallel
While that initial picture forms, the investigation fans out across every source you’ve connected, all at once —
searching Slack for relevant discussion, finding similar past incidents and what resolved them, surfacing the
runbooks and reference docs that apply, lining up recent deploys, feature flags, and config changes from your change
events, pulling in the code changes that could be responsible, and querying your telemetry for anomalies around the
time of the incident. It also checks whether any third-party providers you depend on were having an outage at the
same moment. All of this runs in parallel, so the slow searches never hold up the rest.
Analyze and refine
The investigation pulls the gathered evidence into an initial hypothesis, then looks for what’s missing. It asks
targeted follow-up questions — often by reading your code or running further telemetry queries — and feeds the
answers back in. Each pass makes the hypothesis more specific and better grounded.
Investigations run throughout the incident
An investigation doesn’t stop after its first report. It keeps running for as long as the incident is live, re-assessing as the situation changes. New activity in the channel, fresh alerts, a third-party provider changing state, or a responder steering it — any of these prompt the investigation to gather new evidence and reconsider its hypothesis. This means the investigation stays current with the incident rather than going stale the moment it posts. If the cause shifts, or new information rules out the original theory, the investigation follows along.Investigations keep going until the incident is resolved or declined. You can also pause one if you’d rather it
stopped, and pick it back up later.
From evidence to findings
The investigation reasons in terms of findings — concrete hypotheses about what happened, each backed by evidence.- A finding is a claim, like “a recent deploy introduced a query that locks the orders table under load.”
- Evidence is what supports or contradicts it: a specific Slack message, a pull request diff, a metric spike, a line of code.
- Each finding carries a confidence level, so you can see how sure the investigation is.
This is why investigations improve as you connect more sources. A finding grounded in a real code diff and a matching
metric spike is far stronger than one inferred from an error message alone.
Reading your code
When an investigation needs to understand the code itself, it does more than search for keywords. It works out which repositories are relevant from the incident’s context, plans the specific questions worth answering — “where is this value set?”, “what changed here recently?” — and then reads the code to answer them. All code analysis runs inside isolated, sandboxed containers. Repositories are cloned into ephemeral workspaces and deleted after use. See Code repositories for how access and security work.Querying your telemetry
Investigations query the same logs, metrics, traces, and dashboards your responders reach for. Rather than blindly running queries, the investigation learns the shape of each connected data source — its labels, common query patterns, and the dashboards your team actually uses — so its queries are relevant to your systems. See Telemetry for the providers you can connect.Checking your dependencies
Not every incident is your fault. Alongside everything else, an investigation checks whether the third-party providers you depend on — AWS, GitHub, Stripe, and the like — were having an outage around the time of your incident, and surfaces any that could explain it. This needs no setup. See Third-party dependencies.What you get
The investigation posts a summary into the incident channel and keeps it up to date as it runs, with the same detail available on the incident in the dashboard. It surfaces:- A summary — the headline conclusion in plain language.
- Findings — the hypotheses that survived, each with its confidence and supporting evidence.
- Evidence — links straight back to the source: the Slack message, the pull request, the dashboard, the log line.
@incident. See Incident channel experience for what this looks like and how to interact with it.
Investigate alongside the agent
The investigation runs centrally, but you don’t have to work apart from it. With the incident.io desktop app, you can pull a live investigation into a local coding agent such as Claude Code, Codex, or Cursor and investigate side by side — each of you informing the other. It works as a loop:- Pull the investigation in — your local agent downloads the full investigation: its findings, the checks it ran, the incident context, and the conversation so far. It can read all of this alongside your actual codebase.
- Get live updates — as the central investigation learns more, your local agent keeps in sync, so you’re always working from its latest thinking.
- Send what you find back — when you spot something the investigation hasn’t — the real cause, a misleading metric, a wrong turn it’s taking — you can steer it. Your local agent feeds that back, with evidence, and the central investigation re-assesses its hypothesis within a few minutes. Your input is attributed in the incident channel, so everyone sees where a change in direction came from.
The desktop app connects incident.io to your local agent over the incident.io MCP. Mention an
incident by reference (like
INC-123) and a capable agent can pull it in and start working.Tuning how deep investigations go
Investigations run more than one pass of analysis by default, refining the hypothesis each time. More passes mean a more thorough investigation but a slower one. If you’d like to adjust how deep investigations go for your organization, reach out to us.Where to go next
Connect your data
The sources an investigation draws on, and how to set each one up.
Triggering investigations
Decide when investigations run.