Choosing which agent runs
Which agent writes the change is an explicit choice. In Settings → AI SRE → Code changes you pick one of three options: incident.io’s built-in agent, Cursor, or your own custom platform. The one you select runs for every code change, so connecting more than one is fine. Your selection is always what’s used.Cursor
Connect Cursor with an API key, and code change requests are handed to a Cursor cloud agent that makes the change and opens the pull request. It runs with your Cursor configuration, including any MCP servers you’ve set up for your repositories, and the resulting pull request is linked back to the incident exactly as the built-in flow would.If you’d like us to support another hosted agent out of the box, such as Devin, get in
touch and we’ll work with you.
Your own platform
If you run an internal coding agent platform, you can connect it directly by implementing the small HTTP interface specified below. We call you to launch and steer agent runs, and you call us back (or let us poll) with status. Your agent opens the pull request with its own credentials in your own infrastructure. We only ever need read access to the repository to attach the finished PR to the incident.Connect your platform from Settings → AI SRE → Code changes → Custom agent in the dashboard. You’ll need your
platform’s endpoint URL (which must start with
https://) and a bearer token it issued for us. Everything else, including the webhook signing secret,
is handled automatically. You can also give the connection a display name, which Slack uses when narrating the agent’s progress.Each organization can connect one custom agent platform today. If your agents are split across several internal systems, put a thin router in front of them and connect that. The repository field on each launch tells you where to route. Get in touch if you’d like first-class support for multiple connections.How a delegated task flows
- A responder asks for a code change (for example from the incident channel), and we render a self-contained markdown brief: the incident title and reference, the repository, the user’s instructions, our expectations for the PR, and a snapshot of the investigation so far.
- We call
POST /agentson your platform with the brief and structured identifiers, plus a per-launch webhook URL and signing secret. - Your platform starts a run and responds with a run ID and a link to its own UI, which we surface in Slack.
- While the run is in flight, you send signed status callbacks to the webhook, or we poll
GET /agents/{id}every 2 minutes. Both work. - Your agent opens a draft PR with its own GitHub credentials and reports
finishedwith the PR URL. We attach the PR to the incident and move the task into review. - If a responder asks for changes, we call
POST /agents/{id}/followupwith the new instructions, and the same run updates the same PR.
The interface you implement
All endpoints are served from the base URL you configure, authenticated with the bearer token you issue us (Authorization: Bearer <token>). Every request we send carries an X-Incident-Interface-Version header (currently 2026-06-12) so you can tell which revision of this contract you’re being called with. Breaking changes ship under a new version value, announced ahead of time.
| Endpoint | Required? | Purpose |
|---|---|---|
POST /agents | Required | Launch a run |
GET /agents/{id} | Required | Report run status |
POST /agents/{id}/followup | Strongly recommended | Send follow-up instructions to a run |
POST /agents/{id}/stop | Optional | Cancel a run (best effort) |
GET /ping | Optional | Used by the dashboard’s “Test connection” button |
| Webhook (you call us) | Recommended | Push status instead of waiting for our poll |
POST /agents: launch
task_idis our identifier for the request. Treat it as an idempotency key: if you receive a second launch with atask_idyou’ve already started, return the existing run rather than starting another.promptis the rendered brief. It’s self-contained, so an agent with no other context can act on it.incident_id,referenceandinvestigation_idare also embedded in the prompt, but exposed as structured fields so your harness can fetch live data through our MCP server without parsing markdown. They’re omitted when the task has no incident or investigation.webhookis where to send status callbacks for this run, and the secret to sign them with. You may ignore it and rely on polling, but webhook-driven updates feel noticeably faster in Slack.
idis your opaque run identifier. We include it in every subsequent call.url(optional) links to your platform’s UI for the run and is shown to responders in Slack.
GET /agents/{id}: status
statusis one ofrunning,finishedorerror.summaryis a short progress message. We show the latest one in Slack, so keep it human-readable and current.pr_urlis required when reportingfinished. Afinishedstatus without a PR URL is treated as a failure, not as “almost done”, so reportrunninguntil the PR exists.branch(optional) is the feature branch the agent pushed to.
POST /agents/{id}/followup: iteration
running to status calls while the follow-up is in progress, then finished with the same pr_url once the PR is updated.
POST /agents/{id}/stop: cancel
Best effort, no body. We call it when a responder abandons or retries a task. Returning 404 for a run you’ve already cleaned up is fine.
GET /ping: connection test
Return 200 to confirm the endpoint is reachable and the bearer token is valid. The dashboard’s “Test connection” button calls this; platforms that don’t implement it still work, but admins lose the ability to verify the connection before the first real launch.
The webhook you send us
POST the same JSON body as the status response to the per-launch webhook URL, signed with the per-launch secret:- Send one whenever the run meaningfully progresses: on completion at minimum, and ideally on summary changes too.
- Our handling is idempotent, and the webhook and our poller converge on the same logic, so duplicate or out-of-order deliveries are harmless.
- You don’t need a retry pipeline: if a delivery fails, our 2-minute poll picks the status up. (Retries are welcome all the same.)
- We respond
200even for runs we no longer track, so you can fire-and-forget.
Timing expectations
| What | Expectation |
|---|---|
| Response to any of our calls | Within 10 seconds |
| Our polling cadence | Every 2 minutes per active run |
| Maximum run length | 2 hours, after which we mark the task as failed |
| Webhook deliveries | Any time during a run; completion at minimum |
What this interface doesn’t do
Instructions flow one way. There is no way for your agent to ask the responder a question mid-run (no elicitation step). If the brief is ambiguous, the agent should make a reasonable choice and explain it in the PR description. Responders iterate after the fact through follow-ups, which is the intended loop.Giving your agent live incident data
The brief embeds an investigation snapshot capped at 30,000 characters. For full, current context, configure your agent harness with access to our MCP server using an incident.io API key:investigation_syncdownloads the complete investigation (findings, hypotheses, check outputs) as an archive via a short-lived signed URL.incident_showandalert_showfetch the live incident and related alerts.
Security model
There are three credentials in play, each scoped to one direction:- Us → your API: the bearer token you issue and paste into our dashboard. We store it encrypted and send it on every call. Rotate it by reconnecting with a new token.
- You → our webhook: the HMAC secret we generate when you connect, delivered to you inside each launch request. You never need to store it long-term. Sign each run’s callbacks with the secret that run was launched with.
- Your agent → our MCP: an incident.io API key you create and configure into your agent environment, governed by the same scopes as any other API key.
Related
Making code changes
How responders ask for a fix from the incident channel.
Remote MCP
Give your agent live incident and investigation data.