What opens an incident
Three triggers:| Trigger | When it fires | Severity |
|---|---|---|
| Deny spike | An agent’s deny count exceeds its rolling 1-hour baseline by N standard deviations. | warning (≥ 3σ) or critical (≥ 5σ). |
| Critical DLP | A single tool call carries a high-severity DLP detection. | critical. |
| Manual | An operator opens one via New incident. | Operator-chosen. |
Opening one manually
Click New incident in the top-right. The modal asks for:- Agent — required. Single-select over the org’s enrolled agents.
- Severity —
warningorcritical. Pill toggle. - Summary — short free text. Becomes the incident title.
- Freeze this agent now — optional checkbox. Tick it if the whole reason you’re opening the incident is to stop the agent immediately.
- A new manual incident is opened. Idempotency: if there’s already an open or acknowledged incident for that agent, the modal returns the existing one rather than duplicating.
- The agent’s recent denies (last hour, capped at 50) are auto-attached so the detail page opens with real evidence, not an empty shell.
- If Freeze this agent now was checked, the agent’s status flips to
frozenand the SDK stops issuing tool calls on its next ≤30 s pull. - Both the manual open and (if applicable) the freeze are written to the audit ledger on the same tamper-evident chain as everything else.
/incidents/[id] to keep working.
Triage UI
Two panes:- Left: queue. Tabs filter by status (
all,open,acknowledged,resolved). Each row shows the incident title, agent, reason, deny count, sparkline of last hour, and time opened/last-event. - Right: detail. Severity, title, agent, reason, status, event count, deny rate over the last hour as an area chart. Action buttons: Ack (mark acknowledged) and Freeze agent (link to detail page).
Live updates
The page subscribes to a websocket. New denies, severity changes, and status transitions arrive in real time — you don’t refresh. A red badge next to “Incidents” in the sidebar shows the count of currently-open incidents.Incident detail page
Click View full incident → in the detail pane to open the dedicated page. It shows:- Hero with severity, status, title, opened/last-event timestamps.
- Stats: opened-at, last-event-at, event count, deny count.
- Remediation strip — one-click actions: ack, resolve, freeze agent, jump to the agent detail.
- Events table — every audit event attached to this incident, in time order. Click a row for the trace drawer.
Acknowledging vs resolving
| Action | Effect |
|---|---|
| Ack | Sets status to acknowledged. Stops the red badge from incrementing. The incident is still considered “active” — new events still attach. |
| Resolve | Sets status to resolved. New events on the same agent + reason will open a new incident rather than reopening this one. |
| Reopen | Sets a resolved incident back to open. Use sparingly — usually a new incident is the cleaner trail. |
Freezing an agent from an incident
The detail pane’s Freeze agent button is the fastest way to stop an out-of-control agent. It:- Sets the agent’s status to
frozen. - Revokes the agent’s current identity (so refresh is rejected next cycle).
- Records a freeze event on the incident’s timeline so the audit trail shows what happened.