MCP Tools for Webhook Recovery — Let Claude or Cursor Drive the Fix
The clusters page, replay-with-edit modal, and pattern hints we shipped over the last three weeks are all the same loop: triage → probe → fix → confirm → fan out. Today that loop is callable from MCP, so any AI assistant can drive recovery end to end.
The Loop, In An Agent
Failure clusters, replay-with-edit, the failure pattern library, the recent-changes tab — these are all pieces of one workflow:
- Find the active incident.
- Look at a sample failure.
- Form a hypothesis about the cause.
- Try a fix on one delivery.
- Confirm green.
- Apply the fix across the whole cluster.
Every step there is a tool call. We're shipping the MCP tools today so an AI assistant — Claude Code, Cursor, or any MCP-aware client — can run the entire loop for you.
What's New in hookbase-mcp-server
Three new tools added to the existing dozens:
hookbase_replay_with_edit
Same primitive as the HTTP endpoint, exposed through MCP. The schema is the override matrix you already know:
{
delivery_id: string;
modified_payload?: any;
destination_override?: string;
transform_override?: { code, type?, input_format?, output_format? };
headers_override?: Record<string, string>;
persist_transform?: boolean;
}
The agent can probe a fix on one delivery, get back the new delivery id, then call hookbase_get_delivery until it knows whether the fix worked.
hookbase_list_delivery_clusters
The triage entry point. Returns the same data the clusters page shows, including the escalating flag from earlier this week. An agent investigating an alert can call this first to find out which cluster is the active fire and grab the sample delivery id to start the loop.
hookbase_replay_cluster
Fan out the fix. After verifying a replay with hookbase_replay_with_edit, the agent calls this with the cluster fingerprint and the same override shape. Every delivery in the cluster (up to 2,000) gets re-queued with the fix. persist_transform still requires the cluster to live on a single route.
A Concrete Run
A real loop, abbreviated:
Agent: hookbase_list_delivery_clusters({ since_hours: 1 })
→ cluster fingerprint=abc123, escalating=true, 47 deliveries,
routeName="Stripe → BigQuery", errorMessageExcerpt="invalid signature"
Agent: hookbase_get_delivery({ delivery_id: <sample> })
→ status=failed, response_status=400, response_body="signature mismatch",
rca_analysis.category=auth_signature
Agent thinks: looks like a rotated webhook secret.
The destination's signing key may have been rotated.
Agent notifies user: "Cluster 'Stripe → BigQuery' is escalating
(47 deliveries, 12.4/min vs 0.3/min baseline). Looks like a
signature rejection — your destination's verifying secret may
have been rotated. I can either replay these after you update
the secret, or replay them to a staging URL for now. Which?"
The agent didn't fix anything destructive without your input. It walked the loop, gathered the context, and presented a decision.
The Existing Tools Still Apply
The replay tools compose with everything that was already there: hookbase_list_events, hookbase_get_event_debug, hookbase_get_route, hookbase_test_transform, hookbase_list_audit_logs. The new tools are the recovery primitives — the rest let the agent understand the system before reaching for them.
Install / Update
npm install -g hookbase-mcp-server
Configure your MCP client with your Hookbase API key and org id (the README has details). If you already have it installed, npm update picks up the new tools.
Where This Goes
This is the last of the inbound debugging UX initiative we've been shipping over the last three weeks. The big picture has been the same throughout: make incident recovery a single tight loop on a single seam, and then expose that loop to whatever drives it — the dashboard, the API, or an MCP-aware agent.
If you want to play with the agent-driven version, the MCP server is open. If you'd rather stick to the UI, the dashboard has every feature the agent calls into. Either way, the tools are the same underneath.