RFC: Hookdeck MCP Server — Phase 1 Plan and Request for Feedback

We're building a Hookdeck MCP server and we want community input before we ship. This issue lays out what we're planning to build, why we made the decisions we did, and the specific questions we'd love your help answering.

If you've been waiting for a Hookdeck MCP, or if you've thought about how you'd use one, please weigh in.

---

## Background: why we're approaching this in two parts

There's an active debate in the developer tools space right now about what MCPs should actually do, and whether skills + CLI is a better fit for some workflows. We've thought about this carefully for Hookdeck and landed on a deliberate split.

**The observation:** the context in which you're *building* a webhook integration is fundamentally different from the context in which you're *investigating* a production issue.

- When you're building, you're in an IDE with terminal access. You're setting up sources, destinations, and connections, scaffolding handler code, running a local tunnel, iterating on transformations. The Hookdeck CLI already handles all of this. We've published [agent skills](https://github.com/hookdeck/agent-skills) that teach AI agents (Cursor, Claude Code, etc.) the CLI workflows — setup, scaffolding, listen, iterate — without needing an MCP at all.

- When you're investigating, you might be in a chat interface without a terminal. You're asking questions about what's happening in production: what's failing, why, is it still happening, what did the destination actually respond with. That's where an MCP earns its place — structured queries through natural language, no CLI required.

So instead of one MCP that tries to do everything, we're building purpose-built integrations for each context. Skills + CLI for development. MCP for investigation and operations.

---

## Phase 1 scope: 11 tools

Phase 1 is focused entirely on investigation and production monitoring. The tool set is read-focused, with two lightweight operational exceptions (pause and unpause) that are natural responses to what you find during an investigation.

| Tool | Actions | What it's for |
|------|---------|---------------|
| `projects` | `list`, `use` | Set project context, switch between orgs |
| `connections` | `list`, `get`, `pause`, `unpause` | Inspect connections, pause a hammering connection |
| `sources` | `list`, `get` | Find source URLs, check configuration |
| `destinations` | `list`, `get` | Inspect destination config and auth |
| `transformations` | `list`, `get` | Read transformation code during debugging |
| `requests` | `list`, `get` | Inspect inbound requests from providers |
| `events` | `list`, `get` | Query events with filters, inspect payloads |
| `attempts` | `list`, `get` | See delivery attempts and destination responses verbatim |
| `issues` | `list`, `get` | Surface aggregated failure signals |
| `metrics` | `events`, `requests`, `attempts`, `transformations` | Aggregate stats — failure rates, error counts, over time |
| `help` | topic | Catch-all: redirects setup requests to skills + CLI, points operational queries to the right tool |

All tools use a compound pattern — one tool name, one `action` parameter — to keep the total surface area low. Anthropic's own data shows LLM tool selection accuracy degrades above 30-50 tools; 11 compounds sits well within the reliable range.

The `help` tool is important. When someone asks the MCP to create a connection or start a tunnel, it won't find a write tool — but it will find `help`, which returns guidance pointing to skills + CLI. The MCP is never a dead end, and the topics people ask `help` about will tell us what to build in Phase 2.

**What's intentionally out of scope for Phase 1:**
- Write/CRUD operations (create, update, delete for sources, destinations, connections) — handled by skills + CLI today
- Event retry — this creates a new attempt and generates data, which feels like a different category from pause/unpause. We may add it in Phase 2 if people want it.
- Documentation search — Phase 2
- CLI delegation (`use_hookdeck_cli`) — Phase 3

---

## Phased roadmap

**Phase 1 (shipping soon):** The 11 tools above. Runs locally as `hookdeck gateway mcp` via stdio. Requires the CLI to be installed and authenticated.

**Phase 2 (contingent on Phase 1 feedback):** A `search_docs` tool that serves Hookdeck documentation and agent skills content through the MCP — so agents in chat interfaces that don't have the skills installed can still access setup and workflow knowledge. Also: streamable HTTP transport, which is the prerequisite for a hosted/cloud MCP.

**Phase 3 (contingent on Phase 2):** A `use_hookdeck_cli` tool for environments that have both MCP and terminal access, letting the MCP delegate execution to the CLI. At this point, MCP becomes a single integration for both investigation and development.

The phasing is a hypothesis, not a commitment. If Phase 1 feedback is "actually I really want write tools through the MCP", that jumps the queue.

---

## Why not write operations in Phase 1?

Most dev tools MCPs ship with full CRUD. We looked at 10 of them and 8 offer write operations. We're starting read-focused anyway, for two reasons:

First, we don't have clear signal that anyone wants write operations through the MCP. One user (Julian) asked for write operations, but when we walked through the workflow, skills + CLI covered what they needed. They were open to trying it. That's one person, and they haven't reported back yet.

Second, the observability tools closest to our use case (Sentry, Datadog) are predominantly read-focused through MCP. That's because their MCP value is surfacing production data for investigation, which is exactly our Phase 1 goal.

If users come back saying "I need the MCP to create sources", we add those tools. The `help` tool is specifically designed to capture that signal.

---

## The skills + CLI split: an honest assessment

We're making a bet that for development workflows, the CLI is a better "runner" than MCP. The data supports this — one benchmark showed CLI achieving 28% higher task completion with the same token budget, with 95% of context available for reasoning vs. MCP consuming more with tool schemas and response wrappers. Agents are already trained on CLI patterns (git, docker, gh); skills bridge the gap for unfamiliar CLIs like Hookdeck by teaching the agent the commands through markdown.

But this is a less-tested bet. The counter-argument is that MCP's structured tool schemas provide guidance agents need for multi-turn, stateful workflows. We think skills handle this adequately for the setup → scaffold → listen → iterate cycle, but we haven't proven it with broad usage data.

If agents struggle to maintain state across the development cycle using skills alone, that's a signal to reconsider. The path back is to add write tools to the MCP.

---

## What we'd like to know from you

**1. Does the skills + CLI split make sense for how you work?**
If you're primarily in an IDE with terminal access (Cursor, Claude Code, VS Code), does the idea of installing agent skills and using the CLI for setup workflows work for you? Or do you find yourself wanting everything through one MCP integration?

**2. What would you actually use a Hookdeck MCP for?**
If you've thought about connecting Hookdeck to Claude, ChatGPT, or another agent — what were you trying to do? Investigation and debugging? Setup automation? Something else?

**3. Are there tools missing from the Phase 1 list that you'd consider essential?**
The 11-tool set is focused on investigation. Are there specific capabilities you'd want on day one that aren't here? (Note: we're deliberately keeping the first version small to learn from usage — the question is whether there are hard gaps.)

**4. What do you think about event retry being out of Phase 1?**
Our reasoning: pause/unpause only affect flow control, retry creates new data. Does that distinction feel right, or do you see retry as a natural part of the investigation workflow?

**5. If you use cloud IDEs or non-terminal environments (Codespaces, Gitpod, ChatGPT, Claude.ai), does the requirement to install and run the CLI locally create a problem for you?**
The local CLI approach sidesteps hosting infrastructure and authentication complexity, but it excludes some chat-only environments. Knowing how many people hit this friction helps us prioritize the hosted MCP.

---

Any feedback welcome, including "this split doesn't work for me and here's why." We'd rather hear the hard cases now than after Phase 1 ships.


Tool	Actions	What it's for
`projects`	`list`, `use`	Set project context, switch between orgs
`connections`	`list`, `get`, `pause`, `unpause`	Inspect connections, pause a hammering connection
`sources`	`list`, `get`	Find source URLs, check configuration
`destinations`	`list`, `get`	Inspect destination config and auth
`transformations`	`list`, `get`	Read transformation code during debugging
`requests`	`list`, `get`	Inspect inbound requests from providers
`events`	`list`, `get`	Query events with filters, inspect payloads
`attempts`	`list`, `get`	See delivery attempts and destination responses verbatim
`issues`	`list`, `get`	Surface aggregated failure signals
`metrics`	`events`, `requests`, `attempts`, `transformations`	Aggregate stats — failure rates, error counts, over time
`help`	topic	Catch-all: redirects setup requests to skills + CLI, points operational queries to the right tool

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RFC: Hookdeck MCP Server — Phase 1 Plan and Request for Feedback #228

Background: why we're approaching this in two parts

Phase 1 scope: 11 tools

Phased roadmap

Why not write operations in Phase 1?

The skills + CLI split: an honest assessment

What we'd like to know from you

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

RFC: Hookdeck MCP Server — Phase 1 Plan and Request for Feedback #228

Description

Background: why we're approaching this in two parts

Phase 1 scope: 11 tools

Phased roadmap

Why not write operations in Phase 1?

The skills + CLI split: an honest assessment

What we'd like to know from you

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions