Skip to content

RFC: Hookdeck MCP Server — Phase 1 Plan and Request for Feedback #228

@leggetter

Description

@leggetter

We're building a Hookdeck MCP server and we want community input before we ship. This issue lays out what we're planning to build, why we made the decisions we did, and the specific questions we'd love your help answering.

If you've been waiting for a Hookdeck MCP, or if you've thought about how you'd use one, please weigh in.


Background: why we're approaching this in two parts

There's an active debate in the developer tools space right now about what MCPs should actually do, and whether skills + CLI is a better fit for some workflows. We've thought about this carefully for Hookdeck and landed on a deliberate split.

The observation: the context in which you're building a webhook integration is fundamentally different from the context in which you're investigating a production issue.

  • When you're building, you're in an IDE with terminal access. You're setting up sources, destinations, and connections, scaffolding handler code, running a local tunnel, iterating on transformations. The Hookdeck CLI already handles all of this. We've published agent skills that teach AI agents (Cursor, Claude Code, etc.) the CLI workflows — setup, scaffolding, listen, iterate — without needing an MCP at all.

  • When you're investigating, you might be in a chat interface without a terminal. You're asking questions about what's happening in production: what's failing, why, is it still happening, what did the destination actually respond with. That's where an MCP earns its place — structured queries through natural language, no CLI required.

So instead of one MCP that tries to do everything, we're building purpose-built integrations for each context. Skills + CLI for development. MCP for investigation and operations.


Phase 1 scope: 11 tools

Phase 1 is focused entirely on investigation and production monitoring. The tool set is read-focused, with two lightweight operational exceptions (pause and unpause) that are natural responses to what you find during an investigation.

Tool Actions What it's for
projects list, use Set project context, switch between orgs
connections list, get, pause, unpause Inspect connections, pause a hammering connection
sources list, get Find source URLs, check configuration
destinations list, get Inspect destination config and auth
transformations list, get Read transformation code during debugging
requests list, get Inspect inbound requests from providers
events list, get Query events with filters, inspect payloads
attempts list, get See delivery attempts and destination responses verbatim
issues list, get Surface aggregated failure signals
metrics events, requests, attempts, transformations Aggregate stats — failure rates, error counts, over time
help topic Catch-all: redirects setup requests to skills + CLI, points operational queries to the right tool

All tools use a compound pattern — one tool name, one action parameter — to keep the total surface area low. Anthropic's own data shows LLM tool selection accuracy degrades above 30-50 tools; 11 compounds sits well within the reliable range.

The help tool is important. When someone asks the MCP to create a connection or start a tunnel, it won't find a write tool — but it will find help, which returns guidance pointing to skills + CLI. The MCP is never a dead end, and the topics people ask help about will tell us what to build in Phase 2.

What's intentionally out of scope for Phase 1:

  • Write/CRUD operations (create, update, delete for sources, destinations, connections) — handled by skills + CLI today
  • Event retry — this creates a new attempt and generates data, which feels like a different category from pause/unpause. We may add it in Phase 2 if people want it.
  • Documentation search — Phase 2
  • CLI delegation (use_hookdeck_cli) — Phase 3

Phased roadmap

Phase 1 (shipping soon): The 11 tools above. Runs locally as hookdeck gateway mcp via stdio. Requires the CLI to be installed and authenticated.

Phase 2 (contingent on Phase 1 feedback): A search_docs tool that serves Hookdeck documentation and agent skills content through the MCP — so agents in chat interfaces that don't have the skills installed can still access setup and workflow knowledge. Also: streamable HTTP transport, which is the prerequisite for a hosted/cloud MCP.

Phase 3 (contingent on Phase 2): A use_hookdeck_cli tool for environments that have both MCP and terminal access, letting the MCP delegate execution to the CLI. At this point, MCP becomes a single integration for both investigation and development.

The phasing is a hypothesis, not a commitment. If Phase 1 feedback is "actually I really want write tools through the MCP", that jumps the queue.


Why not write operations in Phase 1?

Most dev tools MCPs ship with full CRUD. We looked at 10 of them and 8 offer write operations. We're starting read-focused anyway, for two reasons:

First, we don't have clear signal that anyone wants write operations through the MCP. One user (Julian) asked for write operations, but when we walked through the workflow, skills + CLI covered what they needed. They were open to trying it. That's one person, and they haven't reported back yet.

Second, the observability tools closest to our use case (Sentry, Datadog) are predominantly read-focused through MCP. That's because their MCP value is surfacing production data for investigation, which is exactly our Phase 1 goal.

If users come back saying "I need the MCP to create sources", we add those tools. The help tool is specifically designed to capture that signal.


The skills + CLI split: an honest assessment

We're making a bet that for development workflows, the CLI is a better "runner" than MCP. The data supports this — one benchmark showed CLI achieving 28% higher task completion with the same token budget, with 95% of context available for reasoning vs. MCP consuming more with tool schemas and response wrappers. Agents are already trained on CLI patterns (git, docker, gh); skills bridge the gap for unfamiliar CLIs like Hookdeck by teaching the agent the commands through markdown.

But this is a less-tested bet. The counter-argument is that MCP's structured tool schemas provide guidance agents need for multi-turn, stateful workflows. We think skills handle this adequately for the setup → scaffold → listen → iterate cycle, but we haven't proven it with broad usage data.

If agents struggle to maintain state across the development cycle using skills alone, that's a signal to reconsider. The path back is to add write tools to the MCP.


What we'd like to know from you

1. Does the skills + CLI split make sense for how you work?
If you're primarily in an IDE with terminal access (Cursor, Claude Code, VS Code), does the idea of installing agent skills and using the CLI for setup workflows work for you? Or do you find yourself wanting everything through one MCP integration?

2. What would you actually use a Hookdeck MCP for?
If you've thought about connecting Hookdeck to Claude, ChatGPT, or another agent — what were you trying to do? Investigation and debugging? Setup automation? Something else?

3. Are there tools missing from the Phase 1 list that you'd consider essential?
The 11-tool set is focused on investigation. Are there specific capabilities you'd want on day one that aren't here? (Note: we're deliberately keeping the first version small to learn from usage — the question is whether there are hard gaps.)

4. What do you think about event retry being out of Phase 1?
Our reasoning: pause/unpause only affect flow control, retry creates new data. Does that distinction feel right, or do you see retry as a natural part of the investigation workflow?

5. If you use cloud IDEs or non-terminal environments (Codespaces, Gitpod, ChatGPT, Claude.ai), does the requirement to install and run the CLI locally create a problem for you?
The local CLI approach sidesteps hosting infrastructure and authentication complexity, but it excludes some chat-only environments. Knowing how many people hit this friction helps us prioritize the hosted MCP.


Any feedback welcome, including "this split doesn't work for me and here's why." We'd rather hear the hard cases now than after Phase 1 ships.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions