docs(problems): add MCP configuration drift problem doc#2011
Conversation
Describes the threat of silent MCP config modification as an escalation vector, with approaches for baseline-and-diff, immutable harness input, and content-aware validation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Benjamin Kapner <bkapner@redhat.com>
ReviewFindingsLow
Info
Previous runReviewFindingsMedium
Low
|
Add README.md entry. Add TOFU bootstrapping risk to baseline-and-diff trade-offs. Correct SSRF coverage characterization for MCP connections. Add cross-references to Security Threat Model, Governance, and Agent Architecture. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Benjamin Kapner <bkapner@redhat.com>
|
Addressed all findings in bb260f0. [missing-doc] Added a bullet point entry in README.md linking to [edge-case] Added a TOFU bootstrapping risk note to the Approach 1 trade-offs: "if the first run occurs against an already-compromised config, the baseline captures the malicious state and all subsequent runs pass. The baseline should be established from a known-good state or reviewed by a human before being trusted." [technical-accuracy] Corrected the SSRF coverage characterization. The SSRF pretool hook operates on Bash and WebFetch tool calls, but MCP server connections are established by the runtime's MCP client, which may not flow through the tool-call hook mechanism. Replaced "last line of defense" with a more accurate description of partial coverage, and repositioned drift detection as the primary defense for MCP specifically. [section-structure] Added a "Relationship to other problem areas" section with cross-references to Security Threat Model (MCP drift as a specific instance of Threat 1), Governance (who controls MCP config policy), and Agent Architecture (MCP configs define agent role boundaries). |
MCP configuration files define what external tools and services an agent can access. They are the agent's permission surface: every tool server declared in the config becomes available to the agent at startup. Today, these configs are plain files in the repo or workspace, loaded at agent startup without any integrity check, and not monitored between runs.
This creates a security blind spot. The existing security hooks (Tirith, SSRF validator, canary detection, secret redactor) all operate at runtime, checking what the agent does after it starts. But nobody checks whether the configuration that defines the agent's entire tool surface was tampered with before the agent even started. It is like having a security guard at the door checking IDs, while someone quietly replaced the list of who is allowed in.
If an attacker (or a compromised agent, or even an unreviewed PR) modifies a
.mcp.jsonfile, they can:Existing defenses are insufficient for this specific threat:
This extends the "persistent injection via externally editable resources" concern already identified under Threat 1 in the security threat model, applying it specifically to MCP configurations.
The doc proposes three defense approaches with trade-offs: