docs(problems): add MCP configuration drift problem doc by Benkapner · Pull Request #2011 · fullsend-ai/fullsend

Benkapner · 2026-06-08T09:58:49Z

MCP configuration files define what external tools and services an agent can access. They are the agent's permission surface: every tool server declared in the config becomes available to the agent at startup. Today, these configs are plain files in the repo or workspace, loaded at agent startup without any integrity check, and not monitored between runs.

This creates a security blind spot. The existing security hooks (Tirith, SSRF validator, canary detection, secret redactor) all operate at runtime, checking what the agent does after it starts. But nobody checks whether the configuration that defines the agent's entire tool surface was tampered with before the agent even started. It is like having a security guard at the door checking IDs, while someone quietly replaced the list of who is allowed in.

If an attacker (or a compromised agent, or even an unreviewed PR) modifies a .mcp.json file, they can:

Inject a malicious MCP server that exposes attacker-controlled tools. The agent trusts these tools because they are declared in its config.
Replace a legitimate server endpoint with an attacker-controlled proxy. All tool calls the agent makes through that server now pass through the attacker's infrastructure, enabling data interception and response manipulation.
Expand the tool surface by adding capabilities to an existing server entry, giving the agent access to destructive operations or data sources it was never designed to use.
Accumulate drift organically as teams add integrations without a baseline to compare against, violating least privilege without anyone noticing.

Existing defenses are insufficient for this specific threat:

CODEOWNERS can guard MCP config files, but many repos treat config files as low-sensitivity and do not require human approval
The tool allowlist hook operates on tool names, not server endpoints. Replacing the endpoint behind a trusted tool name bypasses the allowlist entirely
SSRF validation blocks connections to private networks, but a malicious external server URL passes all checks
Credential isolation (ADR 0017) keeps secrets out of the sandbox, but MCP server endpoints are not secrets

This extends the "persistent injection via externally editable resources" concern already identified under Threat 1 in the security threat model, applying it specifically to MCP configurations.

The doc proposes three defense approaches with trade-offs:

Baseline-and-diff: hash config files at session start, compare to a stored baseline, alert or block on mismatch. Simple to implement but requires a workflow for legitimate config updates.
Immutable harness input: treat MCP configs as harness-level inputs injected from a trusted source (like agent system prompts), so the agent never sees the config file. Strongest isolation but adds operational complexity.
Content-aware validation: parse the config and validate contents against a policy (approved server domains, approved tool surfaces per agent role). Catches semantic threats that hashing misses but requires maintaining allowlists.

Describes the threat of silent MCP config modification as an escalation vector, with approaches for baseline-and-diff, immutable harness input, and content-aware validation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Benjamin Kapner <bkapner@redhat.com>

fullsend-ai-review · 2026-06-08T10:09:42Z

Review

Findings

Low

[duplication] docs/problems/mcp-config-drift.md:9 — The "Related:" block (lines 5–8) is duplicated verbatim at lines 9–13. This is a copy-paste error. The duplicate block contains the same three references (security-threat-model.md, agent-architecture.md, ADR 0017) and should be removed.
Remediation: Remove the second "Related:" block (lines 9–13).

Info

[missing-authorization] docs/problems/mcp-config-drift.md — This PR adds a new problem document with no linked issue. CLAUDE.md provides the process for adding problem docs ("create a new file in docs/problems/ and link it from README.md") without requiring issue-based authorization, so this is noted for traceability rather than as a blocking concern.
[scope-boundary] docs/problems/mcp-config-drift.md — MCP configuration drift could potentially be a subsection of security-threat-model.md rather than a standalone problem doc. The doc's "Relationship to other problem areas" section addresses this by explaining how MCP config drift is a specific instance of Threat 1 with unique defense considerations (tool surface definition vs. text-based influence). The three proposed defense approaches are MCP-specific and justify standalone treatment.

Previous run

Review

Findings

Medium

[missing-doc] README.md — The new problem document docs/problems/mcp-config-drift.md is not linked from README.md. CLAUDE.md requires: "When adding new problem areas, create a new file in docs/problems/ and link it from README.md." The README lists all 23 existing problem docs (lines 17–41) but the new MCP Configuration Drift doc is absent. Add an entry following the existing format, e.g.: - [MCP Configuration Drift](docs/problems/mcp-config-drift.md) — Detecting unauthorized changes in MCP server configurations that define the agent tool surface

Low

[edge-case] docs/problems/mcp-config-drift.md:50 — Approach 1 (baseline-and-diff) does not address the trust-on-first-use (TOFU) bootstrapping problem: if the first run occurs against an already-compromised config, the baseline captures the malicious state and all subsequent runs pass. Consider noting in the trade-offs that the baseline should be established from a known-good state or reviewed by a human before being trusted.
[technical-accuracy] docs/problems/mcp-config-drift.md:79 — The "Relationship to existing security hooks" section claims SSRF validation is "the last line of defense if a malicious endpoint makes it into the config." However, the SSRF pretool hook (ssrf_pretool.py) operates on Bash and WebFetch tool calls. MCP server connections are established by the runtime's MCP client, which may not flow through the tool-call hook mechanism. If MCP connections bypass the SSRF hook, the "last line of defense" characterization overstates the actual coverage.
[section-structure] docs/problems/mcp-config-drift.md — Missing "Relationship to other problem areas" section. This section appears in most existing problem docs (4 of 5 reviewed) and provides explicit cross-references to related concerns. For this doc, relevant cross-references would include Security Threat Model (MCP drift as a specific instance of Threat 1), Governance (who controls MCP config policy), and Agent Architecture (how MCP configs relate to agent roles and trust boundaries).

Add README.md entry. Add TOFU bootstrapping risk to baseline-and-diff trade-offs. Correct SSRF coverage characterization for MCP connections. Add cross-references to Security Threat Model, Governance, and Agent Architecture. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Benjamin Kapner <bkapner@redhat.com>

Benkapner · 2026-06-08T10:22:47Z

Addressed all findings in bb260f0.

[missing-doc] Added a bullet point entry in README.md linking to docs/problems/mcp-config-drift.md, positioned after the Security Threat Model entry.

[edge-case] Added a TOFU bootstrapping risk note to the Approach 1 trade-offs: "if the first run occurs against an already-compromised config, the baseline captures the malicious state and all subsequent runs pass. The baseline should be established from a known-good state or reviewed by a human before being trusted."

[technical-accuracy] Corrected the SSRF coverage characterization. The SSRF pretool hook operates on Bash and WebFetch tool calls, but MCP server connections are established by the runtime's MCP client, which may not flow through the tool-call hook mechanism. Replaced "last line of defense" with a more accurate description of partial coverage, and repositioned drift detection as the primary defense for MCP specifically.

[section-structure] Added a "Relationship to other problem areas" section with cross-references to Security Threat Model (MCP drift as a specific instance of Threat 1), Governance (who controls MCP config policy), and Agent Architecture (MCP configs define agent role boundaries).

fullsend-ai-review Bot added the requires-manual-review Review requires human judgment label Jun 8, 2026

fullsend-ai-review Bot approved these changes Jun 8, 2026

View reviewed changes

Comment thread docs/problems/mcp-config-drift.md

fullsend-ai-review Bot added ready-for-merge All reviewers approved — ready to merge and removed requires-manual-review Review requires human judgment labels Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(problems): add MCP configuration drift problem doc#2011

docs(problems): add MCP configuration drift problem doc#2011
Benkapner wants to merge 2 commits into
fullsend-ai:mainfrom
Benkapner:docs/mcp-config-drift

Benkapner commented Jun 8, 2026 •

edited

Loading

Uh oh!

fullsend-ai-review Bot commented Jun 8, 2026 •

edited

Loading

Review

Findings

Medium

Low

Uh oh!

Benkapner commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Benkapner commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fullsend-ai-review Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review

Findings

Low

Info

Review

Findings

Medium

Low

Uh oh!

Benkapner commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Benkapner commented Jun 8, 2026 •

edited

Loading

fullsend-ai-review Bot commented Jun 8, 2026 •

edited

Loading