Skip to content

feat(validation): reject shell-string command in MCP stdio entries#809

Open
danielmeppiel wants to merge 10 commits intomainfrom
feat/mcp-stdio-shell-string-validation
Open

feat(validation): reject shell-string command in MCP stdio entries#809
danielmeppiel wants to merge 10 commits intomainfrom
feat/mcp-stdio-shell-string-validation

Conversation

@danielmeppiel
Copy link
Copy Markdown
Collaborator

Summary

Reject self-defined stdio MCP entries with whitespace-containing command and no args at parse time. Surfaces with a fix-it error pointing at the canonical shape.

Why

Surfaced via #122 (thanks @lirantal). Users coming from the universal mcp.json / Claude Desktop / Cursor mental model write:

- name: nodejs-api-docs
  registry: false
  transport: stdio
  command: "npx mcp-server-nodejs-api-docs"   # WRONG: shell line in `command`

and get confused when nothing works. APM never whitespace-split command; per schema, command is a single binary path and args is the list of arguments. Silently accepting the loose shape was a UX trap.

What changes

MCPDependency.validate() now raises ValueError when:

  • registry: false, AND
  • transport == "stdio", AND
  • command is a string containing whitespace, AND
  • args is empty/missing

Error message includes a fix-it that splits the offending command at the first whitespace boundary and shows the corrected command: / args: shape.

Self-defined MCP dependency 'nodejs-api-docs': 'command' must be a single binary
path, not a shell line. APM does not split 'command' on whitespace.
Got: command='npx -y mcp-server-nodejs-api-docs'.
Did you mean: command: npx, args: ["-y", "mcp-server-nodejs-api-docs"] ?
See https://microsoft.github.io/apm/guides/mcp-servers/ for the canonical stdio shape.

Why error, not warn

Per maintainer steer (issue #806 thread): "move fast, breaking changes OK if they make sense, adoption base small enough." Panel ratified:

  • devx-ux: npm/cargo error on bad manifests; warn-then-mis-execute is worse UX.
  • supply-chain-security: Silently passing a whitespace string to a runtime that doesn't split = broken or surprising. Fail closed.
  • CEO: The loose shape was never specified; CHANGELOG (BREAKING) entry covers the comms.

If user feedback shows broader migration friction, downgrading to a warn is a one-line change.

Edge cases covered (tests)

  • Canonical shape (command: npx, args: [...]) keeps working
  • Whitespace command + empty args -> rejected with fix-it
  • Whitespace command + explicit args -> accepted (paths with spaces, e.g. /opt/My App/server, are legal when author has owned shape)
  • Tabs in command -> also rejected (covers \t, not just spaces)
  • Error message includes the suggested command: and args: tokens

Tests

tests/unit/test_mcp_overlays.py: 46 passed
tests/unit tests/test_console.py: 4069 passed

CHANGELOG

Added under ## [Unreleased] / Changed (BREAKING).

Followups

Closes #806
Refs #122

danielmeppiel and others added 10 commits April 20, 2026 13:07
Implements the core decision engine for issue #778 'transport selection v1'.
Strict-by-default semantics replace today's silent cross-protocol fallback:
explicit ssh:// and https:// dependencies no longer downgrade to a different
protocol, and shorthand (owner/repo) consults git insteadOf rewrites before
defaulting to HTTPS.

This commit ships Waves 1+2 of the transport-selection plan (per session plan):
- new module src/apm_cli/deps/transport_selection.py with ProtocolPreference,
  TransportAttempt/TransportPlan dataclasses, GitConfigInsteadOfResolver,
  and TransportSelector that returns a typed, strict-by-default plan
- DependencyReference grows explicit_scheme so the selector can distinguish
  user-stated transport from shorthand
- _clone_with_fallback in github_downloader.py now iterates the selector
  plan; per-attempt URL building stays in the orchestrator
- InstallContext / InstallRequest / pipeline / Service threaded with
  protocol_pref + allow_protocol_fallback so CLI args reach the downloader
- apm install gains --ssh / --https (mutually exclusive) and
  --allow-protocol-fallback flags; honours APM_GIT_PROTOCOL and
  APM_ALLOW_PROTOCOL_FALLBACK env vars
- two pre-existing tests in test_auth_scoping.py asserted the legacy
  permissive chain; updated to assert the new strict contract and added a
  coverage test for the allow_fallback escape hatch

Tests: 4029 unit tests pass. Test matrix + integration tests + docs land
in subsequent commits per Waves 3-5.

Refs #778, #328, #661

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…one env

Wave 2 panel gate (code-review subagent) flagged that
GitHubPackageDownloader._clone_with_fallback decided the clone env
(locked-down vs relaxed) ONCE per dependency from has_token. Under
allow_fallback=True the plan can mix attempts of different use_token
values, so SSH and plain-HTTPS attempts in a mixed chain were running
with GIT_ASKPASS=echo + GIT_CONFIG_GLOBAL=/dev/null +
GIT_CONFIG_NOSYSTEM=1, breaking ssh-agent passphrase prompts and git
credential helpers.

Fix the env per attempt; address an adjacent contract bug; add tests.

* github_downloader._clone_with_fallback: replace the per-dep clone_env
  with a per-attempt _env_for() helper so only token-bearing attempts
  get the locked-down env.
* github_downloader._build_repo_url: treat token="" as an explicit "no
  token" sentinel so plain-HTTPS attempts in a mixed chain genuinely
  run without embedded credentials, letting credential helpers (gh
  auth, Keychain) supply auth. Orchestrator passes "" instead of None
  for use_token=False attempts.
* transport_selection.GitConfigInsteadOfResolver: wrap the lazy
  insteadOf-rewrite cache in a threading.Lock so parallel downloads
  can't double-populate.

Tests:
* tests/unit/test_transport_selection.py (NEW, 30 tests): 14-row
  selection matrix (explicit-strict, shorthand+insteadOf, shorthand
  defaults, CLI prefs, allow_fallback chain, env helpers); resolver
  caching; "must use normal env" contract.
* tests/unit/test_auth_scoping.py: new
  test_allow_fallback_env_is_per_attempt_not_per_dep regression
  asserts auth-HTTPS gets locked-down env, SSH and plain-HTTPS get
  relaxed env, and plain-HTTPS does not embed the token in the URL.
* tests/integration/test_transport_selection_integration.py (NEW, 7
  tests): 2 always-on cases (public shorthand HTTPS; explicit https://
  strict); 5 SSH-required cases (explicit ssh:// strict, bad-host
  no-fallback, insteadOf override, APM_GIT_PROTOCOL=ssh env,
  allow_fallback rescue). Gated on APM_RUN_INTEGRATION_TESTS=1; SSH
  cases auto-skip if no key.
* tests/fixtures/gitconfig_insteadof_to_ssh (NEW): minimal gitconfig
  used by the integration test for the insteadOf-honored case.
* scripts/test-integration.sh: added "Transport Selection" block so
  the integration suite runs in CI.

Full unit suite: 4061 passed (was 4029; +32 net new tests).

Refs #778, #661, #328

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Update the four user-facing surfaces affected by the new TransportSelector
contract:

* docs/src/content/docs/guides/dependencies.md:
  - New "Transport selection (SSH vs HTTPS)" section: breaking-change
    callout with the rescue env var, selection matrix, insteadOf
    example, --ssh / --https / APM_GIT_PROTOCOL overrides, and the
    --allow-protocol-fallback escape hatch.
  - Soften the existing "Custom ports preserved" sentence (cross-protocol
    retries are now opt-in).
  - Update the "Other Git Hosts" SSH bullet: SSH is no longer a silent
    fallback; point at explicit URLs or insteadOf.
* docs/src/content/docs/getting-started/authentication.md:
  - Rewrite the "SSH connection hangs" troubleshooting entry: remove the
    now-incorrect "tries SSH then falls back to HTTPS" framing.
  - New "Choosing transport (SSH vs HTTPS)" section with a pointer to
    dependencies.md for the full transport contract.
* docs/src/content/docs/reference/cli-commands.md:
  - Document --ssh / --https / --allow-protocol-fallback on apm install,
    plus APM_GIT_PROTOCOL and APM_ALLOW_PROTOCOL_FALLBACK env vars.
* packages/apm-guide/.apm/skills/apm-usage/dependencies.md (Rule 4 mirror):
  - Same transport contract in skill-resource voice with three runnable
    snippets and a selection matrix.

CHANGELOG: scope new entries to `apm install` only (there is no `apm
add` command in the codebase).

Refs #778

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two findings from the Copilot reviewer on PR #779:

1. Non-ASCII em dash introduced by this PR in the modified `Fields:`
   line of guides/dependencies.md: replace with `--`. The other
   non-ASCII chars Copilot flagged in the file (lines ~150-160 of the
   "nested groups" warning block) are pre-existing and out of scope
   for this PR.
2. CHANGELOG entries for the new transport-selection feature were too
   long and bundled multiple concerns into one bullet. Split into one
   tighter BREAKING entry plus two single-purpose Added entries
   (initial-protocol flags; fallback escape hatch). Each ends in
   `(#778)`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- docs(dependencies): pin BREAKING-change caution panel to APM 0.8.13
  (per maintainer review comment)
- transport_selection: collapse explicit `http://` URLs into the plain-HTTPS
  branch so TransportAttempt.scheme stays in {ssh, https} and the downloader
  contract is consistent until #700 lands a first-class HTTP transport
- github_downloader: gate the [!] "Protocol fallback" warning on actual
  scheme change (ssh<->https) rather than label change, so an auth downgrade
  inside a single protocol is not misreported as a protocol switch
- pipeline / request / context / commands: switch
  `allow_protocol_fallback` to `Optional[bool] = None` end-to-end so
  programmatic callers (non-CLI) keep the documented "None => read
  APM_ALLOW_PROTOCOL_FALLBACK env" behavior
- test_auth_scoping: ASCII-only docstring (replace one stray Unicode arrow)

All 4061 unit tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ade (#780)

obra/superpowers (and any Claude Code plugin shipping both hooks/*.json
and agents/skills/commands) was misclassified as HOOK_PACKAGE because
the cascade in detect_package_type() checked _has_hook_json BEFORE the
plugin-evidence branch. The plugin synthesizer (_map_plugin_artifacts)
already maps hooks alongside agents/skills/commands, so MARKETPLACE_PLUGIN
is a strict superset -- swapping the order means hooks still install,
plus everything else that was being silently dropped.

Three deliverables:

A) Surgical detection fix: reorder cascade so MARKETPLACE_PLUGIN is
   checked before HOOK_PACKAGE. Refactored to use a new
   gather_detection_evidence() helper + DetectionEvidence dataclass so
   observability code (warnings, summaries) can reuse the same scan
   without breaking the detect_package_type() public signature.

B) Observability:
   - Add HOOK_PACKAGE to the package-type label table (it was missing
     entirely -- the silent classification path).
   - Update MARKETPLACE_PLUGIN label to mention plugin.json OR
     agents/skills/commands (matches the cascade behaviour).
   - New CommandLogger.package_type_warn() at default visibility.
   - New _warn_if_classification_near_miss() helper fires when a
     HOOK_PACKAGE classification disagrees with directory contents
     (catches near-misses the order swap cannot, e.g. .claude-plugin/
     dir without plugin.json).
   - Wired at both materialization sites (local + cached).

C) Architectural follow-up tracked in plan; will file as separate
   issue after merge for a Visitor / Format-Discovery refactor.

Tests: 17 detection tests + 9 sources-observability tests + 70
command-logger tests pass. Full unit suite (4180 tests) green.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…in/ evidence

Per Copilot review on PR #781: the near-miss warning helper had dead
code paths once the cascade was reordered (a HOOK_PACKAGE classification
implies plugin_dirs_present is empty -- otherwise it would be
MARKETPLACE_PLUGIN). Two principled options were offered: simplify the
helper, or broaden detection so the invariants stay consistent. Chose
the latter -- it removes the inconsistency at the source.

Changes:
- Add .claude-plugin/ as first-class plugin evidence in DetectionEvidence
  (new has_claude_plugin_dir field) and in has_plugin_evidence. A Claude
  Code plugin without a plugin.json (manifest-less) now classifies as
  MARKETPLACE_PLUGIN; normalize_plugin_directory already handles the
  missing-manifest case (derives name from directory).
- Drop _warn_if_classification_near_miss helper, its two call sites,
  CommandLogger.package_type_warn method, and the corresponding tests.
  All scenarios it covered are now handled by the cascade itself.
- Add regression test test_claude_plugin_dir_alone_is_plugin_evidence
  asserting that .claude-plugin/ + hooks/ classifies as MARKETPLACE_PLUGIN
  with plugin_json_path=None (matched via directory evidence alone).
- Extend test_obra_superpowers_evidence to assert has_claude_plugin_dir.

Verified: 4175 unit tests pass (excluding the one pre-existing unrelated
failure documented on main).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Self-defined stdio MCP entries with `command` containing whitespace and
no `args` are now rejected at parse time with a fix-it error pointing at
the canonical `command: <binary>, args: [<token>, ...]` shape.

Previously silently accepted; APM never split `command` on whitespace,
so the loose shape mis-executed downstream. The trap surfaced via #122
(thanks @lirantal) -- users coming from the universal `mcp.json` /
Claude Desktop / Cursor mental model wrote `command: "npx mcp-server-foo"`
and got confused when nothing worked.

Per maintainer steer ("move fast, breaking OK"): error in v1, not warn.
The loose shape was never specified; silent mis-execution is worse than
hard-fail with a clear fix-it. CHANGELOG entry under
Changed (BREAKING).

Closes #806
Refs #122

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds stricter parse-time validation for self-defined stdio MCP dependencies to prevent the common “shell string in command” misconfiguration, and also updates package-type detection/observability to address plugin-vs-hooks classification edge cases.

Changes:

  • Reject self-defined stdio MCP entries where command contains whitespace and args is missing/empty, with a fix-it suggestion in the error.
  • Introduce richer package-type detection evidence and reorder the detection cascade to classify plugin-shaped packages before hook-only packages.
  • Add/extend unit tests and update CHANGELOG.md entries for the behavior changes.
Show a summary per file
File Description
src/apm_cli/models/dependency/mcp.py Adds the new stdio shell-string command validation and fix-it error text.
tests/unit/test_mcp_overlays.py Adds unit tests covering the new MCP validation behavior and error contents.
src/apm_cli/models/validation.py Adds DetectionEvidence + gather_detection_evidence() and changes detection order to prefer plugin evidence over hooks.
tests/test_apm_package_models.py Adds regression tests for plugin-vs-hooks classification and new evidence-gathering helper.
src/apm_cli/install/sources.py Centralizes package-type label formatting and ensures hook packages get a label.
tests/unit/install/test_sources_classification.py New tests ensuring every classifiable PackageType has a human-readable label.
CHANGELOG.md Adds entries describing the new MCP validation breaking change and the plugin classification fix.

Copilot's findings

Comments suppressed due to low confidence (1)

src/apm_cli/models/dependency/mcp.py:142

  • This validation rejects any whitespace in command when args is empty, but per the schema command is a binary path and valid paths can contain spaces (notably on Windows, e.g. under Program Files). Consider narrowing the check to the specific "shell line" anti-pattern (e.g. multiple tokens and command doesn't look like a path) so you don't block legitimate single-binary paths that contain spaces but have no arguments.
            if (
                self.transport == 'stdio'
                and isinstance(self.command, str)
                and any(ch.isspace() for ch in self.command)
                and not self.args
            ):
  • Files reviewed: 7/7 changed files
  • Comments generated: 7

Comment thread CHANGELOG.md
### Changed (BREAKING)

- Strict-by-default transport selection: explicit `ssh://`/`https://` URLs no longer silently fall back to the other protocol; shorthand consults `git config url.<base>.insteadOf` and otherwise defaults to HTTPS. Set `APM_ALLOW_PROTOCOL_FALLBACK=1` (or pass `--allow-protocol-fallback`) to restore the legacy permissive chain; cross-protocol retries then emit a `[!]` warning. Closes #328 (#778)
- Self-defined stdio MCP entries with `command` containing whitespace and no `args` are now rejected at parse time with a fix-it error pointing at the canonical `command: <binary>, args: [<token>, ...]` shape. Previously silently accepted; APM never split `command` on whitespace, so the loose shape mis-executed downstream. Closes #806 (#122)
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changelog entry ends with (#122), but #122 is an issue (already referenced via Closes #806/Refs #122) and cannot also be the PR number. Per the repo changelog convention, the trailing parenthetical should be the PR number for this change.

Copilot generated this review using guidance from repository custom instructions.
Comment thread CHANGELOG.md

### Fixed

- `apm install` no longer silently drops skills, agents, and commands when a Claude Code plugin also ships `hooks/*.json`. The package-type detection cascade now classifies plugin-shaped packages as `MARKETPLACE_PLUGIN` (which already maps hooks via the plugin synthesizer) before falling back to the hook-only classification, and emits a default-visibility `[!]` warning when a hook-only classification disagrees with the package's directory contents (#780)
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entry appears to be a normal code change but does not end with a PR number, and (#780) reads like an issue reference rather than the required (#PR_NUMBER) suffix. Please adjust to Keep-a-Changelog format: one line per PR, ending with this PR's number (issue refs can stay in the sentence body).

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +235 to +245
evidence = gather_detection_evidence(package_path)

if has_apm_yml and has_skill_md:
if evidence.has_apm_yml and evidence.has_skill_md:
return PackageType.HYBRID, None
if has_apm_yml:
if evidence.has_apm_yml:
return PackageType.APM_PACKAGE, None
if has_skill_md:
if evidence.has_skill_md:
return PackageType.CLAUDE_SKILL, None
if _has_hook_json(package_path):
if evidence.has_plugin_evidence:
return PackageType.MARKETPLACE_PLUGIN, evidence.plugin_json_path
if evidence.has_hook_json:
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR's title/description are focused on MCP stdio validation (#806/#122), but this diff also adds new package-type detection behavior (gather_detection_evidence, .claude-plugin evidence, and a reordered detection cascade) tied to #780. If this extra scope is intentional, it should be called out explicitly in the PR description; otherwise consider splitting to keep review/rollback risk isolated.

Copilot uses AI. Check for mistakes.
and any(ch.isspace() for ch in self.command)
and not self.args
):
first, _, rest = self.command.strip().partition(' ')
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix-it suggestion splits command using .partition(' '), which only handles a literal space. For inputs containing other whitespace (e.g. tabs, multiple whitespace), the validation still triggers but the suggested command/args shape will be incorrect (e.g. npx\t-y\tpkg produces command: npx\t-y\tpkg, args: []). Split on any whitespace with a single pass (e.g. split(maxsplit=1) / regex \s+) so the suggestion matches the validation condition (isspace()).

This issue also appears on line 137 of the same file.

Suggested change
first, _, rest = self.command.strip().partition(' ')
command_parts = self.command.strip().split(maxsplit=1)
first = command_parts[0] if command_parts else ''
rest = command_parts[1] if len(command_parts) > 1 else ''

Copilot uses AI. Check for mistakes.
f"APM does not split 'command' on whitespace. "
f"Got: command={self.command!r}. "
f"Did you mean: command: {first}, args: {suggested_args} ? "
f"See https://microsoft.github.io/apm/guides/mcp-servers/ "
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message links to https://microsoft.github.io/apm/guides/mcp-servers/, but there is no mcp-servers guide in the current docs tree, so users will hit a 404 from the fix-it error. Either add the page in this PR or point to an existing canonical section (e.g. the current MCP dependency docs) until the new guide lands.

Suggested change
f"See https://microsoft.github.io/apm/guides/mcp-servers/ "
f"See https://microsoft.github.io/apm/reference/dependencies/ "

Copilot uses AI. Check for mistakes.
Comment on lines +1149 to +1162
def test_empty_directory(self, tmp_path):
from apm_cli.models.validation import gather_detection_evidence

evidence = gather_detection_evidence(tmp_path)
assert evidence.has_apm_yml is False
assert evidence.has_skill_md is False
assert evidence.has_hook_json is False
assert evidence.plugin_json_path is None
assert evidence.plugin_dirs_present == ()
assert evidence.has_plugin_evidence is False

def test_records_plugin_dirs_in_canonical_order(self, tmp_path):
from apm_cli.models.validation import gather_detection_evidence

Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests mix src.apm_cli... and apm_cli... imports for the same module. That can import the same source file twice under different module names, leading to duplicated class/enum singletons (e.g. PackageType identity mismatches) and flaky isinstance / equality behavior. Please standardize on a single import style within this file (preferably the existing src.apm_cli... used at the top).

Copilot uses AI. Check for mistakes.
Comment on lines +55 to +61
PackageType.CLAUDE_SKILL: "Skill (SKILL.md detected)",
PackageType.MARKETPLACE_PLUGIN:
"Marketplace Plugin (plugin.json or agents/skills/commands)",
PackageType.HYBRID: "Hybrid (apm.yml + SKILL.md)",
PackageType.APM_PACKAGE: "APM Package (apm.yml)",
PackageType.HOOK_PACKAGE: "Hook Package (hooks/*.json only)",
}.get(pkg_type)
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MARKETPLACE_PLUGIN classification now also triggers on a bare .claude-plugin/ directory (see detect_package_type), but this label only mentions plugin.json or agents/skills/commands. That makes the install-time classification log misleading for the manifest-less plugin case. Update the wording (or make it more generic) so it matches the actual detection signals.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(validation): warn or error on shell-string command in MCP stdio entries [FEATURE] Support stdio MCP servers in inline dependency declarations

2 participants