Skip to content

feat(transport): TransportSelector + strict-by-default transport (#778)#779

Merged
danielmeppiel merged 6 commits intomainfrom
feat/transport-selection-v1
Apr 20, 2026
Merged

feat(transport): TransportSelector + strict-by-default transport (#778)#779
danielmeppiel merged 6 commits intomainfrom
feat/transport-selection-v1

Conversation

@danielmeppiel
Copy link
Copy Markdown
Collaborator

Summary

Closes #778. Closes #328. Follow-up to #661 / #665.

Independent of #700 (HTTP-allow validation) — zero overlap; either can merge first. See "Independence from #700" below.

This PR implements Transport Selection v1: APM now distinguishes "user expressed a transport preference" from "user expressed none" instead of running the same permissive HTTPS-then-SSH chain for everything.

What changes for users

Scenario Before this PR After this PR
ssh://host/repo.git Silently downgraded to HTTPS on SSH failure (#661 hazard) Strict SSH-only. Fails loudly naming the SSH URL.
https://host/repo.git Could fall through to SSH Strict HTTPS-only.
owner/repo shorthand, no git config insteadOf HTTPS first, then SSH on failure HTTPS only (strict).
owner/repo shorthand, git config url.git@host:.insteadOf https://host/ HTTPS only — never tried SSH (#328) SSH first (strict), matching git clone owner/repo behavior on that machine.
Need today's permissive multi-protocol chain Default Set APM_ALLOW_PROTOCOL_FALLBACK=1 (or --allow-protocol-fallback). When fallback runs, a [!] warning names both protocols.

Migration / non-breaking transition for current users

This is a ### Changed (BREAKING) entry, but the rescue is one env var. The PR was scoped explicitly so a current user with a CI pipeline that depended on the silent-fallback can keep working with a single line:

env:
  APM_ALLOW_PROTOCOL_FALLBACK: "1"

The CHANGELOG entry leads with this, the new docs page leads with this, and the error message that triggers when a strict clone fails names both the URL and the rescue env var.

New CLI surface

On apm install:

  • --ssh / --https — pick the initial transport for shorthand URLs (mutually exclusive). Explicit-scheme URLs ignore CLI prefs.
  • --allow-protocol-fallback — opt back into the legacy permissive chain.

Env vars (apply globally, picked up by both CLI and programmatic callers):

  • APM_GIT_PROTOCOL=ssh|https
  • APM_ALLOW_PROTOCOL_FALLBACK=1

Independence from #700

Architecture

New module src/apm_cli/deps/transport_selection.py (~310 LOC) extracts the decision from the orchestrator:

  • ProtocolPreference enum (NONE / SSH / HTTPS).
  • TransportAttempt frozen dataclass — pure data, comparable with ==, no closures.
  • TransportPlan frozen dataclass — (attempts, strict, fallback_hint).
  • InsteadOfResolver Protocol; default GitConfigInsteadOfResolver shells out git config --get-regexp '^url\..*\.insteadof$'. Lazy load, double-checked locking around the cache.
  • TransportSelector.select(dep_ref, cli_pref, allow_fallback, has_token) -> TransportPlan.

GitHubPackageDownloader._clone_with_fallback now consumes a TransportPlan and iterates plan.attempts instead of running the hardcoded chain. URL building stays in the orchestrator (existing _build_repo_url honors use_ssh, token).

Critical invariant kept: GitConfigInsteadOfResolver._load_rewrites() runs with the process's normal env (NOT the downloader's locked-down git_env which sets GIT_CONFIG_GLOBAL=/dev/null). Otherwise issue #328 would be unfixable because the user's insteadOf config would be invisible.

Per-attempt clone env (Wave 2 panel finding, fixed in commit 724f8e7)

The first cut decided the clone env (locked-down vs relaxed) once per dependency from has_token. Under --allow-protocol-fallback the plan can mix attempts of different use_token values, so SSH + plain-HTTPS attempts in a mixed chain were running with GIT_ASKPASS=echo + GIT_CONFIG_GLOBAL=/dev/null + GIT_CONFIG_NOSYSTEM=1 — breaking ssh-agent passphrase prompts, gh auth, Keychain.

Fix:

  • _env_for(attempt.use_token) per-attempt closure: only token-bearing attempts get the locked-down env.
  • Adjacent contract bug: _build_repo_url(token=None) fell back to self.github_token, so "plain HTTPS" attempts in a mixed chain still embedded the token in the URL — violating the use_token=False contract. Fixed via empty-string sentinel: token="" means explicit "no token" and is what the orchestrator now passes for non-token attempts.
  • Regression test test_allow_fallback_env_is_per_attempt_not_per_dep drives a 3-attempt mixed chain and asserts auth-HTTPS=locked-down, SSH=relaxed, plain-HTTPS=relaxed-and-tokenless.

Also fixed concurrency: GitConfigInsteadOfResolver cache lazy-load now uses double-checked locking with threading.Lock so parallel downloads can't double-populate.

Tests

Suite Count Notes
tests/unit/test_transport_selection.py (NEW) 30 14-row selection matrix; env helpers; resolver caching; "must use normal env" contract
tests/unit/test_auth_scoping.py (regression) +1 Per-attempt env in mixed chain + plain-HTTPS token-leak guard
tests/integration/test_transport_selection_integration.py (NEW) 7 2 always-on (public shorthand HTTPS, explicit https:// strict) + 5 SSH-required (explicit ssh:// strict, bad-host no-fallback, insteadOf override, APM_GIT_PROTOCOL=ssh, allow_fallback rescue). Gated on APM_RUN_INTEGRATION_TESTS=1; SSH cases auto-skip if no key.
scripts/test-integration.sh +1 block Runs the integration suite in CI

Full unit suite: 4061 passed (was 4029; +32 net new).

APM Review Panel sign-off

Synthesized review across the panel personas (per .github/skills/apm-review-panel/).

python-architect (Wave 0 + 2): VERDICT: ship. All 6 mandatory edits from the Wave 0 verdict applied. Two deliberate scope-shrinks acknowledged: (1) _clone_with_fallback not renamed to _clone_with_transport (5+ tests patch it by name); (2) list_remote_refs (~line 876 in github_downloader.py) still hardcodes HTTPS — flagged as fast-follow, not in scope.

supply-chain-security-expert: Strict-by-default closes the induced-downgrade hazard #661 was filed for. Token-leak guard in plain-HTTPS attempts is a meaningful additional hardening discovered during the fix. Escape hatch is ### Changed (BREAKING)-documented and gated by an explicit env var, not implicit. APPROVED.

auth-expert: _env_for per-attempt is the correct contract — token attempts must run locked-down, non-token attempts must let credential helpers (gh auth, Keychain, ssh-agent) work. The token="" sentinel is documented in the source. Resolver _load_rewrites() running with normal env (not git_env) is correct and is the ONLY way to honor user insteadOf config. APPROVED.

devx-ux-expert: Migration story is one env var, named in the error message that triggers the broken case. Three new flags (--ssh, --https, --allow-protocol-fallback) plus three new env vars (APM_GIT_PROTOCOL, APM_ALLOW_PROTOCOL_FALLBACK) document cleanly in CLI ref + dependencies guide; warning on permitted fallback names both protocols. Migration-UX deviation noted: under allow_fallback=True we let explicit-scheme URLs fall through to the shorthand chain rather than staying strict. Architect's Wave 0 verdict said "explicit_scheme always strict." Deviation kept on purpose — users explicitly opting into fallback mode are saying "I want today's behavior" and should get it for explicit URLs too. APPROVED with deviation acknowledged.

cli-logging-expert: [!] warning format on permitted fallback matches STATUS_SYMBOLS convention. Errors on strict-mode failure name the URL and the rescue env var. Per-(url, from-to) dedup avoids warning fatigue. APPROVED.

oss-growth-hacker: Closing #328 (community pain on shorthand+SSH) is the highest-leverage win in this PR — closes a frequently-cited friction. Strict default also positions APM as a safer choice vs alternatives. APPROVED.

apm-ceo (final call): SHIP. Pre-1.0 breaking change with a one-flag rescue is consistent with CONTRIBUTING.md philosophy. The combination — closes #328, fixes #661 hazard, ships independently of #700 — is high-leverage. Wave 5 sign-off granted.

Risks & known follow-ups (out of scope for this PR)

  • list_remote_refs (~line 876 of github_downloader.py) still hardcodes HTTPS. Architect-flagged fast-follow; tracked separately.
  • git config --get-regexp requires git ≥ 1.8.5; older git silently degrades to no insteadOf detection (today's behavior).

Refs #778, #328, #661, #665, #700, #777

danielmeppiel and others added 4 commits April 20, 2026 13:07
Implements the core decision engine for issue #778 'transport selection v1'.
Strict-by-default semantics replace today's silent cross-protocol fallback:
explicit ssh:// and https:// dependencies no longer downgrade to a different
protocol, and shorthand (owner/repo) consults git insteadOf rewrites before
defaulting to HTTPS.

This commit ships Waves 1+2 of the transport-selection plan (per session plan):
- new module src/apm_cli/deps/transport_selection.py with ProtocolPreference,
  TransportAttempt/TransportPlan dataclasses, GitConfigInsteadOfResolver,
  and TransportSelector that returns a typed, strict-by-default plan
- DependencyReference grows explicit_scheme so the selector can distinguish
  user-stated transport from shorthand
- _clone_with_fallback in github_downloader.py now iterates the selector
  plan; per-attempt URL building stays in the orchestrator
- InstallContext / InstallRequest / pipeline / Service threaded with
  protocol_pref + allow_protocol_fallback so CLI args reach the downloader
- apm install gains --ssh / --https (mutually exclusive) and
  --allow-protocol-fallback flags; honours APM_GIT_PROTOCOL and
  APM_ALLOW_PROTOCOL_FALLBACK env vars
- two pre-existing tests in test_auth_scoping.py asserted the legacy
  permissive chain; updated to assert the new strict contract and added a
  coverage test for the allow_fallback escape hatch

Tests: 4029 unit tests pass. Test matrix + integration tests + docs land
in subsequent commits per Waves 3-5.

Refs #778, #328, #661

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…one env

Wave 2 panel gate (code-review subagent) flagged that
GitHubPackageDownloader._clone_with_fallback decided the clone env
(locked-down vs relaxed) ONCE per dependency from has_token. Under
allow_fallback=True the plan can mix attempts of different use_token
values, so SSH and plain-HTTPS attempts in a mixed chain were running
with GIT_ASKPASS=echo + GIT_CONFIG_GLOBAL=/dev/null +
GIT_CONFIG_NOSYSTEM=1, breaking ssh-agent passphrase prompts and git
credential helpers.

Fix the env per attempt; address an adjacent contract bug; add tests.

* github_downloader._clone_with_fallback: replace the per-dep clone_env
  with a per-attempt _env_for() helper so only token-bearing attempts
  get the locked-down env.
* github_downloader._build_repo_url: treat token="" as an explicit "no
  token" sentinel so plain-HTTPS attempts in a mixed chain genuinely
  run without embedded credentials, letting credential helpers (gh
  auth, Keychain) supply auth. Orchestrator passes "" instead of None
  for use_token=False attempts.
* transport_selection.GitConfigInsteadOfResolver: wrap the lazy
  insteadOf-rewrite cache in a threading.Lock so parallel downloads
  can't double-populate.

Tests:
* tests/unit/test_transport_selection.py (NEW, 30 tests): 14-row
  selection matrix (explicit-strict, shorthand+insteadOf, shorthand
  defaults, CLI prefs, allow_fallback chain, env helpers); resolver
  caching; "must use normal env" contract.
* tests/unit/test_auth_scoping.py: new
  test_allow_fallback_env_is_per_attempt_not_per_dep regression
  asserts auth-HTTPS gets locked-down env, SSH and plain-HTTPS get
  relaxed env, and plain-HTTPS does not embed the token in the URL.
* tests/integration/test_transport_selection_integration.py (NEW, 7
  tests): 2 always-on cases (public shorthand HTTPS; explicit https://
  strict); 5 SSH-required cases (explicit ssh:// strict, bad-host
  no-fallback, insteadOf override, APM_GIT_PROTOCOL=ssh env,
  allow_fallback rescue). Gated on APM_RUN_INTEGRATION_TESTS=1; SSH
  cases auto-skip if no key.
* tests/fixtures/gitconfig_insteadof_to_ssh (NEW): minimal gitconfig
  used by the integration test for the insteadOf-honored case.
* scripts/test-integration.sh: added "Transport Selection" block so
  the integration suite runs in CI.

Full unit suite: 4061 passed (was 4029; +32 net new tests).

Refs #778, #661, #328

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Update the four user-facing surfaces affected by the new TransportSelector
contract:

* docs/src/content/docs/guides/dependencies.md:
  - New "Transport selection (SSH vs HTTPS)" section: breaking-change
    callout with the rescue env var, selection matrix, insteadOf
    example, --ssh / --https / APM_GIT_PROTOCOL overrides, and the
    --allow-protocol-fallback escape hatch.
  - Soften the existing "Custom ports preserved" sentence (cross-protocol
    retries are now opt-in).
  - Update the "Other Git Hosts" SSH bullet: SSH is no longer a silent
    fallback; point at explicit URLs or insteadOf.
* docs/src/content/docs/getting-started/authentication.md:
  - Rewrite the "SSH connection hangs" troubleshooting entry: remove the
    now-incorrect "tries SSH then falls back to HTTPS" framing.
  - New "Choosing transport (SSH vs HTTPS)" section with a pointer to
    dependencies.md for the full transport contract.
* docs/src/content/docs/reference/cli-commands.md:
  - Document --ssh / --https / --allow-protocol-fallback on apm install,
    plus APM_GIT_PROTOCOL and APM_ALLOW_PROTOCOL_FALLBACK env vars.
* packages/apm-guide/.apm/skills/apm-usage/dependencies.md (Rule 4 mirror):
  - Same transport contract in skill-resource voice with three runnable
    snippets and a selection matrix.

CHANGELOG: scope new entries to `apm install` only (there is no `apm
add` command in the codebase).

Refs #778

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 20, 2026 11:38
@danielmeppiel
Copy link
Copy Markdown
Collaborator Author

Visual reference for reviewers

Three diagrams to make the PR easier to read end-to-end. All flows live in src/apm_cli/deps/transport_selection.py (TransportSelector.select) and src/apm_cli/deps/github_downloader.py (_clone_with_fallback).

1. Selection: input -> TransportPlan

How TransportSelector.select(dep_ref, cli_pref, allow_fallback, has_token) decides which attempts to make and whether they are strict.

flowchart TD
    Start[apm install dep_ref] --> Parse{URL kind?}

    Parse -->|"explicit ssh://"| ExSsh[plan: SSH only]
    Parse -->|"explicit https://"| ExHttps[plan: HTTPS only]
    Parse -->|"shorthand owner/repo"| Short[Look up insteadOf]

    Short --> InsteadOf{"git config<br/>url.&lt;base&gt;.insteadOf<br/>rewrites to SSH?"}

    InsteadOf -->|yes| ShortSsh[first attempt: SSH]
    InsteadOf -->|no| CliPref{CLI / env preference?}

    CliPref -->|--ssh / APM_GIT_PROTOCOL=ssh| ShortSsh
    CliPref -->|--https / APM_GIT_PROTOCOL=https / unset| ShortHttps[first attempt: HTTPS]

    ExSsh --> Fb1{allow_fallback?}
    ExHttps --> Fb1
    ShortSsh --> Fb2{allow_fallback?}
    ShortHttps --> Fb2

    Fb1 -->|no| StrictExplicit["TransportPlan(<br/>attempts=[picked_protocol],<br/>strict=True)"]
    Fb1 -->|yes| FallExplicit["TransportPlan(<br/>attempts=[picked, other],<br/>strict=False,<br/>warn on cross-protocol)"]

    Fb2 -->|no| StrictShort["TransportPlan(<br/>attempts=[first],<br/>strict=True)"]
    Fb2 -->|yes| FallShort["TransportPlan(<br/>attempts=[first, other,<br/>plain-https if has_token],<br/>strict=False)"]

    classDef strict fill:#d4edda,stroke:#28a745,color:#000
    classDef permissive fill:#fff3cd,stroke:#856404,color:#000
    class StrictExplicit,StrictShort strict
    class FallExplicit,FallShort permissive
Loading

Green = new strict default. Yellow = legacy permissive chain, only reachable via --allow-protocol-fallback / APM_ALLOW_PROTOCOL_FALLBACK=1.

2. Orchestration: per-attempt env + token contract

How _clone_with_fallback walks the plan. The fix from the Wave 2 gate lives in the _env_for(attempt.use_token) decision and the token="" empty-string sentinel.

flowchart TD
    In["plan = TransportPlan(attempts, strict)"] --> Loop{for attempt in attempts}

    Loop --> EnvDecide{attempt.use_token?}

    EnvDecide -->|true| Locked["env = self.git_env<br/>(GIT_ASKPASS=echo,<br/>GIT_CONFIG_GLOBAL=/dev/null,<br/>GIT_CONFIG_NOSYSTEM=1)"]
    EnvDecide -->|false| Relaxed["env = filtered process env<br/>(ssh-agent, gh auth,<br/>Keychain, ~/.gitconfig visible)"]

    Locked --> Url["url = _build_repo_url(<br/>token=dep_token if use_token else '')"]
    Relaxed --> Url

    Url --> Clone["Repo.clone_from(url, env=env)"]

    Clone --> Ok{success?}
    Ok -->|yes| Done[return path]

    Ok -->|no| Strict{plan.strict?}
    Strict -->|yes| FailLoud["raise: name URL,<br/>name protocol attempted,<br/>suggest APM_ALLOW_PROTOCOL_FALLBACK=1"]
    Strict -->|no| WarnNext["[!] warn: from->to,<br/>continue loop"]

    WarnNext --> Loop

    classDef new fill:#d4edda,stroke:#28a745,color:#000
    classDef bug fill:#f8d7da,stroke:#dc3545,color:#000
    class EnvDecide,Url new
    class FailLoud new
Loading

The two highlighted nodes are the Wave 2 fix:

  • EnvDecide was previously hoisted out of the loop (decided once per dep from has_token) -- broke ssh-agent and credential helpers in mixed --allow-protocol-fallback chains.
  • Url -- previously called _build_repo_url(token=None) for non-token attempts, but None triggered fallback to self.github_token, so the URL still embedded the token. token="" is now an explicit "no token" sentinel.

3. Before / after for a concrete bug case (#661)

User declares git+ssh://git@host:22/org/repo.git and SSH auth fails (e.g. ssh-agent locked).

sequenceDiagram
    autonumber
    participant U as User
    participant APM as apm install
    participant SSH as SSH transport
    participant HTTPS as HTTPS transport

    rect rgb(255, 235, 235)
        Note over U,HTTPS: Before this PR (silent downgrade hazard)
        U->>APM: install ssh://...:22/org/repo
        APM->>SSH: clone (locked-down env)
        SSH-->>APM: auth failed
        APM->>HTTPS: silently retry on https://host:22/org/repo
        HTTPS-->>APM: 200 OK (different transport, maybe different ACL)
        APM-->>U: install succeeded (user never sees the downgrade)
    end

    rect rgb(220, 245, 220)
        Note over U,HTTPS: After this PR (strict by default)
        U->>APM: install ssh://...:22/org/repo
        APM->>SSH: clone (locked-down env)
        SSH-->>APM: auth failed
        APM-->>U: error: SSH clone failed for ssh://...:22/org/repo<br/>Set APM_ALLOW_PROTOCOL_FALLBACK=1 to opt into HTTPS retry.
    end

    rect rgb(255, 250, 220)
        Note over U,HTTPS: Opt-in escape hatch (migration path)
        U->>APM: APM_ALLOW_PROTOCOL_FALLBACK=1 apm install ...
        APM->>SSH: clone (locked-down env)
        SSH-->>APM: auth failed
        APM->>U: [!] SSH failed, falling back to HTTPS for ssh://...:22/...
        APM->>HTTPS: clone (relaxed env, no token in URL if use_token=false)
        HTTPS-->>APM: 200 OK
        APM-->>U: install succeeded (with explicit consent, named in warning)
    end
Loading

The warning in the third pane is what makes the escape hatch safe: the user opted in and sees the cross-protocol retry happen; before this PR it happened silently every time.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements Transport Selection v1 so APM distinguishes explicit transport (scheme) from shorthand and becomes strict-by-default, with an opt-in legacy cross-protocol fallback via CLI/env.

Changes:

  • Add TransportSelector decision engine and route clone attempts through a computed TransportPlan.
  • Add apm install flags/env vars to control shorthand transport (--ssh/--https, APM_GIT_PROTOCOL) and legacy fallback (--allow-protocol-fallback, APM_ALLOW_PROTOCOL_FALLBACK).
  • Add unit + integration coverage and update docs/CHANGELOG for the new contract.
Show a summary per file
File Description
tests/unit/test_transport_selection.py New unit test matrix for transport planning and insteadOf resolution/caching.
tests/unit/test_auth_scoping.py Regression tests for per-attempt env + token scoping under fallback.
tests/integration/test_transport_selection_integration.py New networked integration coverage validating strictness + overrides.
tests/fixtures/gitconfig_insteadof_to_ssh Fixture .gitconfig for insteadOf HTTPS->SSH rewrite.
src/apm_cli/models/dependency/reference.py Add explicit_scheme signal to parsed dependency references.
src/apm_cli/install/service.py Thread transport prefs into the install pipeline.
src/apm_cli/install/request.py Extend request with protocol preference + fallback toggle.
src/apm_cli/install/pipeline.py Thread protocol preference + fallback into InstallContext.
src/apm_cli/install/phases/resolve.py Construct downloader with protocol pref + fallback.
src/apm_cli/install/context.py Store protocol pref + fallback on context.
src/apm_cli/deps/transport_selection.py New transport selection module (plan + insteadOf resolver).
src/apm_cli/deps/github_downloader.py Consume TransportPlan; implement strict-by-default cloning + warnings.
src/apm_cli/commands/install.py Add --ssh/--https/--allow-protocol-fallback and resolve env/flag behavior.
scripts/test-integration.sh Add CI integration test invocation for transport selection suite.
packages/apm-guide/.apm/skills/apm-usage/dependencies.md Document strict selection + overrides + fallback escape hatch.
docs/src/content/docs/reference/cli-commands.md Document new flags and transport env vars for apm install.
docs/src/content/docs/guides/dependencies.md Document strict transport contract and migration/fallback guidance.
docs/src/content/docs/getting-started/authentication.md Update SSH hang guidance and link to transport selection docs.
CHANGELOG.md Add BREAKING/Added entries describing transport selection changes.

Copilot's findings

Comments suppressed due to low confidence (2)

CHANGELOG.md:18

  • This entry is very long and doesn't match the changelog format used elsewhere (concise, one line per PR, ending with (#PR)), and it mixes multiple concerns into one bullet. Please condense and ensure the bullet ends with the PR number (e.g. (#778)).
### Added

- `apm install` accepts `--ssh` / `--https` flags and `APM_GIT_PROTOCOL=ssh|https` env to pick the initial transport for shorthand URLs. `apm install` also accepts `--allow-protocol-fallback` (env: `APM_ALLOW_PROTOCOL_FALLBACK=1`) as the escape hatch for cross-protocol fallback when migrating off the previous permissive behavior (#778).
- Add APM Review Panel skill (`.github/skills/apm-review-panel/`) and four new specialist personas (`devx-ux-expert`, `supply-chain-security-expert`, `apm-ceo`, `oss-growth-hacker`) with auto-activating per-persona skills. Routes specialist findings through an APM CEO arbiter for strategic / breaking-change calls, with the OSS growth hacker side-channeling adoption insights via `WIP/growth-strategy.md`. Instrumentation per Handbook Ch. 9 (`The Instrumented Codebase`); PROSE-compliant (thin SKILL.md routers, persona detail lazy-loaded via markdown links, explicit boundaries per persona).

docs/src/content/docs/guides/dependencies.md:155

  • This excerpt contains non-ASCII punctuation (e.g. em dash "—" and Unicode arrow "→"). The repo's cross-platform encoding rules require docs to stay printable ASCII; please replace with ASCII equivalents ("--", "->", straight quotes, etc.).
>
> ```yaml
> # DON'T — ambiguous: APM can't tell where the repo path ends
> # gitlab.com/group/subgroup/repo/file.prompt.md
> #   → parsed as repo=group/subgroup, virtual=repo/file.prompt.md (wrong!)
  • Files reviewed: 19/19 changed files
  • Comments generated: 5

Comment thread CHANGELOG.md
Comment thread src/apm_cli/deps/github_downloader.py
Comment thread src/apm_cli/deps/transport_selection.py Outdated
Comment thread tests/unit/test_auth_scoping.py Outdated
Comment thread src/apm_cli/install/pipeline.py
Two findings from the Copilot reviewer on PR #779:

1. Non-ASCII em dash introduced by this PR in the modified `Fields:`
   line of guides/dependencies.md: replace with `--`. The other
   non-ASCII chars Copilot flagged in the file (lines ~150-160 of the
   "nested groups" warning block) are pre-existing and out of scope
   for this PR.
2. CHANGELOG entries for the new transport-selection feature were too
   long and bundled multiple concerns into one bullet. Split into one
   tighter BREAKING entry plus two single-purpose Added entries
   (initial-protocol flags; fallback escape hatch). Each ends in
   `(#778)`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danielmeppiel
Copy link
Copy Markdown
Collaborator Author

Both findings addressed in f179970:

1. Non-ASCII punctuation in docs/src/content/docs/guides/dependencies.md -- partial fix. Replaced the em dash in the Fields: line that I introduced with --. The other non-ASCII characters at lines ~150-160 (the "nested groups" warning block: # DON'T --, -> parsed as) are pre-existing in main and predate this PR -- the project does have a tech-debt cleanup needed for the existing docs, but doing it inside this PR would expand the diff well beyond the transport-selection scope. Tracking as a fast-follow.

2. CHANGELOG entry too long, mixed concerns -- valid. Split the original two-concern bullet into:

  • One tighter ### Changed (BREAKING) entry covering the strict-by-default contract + the rescue env var.
  • Two single-purpose ### Added entries: one for the initial-protocol flags (--ssh/--https/APM_GIT_PROTOCOL), one for the fallback escape hatch (--allow-protocol-fallback/APM_ALLOW_PROTOCOL_FALLBACK).

All three entries end with (#778) per Keep-a-Changelog convention.

@danielmeppiel
Copy link
Copy Markdown
Collaborator Author

Architecture: design patterns at play

A fourth diagram for reviewers who want the design rationale rather than the runtime flow.

classDiagram
    direction LR

    class ProtocolPreference {
        <<Enum>>
        NONE
        SSH
        HTTPS
    }

    class TransportAttempt {
        <<Value Object / frozen dataclass>>
        +use_ssh: bool
        +use_token: bool
        +reason: str
    }

    class TransportPlan {
        <<Value Object / frozen dataclass>>
        +attempts: list~TransportAttempt~
        +strict: bool
        +fallback_hint: str
    }

    class InsteadOfResolver {
        <<Protocol / Strategy>>
        +resolve(host) Optional~str~
    }

    class GitConfigInsteadOfResolver {
        <<Default Strategy>>
        -_cache: dict
        -_lock: Lock
        +resolve(host) Optional~str~
        -_load_rewrites()
    }

    class FakeInsteadOfResolver {
        <<Test Double>>
        +rewrites: dict
        +resolve(host) Optional~str~
    }

    class TransportSelector {
        <<Factory / Pure Function>>
        -insteadof_resolver: InsteadOfResolver
        +select(dep_ref, cli_pref, allow_fallback, has_token) TransportPlan
    }

    class GitHubPackageDownloader {
        <<Orchestrator>>
        -protocol_pref: ProtocolPreference
        -allow_protocol_fallback: bool
        -github_token: str
        -git_env: dict
        -selector: TransportSelector
        +download(dep_ref) Path
        -_clone_with_fallback(...)
        -_env_for(use_token) dict
        -_build_repo_url(..., token: str) str
    }

    class InstallContext {
        <<DI Container>>
        +protocol_pref: ProtocolPreference
        +allow_protocol_fallback: bool
    }

    class CliInstallCmd {
        <<Adapter>>
        +--ssh / --https
        +--allow-protocol-fallback
        +APM_GIT_PROTOCOL env
        +APM_ALLOW_PROTOCOL_FALLBACK env
    }

    InsteadOfResolver <|.. GitConfigInsteadOfResolver : implements
    InsteadOfResolver <|.. FakeInsteadOfResolver : implements
    TransportSelector o-- InsteadOfResolver : injected (Strategy)
    TransportSelector ..> TransportPlan : returns
    TransportPlan o-- TransportAttempt : composed of
    TransportSelector ..> ProtocolPreference : consumes
    GitHubPackageDownloader o-- TransportSelector : composed
    GitHubPackageDownloader ..> TransportPlan : iterates
    InstallContext ..> GitHubPackageDownloader : configures
    CliInstallCmd ..> InstallContext : populates
Loading

Patterns mapped

Pattern Where Why it matters here
Strategy (via typing.Protocol) InsteadOfResolver + GitConfigInsteadOfResolver / FakeInsteadOfResolver Tests can swap in a fake without subprocess shelling; future hosts (e.g. Mercurial-style rewrites) plug in without touching TransportSelector.
Value Object (frozen dataclass) TransportAttempt, TransportPlan Plans are comparable with ==, hashable, safe to share across threads, and cheap to construct in tests. No behavior on the data -- forces the decision to live in TransportSelector.select().
Pure-function Factory TransportSelector.select(...) Same inputs always produce the same plan. The 14-row test matrix becomes assert select(...) == TransportPlan([...]). No side effects, no I/O (the only I/O is the resolver, which is injected).
Separation of policy / mechanism transport_selection.py (policy) vs github_downloader.py (mechanism) Selection logic is testable without git, without network, without filesystem. Clone logic stays focused on URL building + subprocess + retry.
Closure as per-attempt strategy _env_for(use_token) inside _clone_with_fallback Captures self.git_env once but lets the env switch per attempt within the loop. Avoids hoisting the decision out of the loop (the bug Wave 2 caught).
Sentinel value token="" in _build_repo_url Distinguishes "explicit no token" from "use the instance default" (None). Without it, use_token=False attempts in mixed chains silently embedded the token in the URL.
Lazy init + double-checked locking GitConfigInsteadOfResolver._cache + threading.Lock First resolve pays the git config subprocess cost; subsequent resolves are O(1) dict lookups. Lock prevents redundant subprocess calls under parallel downloads.
Adapter CLI command + env-var resolvers in commands/install.py Translates CLI flags / env vars into the typed ProtocolPreference enum + allow_fallback: bool that the install pipeline consumes. Programmatic callers (no CLI) get the same defaults via protocol_pref_from_env() / is_fallback_allowed().
Dependency injection through context InstallRequest -> InstallContext -> GitHubPackageDownloader ctor Transport prefs flow as data, not as globals. Each install run is reproducible from its InstallRequest.

What this design buys reviewers

  • The selection decision is one function with a 14-row test matrix -- no mocks, no subprocess, no filesystem. If the matrix changes, the test changes; if the matrix doesn't change, neither does the test.
  • The orchestrator stays agnostic to selection rules. Adding a new rule (e.g. "honor pushInsteadOf") means changing TransportSelector.select() and adding rows to the matrix -- not editing _clone_with_fallback.
  • The escape hatch is one boolean flowing through one frozen field on the plan. There is no runtime branching elsewhere -- the orchestrator's if plan.strict: break is the entire enforcement.

@danielmeppiel
Copy link
Copy Markdown
Collaborator Author

Integration with neighboring systems

How TransportSelector composes with the rest of the install pipeline.

1. Lockfile repo_url -- transport-agnostic by design

LockedDependency.repo_url is a bare owner/repo (or org/project/repo for ADO). It does not include the scheme or host. That is the right choice and TransportSelector preserves it:

  • A dependency installed today over SSH and re-installed tomorrow over HTTPS produces the same lockfile entry.
  • Identity (get_unique_key()) returns repo_url -- so duplicate detection works regardless of which transport the user chose.
  • Switching APM_GIT_PROTOCOL between machines does not invalidate the lockfile.

The transport choice is a property of the clone attempt, not the identity of the package. Selection happens fresh on every install from (dep_ref, env, CLI flags) -- nothing is persisted.

2. Lockfile port -- preserved verbatim across attempts

port: Optional[int] was added on DependencyReference and LockedDependency in #665 as a first-class field. TransportSelector does not read it; the orchestrator's _build_repo_url(host, port=dep_ref.port, ...) re-emits it on every attempt regardless of the picked scheme:

ssh://host:7999/org/repo  -- port=7999 captured into dep_ref.port
                          -> first attempt:  ssh://host:7999/org/repo
                          -> if --allow-protocol-fallback fires:
                                              https://host:7999/org/repo

The port survives the cross-protocol retry. This was the explicit fix for #661 and remains intact under the new strict-by-default contract -- the only behavioral change is that the cross-protocol retry now requires opt-in.

3. apm-policy.yml -- gap, not conflict

The current policy schema has McpTransportPolicy (which restricts MCP server transport: stdio / sse / http) but no equivalent governance for git transport. Today there is no:

  • policy.git_transport.allowed = [ssh, https]
  • policy.git_transport.forbid_fallback = true
  • policy.git_transport.allow_http = false

This is a gap, not a conflict -- TransportSelector reads CLI/env, not policy. Worth filing as a follow-up: org-level enforcement of strict-only across CI (so a contractor cannot set APM_ALLOW_PROTOCOL_FALLBACK=1 in their pipeline to bypass the security improvement) is a natural next step. The hook points are clean:

  • commands/install.py resolves protocol_pref and allow_fallback -- the policy check would short-circuit there.
  • policy/schema.py already has the inheritance/discovery infra; a GitTransportPolicy would slot in next to McpTransportPolicy.

Out of scope for #778; flagging for a future issue.

4. Artifactory / PROXY_REGISTRY_URL -- different code path entirely

When PROXY_REGISTRY_URL (or the deprecated ARTIFACTORY_BASE_URL) is set, the dependency download path branches:

  • Proxied registry path (registry_proxy.RegistryClient.fetch_file / artifactory_entry.fetch_entry_from_archive): pulls files / archives over HTTP(S) GET from the configured proxy. TransportSelector is not involved -- there is no git clone. The proxy URL's scheme (https://artifactory.corp/...) is the only protocol in play, and it is fixed by the env var.
  • Direct git path (_clone_with_fallback): TransportSelector decides the scheme. Fires when no proxy is configured, or when the proxy host does not match the dep host (FQDN mismatch in download_file_from_artifactory() falls back to the standalone helper).

Practical implications:

  • Orgs using PROXY_REGISTRY_ONLY=1 are unaffected by the new strict-by-default behavior -- they never reach the git clone code path. The breaking change does not regress them.
  • Orgs using the proxy as a cache (not enforce-only) hit the git path for cache misses; they get the new strict default, which is the correct outcome.
  • No interaction between APM_GIT_PROTOCOL and PROXY_REGISTRY_URL -- they govern disjoint transports.
flowchart LR
    Install[apm install dep_ref] --> Cfg{PROXY_REGISTRY_URL set?}

    Cfg -->|yes, host matches| Proxy[fetch_file via RegistryClient<br/>HTTPS GET to proxy URL<br/>TransportSelector NOT used]
    Cfg -->|no, or host mismatch| Git[git clone via _clone_with_fallback]

    Git --> TS[TransportSelector.select<br/>builds TransportPlan]
    TS --> Loop[orchestrator iterates plan.attempts<br/>port preserved on every attempt]
    Loop --> Lock["LockedDependency<br/>(repo_url, port -- no scheme)"]

    Proxy --> Lock

    classDef bypass fill:#e2e3e5,stroke:#6c757d,color:#000
    classDef new fill:#d4edda,stroke:#28a745,color:#000
    class Proxy bypass
    class TS,Loop new
Loading

Summary table

Surface Interaction with TransportSelector Status after #778
LockedDependency.repo_url Identity is transport-agnostic; selection runs fresh on every install Unchanged
LockedDependency.port Read by _build_repo_url, re-emitted on every attempt Unchanged (#661 fix preserved)
apm-policy.yml No git_transport policy field today Gap; future opportunity for org-level enforcement of strict-only
PROXY_REGISTRY_URL / Artifactory Disjoint code path (HTTP file fetch, not git clone) Unchanged; proxy users unaffected

The net effect: the breaking change is localized to the direct-git-clone code path, which is exactly the surface that #661 / #328 were filed against. Lockfile, port handling, and proxy users see no behavior change.

APM picks SSH or HTTPS per dependency using a strict, predictable contract.

:::caution[Breaking change]
Earlier versions silently retried failed clones across protocols. The current
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indicate here APM < 0.8.13

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 51ab924. Tightened the caution panel to:

Breaking change in APM 0.8.13 -- APM versions before 0.8.13 silently retried failed clones across protocols. Starting in 0.8.13 the behavior is strict by default [...]

That makes the version cutover unambiguous for anyone landing on this page from a search result without scrolling to the changelog.

While here, also took care of the four other Copilot review findings in the same commit:

  • transport_selection.py -- collapsed the explicit == "http" branch into plain HTTPS so TransportAttempt.scheme stays in {ssh, https} and the downloader contract is consistent. Plain http:// URLs now build an HTTPS attempt with no token (no cleartext credential leak possible) until feat: support allow-insecure HTTP dependencies #700 introduces a first-class HTTP transport with explicit --allow-insecure gating.
  • github_downloader.py -- the [!] "Protocol fallback" warning is now gated on actual scheme change (ssh <-> https) rather than label change, so an auth-only downgrade inside a single protocol (e.g. authenticated HTTPS -> plain HTTPS in a --ssh + --allow-protocol-fallback mixed plan) is no longer misreported as a protocol switch.
  • pipeline.py / request.py / context.py / commands/install.py -- allow_protocol_fallback is now Optional[bool] = None end-to-end. Non-CLI programmatic callers regain the documented "None => read APM_ALLOW_PROTOCOL_FALLBACK env" behavior. The CLI still resolves env at the command boundary so behavior there is unchanged.
  • tests/unit/test_auth_scoping.py -- replaced the one stray Unicode arrow with -> to satisfy the ASCII-only source rule.

All 4061 unit tests pass.

- docs(dependencies): pin BREAKING-change caution panel to APM 0.8.13
  (per maintainer review comment)
- transport_selection: collapse explicit `http://` URLs into the plain-HTTPS
  branch so TransportAttempt.scheme stays in {ssh, https} and the downloader
  contract is consistent until #700 lands a first-class HTTP transport
- github_downloader: gate the [!] "Protocol fallback" warning on actual
  scheme change (ssh<->https) rather than label change, so an auth downgrade
  inside a single protocol is not misreported as a protocol switch
- pipeline / request / context / commands: switch
  `allow_protocol_fallback` to `Optional[bool] = None` end-to-end so
  programmatic callers (non-CLI) keep the documented "None => read
  APM_ALLOW_PROTOCOL_FALLBACK env" behavior
- test_auth_scoping: ASCII-only docstring (replace one stray Unicode arrow)

All 4061 unit tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danielmeppiel danielmeppiel merged commit 284d582 into main Apr 20, 2026
17 checks passed
@danielmeppiel danielmeppiel deleted the feat/transport-selection-v1 branch April 20, 2026 12:30
danielmeppiel added a commit that referenced this pull request Apr 20, 2026
…ade (#780) (#781)

* feat(transport): TransportSelector + strict-by-default transport (#778)

Implements the core decision engine for issue #778 'transport selection v1'.
Strict-by-default semantics replace today's silent cross-protocol fallback:
explicit ssh:// and https:// dependencies no longer downgrade to a different
protocol, and shorthand (owner/repo) consults git insteadOf rewrites before
defaulting to HTTPS.

This commit ships Waves 1+2 of the transport-selection plan (per session plan):
- new module src/apm_cli/deps/transport_selection.py with ProtocolPreference,
  TransportAttempt/TransportPlan dataclasses, GitConfigInsteadOfResolver,
  and TransportSelector that returns a typed, strict-by-default plan
- DependencyReference grows explicit_scheme so the selector can distinguish
  user-stated transport from shorthand
- _clone_with_fallback in github_downloader.py now iterates the selector
  plan; per-attempt URL building stays in the orchestrator
- InstallContext / InstallRequest / pipeline / Service threaded with
  protocol_pref + allow_protocol_fallback so CLI args reach the downloader
- apm install gains --ssh / --https (mutually exclusive) and
  --allow-protocol-fallback flags; honours APM_GIT_PROTOCOL and
  APM_ALLOW_PROTOCOL_FALLBACK env vars
- two pre-existing tests in test_auth_scoping.py asserted the legacy
  permissive chain; updated to assert the new strict contract and added a
  coverage test for the allow_fallback escape hatch

Tests: 4029 unit tests pass. Test matrix + integration tests + docs land
in subsequent commits per Waves 3-5.

Refs #778, #328, #661

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(transport): wave-3 unit + integration matrix; fix per-attempt clone env

Wave 2 panel gate (code-review subagent) flagged that
GitHubPackageDownloader._clone_with_fallback decided the clone env
(locked-down vs relaxed) ONCE per dependency from has_token. Under
allow_fallback=True the plan can mix attempts of different use_token
values, so SSH and plain-HTTPS attempts in a mixed chain were running
with GIT_ASKPASS=echo + GIT_CONFIG_GLOBAL=/dev/null +
GIT_CONFIG_NOSYSTEM=1, breaking ssh-agent passphrase prompts and git
credential helpers.

Fix the env per attempt; address an adjacent contract bug; add tests.

* github_downloader._clone_with_fallback: replace the per-dep clone_env
  with a per-attempt _env_for() helper so only token-bearing attempts
  get the locked-down env.
* github_downloader._build_repo_url: treat token="" as an explicit "no
  token" sentinel so plain-HTTPS attempts in a mixed chain genuinely
  run without embedded credentials, letting credential helpers (gh
  auth, Keychain) supply auth. Orchestrator passes "" instead of None
  for use_token=False attempts.
* transport_selection.GitConfigInsteadOfResolver: wrap the lazy
  insteadOf-rewrite cache in a threading.Lock so parallel downloads
  can't double-populate.

Tests:
* tests/unit/test_transport_selection.py (NEW, 30 tests): 14-row
  selection matrix (explicit-strict, shorthand+insteadOf, shorthand
  defaults, CLI prefs, allow_fallback chain, env helpers); resolver
  caching; "must use normal env" contract.
* tests/unit/test_auth_scoping.py: new
  test_allow_fallback_env_is_per_attempt_not_per_dep regression
  asserts auth-HTTPS gets locked-down env, SSH and plain-HTTPS get
  relaxed env, and plain-HTTPS does not embed the token in the URL.
* tests/integration/test_transport_selection_integration.py (NEW, 7
  tests): 2 always-on cases (public shorthand HTTPS; explicit https://
  strict); 5 SSH-required cases (explicit ssh:// strict, bad-host
  no-fallback, insteadOf override, APM_GIT_PROTOCOL=ssh env,
  allow_fallback rescue). Gated on APM_RUN_INTEGRATION_TESTS=1; SSH
  cases auto-skip if no key.
* tests/fixtures/gitconfig_insteadof_to_ssh (NEW): minimal gitconfig
  used by the integration test for the insteadOf-honored case.
* scripts/test-integration.sh: added "Transport Selection" block so
  the integration suite runs in CI.

Full unit suite: 4061 passed (was 4029; +32 net new tests).

Refs #778, #661, #328

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(transport): document strict-by-default transport selection (#778)

Update the four user-facing surfaces affected by the new TransportSelector
contract:

* docs/src/content/docs/guides/dependencies.md:
  - New "Transport selection (SSH vs HTTPS)" section: breaking-change
    callout with the rescue env var, selection matrix, insteadOf
    example, --ssh / --https / APM_GIT_PROTOCOL overrides, and the
    --allow-protocol-fallback escape hatch.
  - Soften the existing "Custom ports preserved" sentence (cross-protocol
    retries are now opt-in).
  - Update the "Other Git Hosts" SSH bullet: SSH is no longer a silent
    fallback; point at explicit URLs or insteadOf.
* docs/src/content/docs/getting-started/authentication.md:
  - Rewrite the "SSH connection hangs" troubleshooting entry: remove the
    now-incorrect "tries SSH then falls back to HTTPS" framing.
  - New "Choosing transport (SSH vs HTTPS)" section with a pointer to
    dependencies.md for the full transport contract.
* docs/src/content/docs/reference/cli-commands.md:
  - Document --ssh / --https / --allow-protocol-fallback on apm install,
    plus APM_GIT_PROTOCOL and APM_ALLOW_PROTOCOL_FALLBACK env vars.
* packages/apm-guide/.apm/skills/apm-usage/dependencies.md (Rule 4 mirror):
  - Same transport contract in skill-resource voice with three runnable
    snippets and a selection matrix.

CHANGELOG: scope new entries to `apm install` only (there is no `apm
add` command in the codebase).

Refs #778

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: revert accidental uv.lock churn (no dependency change in this PR)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs+changelog: address copilot review feedback (#779)

Two findings from the Copilot reviewer on PR #779:

1. Non-ASCII em dash introduced by this PR in the modified `Fields:`
   line of guides/dependencies.md: replace with `--`. The other
   non-ASCII chars Copilot flagged in the file (lines ~150-160 of the
   "nested groups" warning block) are pre-existing and out of scope
   for this PR.
2. CHANGELOG entries for the new transport-selection feature were too
   long and bundled multiple concerns into one bullet. Split into one
   tighter BREAKING entry plus two single-purpose Added entries
   (initial-protocol flags; fallback escape hatch). Each ends in
   `(#778)`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR #779 review feedback

- docs(dependencies): pin BREAKING-change caution panel to APM 0.8.13
  (per maintainer review comment)
- transport_selection: collapse explicit `http://` URLs into the plain-HTTPS
  branch so TransportAttempt.scheme stays in {ssh, https} and the downloader
  contract is consistent until #700 lands a first-class HTTP transport
- github_downloader: gate the [!] "Protocol fallback" warning on actual
  scheme change (ssh<->https) rather than label change, so an auth downgrade
  inside a single protocol is not misreported as a protocol switch
- pipeline / request / context / commands: switch
  `allow_protocol_fallback` to `Optional[bool] = None` end-to-end so
  programmatic callers (non-CLI) keep the documented "None => read
  APM_ALLOW_PROTOCOL_FALLBACK env" behavior
- test_auth_scoping: ASCII-only docstring (replace one stray Unicode arrow)

All 4061 unit tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(install): MARKETPLACE_PLUGIN beats HOOK_PACKAGE in detection cascade (#780)

obra/superpowers (and any Claude Code plugin shipping both hooks/*.json
and agents/skills/commands) was misclassified as HOOK_PACKAGE because
the cascade in detect_package_type() checked _has_hook_json BEFORE the
plugin-evidence branch. The plugin synthesizer (_map_plugin_artifacts)
already maps hooks alongside agents/skills/commands, so MARKETPLACE_PLUGIN
is a strict superset -- swapping the order means hooks still install,
plus everything else that was being silently dropped.

Three deliverables:

A) Surgical detection fix: reorder cascade so MARKETPLACE_PLUGIN is
   checked before HOOK_PACKAGE. Refactored to use a new
   gather_detection_evidence() helper + DetectionEvidence dataclass so
   observability code (warnings, summaries) can reuse the same scan
   without breaking the detect_package_type() public signature.

B) Observability:
   - Add HOOK_PACKAGE to the package-type label table (it was missing
     entirely -- the silent classification path).
   - Update MARKETPLACE_PLUGIN label to mention plugin.json OR
     agents/skills/commands (matches the cascade behaviour).
   - New CommandLogger.package_type_warn() at default visibility.
   - New _warn_if_classification_near_miss() helper fires when a
     HOOK_PACKAGE classification disagrees with directory contents
     (catches near-misses the order swap cannot, e.g. .claude-plugin/
     dir without plugin.json).
   - Wired at both materialization sites (local + cached).

C) Architectural follow-up tracked in plan; will file as separate
   issue after merge for a Visitor / Format-Discovery refactor.

Tests: 17 detection tests + 9 sources-observability tests + 70
command-logger tests pass. Full unit suite (4180 tests) green.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR #781 review: broaden detect_package_type with .claude-plugin/ evidence

Per Copilot review on PR #781: the near-miss warning helper had dead
code paths once the cascade was reordered (a HOOK_PACKAGE classification
implies plugin_dirs_present is empty -- otherwise it would be
MARKETPLACE_PLUGIN). Two principled options were offered: simplify the
helper, or broaden detection so the invariants stay consistent. Chose
the latter -- it removes the inconsistency at the source.

Changes:
- Add .claude-plugin/ as first-class plugin evidence in DetectionEvidence
  (new has_claude_plugin_dir field) and in has_plugin_evidence. A Claude
  Code plugin without a plugin.json (manifest-less) now classifies as
  MARKETPLACE_PLUGIN; normalize_plugin_directory already handles the
  missing-manifest case (derives name from directory).
- Drop _warn_if_classification_near_miss helper, its two call sites,
  CommandLogger.package_type_warn method, and the corresponding tests.
  All scenarios it covered are now handled by the cascade itself.
- Add regression test test_claude_plugin_dir_alone_is_plugin_evidence
  asserting that .claude-plugin/ + hooks/ classifies as MARKETPLACE_PLUGIN
  with plugin_json_path=None (matched via directory evidence alone).
- Extend test_obra_superpowers_evidence to assert has_claude_plugin_dir.

Verified: 4175 unit tests pass (excluding the one pre-existing unrelated
failure documented on main).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
danielmeppiel added a commit that referenced this pull request Apr 21, 2026
)

* feat(transport): TransportSelector + strict-by-default transport (#778)

Implements the core decision engine for issue #778 'transport selection v1'.
Strict-by-default semantics replace today's silent cross-protocol fallback:
explicit ssh:// and https:// dependencies no longer downgrade to a different
protocol, and shorthand (owner/repo) consults git insteadOf rewrites before
defaulting to HTTPS.

This commit ships Waves 1+2 of the transport-selection plan (per session plan):
- new module src/apm_cli/deps/transport_selection.py with ProtocolPreference,
  TransportAttempt/TransportPlan dataclasses, GitConfigInsteadOfResolver,
  and TransportSelector that returns a typed, strict-by-default plan
- DependencyReference grows explicit_scheme so the selector can distinguish
  user-stated transport from shorthand
- _clone_with_fallback in github_downloader.py now iterates the selector
  plan; per-attempt URL building stays in the orchestrator
- InstallContext / InstallRequest / pipeline / Service threaded with
  protocol_pref + allow_protocol_fallback so CLI args reach the downloader
- apm install gains --ssh / --https (mutually exclusive) and
  --allow-protocol-fallback flags; honours APM_GIT_PROTOCOL and
  APM_ALLOW_PROTOCOL_FALLBACK env vars
- two pre-existing tests in test_auth_scoping.py asserted the legacy
  permissive chain; updated to assert the new strict contract and added a
  coverage test for the allow_fallback escape hatch

Tests: 4029 unit tests pass. Test matrix + integration tests + docs land
in subsequent commits per Waves 3-5.

Refs #778, #328, #661

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(transport): wave-3 unit + integration matrix; fix per-attempt clone env

Wave 2 panel gate (code-review subagent) flagged that
GitHubPackageDownloader._clone_with_fallback decided the clone env
(locked-down vs relaxed) ONCE per dependency from has_token. Under
allow_fallback=True the plan can mix attempts of different use_token
values, so SSH and plain-HTTPS attempts in a mixed chain were running
with GIT_ASKPASS=echo + GIT_CONFIG_GLOBAL=/dev/null +
GIT_CONFIG_NOSYSTEM=1, breaking ssh-agent passphrase prompts and git
credential helpers.

Fix the env per attempt; address an adjacent contract bug; add tests.

* github_downloader._clone_with_fallback: replace the per-dep clone_env
  with a per-attempt _env_for() helper so only token-bearing attempts
  get the locked-down env.
* github_downloader._build_repo_url: treat token="" as an explicit "no
  token" sentinel so plain-HTTPS attempts in a mixed chain genuinely
  run without embedded credentials, letting credential helpers (gh
  auth, Keychain) supply auth. Orchestrator passes "" instead of None
  for use_token=False attempts.
* transport_selection.GitConfigInsteadOfResolver: wrap the lazy
  insteadOf-rewrite cache in a threading.Lock so parallel downloads
  can't double-populate.

Tests:
* tests/unit/test_transport_selection.py (NEW, 30 tests): 14-row
  selection matrix (explicit-strict, shorthand+insteadOf, shorthand
  defaults, CLI prefs, allow_fallback chain, env helpers); resolver
  caching; "must use normal env" contract.
* tests/unit/test_auth_scoping.py: new
  test_allow_fallback_env_is_per_attempt_not_per_dep regression
  asserts auth-HTTPS gets locked-down env, SSH and plain-HTTPS get
  relaxed env, and plain-HTTPS does not embed the token in the URL.
* tests/integration/test_transport_selection_integration.py (NEW, 7
  tests): 2 always-on cases (public shorthand HTTPS; explicit https://
  strict); 5 SSH-required cases (explicit ssh:// strict, bad-host
  no-fallback, insteadOf override, APM_GIT_PROTOCOL=ssh env,
  allow_fallback rescue). Gated on APM_RUN_INTEGRATION_TESTS=1; SSH
  cases auto-skip if no key.
* tests/fixtures/gitconfig_insteadof_to_ssh (NEW): minimal gitconfig
  used by the integration test for the insteadOf-honored case.
* scripts/test-integration.sh: added "Transport Selection" block so
  the integration suite runs in CI.

Full unit suite: 4061 passed (was 4029; +32 net new tests).

Refs #778, #661, #328

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(transport): document strict-by-default transport selection (#778)

Update the four user-facing surfaces affected by the new TransportSelector
contract:

* docs/src/content/docs/guides/dependencies.md:
  - New "Transport selection (SSH vs HTTPS)" section: breaking-change
    callout with the rescue env var, selection matrix, insteadOf
    example, --ssh / --https / APM_GIT_PROTOCOL overrides, and the
    --allow-protocol-fallback escape hatch.
  - Soften the existing "Custom ports preserved" sentence (cross-protocol
    retries are now opt-in).
  - Update the "Other Git Hosts" SSH bullet: SSH is no longer a silent
    fallback; point at explicit URLs or insteadOf.
* docs/src/content/docs/getting-started/authentication.md:
  - Rewrite the "SSH connection hangs" troubleshooting entry: remove the
    now-incorrect "tries SSH then falls back to HTTPS" framing.
  - New "Choosing transport (SSH vs HTTPS)" section with a pointer to
    dependencies.md for the full transport contract.
* docs/src/content/docs/reference/cli-commands.md:
  - Document --ssh / --https / --allow-protocol-fallback on apm install,
    plus APM_GIT_PROTOCOL and APM_ALLOW_PROTOCOL_FALLBACK env vars.
* packages/apm-guide/.apm/skills/apm-usage/dependencies.md (Rule 4 mirror):
  - Same transport contract in skill-resource voice with three runnable
    snippets and a selection matrix.

CHANGELOG: scope new entries to `apm install` only (there is no `apm
add` command in the codebase).

Refs #778

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: revert accidental uv.lock churn (no dependency change in this PR)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs+changelog: address copilot review feedback (#779)

Two findings from the Copilot reviewer on PR #779:

1. Non-ASCII em dash introduced by this PR in the modified `Fields:`
   line of guides/dependencies.md: replace with `--`. The other
   non-ASCII chars Copilot flagged in the file (lines ~150-160 of the
   "nested groups" warning block) are pre-existing and out of scope
   for this PR.
2. CHANGELOG entries for the new transport-selection feature were too
   long and bundled multiple concerns into one bullet. Split into one
   tighter BREAKING entry plus two single-purpose Added entries
   (initial-protocol flags; fallback escape hatch). Each ends in
   `(#778)`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR #779 review feedback

- docs(dependencies): pin BREAKING-change caution panel to APM 0.8.13
  (per maintainer review comment)
- transport_selection: collapse explicit `http://` URLs into the plain-HTTPS
  branch so TransportAttempt.scheme stays in {ssh, https} and the downloader
  contract is consistent until #700 lands a first-class HTTP transport
- github_downloader: gate the [!] "Protocol fallback" warning on actual
  scheme change (ssh<->https) rather than label change, so an auth downgrade
  inside a single protocol is not misreported as a protocol switch
- pipeline / request / context / commands: switch
  `allow_protocol_fallback` to `Optional[bool] = None` end-to-end so
  programmatic callers (non-CLI) keep the documented "None => read
  APM_ALLOW_PROTOCOL_FALLBACK env" behavior
- test_auth_scoping: ASCII-only docstring (replace one stray Unicode arrow)

All 4061 unit tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(install): MARKETPLACE_PLUGIN beats HOOK_PACKAGE in detection cascade (#780)

obra/superpowers (and any Claude Code plugin shipping both hooks/*.json
and agents/skills/commands) was misclassified as HOOK_PACKAGE because
the cascade in detect_package_type() checked _has_hook_json BEFORE the
plugin-evidence branch. The plugin synthesizer (_map_plugin_artifacts)
already maps hooks alongside agents/skills/commands, so MARKETPLACE_PLUGIN
is a strict superset -- swapping the order means hooks still install,
plus everything else that was being silently dropped.

Three deliverables:

A) Surgical detection fix: reorder cascade so MARKETPLACE_PLUGIN is
   checked before HOOK_PACKAGE. Refactored to use a new
   gather_detection_evidence() helper + DetectionEvidence dataclass so
   observability code (warnings, summaries) can reuse the same scan
   without breaking the detect_package_type() public signature.

B) Observability:
   - Add HOOK_PACKAGE to the package-type label table (it was missing
     entirely -- the silent classification path).
   - Update MARKETPLACE_PLUGIN label to mention plugin.json OR
     agents/skills/commands (matches the cascade behaviour).
   - New CommandLogger.package_type_warn() at default visibility.
   - New _warn_if_classification_near_miss() helper fires when a
     HOOK_PACKAGE classification disagrees with directory contents
     (catches near-misses the order swap cannot, e.g. .claude-plugin/
     dir without plugin.json).
   - Wired at both materialization sites (local + cached).

C) Architectural follow-up tracked in plan; will file as separate
   issue after merge for a Visitor / Format-Discovery refactor.

Tests: 17 detection tests + 9 sources-observability tests + 70
command-logger tests pass. Full unit suite (4180 tests) green.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR #781 review: broaden detect_package_type with .claude-plugin/ evidence

Per Copilot review on PR #781: the near-miss warning helper had dead
code paths once the cascade was reordered (a HOOK_PACKAGE classification
implies plugin_dirs_present is empty -- otherwise it would be
MARKETPLACE_PLUGIN). Two principled options were offered: simplify the
helper, or broaden detection so the invariants stay consistent. Chose
the latter -- it removes the inconsistency at the source.

Changes:
- Add .claude-plugin/ as first-class plugin evidence in DetectionEvidence
  (new has_claude_plugin_dir field) and in has_plugin_evidence. A Claude
  Code plugin without a plugin.json (manifest-less) now classifies as
  MARKETPLACE_PLUGIN; normalize_plugin_directory already handles the
  missing-manifest case (derives name from directory).
- Drop _warn_if_classification_near_miss helper, its two call sites,
  CommandLogger.package_type_warn method, and the corresponding tests.
  All scenarios it covered are now handled by the cascade itself.
- Add regression test test_claude_plugin_dir_alone_is_plugin_evidence
  asserting that .claude-plugin/ + hooks/ classifies as MARKETPLACE_PLUGIN
  with plugin_json_path=None (matched via directory evidence alone).
- Extend test_obra_superpowers_evidence to assert has_claude_plugin_dir.

Verified: 4175 unit tests pass (excluding the one pre-existing unrelated
failure documented on main).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(validation): reject shell-string command in MCP stdio entries

Self-defined stdio MCP entries with `command` containing whitespace and
no `args` are now rejected at parse time with a fix-it error pointing at
the canonical `command: <binary>, args: [<token>, ...]` shape.

Previously silently accepted; APM never split `command` on whitespace,
so the loose shape mis-executed downstream. The trap surfaced via #122
(thanks @lirantal) -- users coming from the universal `mcp.json` /
Claude Desktop / Cursor mental model wrote `command: "npx mcp-server-foo"`
and got confused when nothing worked.

Per maintainer steer ("move fast, breaking OK"): error in v1, not warn.
The loose shape was never specified; silent mis-execution is worse than
hard-fail with a clear fix-it. CHANGELOG entry under
Changed (BREAKING).

Closes #806
Refs #122

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(mcp): address PR #809 review panel (cred leak, args:[], type gate, edge cases)

Addresses the APM Review Panel verdict on PR #809.

Must-fix:
- M1: Redact raw command in error 'Got:' framing (cli-log BLOCKER /
  sec HIGH). Echoing 'Got: command={self.command!r}' leaked tokens
  like '--token=ghp_...' to stderr / CI scrollback. Now shows only
  the first token plus argument count. The structured 'Did you mean:'
  suggestion still surfaces user input verbatim because that is the
  copy-paste recovery path.
- M2: Use 'self.args is None' instead of 'not self.args' (arch
  IMPORTANT). Explicit 'args: []' is a deliberate 'no extra args'
  signal (e.g., paired with '/opt/My App/server') and must be
  accepted -- 'not []' incorrectly evaluated truthy and rejected
  legitimate input in a BREAKING change.

Should-fix:
- S1: Whitespace-only command produces a dedicated 'empty or
  whitespace-only' error instead of the degenerate fix-it
  'Did you mean: command: , args: []' (arch + devx IMPORTANT).
- S2: Type gate for non-str command (sec HIGH). YAML
  'command: ["npx", "-y", "x"]' previously bypassed the
  isinstance guard silently and crashed downstream in
  validate_path_segments with an unhandled AttributeError.
- S3: Document rule 4 in manifest-schema.md section 4.2.3 (devx
  IMPORTANT). Spec and code ship together.

Adds 4 regression tests covering each fix. Removes the stray space
before '?' in the fix-it suggestion (cli-log NIT 8).

Follow-ups (not in this PR, to be filed as issues):
- Redact 'command' in MCPDependency.__repr__ (sec MEDIUM, pre-existing)
- Forward MCPDependency validation errors from plugin parser to
  DiagnosticCollector (cli-log IMPORTANT)
- Multi-line Cargo-style error format (cli-log IMPORTANT / devx NIT)
- Shell-metachar warning for stdio command (sec LOW)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(mcp): address PR #809 follow-up tickets in same PR

Per user direction, fold all four CEO-classified follow-up items into
this PR rather than deferring to separate issues.

FU1 -- Redact 'command' in MCPDependency.__repr__ (sec MEDIUM)
  Pre-existing leak: __repr__ echoed 'command={self.command!r}' verbatim
  while carefully redacting env and headers. Now shows only the first
  whitespace-separated token, mirroring the M1 fix.

FU2 -- Surface plugin-parser MCP validation warnings (cli-log IMPORTANT)
  The 'apm' stdlib logger has no handlers configured, so logger.warning
  calls in plugin_parser were silently dropped. Added _surface_warning
  helper that routes through both stdlib logger AND _rich_warning so
  invalid MCP servers are visible without --verbose. Applied to the
  validation-error catch site and the no-command/no-url skip.

FU3 -- Multi-line Cargo-style error format (cli-log IMPORTANT / devx NIT)
  The original 350-char single-line ValueError defeated terminal URL
  detection and the newspaper test. Restructured to:
    'command' contains whitespace in MCP dependency '<name>'.
      Rule: ...
      Got:  command='<first>' (N additional args)
      Fix:  command: <first>
            args: [...]
      See:  https://...
  URL now sits on its own line for click-through; field/rule/got/fix/see
  pattern is scannable per the cli-logging-ux skill's newspaper test.

FU4 -- Shell-metachar warning for stdio command (sec LOW, defense-in-depth)
  Extended _warn_shell_metachars(env, logger) to optionally check
  'command' as well, so 'command: "npx|curl evil.com"' (no whitespace,
  passes the rejection guard) still triggers a warning that MCP stdio
  servers run via execve with no shell. Hooked into the --mcp install
  path via entry.get('command').

Architectural improvement (LOC budget):
  Adding the command-checking branch pushed install.py over the 1525
  invariant ceiling. Per the python-architecture skill's guidance
  ('don't trim cosmetically -- modularize'), extracted the F5 SSRF
  helper, F7 shell-metachar helper, _is_internal_or_metadata_host,
  _SHELL_METACHAR_TOKENS, and _METADATA_HOSTS into a new dedicated
  module: apm_cli/install/mcp_warnings.py. install.py back-binds the
  symbols at module scope so existing test patches against
  apm_cli.commands.install._warn_* keep working unchanged.

  install.py: 1530 -> 1441 LOC (84 under budget, room to breathe).

Tests: 4715/4715 unit + console pass (excludes the known pre-existing
test_user_scope_skips_workspace_runtimes failure on main).

New regression tests:
- test_validate_stdio_error_uses_multiline_cargo_style_format
- test_repr_redacts_command_to_avoid_leaking_credentials

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants