Skip to content

Develop#40

Merged
mswdev merged 176 commits intomainfrom
develop
May 4, 2026
Merged

Develop#40
mswdev merged 176 commits intomainfrom
develop

Conversation

@mswdev
Copy link
Copy Markdown
Owner

@mswdev mswdev commented May 4, 2026

No description provided.

mswdev added 30 commits April 27, 2026 13:59
Sync develop with main
Plans the rename from claude-docker-sandbox to Ckipper and the
multi-account architecture: per-account CLAUDE_CONFIG_DIR, registry
in ~/.ckipper/accounts.json, ckipper CLI for registration, and
account-aware w + entrypoint integration.
Breaks the rename + multi-account work into 7 phases (25 tasks):
project rename, tooling relocation to ~/.ckipper/, ckipper CLI, w
and entrypoint account-awareness, migration command, docs, and live
deploy + end-to-end validation. Each task has a verification step
and its own commit.
Six-reviewer panel + Team Lead synthesis (all GO-WITH-FIXES) revealed
gaps that needed folding back into the docs before execution:

- Purge user-specific names (af, matt@msw.dev, /Users/matt/...) and
  replace with placeholders (<work>, <your-second-account>, $HOME/...).
- Drop the ~/.claude → ~/.claude-personal symlink (creates more failure
  modes than it solves; users use claude-personal post-migrate).
- Add migration safety: pgrep precondition, error-trap rollback,
  Keychain probe-before-trust.
- Extend hook regex to cover ~/.ckipper/ and per-account dirs
  (closes credential cross-contamination vector).
- Validate keychain_service shape before passing to security command.
- Fix Keychain awk: printf vs echo, lock detection, fixture-based
  regression test under tests/.
- Make aliases.zsh self-contained (cca works when sourced alone).
- chmod 600 on accounts.json; flock for atomic registry writes;
  schema version 1.
- Insert Phase 6.5 Docker smoke test before live deploy.
- Entrypoint errors on missing CLAUDE_CONFIG_DIR (no silent fallback).
- Per-subcommand --help; cca and pronunciation in help text.
- Extract Python heredocs to docker/cleanup-projects.py;
  registry-driven --rm cleanup (not glob).
- README expansions: stronger concurrent-use warning, surface
  multi-account earlier, ckipper migrate preamble.
- Concrete Section 12 isolation assertions in test-prompt.md.
- Expanded follow-ups list.

Plan ready for execution via superpowers:executing-plans.
Final additions before handing the plan to executing-plans:
- New "Prerequisites" section at top of implementation plan covers
  starting branch state, required tools (with macOS flock fallback),
  the GPG-signing-under-sandbox quirk, and the two-deployments rule.
- Phase 7 explicitly marked user-driven with an instruction for the
  executing agent to STOP at the Phase 6.5/7 boundary.
- Design doc tool-layout: removed misleading tests/ entry under
  ~/.ckipper/ — fixtures live in the repo, not the deployed install.
The jq walk regex anchors on $HOME/ so the substitution replaces the full
prefix (template uses "$HOME/.claude/hooks/...") instead of producing
$HOME<dir>/hooks/... — fixes a concatenation bug in the plan's regex.
…no symlink

Replaced ERR-trap rollback with explicit post-mv / post-finalize checks.
The trap-based design cascaded inside finalize when the registry write
itself failed (chmod -w'd HOME), causing the script to hang on a chain of
trap re-entries. Explicit checks return promptly on either failure point.

Also: _ckipper_finalize_registration now verifies the registry write by
reading back the registered account; without this, write failures were
silent and migrate's rollback couldn't tell finalize had failed.
…virtiofs)

The original design mounted .claude.json read-only at a sibling path
(.claude-host.json) for race protection against the host's Claude. With
multi-account dirs, that staging path lives nested inside the dir bind-mount
— and macOS Docker Desktop's virtiofs cannot create a nested mountpoint
under another bind-mount ("mountpoint outside of rootfs" error).

Fix: drop the staging mount. The entrypoint now mutates .claude.json in
place. Race protection devolves to the documented "don't run the same
account in two sessions" rule (issue #24317). For multi-account, this is
the intended workflow anyway — different accounts use different dirs and
have no race.

Caught by the Phase 6.5 Docker smoke test.
…top level

The plan's migration block (and Task 6 spec) had cp -a target $CKIPPER_DIR
directly. But the new layout puts tooling at $CKIPPER_DIR/docker/, so the
contents of legacy $HOME/.claude/docker/ should land at $CKIPPER_DIR/docker/.

Caught when running install.sh against my actual host: Dockerfile,
entrypoint.sh, init-firewall.sh, w-function.zsh, and (worse) the user's
customized w-config.zsh ended up at $CKIPPER_DIR/ top level. The fresh
install.sh copy then put canonical versions at $CKIPPER_DIR/docker/, so
the user's customizations were silently shadowed.

The guard now checks for $CKIPPER_DIR/docker/ existence (not $CKIPPER_DIR
itself) so the migration runs once even if $CKIPPER_DIR exists for some
other reason.
The previous sed pattern matched the path portion only, leaving the
trailing closing quote of source "$HOME/.claude/docker/w-function.zsh"
dangling — broke shell startup with 'unmatched "'.

The new pattern consumes the entire 'source ... w-function.zsh' line
(quoted or unquoted, ~/$HOME/absolute) and rewrites to a canonical
quoted form. Tested against 4 legacy variants and 2 unrelated lines.
Bare 'claude' defaults to ~/.claude/ and writes credentials to the
unsuffixed 'Claude Code-credentials' Keychain entry — the same entry
the migrated 'personal' account is registered against. A fresh /login
silently overwrites the registered account's creds, breaking it on next
token refresh.

The auto-generated aliases.zsh now defines a claude() shell function
that refuses bare invocation when accounts are registered, and tells
the user to use claude-<name> / cca <name> / 'command claude' for
intentional fresh login. Pass-through when no accounts are registered
(first-time setup before migrate).

Lives in aliases.zsh so .zshrc stays clean — only source lines.
mswdev added 29 commits May 2, 2026 03:17
Critical:
- statusline: detect interpreter-prefix commands like "bash <src>/script.sh"
  by walking every whitespace token, not just the first. The apply
  function's path rewrite switched from sub() (anchored, regex) to
  literal split+join — safe with paths that contain regex metacharacters.
- preview UX wired end-to-end. _summary functions are now called per
  changeset row to populate the summaries TSV. The Apply prompt is a
  3-way Apply / View changes / Abort chooser; View loops through the
  drill-down picker until Apply or Abort. _ckipper_account_sync_count_changes
  prints the trailing "N changes (a new, b overwrite)" line per §6.1.

Important:
- _CKIPPER_LEGACY_COMMANDS[sync-hooks] now points at the renamed
  redeploy-hooks subcommand.
- Drill-down id/display fix: items file is the source of truth for the
  original id, recovered via _drill_down_resolve_id(items, type, display).
  Without this, files-based diffs would resolve to display ('foo.md')
  instead of id ('agents/foo.md').
- Per-target rollback safety: manifest_append fires BEFORE the apply
  call. A strategy that backs up + writes + then errors mid-write now
  has its entry in the manifest, so rollback restores from the backup
  dir. Test added that simulates a crashing apply and verifies the
  destination is restored to its pre-sync state.
- Renamed every _core_* function defined in lib/account/sync/ to
  _ckipper_account_sync_* (55 functions). Per .claude/rules/shell-conventions.md,
  _core_* is reserved for lib/core/. Sed pass: _core_account_sync_ →
  _ckipper_account_sync_, then _core_sync_ → _ckipper_account_sync_.
- Added lint-merge-guards rule that catches future _core_* definitions
  outside lib/core/. The guard would have failed this PR's pre-fix
  state — it now guards against the same drift recurring.
Five issues from the latest code review of PR #37:

Critical
- preview_prompt stdout pollution: when the user picked "View changes"
  then "Apply", $action ended up as "<diff text>...apply" not "apply",
  so the [[ "$action" == "apply" ]] gate at finalize silently skipped
  the apply. Two distinct causes:
    1. drill_down_loop wrote diffs and the press-enter prompt to stdout,
       which was captured by the outer $() in run_one_target.
    2. `local choice` inside preview_prompt's loop printed `choice='...'`
       on every iteration after the first — zsh re-declares behaviour.
  Fix: redirect drill_down_loop output to stderr; hoist `local choice=""`
  outside the loop; same hoist for `local _ack=""` in drill_down_loop.
  Regression test added to dispatcher_test.bats.

3-parameter cap (.claude/rules/code-style.md)
- Introduced _SYNC_CTX associative array as the per-target context object.
  apply_one 8→3, finalize 7→1, drill_down_show 6→1, drill_down_loop 5→1,
  preview_prompt 4→0. Keys: src_dir, dst_dir, src_name, dst_name,
  backup_dir, changeset, summaries, items.

Nesting cap (.claude/rules/code-style.md)
- Extracted _ckipper_account_sync_accumulate_positional from parse_args's
  case `*)` arm, dropping the function from 3 nesting levels to 2.

undo_dispatch doc-header (.claude/rules/shell-conventions.md)
- Added the missing `Errors (stderr):` line for the "Unknown flag" and
  "Usage:" messages.

CHANGELOG
- Removed the false note claiming `lib/account/sync.zsh` is "intentionally
  untouched" — the file was deleted in this PR (commit ea61ee6) and
  replaced with the lib/account/sync/ tree.

Verification
- 445 bats + 6 pytest, all green
- make lint-merge-guards green
Seven verified bugs from the latest review, each with a regression test
that watched the bug fail before the fix landed.

- structured: mcp_compare/settings_compare returned "overwrite" when the
  destination JSON file was absent (jq -> empty, not "null"); manifest
  recorded op=overwrite and rollback would not delete the new file.
  Treat empty as the same case as "null".

- hooks: filter_src jq gsub treated $src as a regex pattern; switched to
  literal split($src) | join($dst) so paths with `.`/`-` (e.g.
  ~/.claude-personal) don't mangle unrelated commands.

- statusline: copy_and_rewrite now appends a manifest entry for the
  copied script file. Mid-write crash + rollback used to leave the
  script orphaned because manifest_rel only emitted settings.json.

- backup: backup_file is now idempotent — a second call for the same
  rel within one invocation is a no-op so the original-state snapshot
  survives when settings + statusline both write settings.json.

- dispatcher: skip assert_dst_idle for --dry-run; preview-only runs
  shouldn't be blocked by a still-open Claude session.

- account dispatcher: 'sync-hooks' now prints a rename hint pointing
  at 'redeploy-hooks' (mirrors the top-level legacy-command pattern in
  ckipper.zsh). Removed the misleading "[hidden but callable]" doc and
  the contradictory CHANGELOG line, plus the stale comment about a
  non-existent fallback function.

- backup: manifest_append cleans up its tmp file on jq failure (was
  using `&&` chain that left the tmp file in the backup dir).
Six confirmed defects in the new sync engine, all uncovered by a fresh
multi-agent review. Each fix is paired with a regression test that fails
on develop and passes after the change.

E. apply_one used to derive op (create|overwrite) from change_status. A
   new SUB-KEY (e.g. one new MCP server) maps to change_status="new" even
   when the destination FILE already exists with unrelated content. The
   manifest then recorded op=create, and rollback's `rm -rf $live` would
   destroy the entire .claude.json on `sync undo`. Fix: derive op from
   `[[ ! -e $live ]]` at apply time, where $live is computed via a new
   live_path helper. (lib/account/sync/engine.zsh, backup.zsh)

G. prefs rollback resolved the live path as `$dst_dir/$rel`, which for
   prefs is the destination ACCOUNT dir, not $CKIPPER_REGISTRY. Rollback
   wrote a stray `accounts.json` into the account dir and silently left
   the registry corrupted. Fix: live_path returns $CKIPPER_REGISTRY for
   type=prefs; rollback_target reads the type field from the manifest
   and forwards it to rollback_one. (lib/account/sync/backup.zsh,
   engine.zsh)

A. assert_dst_idle filtered `_core_running_claude_processes` output via
   `grep -F $dst_dir`, but `pgrep -lx claude` only emits "<PID> claude"
   on macOS — no env, no argv. The grep could never match and the
   safeguard was dead code. macOS does not expose foreign processes'
   `CLAUDE_CONFIG_DIR` env var, so dst-specific filtering is not
   achievable; revert to refusing on any running Claude CLI.
   (lib/account/sync/engine.zsh, CHANGELOG.md, dispatcher.zsh help text)

B. undo_dispatch read the module-level _SYNC_FORCE directly, which leaks
   across invocations: a prior `sync ... --force` left _SYNC_FORCE=true,
   and a subsequent `sync undo` (no --force) silently bypassed the
   running-Claude refusal because parse_args resets the flag only on the
   sync path. Fix: undo_dispatch uses a local force var.
   (lib/account/sync/dispatcher.zsh)

D. statusline_compare lacked the empty-string guard that mcp_compare and
   settings_compare have. When the destination's settings.json was
   missing entirely, jq exited non-zero and d="" — the [[ "$d" == "null" ]]
   guard missed and compare returned "overwrite" instead of "new",
   mislabeling the preview UI. (lib/account/sync/strategies/statusline.zsh)

F. hooks_apply mutates settings.json (adding the paired .hooks entry)
   but the engine's apply_one only recorded a manifest entry for the
   script file (manifest_rel returns "<id>" for hooks). The settings.json
   mutation was untracked, so rollback after a hook sync left a phantom
   .hooks entry pointing at a deleted script. Fix: hooks_apply now
   appends a settings.json manifest entry alongside the script entry;
   duplicate entries from multi-hook syncs are harmless because rollback
   is idempotent. (lib/account/sync/strategies/hooks.zsh)

Tests: 455 → 467 (+12 regression tests across all 6 fixes). All green.
#37)

Consolidations driven by a parallel reuse / quality / efficiency review.

- Extract lib/account/sync/_shared.zsh with hash_file / hash_dir /
  diff_line_stats / json_status. Removes the cross-strategy reach where
  hooks.zsh called _file_hash defined in files_flat.zsh, and the three
  copies of the new|unchanged|overwrite block in mcp/settings/statusline
  compare functions.
- Refactor run_one_target (28 → 16 body lines) by extracting
  prepare_target_artifacts. Both helpers under the 25-line cap.
- Replace [[ \"\$type\" == \"prefs\" ]] hardcoding (4 sites in
  engine.zsh + preview.zsh) with the new _CKIPPER_SYNC_TYPE_USES_NAMES
  registry array — adding a name-typed sync type now requires only a
  registry entry, not engine + preview edits.
- Fix O(N²) preview summary lookup in render_summary — read summaries
  TSV into an assoc-array once instead of awk-scanning per row.
- Replace 2 inline gum-detect blocks in interactive.zsh with
  _core_prompt_use_gum.
- Delete stale §6.1 references in preview.zsh and dispatcher.zsh
  comments.

Test files updated to source _shared.zsh in the same load order as
production. 467/467 bats tests pass; make lint clean.
- settings strategy: enumerator now also excludes `.statusLine.*` paths
  (in addition to `.hooks`). Without this, `--include settings` (or
  `--include claude-config`) silently planted the source's absolute
  `statusLine.command` path on the destination, breaking the destination's
  statusline. The dedicated `statusline` strategy owns path-rewrite +
  internal-script copy, so it must be the only enumerator for that subtree.
  Updated `structured_test.bats` to assert exclusion.
- preview: extract magic number 26 → `_CKIPPER_SYNC_DISPLAY_COL_WIDTH`,
  matching the existing `_CKIPPER_SYNC_DIVIDER_WIDTH=45` pattern.
- rename sweep cleanup: update stale `sync-hooks` → `redeploy-hooks`
  references missed in the original rename — `install.sh` (echo + comments),
  `doctor.zsh` WARN message, and `templates/settings-template.json` comment.
…cker

Two fixes in lib/account/account-management.zsh:

1. Account validation/usage messages ("Account 'foo' is already
   registered.", "Usage: ckipper account remove <name>", etc.) printed
   to stdout. Other modules (lib/config, lib/worktree, doctor.zsh)
   already use stderr, so `ckipper account remove foo 2>/dev/null`
   silenced success but leaked the failure path. Routed every error
   path through `>&2`.

2. The Keychain entry picker used by `ckipper account add --adopt`
   hand-rolled `nl + read -r` instead of using _core_prompt_choose.
   Switched to _core_prompt_choose with an explicit "(skip)" sentinel
   so gum/no-gum routing matches every other interactive picker.
Two consistency cleanups left over from the sync overhaul:

1. Three sites bypassed _core_prompt_*: lib/account/sync/interactive.zsh
   used raw `read -r "var?label"` for two free-form prompts, and
   lib/setup/prompts.zsh:98 used raw `read -r` for a y/N. All three now
   route through _core_prompt_input / _core_prompt_confirm. Same
   behavior in no-gum mode; future helper-layer changes propagate
   automatically.

2. Module-level globals in lib/account/sync/{dispatcher,engine,preview}
   used the bare _SYNC_* prefix. Every other module uses _CKIPPER_*
   per .claude/rules/shell-conventions.md. Renamed the eight
   dispatcher state vars (_SYNC_FROM, _SYNC_TARGETS, _SYNC_INCLUDE,
   _SYNC_EXCLUDE, _SYNC_DRY_RUN, _SYNC_YES, _SYNC_FORCE, _SYNC_CTX)
   plus the unprefixed ALIASES_FILE_PERMS readonly in aliases.zsh.
   Tests updated. Verified zero leftover _SYNC_* references via grep.
Three small additions that close gaps the README implies are covered:

- ckipper doctor now checks (a) gum on PATH (FAIL — required for the
  setup/sync wizards), (b) ~/.zsh/completions/_ckipper exists and
  matches CKIPPER_COMPLETION_VERSION (WARN — stale completions only
  produce outdated tab suggestions), and (c) aliases.zsh parses with
  `zsh -n` (FAIL — a broken aliases file disables every per-account
  launcher in the user's shell). Also renamed MIN_HOOK_FILES to
  _CKIPPER_DOCTOR_MIN_HOOK_FILES while in doctor.zsh.

- Tab completion now offers schema keys after `ck config get/set/unset`,
  sourced live from _CKIPPER_SCHEMA_TYPE. Bumped
  CKIPPER_COMPLETION_VERSION 7 → 8 so existing installs regenerate the
  cached completion file on next shell start.

- The setup-wizard final message now lists all three Claude-launching
  paths (`ckipper run`, `claude-<account>`, bare `<account>`) and the
  `ck` interactive menu, plus a hint to run `ckipper doctor`.
  Previously it only mentioned `ckipper run`.
- Renamed slogan from "Multi-account Claude Code manager with Docker
  isolation, per-account preferences, and worktree-aware launchers." to
  "A lightweight Claude Code account, worktree, and Docker manager."
  Shorter, scans cleaner.

- Core commands table now lists `sync` and `redeploy-hooks` under
  account-namespace subcommands. Both are fully implemented and
  documented later in the README, but the at-a-glance command index
  was missing them.

- Renamed "Known limitations" → "Known limitations (Docker mode only)"
  and "Multi-account caveats" → "Multi-account caveats (host and
  Docker)" with intro lines that explicitly tag scope. The previous
  headings left it ambiguous whether each limitation applied only to
  the dockerized launch path or to ckipper in general.

- Added two Docker-mode limitations confirmed via research:
  - Claude in Chrome MCP — the browser extension can't bridge the
    host/container boundary (anthropics/claude-code#25506).
  - Custom system-sound hooks — afplay/say/osascript don't exist in
    the Linux container; built-in notify-bell.sh sidesteps via the
    terminal bell (\a) character.
Post-sync-overhaul cohesion cleanup + Docker-mode README clarifications
In zsh an `EXIT` trap set inside a function fires when *that* function
returns, regardless of `local_traps` in the caller. Previously the trap
was set inside `_core_registry_acquire_mkdir_lock`, so the lockdir was
released the moment acquire returned — before the caller's jq write
ever ran. A concurrent writer could grab the lock during the supposed
critical section and clobber the in-flight update.

Move the trap into `_core_registry_update_mkdir_fallback` (the function
whose lifetime spans the critical section) and eager-expand `\$lockdir`
at trap-set time, since by the time the trap actually fires the
function's local is already out of scope.

Add three bats tests:
- acquire leaves the lockdir intact for the caller after returning;
- fallback releases the lockdir after a successful update;
- fallback releases the lockdir after a jq failure.

Verified that the test for the regression fails on the previous
implementation and passes after the fix.
`_core_config_write_global` emitted every value as a quoted scalar
regardless of schema type, so `ckipper config set ports 3000,3030,6006`
rewrote the templated `CKIPPER_PORTS=(3000 3030 6006)` array literal as
`CKIPPER_PORTS="3000,3030,6006"`. After the next shell startup,
`CKIPPER_PORTS` was a string, so `for port in "${CKIPPER_PORTS[@]}"` in
`lib/worktree/ports.zsh` iterated once with the whole CSV — silently
breaking port forwarding under Docker mode.

Extract `_core_config_format_line` and pick the on-disk form by schema
type: int_array → `KEY=(a b c)`, everything else → `KEY="value"`.
Teach `_core_config_read_global` to recognise the array form and
return CSV so `ckipper config get ports` round-trips with whatever the
user passed to `set`.

Add four bats tests covering the regression, the round-trip, the
sourcing-side shape, and idempotent rewrite of an existing array
literal in place.
…meref)

`local -n _picked_ref="$2"` is invalid in zsh — the `local` builtin
errored silently and the subsequent `_picked_ref="$picked"` leaked the
value to a literal global named `_picked_ref` instead of the caller's
variable. The adopt flow on macOS therefore never honoured the user's
Keychain selection: `picked` stayed empty, so the registry recorded
the account with no associated Keychain service.

Drop the failed nameref pattern. Echo the chosen service to stdout,
make the caller capture it: `picked=$(_helper "$name") || return 1`.
This matches the rest of the module's style (snapshot, validate, etc.
all use stdout-capture).

Add four bats tests: stdout shape on valid pick, empty output on skip,
empty output when the keychain has no candidates, and the existing
"invalid service shape" error path.
`_ckipper_account_remove` proceeded straight into the destructive
cleanup helpers — `_ckipper_account_cleanup_dir` runs `rm -rf` on the
account's config directory — without ever asserting that no Claude
process was alive. A user with Claude open in another terminal who ran
`ckipper account remove <name>` would have their live session's
config dir removed from underneath the running process.

`account rename` already has the guard at
`_ckipper_account_rename_check_preconditions:495`. `account remove` is
strictly more destructive; mirror the guard right after the
"is registered?" check and before any state mutation.

Add a bats case that runs `_ckipper_account_remove work` with a stub
pgrep reporting a fake process and `CKIPPER_FORCE=0`, and asserts
non-zero exit, the registry entry preserved, and the config dir intact.
Both `_ckipper_account_default` and `_ckipper_account_remove` invoked
`_core_registry_update` without checking its exit status, then printed
success ("Default account is now …" / "Unregistered …") regardless of
whether the registry write actually landed. `account_remove` was the
worse of the two: after a silent registry-write failure it still
proceeded to `rm -rf` the config dir and prompt to delete the keychain
entry — based on a registry state that hadn't changed.

Wrap both calls in `if ! _core_registry_update …; then …` and gate the
follow-up cleanup helpers on success. Surface a clear stderr error and
return 1 on failure.

Add bats tests that stub `_core_registry_update` to fail and assert
non-zero exit, an "Error" message on stderr, the absence of the
success line, and (for remove) that the config dir survives.
`_ckipper_account_sync_settings_*` interpolated the dotted item id
straight into a jq filter string (`.cleanup-period-days`), which jq
parsed as `.cleanup - .period - .days` — subtraction. Most Claude
Code settings (cleanup-period-days, max-tokens, …) hit this. Result:
`compare` returned "new" for both source and destination (both jq
calls compile-failed silently), and `apply` aborted with a jq compile
error mid-stream, rolling back the entire sync slice.

A second silent-corruption shape: keys containing literal dots (rare
but legal JSON, e.g. `{"some.key": …}`) were split on `.` into a
multi-element path, so `setpath` planted a nested
`{"some":{"key":null}}` and lost the original flat key — with rc=0.

Fix the contract so the id NEVER goes through a jq filter parser:
- `enumerate` emits the path as a JSON-encoded array
  (`["cleanup-period-days"]`, `["permissions","allow"]`); the display
  column keeps the dotted form for the preview UI.
- `compare`, `summary`, `diff`, `apply` consume the id via
  `--argjson p "$id"` and use `getpath`/`setpath` — never
  string-interpolated jq filters, never `split(".")`.
- Drop the dead `_ckipper_account_sync_settings_jq_path` helper.

Update the existing settings tests to pass JSON-array ids. Add three
new tests: hyphenated-key enumerate (id has correct shape), hyphenated
compare (round-trips without jq compile error), hyphenated apply
(value lands without aborting), and dotted-key apply (no nested-object
leak).
Two regex bugs in the bash-guardrails safety hook:

1. The recursive-delete guard required *both* `r` AND `f` in the rm
   flags (regex alternatives `r[a-zA-Z]*f` or `f[a-zA-Z]*r` or
   `--recursive`). Plain `rm -r /home/user/important` matched none of
   them and was silently allowed. With `--dangerously-skip-permissions`
   this could erase arbitrary user state. Replace with a single
   alternation that requires *either* an r/R flag or `--recursive`.

2. The git-history guard used `--force\b`. `\b` fires at any
   word/non-word boundary, including the `-` in `--force-with-lease` —
   the very replacement the error message tells the user to switch to.
   Anchor the right side against whitespace-or-end-of-string instead.

Add seven new bats cases: rm -r blocked, rm -R blocked,
rm --recursive blocked, rm -rf node_modules still allowed,
--force-with-lease passes, --force still blocked, -f still blocked.
The schema arrays (`_CKIPPER_SCHEMA_TYPE`, `_DEFAULT`, `_SCOPE`,
`_DESCRIPTION`) are pure data — no functions, no behavior. They were
sourced from `lib/config/schema.zsh` but read directly by foundation
modules (`lib/core/config.zsh`, `lib/core/registry.zsh`), which
violated the dependency-direction rule from
`.claude/rules/file-organization.md`: "Parent directories MUST NOT
import from child route/feature directories."

Move the file to `lib/core/schema.zsh` (still 9 source files in
`lib/core/`, under the 10-file cap) and chase every `lib/config/schema`
reference across:
  - source: ckipper.zsh, lib/account/sync/strategies/structured.zsh,
    lib/account/sync/registry.zsh, lib/account/doctor.zsh,
    lib/setup/{prompts,apply}.zsh, lib/worktree/worktree.zsh,
    lib/core/{config,registry}.zsh
  - tests: every *_test.bats file that sourced the schema
  - docs: README.md, CONTRIBUTING.md, doc-comments

`lib/config/` still owns the user-facing get/set/unset/list/edit
subcommands — only the data file moved. `make lint-merge-guards` and
the full bats suite remain green (493 tests).
…d.zsh

The same `typeset -gA _CKIPPER_SYNC_CTX` declaration lived in three
sibling modules — engine.zsh, dispatcher.zsh, preview.zsh — each with
an elaborate "re-declaration without `=()` is a no-op" comment
justifying the duplication. Residue of the /simplify pass that left
the parameter-object refactor half-applied: a single `=()` slip in any
of the three would silently reset state set by the others.

Move the canonical declaration into `_shared.zsh` (sourced first in
the production load order) and drop the three redundant copies. The
three module test files (engine_test.bats, dispatcher_test.bats,
preview_test.bats) now also source `_shared.zsh` so the declaration
is in scope when each test exercises its module in isolation.

Full bats suite remains green (493 tests).
Bare-named globals like `KEYCHAIN_TIMEOUT_SECONDS`, `REGISTRY_FILE_PERMS`,
and the various `LOCK_*` constants leaked into the user's zsh
namespace and could collide with random shell vars. The 9894aaf
"standardize prompt helpers and module-constant prefixes" commit moved
prompt/style/fuzzy/help to the `_CORE_<MODULE>_*` convention; keychain,
registry, and config were missed.

Rename in this commit:
  KEYCHAIN_TIMEOUT_SECONDS         → _CORE_KEYCHAIN_TIMEOUT_SECONDS
  REGISTRY_FILE_PERMS              → _CORE_REGISTRY_FILE_PERMS
  LOCK_NOTIFY_THRESHOLD_ATTEMPTS   → _CORE_REGISTRY_LOCK_NOTIFY_THRESHOLD_ATTEMPTS
  LOCK_MAX_ATTEMPTS                → _CORE_REGISTRY_LOCK_MAX_ATTEMPTS
  STALE_LOCK_AGE_THRESHOLD_SECONDS → _CORE_REGISTRY_STALE_LOCK_AGE_THRESHOLD_SECONDS
  LOCK_RETRY_INTERVAL_SECONDS      → _CORE_REGISTRY_LOCK_RETRY_INTERVAL_SECONDS
  _CKIPPER_GLOBAL_PREFIX           → _CORE_CONFIG_GLOBAL_PREFIX

Definitions and references updated. Test files included. Bats suite
remains green (493 tests).
… prefix

Same pattern as the previous core/ rename: bare-named globals leaked
into the user's namespace and could collide with arbitrary shell vars.

Rename:
  SHASUM_BITS                → _CKIPPER_WT_SHASUM_BITS
  CLAUDE_CONTAINER_UID       → _CKIPPER_WT_CLAUDE_CONTAINER_UID
  CLAUDE_CONTAINER_GID       → _CKIPPER_WT_CLAUDE_CONTAINER_GID
  CREDS_TMPFS_MODE           → _CKIPPER_WT_CREDS_TMPFS_MODE
  CREDS_TMPFS_SIZE           → _CKIPPER_WT_CREDS_TMPFS_SIZE
  DOCKER_GROUP_ADD_HOST_ROOT → _CKIPPER_WT_DOCKER_GROUP_ADD_HOST_ROOT
  MAX_PORT_FALLBACK_ATTEMPTS → _CKIPPER_WT_MAX_PORT_FALLBACK_ATTEMPTS

The `_CKIPPER_WT_` prefix matches the existing convention for
worktree-module shared state (`CKIPPER_WT_DOCKER_ARGS`,
`CKIPPER_WT_FLAG_DOCKER`, etc.); the leading underscore distinguishes
"private constant" from "module-shared state". Bats remains green (493).
…nd loops

Two shell-style fixes bundled because they overlap in the worktree files:

1. Boolean comparison style: `[[ "$X" = true ]]` → `[[ "$X" = "true" ]]`.
   The unquoted form is non-canonical per `.claude/rules/shell-conventions.md`
   ("test with `[[ \"$x\" = \"true\" ]]`"). Five sites in
   run.zsh/docker-mode.zsh/worktree.zsh.

2. `for x in $(find ...)` loops silently lose paths containing spaces /
   tabs / newlines because the unquoted command substitution undergoes
   word-splitting on $IFS. Convert to null-delimited iteration:
       while IFS= read -r -d '' x; do ... done < <(find ... -print0)
   Four sites: ckipper.zsh:304/317 (completion), launcher/menu.zsh:96
   (project enumeration), worktree.zsh:331 (.env file copy on create).

Also rename `CKIPPER_WT_FIND_MAX_DEPTH` → `_CKIPPER_WT_FIND_MAX_DEPTH`
(touched here anyway by the find-loop edit; the leading underscore
matches the module-private-constant convention used elsewhere in
worktree/).

Bats remains green (493 tests).
The contract in ckipper.zsh:191-193 ("Run \`ckipper <namespace>
<subcommand> --help\` for per-subcommand details") held for the
account and worktree dispatchers but not for config — running
\`ckipper config get --help\` returned "Unknown flag: '--help'"
because get/set/unset/list/edit forwarded the flag straight to the
handler.

Mirror the account/worktree pattern: short-circuit `<subcmd> --help`
to `_ckipper_config_help` before dispatching the handler. Five new
bats cases — one per subcommand — assert the exit-zero + overview
text. Other dispatchers (account, worktree, setup, run) already
comply; audited as part of this change.
`bash-guardrails.sh:78` blocked only `.git/info/attributes` and
`.git/info/exclude`; `protect-claude-config.sh:47` blocked all of
`.git/info/`. The two hooks gated the same threat (a container-side
edit landing on a host-side `.git/` file that executes on the next git
invocation) but disagreed on the surface — Slice 5 of the develop
review flagged this as the load-bearing inconsistency.

Align the inner subpath alternation:
    .git/(config|info/|hooks/|worktrees/)
The leading anchor still differs (Bash sees command strings,
Edit/Write sees realpath-resolved file paths), but the protected-path
list is identical now. Cross-reference comments added to both files so
future drift is harder to introduce silently.

Three new bats cases: writes anywhere under `.git/info/` are blocked,
writes to `.git/info/attributes` are blocked, reads of `.git/info/`
files still pass through.
Two standard supply-chain hardening steps for the workflow:

1. Top-level `permissions: contents: read`. The default GITHUB_TOKEN
   on `push` carries write access to the repo; with no opt-in
   permissions block any action invoked here would inherit that. The
   `make lint` / `make test-unit` job only needs to read source — make
   that explicit so a future regression that pulls in a malicious
   action can't escalate silently.

2. `timeout-minutes: 15` on the lint-and-test job. A hung step (network
   stall in `pip install`, an infinite loop in a test) currently runs
   for 6 hours before GitHub kills it. 15 min is generous: current CI
   is well under 5 min including brew install + pip install + the full
   bats + ruff + pytest passes.

SHA-pinning `actions/checkout@v4` to a commit hash is the third
standard hardening but I'm leaving that as a follow-up so we can pick
a known-good SHA from the actions/checkout release page rather than
guessing.
Three small follow-ups from the develop-branch review:

1. `_ckipper_account_list_row` took 4 positional args (.claude/rules
   3-param cap). Move `default` into a module-level
   `_CKIPPER_ACCOUNT_LIST_DEFAULT` global that
   `_ckipper_account_list` populates once per invocation; the row
   helper drops to 3 args + 1 documented context read. Same pattern
   already in use elsewhere in this module (_CKIPPER_FINALIZE_CTX,
   _CKIPPER_RENAME_CTX).

2. `lib/worktree/build-image.zsh:10` and `lib/worktree/ports.zsh:41`
   wrote error/warning text to stdout. Per the project checklist,
   error paths surface to monitoring — move both to stderr (`>&2`).

3. `lib/worktree/normal-mode.zsh` `cd` calls were unguarded. If the
   worktree path went missing between create and run, the command
   silently executed in the caller's directory (and the trailing
   restoring `cd "$old_pwd"` likewise). Add an upfront existence check,
   gate each `cd` with `|| return 1` (or `|| true` for the best-effort
   restore), and surface a stderr error when the path is missing.

Bats remains green (501).
Develop-branch code-review fixes: 9 Critical bugs + cross-cutting cleanup
@mswdev mswdev merged commit ae14db4 into main May 4, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant