What's broken
afx workspace recover resurrects builders that were previously cleaned up via afx cleanup -p <id>. The cleanup left the worktree + branch + porch state on disk (the documented default behavior — preserves user scratch), and workspace recover then treats that preserved state as "a builder whose shellper died and should be revived."
The two operations have contradictory views of the same on-disk state:
afx cleanup writes worktree-preserved files as a deliberate post-cleanup snapshot ("to remove: git worktree remove ...").
afx workspace recover sees the same files as evidence of a crashed builder and re-spawns the Tower terminal.
Net effect: the user ran workspace recover to revive a different set of crashed builders, and a previously-cleaned-up project got swept in as collateral.
Concrete incident (cluesmith/shannon, 2026-05-27)
- 2026-05-25 — PIR builder for shannon issue #1778 ("infra(native): wire
pnpm -F native test into Turbo + CI") investigated and determined no PR was needed (CI already runs native tests via the existing apps/* sweep). Builder closed the GitHub issue without a PR and emitted the "ready for cleanup" notification.
- Architect ran
afx cleanup -p 1778. Output included the standard "Worktree preserved at: /Users/.../.builders/pir-1778" + "Branch preserved: builder/pir-1778".
- 2026-05-27 — user ran
afx workspace recover to revive a different set of builders that had been killed by a Tower restart.
- Side effect:
builder-pir-1778 re-appeared in afx status with the worktree at its preserved post-cleanup commit. Porch phase: implement. The issue is still CLOSED on GitHub.
The builder then began making forward progress on the reopened porch state, producing commits that contradict the closed-issue disposition (21ed8bd75 adds a turbo.json change, e049c6bd9 reverts a ci.yml change).
Expected behavior
workspace recover should NOT revive a builder that was intentionally cleaned up. Specific signals it could check:
- A cleanup marker file dropped by
afx cleanup (e.g., .agent-farm/cleaned-up or a status.yaml field like cleaned_up_at: <ts>). This is the most direct fix — give cleanup and recover a shared piece of state that lets one see the other's intent.
- GitHub issue state. If the linked issue is CLOSED (and there's no open PR referencing it), skip recovery. Per-project knowable from existing porch state.
- Tower's own record of how the terminal exited. If
cleanup killed the terminal cleanly (vs. a crash / SIGKILL from a reboot), recover should treat that as intentional.
(1) is the cleanest because it works without depending on GitHub state OR Tower's process bookkeeping.
Workaround
For now, after every workspace recover run, the user has to manually inspect afx status for builders that shouldn't be there and re-cleanup them. Each false-positive revival costs ~5 minutes of investigation + cleanup, plus the risk of accidentally landing the wrong work.
Severity
Medium. Doesn't lose user data, but breaks the assumption that afx cleanup is a terminal operation — recover silently undoes it. Surface area grows the more builders accumulate over time (every preserved worktree is a future false-positive candidate).
Related
afx workspace recover --max-age defaults to 7 days — the 2026-05-25 → 2026-05-27 gap was within that window. Raising the default wouldn't help because the same race exists at any age threshold.
- Sibling design point:
afx cleanup already prints "To remove: git worktree remove ..." / "To delete: git branch -d ..." as a reminder of follow-up steps. If we expect the user to run those manually, recover should at least filter out worktrees whose branches were deleted — but the branch typically isn't deleted either, so this isn't a robust signal.
Acceptance
What's broken
afx workspace recoverresurrects builders that were previously cleaned up viaafx cleanup -p <id>. The cleanup left the worktree + branch + porch state on disk (the documented default behavior — preserves user scratch), andworkspace recoverthen treats that preserved state as "a builder whose shellper died and should be revived."The two operations have contradictory views of the same on-disk state:
afx cleanupwrites worktree-preserved files as a deliberate post-cleanup snapshot ("to remove: git worktree remove ...").afx workspace recoversees the same files as evidence of a crashed builder and re-spawns the Tower terminal.Net effect: the user ran
workspace recoverto revive a different set of crashed builders, and a previously-cleaned-up project got swept in as collateral.Concrete incident (cluesmith/shannon, 2026-05-27)
pnpm -F native testinto Turbo + CI") investigated and determined no PR was needed (CI already runs native tests via the existingapps/*sweep). Builder closed the GitHub issue without a PR and emitted the "ready for cleanup" notification.afx cleanup -p 1778. Output included the standard "Worktree preserved at: /Users/.../.builders/pir-1778" + "Branch preserved: builder/pir-1778".afx workspace recoverto revive a different set of builders that had been killed by a Tower restart.builder-pir-1778re-appeared inafx statuswith the worktree at its preserved post-cleanup commit. Porch phase:implement. The issue is still CLOSED on GitHub.The builder then began making forward progress on the reopened porch state, producing commits that contradict the closed-issue disposition (
21ed8bd75adds a turbo.json change,e049c6bd9reverts a ci.yml change).Expected behavior
workspace recovershould NOT revive a builder that was intentionally cleaned up. Specific signals it could check:afx cleanup(e.g.,.agent-farm/cleaned-upor a status.yaml field likecleaned_up_at: <ts>). This is the most direct fix — givecleanupandrecovera shared piece of state that lets one see the other's intent.cleanupkilled the terminal cleanly (vs. a crash / SIGKILL from a reboot),recovershould treat that as intentional.(1) is the cleanest because it works without depending on GitHub state OR Tower's process bookkeeping.
Workaround
For now, after every
workspace recoverrun, the user has to manually inspectafx statusfor builders that shouldn't be there and re-cleanup them. Each false-positive revival costs ~5 minutes of investigation + cleanup, plus the risk of accidentally landing the wrong work.Severity
Medium. Doesn't lose user data, but breaks the assumption that
afx cleanupis a terminal operation —recoversilently undoes it. Surface area grows the more builders accumulate over time (every preserved worktree is a future false-positive candidate).Related
afx workspace recover --max-agedefaults to 7 days — the 2026-05-25 → 2026-05-27 gap was within that window. Raising the default wouldn't help because the same race exists at any age threshold.afx cleanupalready prints "To remove: git worktree remove ..." / "To delete: git branch -d ..." as a reminder of follow-up steps. If we expect the user to run those manually,recovershould at least filter out worktrees whose branches were deleted — but the branch typically isn't deleted either, so this isn't a robust signal.Acceptance
afx cleanup -p <id>writes a cleanup marker (file or status.yaml field)afx workspace recoverreads the marker and skips marked projectsworkspace recover --include-staleor a new flag can override the skip if the user really wants to revive a cleaned-up project