fix(integrity): roll back legacy packages on failed reinstall (InstallStash)#137
Merged
Conversation
…lStash)
A content-complete *legacy* package (no .mcpp_ok marker) was deleted
outright by clean_incomplete_install() on the resolve path before a
reinstall. If the reinstall then failed (xlings extracting to
$XLINGS_HOME, network error, etc.), the working package was gone for
good — e.g. xim-x-zlib vanished, leaving clang's RUNPATH dangling and
`clang++ --version` failing with 'libz.so.1: cannot open shared object'
(exit 127), with no way to recover short of a manual reinstall.
Add InstallStash, an RAII guard that replaces the bare clean on the
resolve path: it renames the no-marker dir aside instead of deleting it,
commits (drops the backup) on each success path, and on any failure path
its destructor:
- keeps the new install if verdir now has a marker (reinstall won);
- else restores a looks_complete_legacy() stash and adopts it (writes
the marker) so the next resolve trusts it instead of repeating the
delete-then-fail cycle;
- else discards genuine half-extracted residue (historical semantics).
Wire it into Fetcher::resolve_xpkg_path with stash.commit() before each
success return. Add 5 gtest cases covering restore / commit /
keep-new-complete / discard-residue / noop-when-marked.
Also lands the cross-repo root-cause + implementation doc under
.agents/docs/ (covers this and the compat.xcb codegen fix in mcpp-index).
The 3 stash tests that read a payload via std::ifstream then called fs::remove_all() on the temp tree left the stream in scope. POSIX allows unlinking open files; Windows does not, so remove_all() threw 'being used by another process' and the tests failed only on the windows CI leg. Read through a read_first_line() helper that closes the handle before returning. Production InstallStash code is unchanged.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem (问题2:
mcpp run→ llvmclang++exit 127)Switching to the llvm toolchain produced:
clang is fine — its
libz.so.1comes (via RUNPATH) fromxim-x-zlib, whoseversion directory had been deleted and never restored. Root cause: a
content-complete legacy package (installed before the
.mcpp_okmarkersystem, so unmarked) is deleted outright by
clean_incomplete_install()on theresolve path before a reinstall. When that reinstall then fails (xlings
extracting to
$XLINGS_HOMEinstead of the sandbox, network error, …), theworking package is gone for good — and
mcpp toolchain install llvmdoesn'thelp because it only reinstalls llvm itself, not the orphaned zlib sibling.
Fix
Add
InstallStash, an RAII guard replacing the bare clean on the resolvepath. It renames the no-marker dir aside instead of deleting it, and:
commit()on each success path → drops the backup;verdirnow has a marker (reinstall won);looks_complete_legacy()stash and adopts it(writes the marker, so the next resolve trusts it instead of repeating the
delete-then-fail cycle);
Wired into
Fetcher::resolve_xpkg_pathwithstash.commit()before eachsuccess
return.Tests
New
tests/unit/test_install_integrity.cpp— 5 gtest cases, all green locally(
mcpp test: 22 passed / 0 failed):RestoresLegacyPackageOnFailedReinstallCommitDropsBackupAndKeepsNewInstallKeepsNewCompleteInstallWhenUncommittedDiscardsNonLegacyResidueOnFailureNoopWhenAlreadyCompleteScope
Only the resolve/install path's pre-clean changes from delete→stash. Normal
installs (already-complete dirs return earlier; genuine residue still
discarded) are unchanged. Companion to mcpplibs/mcpp-index#43 (the
compat.xcbcodegen fix). Full analysis in.agents/docs/2026-06-21-xcb-and-install-integrity-cross-repo-fix.md.