fix(ci): resolve integration test suite failures on main#1151
fix(ci): resolve integration test suite failures on main#1151
Conversation
Three groups of integration test failures fixed:
1. **Volume Mounts** (docker-manager.ts): Custom --mount paths were not
visible inside the agent container because chroot mode (always enabled)
does `chroot /host`, making paths like /data invisible. Fixed by
prefixing the container path with /host (e.g., /data -> /host/data)
so mounts are accessible after chroot.
2. **API Proxy Tests** (observability + rate-limit): Tests used
JSON.parse(result.stdout) and header assertions, but stdout contained
Docker build output mixed with curl output when --build-local is used.
Added extractLastJson() and extractCommandOutput() helpers to isolate
actual command output from Docker build noise. Converted header-based
assertions to use JSON response body instead of curl -i headers.
3. **One-Shot Token Tests**: Multiple issues fixed:
- AWF_ONE_SHOT_TOKEN_DEBUG env var was not forwarded to the container,
so debug log assertions always failed. Added forwarding in docker-manager.
- COPILOT_GITHUB_TOKEN was never forwarded to the container in non-api-proxy
mode, causing token caching tests to fail. Added to selective forwarding.
- Python inline scripts used single quotes inside single-quoted bash commands,
causing shell syntax errors. Converted to heredoc syntax.
- NORMAL_VAR tests used options.env which doesn't reach the container;
switched to cliEnv which uses explicit -e flags.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
| Metric | Base | PR | Delta |
|---|---|---|---|
| Lines | 82.03% | 82.18% | 📈 +0.15% |
| Statements | 82.01% | 82.12% | 📈 +0.11% |
| Functions | 82.50% | 82.50% | ➡️ +0.00% |
| Branches | 74.20% | 74.03% | 📉 -0.17% |
📁 Per-file Coverage Changes (1 files)
| File | Lines (Before → After) | Statements (Before → After) |
|---|---|---|
src/docker-manager.ts |
83.1% → 83.7% (+0.55%) | 82.4% → 82.8% (+0.40%) |
Coverage comparison generated by scripts/ci/compare-coverage.ts
Build Test: Node.js Results
Overall: ✅ PASS
|
.NET Build Test Results
Overall: PASS Run outputhello-world:
|
Build Test: Deno ✅
Overall: PASS Deno version: 2.7.4
|
C++ Build Test Results
Overall: PASS 🎉
|
🧪 Bun Build Test Results
Overall: ✅ PASS
|
|
PR titles: fix(test): resolve integration test suite failures on main; fix: speed up firewall shutdown by ~10s
|
|
Smoke Test Results — Run #22731026835 ✅ GitHub MCP — Last 2 merged PRs: #1078 "fix: add explicit execute directive to smoke-codex to prevent noop", #1070 "chore: investigate issue duplication detector workflow failure" Overall: PASS | PR by
|
Rust Build Test Results
Overall: PASS ✅
|
Java Build Test Results
Overall: ✅ PASS All projects compiled and tested successfully against Maven Central via the AWF proxy.
|
Chroot Version Comparison Results
Result: Tests did not fully pass — Python and Node.js versions differ between host and chroot environments.
|
Go Build Test Results
Overall: PASS ✅
|
|
Smoke test results (run 22731026947) ✅ GitHub MCP — #1003: chore(deps): bump all-npm-dependencies; #1078: fix: add explicit execute directive to smoke-codex Overall: PASS
|
- npm audit fix for minimatch ReDoS (GHSA-7r86-cg39-jmmj, GHSA-23c5-xmqv-rm74) - npm audit fix --force for svgo DoS (GHSA-xpqw-6gx7-v673) in docs-site - Remaining 4 moderate lodash vulnerabilities require breaking @astrojs/check upgrade Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
| Metric | Base | PR | Delta |
|---|---|---|---|
| Lines | 82.03% | 82.18% | 📈 +0.15% |
| Statements | 82.01% | 82.12% | 📈 +0.11% |
| Functions | 82.50% | 82.50% | ➡️ +0.00% |
| Branches | 74.20% | 74.03% | 📉 -0.17% |
📁 Per-file Coverage Changes (1 files)
| File | Lines (Before → After) | Statements (Before → After) |
|---|---|---|
src/docker-manager.ts |
83.1% → 83.7% (+0.55%) | 82.4% → 82.8% (+0.40%) |
Coverage comparison generated by scripts/ci/compare-coverage.ts
Go Build Test Results ✅
Overall: PASS
|
🧪 Build Test: Bun
Overall: ✅ PASS Bun version: 1.3.10
|
|
fix: add explicit execute directive to smoke-codex to prevent noop
|
Java Build Test Results ☕
Overall: PASS ✅ All Java projects compiled and passed their tests successfully via Maven (using Squid proxy at
|
Chroot Version Comparison Results
Overall: ❌ Not all tests passed — Python and Node.js versions differ between host and chroot environment.
|
…ll escaping Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Build Test: Deno ✅
Overall: PASS Deno 2.7.4 · All 2 tests passed
|
Build Test: Bun Results ✅
Overall: PASS Bun version: 1.3.10
|
Build Test: Node.js ✅
Overall: PASS
|
Go Build Test Results
Overall: ✅ PASS
|
|
🔬 Smoke test results for
Overall: PASS
|
Rust Build Test Results
Overall: ✅ PASS
|
.NET Build Test Results
Overall: PASS Run Outputhello-world:
|
|
PRs: fix(squid): run Squid container as non-root user; fix(deps): resolve minimatch ReDoS and ajv vulnerabilities
|
Java Build Test Results
Overall: PASS ✅
|
|
Smoke Test Results — PASS
|
Chroot Version Comparison Results
Result: ❌ Not all versions match — the chroot environment uses different Python and Node.js versions than the host runner.
|
C++ Build Test Results
Overall: PASS ✅
|
The X-Request-ID injection test checked that stdout doesn't contain '<script>', but container entrypoint logs in stdout can contain angle brackets. Instead, verify the x-request-id response header contains a valid UUID (proving the proxy rejected the injected value). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
C++ Build Test Results
Overall: PASS ✅
|
Smoke Test Results
Overall: PASS
|
🟢 Build Test: Go — PASS
Overall: PASS — All 3 projects built and tested successfully.
|
|
Smoke Test Results — ✅ GitHub MCP: Last 2 merged PRs retrieved
✅ Playwright: github.com title contains "GitHub" Overall: PASS
|
Node.js Build Test Results
Overall: ✅ PASS
|
|
fix(security): eliminate TOCTOU race conditions in ssl-bump.ts
|
Bun Build Test Results
Overall: PASS ✅ Tested with Bun v1.3.10
|
.NET Build Test Results
Overall: PASS Run outputhello-world: json-parse:
|
🦕 Deno Build Test Results
Overall: ✅ PASS
|
Java Build Test Results ✅
Overall: PASS
|
Chroot Version Comparison Results
Result: FAILED — Python and Node.js versions differ between host and chroot environments.
|
Summary
Fixes three groups of integration test failures in
test-integration-suite.ymlthat have been failing on main since the workflow was added on Feb 26.Volume Mounts (source code fix)
--mountpaths (e.g.,/tmp/test:/data:ro) were invisible inside the agent container because chroot mode (always enabled) doeschroot /host, making paths not under/hostinaccessible/host(e.g.,/databecomes/host/data) so they're accessible after chrootAPI Proxy Tests (test fixes)
JSON.parse(result.stdout)and header assertions failed because stdout contained Docker build output mixed with actual curl output when--build-localis usedextractLastJson()andextractCommandOutput()helpers intests/fixtures/stdout-helpers.tsto isolate command output from Docker build noise. Converted header-based assertions (curl -i) to use JSON response body insteadOne-Shot Token Tests (source code + test fixes)
AWF_ONE_SHOT_TOKEN_DEBUGenv var was not forwarded to the container, so[one-shot-token]debug log assertions always failedCOPILOT_GITHUB_TOKENwas never forwarded to the container in non-api-proxy modeNORMAL_VARtests usedoptions.envwhich doesn't reach the container; neededcliEnv(explicit-eflags)AWF_ONE_SHOT_TOKEN_DEBUGandCOPILOT_GITHUB_TOKENin docker-manager.ts, use heredoc for Python scripts, usecliEnvfor non-sensitive varsTest plan
🤖 Generated with Claude Code