Enforce zip per-entry size limit on actual decompressed bytes#641
Enforce zip per-entry size limit on actual decompressed bytes#641ErisDS wants to merge 1 commit into
Conversation
The metadata pre-flight check from #632 only inspects an entry's declared `uncompressedSize`. An archive whose central directory lies about that size slips past the check, and (worse) hangs `extract()` indefinitely because yauzl's built-in `AssertByteCountStream` overrides its own `destroy` so the mid-stream error never surfaces to extract-zip's `pipeline()`. In `onEntry`, switch off yauzl's broken counter (`validateEntrySizes = false`) and monkey-patch `zipfile.openReadStream` to pipe each read stream through our own counting Transform — whose errors propagate cleanly through extract-zip's pipeline. No dependency changes; existing pre-flight checks, error shapes, and the `onEntry` hook contract are preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WalkthroughThis PR adds streaming byte counting to enforce per-entry uncompressed size limits during ZIP decompression. Previously, size validation relied only on central-directory metadata. The changes introduce a new Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #641 +/- ##
==========================================
- Coverage 98.12% 95.95% -2.17%
==========================================
Files 84 2 -82
Lines 2771 99 -2672
Branches 510 17 -493
==========================================
- Hits 2719 95 -2624
+ Misses 11 2 -9
+ Partials 41 2 -39 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/zip/lib/extract.js (1)
191-192: 💤 Low valueConsider skipping streaming counter when per-entry limit is Infinity.
The counter is installed even when
limits.perEntryUncompressedBytes === Infinity, adding Transform overhead for every entry without any enforcement benefit. The comparisonobservedBytes > Infinityis always false, so it's correct but wasteful.♻️ Optional: skip counter when no limit is configured
function installStreamingCounter(zipfile, limits) { - if (zipfile.__streamingCounterInstalled || typeof zipfile.openReadStream !== 'function') { + if ( + limits.perEntryUncompressedBytes === Infinity || + zipfile.__streamingCounterInstalled || + typeof zipfile.openReadStream !== 'function' + ) { return; }Also applies to: 213-248
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/zip/lib/extract.js` around lines 191 - 192, Skip installing the streaming counter when no per-entry limit is configured by checking limits.perEntryUncompressedBytes === Infinity before calling installStreamingCounter (and similarly in the other install sites around the per-entry check). Modify the code paths that call installStreamingCounter (reference: installStreamingCounter, limits.perEntryUncompressedBytes, and the zipfile handling code) so the Transform is only created when limits.perEntryUncompressedBytes is a finite number; leave behavior unchanged otherwise.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@packages/zip/lib/extract.js`:
- Around line 191-192: Skip installing the streaming counter when no per-entry
limit is configured by checking limits.perEntryUncompressedBytes === Infinity
before calling installStreamingCounter (and similarly in the other install sites
around the per-entry check). Modify the code paths that call
installStreamingCounter (reference: installStreamingCounter,
limits.perEntryUncompressedBytes, and the zipfile handling code) so the
Transform is only created when limits.perEntryUncompressedBytes is a finite
number; leave behavior unchanged otherwise.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 2ccf229d-7080-45bb-a8d8-67f09f4bceca
📒 Files selected for processing (2)
packages/zip/lib/extract.jspackages/zip/test/zip.test.js
Summary
Follow-up to #632. The size limits added there are enforced on
entry.uncompressedSizefrom the central directory — declared metadata. A zip whose header lies about its uncompressed size slips past the check, and (worse) hangsextract()indefinitely.This PR enforces
limits.perEntryUncompressedBytesagainst the actual decompressed bytes streamed for each entry. No dependency changes.Why the hang
yauzl already includes an
AssertByteCountStreamthat should error mid-stream when actual bytes exceed declared. But yauzl overrides its owndestroywith a closure that silently discards the error, so the mid-stream error never reachesextract-zip'spipeline()— which then waits forever on a stream that has been destroyed without emittingendorerror.The fix
In the existing
onEntrywrapper, two small additions:zipfile.validateEntrySizes = false— skips yauzl's broken counter.zipfile.openReadStreamso each returned read stream is piped through a small countingTransformof ours. Our Transform callscb(err)on overflow, which Node-stream's standard machinery propagates throughextract-zip'spipeline()as a clean rejection.That's the whole fix — about 40 lines of new code in
extract.js, plus refactoringthrowOnEntryTooLargeinto anentryTooLargeErrorfactory so the Transform can construct the same error shape.What's deliberately out of scope
totalUncompressedBytesstill uses the existing metadata pre-flight (same as today). A future change could extend streaming to the total; left out here to keep the diff focused.Diff
lib/extract.js: 64 insertions / 6 deletionstest/zip.test.js: 35 insertionsTest plan
pnpm --filter @tryghost/zip test— 16/16 pass, coverage above thresholds, lint cleanuncompressedSizeto 5 bytes, extracts withperEntryUncompressedBytes: 100→ assertsENTRY_TOO_LARGEwithobservedBytes > 100(currently hangs on main)🤖 Generated with Claude Code