Skip to content

Enforce zip per-entry size limit on actual decompressed bytes#641

Open
ErisDS wants to merge 1 commit into
mainfrom
fix/zip-streaming-byte-count
Open

Enforce zip per-entry size limit on actual decompressed bytes#641
ErisDS wants to merge 1 commit into
mainfrom
fix/zip-streaming-byte-count

Conversation

@ErisDS
Copy link
Copy Markdown
Member

@ErisDS ErisDS commented May 14, 2026

Summary

Follow-up to #632. The size limits added there are enforced on entry.uncompressedSize from the central directory — declared metadata. A zip whose header lies about its uncompressed size slips past the check, and (worse) hangs extract() indefinitely.

This PR enforces limits.perEntryUncompressedBytes against the actual decompressed bytes streamed for each entry. No dependency changes.

Why the hang

yauzl already includes an AssertByteCountStream that should error mid-stream when actual bytes exceed declared. But yauzl overrides its own destroy with a closure that silently discards the error, so the mid-stream error never reaches extract-zip's pipeline() — which then waits forever on a stream that has been destroyed without emitting end or error.

The fix

In the existing onEntry wrapper, two small additions:

  1. zipfile.validateEntrySizes = false — skips yauzl's broken counter.
  2. Wrap zipfile.openReadStream so each returned read stream is piped through a small counting Transform of ours. Our Transform calls cb(err) on overflow, which Node-stream's standard machinery propagates through extract-zip's pipeline() as a clean rejection.

That's the whole fix — about 40 lines of new code in extract.js, plus refactoring throwOnEntryTooLarge into an entryTooLargeError factory so the Transform can construct the same error shape.

What's deliberately out of scope

totalUncompressedBytes still uses the existing metadata pre-flight (same as today). A future change could extend streaming to the total; left out here to keep the diff focused.

Diff

  • lib/extract.js: 64 insertions / 6 deletions
  • test/zip.test.js: 35 insertions

Test plan

  • pnpm --filter @tryghost/zip test — 16/16 pass, coverage above thresholds, lint clean
  • New regression test: builds a real zip with 1 MB payload, forges its central directory uncompressedSize to 5 bytes, extracts with perEntryUncompressedBytes: 100 → asserts ENTRY_TOO_LARGE with observedBytes > 100 (currently hangs on main)
  • Sanity-check on a real Ghost theme zip

🤖 Generated with Claude Code

The metadata pre-flight check from #632 only inspects an entry's
declared `uncompressedSize`. An archive whose central directory lies
about that size slips past the check, and (worse) hangs `extract()`
indefinitely because yauzl's built-in `AssertByteCountStream`
overrides its own `destroy` so the mid-stream error never surfaces
to extract-zip's `pipeline()`.

In `onEntry`, switch off yauzl's broken counter (`validateEntrySizes
= false`) and monkey-patch `zipfile.openReadStream` to pipe each read
stream through our own counting Transform — whose errors propagate
cleanly through extract-zip's pipeline.

No dependency changes; existing pre-flight checks, error shapes, and
the `onEntry` hook contract are preserved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

Review Change Stack

Walkthrough

This PR adds streaming byte counting to enforce per-entry uncompressed size limits during ZIP decompression. Previously, size validation relied only on central-directory metadata. The changes introduce a new entryTooLargeError helper, monkey-patch yauzl's openReadStream within an installStreamingCounter function to wrap reads with a Transform stream that counts decompressed bytes, and integrate this into the extraction onEntry handler. Test coverage includes a helper that forges central-directory uncompressedSize fields and a test case verifying extract() rejects when streamed payload exceeds limits despite understated metadata.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • TryGhost/framework#632: Both PRs modify packages/zip/lib/extract.js to enforce perEntryUncompressedBytes during ZIP extraction, with this PR implementing streaming byte counting via openReadStream monkey-patch and the retrieved PR implementing onEntry limit checks.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title directly and accurately reflects the main change: enforcing per-entry size limits on actual decompressed bytes during streaming, not just declared metadata.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, explaining the root cause, the fix, and testing approach with clear technical detail.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/zip-streaming-byte-count

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 85.71429% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.95%. Comparing base (75be1c1) to head (2fe4bb3).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
packages/zip/lib/extract.js 85.71% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #641      +/-   ##
==========================================
- Coverage   98.12%   95.95%   -2.17%     
==========================================
  Files          84        2      -82     
  Lines        2771       99    -2672     
  Branches      510       17     -493     
==========================================
- Hits         2719       95    -2624     
+ Misses         11        2       -9     
+ Partials       41        2      -39     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/zip/lib/extract.js (1)

191-192: 💤 Low value

Consider skipping streaming counter when per-entry limit is Infinity.

The counter is installed even when limits.perEntryUncompressedBytes === Infinity, adding Transform overhead for every entry without any enforcement benefit. The comparison observedBytes > Infinity is always false, so it's correct but wasteful.

♻️ Optional: skip counter when no limit is configured
 function installStreamingCounter(zipfile, limits) {
-    if (zipfile.__streamingCounterInstalled || typeof zipfile.openReadStream !== 'function') {
+    if (
+        limits.perEntryUncompressedBytes === Infinity ||
+        zipfile.__streamingCounterInstalled ||
+        typeof zipfile.openReadStream !== 'function'
+    ) {
         return;
     }

Also applies to: 213-248

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/zip/lib/extract.js` around lines 191 - 192, Skip installing the
streaming counter when no per-entry limit is configured by checking
limits.perEntryUncompressedBytes === Infinity before calling
installStreamingCounter (and similarly in the other install sites around the
per-entry check). Modify the code paths that call installStreamingCounter
(reference: installStreamingCounter, limits.perEntryUncompressedBytes, and the
zipfile handling code) so the Transform is only created when
limits.perEntryUncompressedBytes is a finite number; leave behavior unchanged
otherwise.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/zip/lib/extract.js`:
- Around line 191-192: Skip installing the streaming counter when no per-entry
limit is configured by checking limits.perEntryUncompressedBytes === Infinity
before calling installStreamingCounter (and similarly in the other install sites
around the per-entry check). Modify the code paths that call
installStreamingCounter (reference: installStreamingCounter,
limits.perEntryUncompressedBytes, and the zipfile handling code) so the
Transform is only created when limits.perEntryUncompressedBytes is a finite
number; leave behavior unchanged otherwise.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2ccf229d-7080-45bb-a8d8-67f09f4bceca

📥 Commits

Reviewing files that changed from the base of the PR and between f5a44ed and 2fe4bb3.

📒 Files selected for processing (2)
  • packages/zip/lib/extract.js
  • packages/zip/test/zip.test.js

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants