Skip to content

Use yauzl directly instead of extract-zip#639

Open
ErisDS wants to merge 1 commit into
mainfrom
fix/zip-use-yauzl
Open

Use yauzl directly instead of extract-zip#639
ErisDS wants to merge 1 commit into
mainfrom
fix/zip-use-yauzl

Conversation

@ErisDS
Copy link
Copy Markdown
Member

@ErisDS ErisDS commented May 14, 2026

Summary

Swaps the unmaintained extract-zip (no release since 2020) for yauzl directly. Closes the lying-header hang in #632's size-limit checks: a zip whose central directory understates an entry's size currently bypasses the metadata pre-flight check and then hangs extract() indefinitely.

Why the swap matters

yauzl's built-in AssertByteCountStream is supposed to error when actual decompressed bytes exceed declared uncompressedSize. But it overrides its own destroy with a closure that silently discards the error, so extract-zip's pipeline() waits on an error that never arrives. The fix is to open yauzl with validateEntrySizes: false and run our own counting Transform whose errors propagate cleanly.

Scope

Minimum-viable swap:

  • lib/extract.js — existing helpers (normalizeLimit, getLimits, getEntryUncompressedSize, throwOn*, etc.) and the onEntry wrapper are untouched. The only change is removing the require('extract-zip') line and replacing the single await extract(zipToExtract, opts) call with a local extractZip() function that uses yauzl. Path-traversal guard, __MACOSX/ skip, symlink/dir/mode handling are ported verbatim.
  • test/zip.test.js — three existing tests that mocked extract-zip via Module._load are updated to mock yauzl via a small helper. One new regression test for the lying-header case.
  • package.jsonextract-zipyauzl.

Total diff: ~130 net new lines in extract.js (the local yauzl-based extractor) plus the test refactor and the one new regression test.

What's deliberately out of scope

The streaming counter currently only enforces perEntryUncompressedBytes against actual decompressed bytes. totalUncompressedBytes still uses the existing metadata pre-flight (same as today). A future change could extend streaming to the total — left out here to keep the diff focused.

Test plan

  • pnpm --filter @tryghost/zip test — 16/16 pass, coverage above thresholds, lint clean
  • Sanity-check on a real Ghost theme zip

🤖 Generated with Claude Code

Swaps the unmaintained `extract-zip` (no release since 2020) for `yauzl`
directly. extract-zip is a thin wrapper around yauzl, so the swap is
small.

The motivation is that yauzl's built-in `AssertByteCountStream`
overrides its own `destroy` in a way that silently swallows errors when
an entry's actual decompressed bytes exceed its declared uncompressed
size. extract-zip pipes that stream into a `pipeline()` and waits on
the error that never arrives, hanging the extraction promise forever
on any zip whose central directory understates the payload — including
the metadata-pre-flight checks from #632. By opening yauzl with
`validateEntrySizes: false` and running our own counting Transform,
the per-entry size limit is enforced against actual decompressed bytes
and errors propagate cleanly through `pipeline()`.

All existing pre-flight checks, error shapes, helpers, and the
`onEntry` hook contract are preserved. Path traversal, `__MACOSX/`
skip, symlink/dir/mode detection are ported verbatim from extract-zip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 702cb5ff-840f-44c0-94dc-6632992c2002

📥 Commits

Reviewing files that changed from the base of the PR and between 75be1c1 and 1a4e70f.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (3)
  • packages/zip/lib/extract.js
  • packages/zip/package.json
  • packages/zip/test/zip.test.js

Walkthrough

This PR replaces the extract-zip dependency with an internal yauzl-based streaming extractor. The new implementation introduces extractZip() to orchestrate ZIP file lifecycle and extractEntry() to handle individual entries with built-in path traversal guards, mode/directory detection, and per-entry decompressed byte counting via Transform streams. Tests are updated with a binary header manipulation helper to forge understated ZIP sizes and refactored with yauzl-based mocks to validate size limit enforcement on actual streamed bytes rather than declared metadata.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • TryGhost/framework#632: Enforces uncompressed-size limits based on actual streamed decompressed bytes with corresponding test updates for oversized ZIP entries.
  • TryGhost/framework#636: Updates ZIP safety validation paths for symlink rejection and filename length checks with richer UnsupportedMediaTypeError detail assertions.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: replacing extract-zip with yauzl as the primary ZIP extraction mechanism.
Description check ✅ Passed The description is directly related to the changeset, explaining the rationale for the swap, implementation details, scope, and test plan.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/zip-use-yauzl

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 86.88525% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.87%. Comparing base (75be1c1) to head (1a4e70f).

Files with missing lines Patch % Lines
packages/zip/lib/extract.js 86.88% 1 Missing and 7 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #639      +/-   ##
==========================================
- Coverage   98.12%   97.87%   -0.25%     
==========================================
  Files          84       84              
  Lines        2771     2830      +59     
  Branches      510      524      +14     
==========================================
+ Hits         2719     2770      +51     
- Misses         11       12       +1     
- Partials       41       48       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants