Skip to content

perf: reduce allocations and copies in load-modify-save path#34

Merged
Mythie merged 1 commit intomainfrom
perf/load-and-save
Feb 23, 2026
Merged

perf: reduce allocations and copies in load-modify-save path#34
Mythie merged 1 commit intomainfrom
perf/load-and-save

Conversation

@Mythie
Copy link
Contributor

@Mythie Mythie commented Feb 23, 2026

Improves performance of the load → modify → save cycle.

Why: We were only ~1.5x faster than pdf-lib for load/modify/save, which is underwhelming given our architectural advantages.

How: Profiling showed the bottleneck was allocation churn and unnecessary copying. This PR:

  • Pre-sizes ByteWriter buffers using size hints (original PDF length for full saves, estimated output sizes for filters) to avoid repeated geometric reallocation
  • Uses subarray instead of slice for stream data in the parser — zero-copy views into the original PDF bytes, which stay alive for the document lifetime anyway
  • Returns the internal buffer directly from ByteWriter.toBytes() when it's already the right size (zero-copy fast path), falls back to subarray instead of slice for the trimmed case
  • Hoists the trailing-zero regex in formatPdfNumber out of the function body so it isn't recompiled on every call
  • Routes page tree loading through registry.resolve so objects are tracked for modification detection

We were only ~1.5x faster than pdf-lib for load → modify → save,
which is underwhelming given our architectural advantages. Profiling
with bun --cpu-prof showed the bottleneck was allocation churn and
unnecessary copying, not parsing or serialization logic.

Key changes:
- Pre-size ByteWriter buffers using size hints (original PDF length
  for full saves, estimated output sizes for filters/serializers)
  to avoid repeated geometric reallocation
- Use subarray instead of slice for stream data in the parser —
  these are zero-copy views into the original PDF bytes which stay
  alive for the document lifetime anyway
- Return the internal buffer directly from ByteWriter.toBytes()
  when it's already the right size (zero-copy fast path), fall back
  to subarray instead of slice for the trimmed case
- Hoist the trailing-zero regex in formatPdfNumber out of the
  function body so it isn't recompiled on every call
- Route page tree loading through registry.resolve so objects are
  tracked for modification detection (was using parsed.getObject
  which bypassed the registry)
@vercel
Copy link
Contributor

vercel bot commented Feb 23, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
core Ready Ready Preview, Comment Feb 23, 2026 3:13am

@github-actions
Copy link
Contributor

Benchmark Results

Comparison

Load PDF

Benchmark Mean p99 RME Samples
libpdf 2.43ms 3.01ms ±0.9% 207
pdf-lib 37.72ms 44.02ms ±3.7% 14

Create blank PDF

Benchmark Mean p99 RME Samples
libpdf 56μs 121μs ±1.2% 8916
pdf-lib 411μs 1.45ms ±2.4% 1216

Add 10 pages

Benchmark Mean p99 RME Samples
libpdf 98μs 163μs ±0.9% 5110
pdf-lib 500μs 1.65ms ±2.4% 1001

Draw 50 rectangles

Benchmark Mean p99 RME Samples
libpdf 307μs 702μs ±1.2% 1629
pdf-lib 1.67ms 5.55ms ±5.8% 300

Load and save PDF

Benchmark Mean p99 RME Samples
libpdf 2.76ms 5.98ms ±3.7% 181
pdf-lib 87.45ms 99.56ms ±4.9% 10

Load, modify, and save PDF

Benchmark Mean p99 RME Samples
libpdf 41.94ms 46.44ms ±3.8% 13
pdf-lib 90.68ms 105.56ms ±6.8% 10

Extract single page from 100-page PDF

Benchmark Mean p99 RME Samples
libpdf 3.75ms 6.74ms ±2.2% 134
pdf-lib 9.06ms 11.90ms ±1.9% 56

Split 100-page PDF into single-page PDFs

Benchmark Mean p99 RME Samples
libpdf 32.59ms 36.00ms ±2.3% 16
pdf-lib 87.42ms 90.00ms ±1.9% 6

Split 2000-page PDF into single-page PDFs (0.9MB)

Benchmark Mean p99 RME Samples
libpdf 600.85ms 600.85ms ±0.0% 1
pdf-lib 1.63s 1.63s ±0.0% 1

Copy 10 pages between documents

Benchmark Mean p99 RME Samples
libpdf 4.60ms 5.94ms ±2.2% 109
pdf-lib 11.87ms 14.10ms ±1.6% 43

Merge 2 x 100-page PDFs

Benchmark Mean p99 RME Samples
libpdf 14.12ms 20.53ms ±3.5% 36
pdf-lib 52.39ms 52.88ms ±0.3% 10
Copying

Copy pages between documents

Benchmark Mean p99 RME Samples
copy 1 page 966μs 1.88ms ±2.1% 518
copy 10 pages from 100-page PDF 4.56ms 8.38ms ±2.4% 110
copy all 100 pages 7.26ms 8.13ms ±0.9% 69

Duplicate pages within same document

Benchmark Mean p99 RME Samples
duplicate page 0 876μs 1.44ms ±0.9% 571
duplicate all pages (double the document) 867μs 1.21ms ±0.7% 577

Merge PDFs

Benchmark Mean p99 RME Samples
merge 2 small PDFs 1.42ms 1.96ms ±1.0% 351
merge 10 small PDFs 7.40ms 9.31ms ±1.0% 68
merge 2 x 100-page PDFs 13.76ms 19.96ms ±3.1% 37
Drawing

benchmarks/drawing.bench.ts

Benchmark Mean p99 RME Samples
draw 100 rectangles 560μs 1.15ms ±3.4% 893
draw 100 circles 1.27ms 2.68ms ±2.8% 394
draw 100 lines 491μs 1.10ms ±1.3% 1019
draw 100 text lines (standard font) 1.54ms 2.19ms ±1.2% 324
create 10 pages with mixed content 1.31ms 2.17ms ±1.5% 383
Forms

benchmarks/forms.bench.ts

Benchmark Mean p99 RME Samples
get form fields 3.53ms 8.20ms ±5.0% 142
fill text fields 10.84ms 14.99ms ±3.0% 47
read field values 2.96ms 4.39ms ±1.3% 170
flatten form 8.43ms 11.81ms ±2.9% 60
Loading

benchmarks/loading.bench.ts

Benchmark Mean p99 RME Samples
load small PDF (888B) 69μs 162μs ±3.8% 7266
load medium PDF (19KB) 111μs 206μs ±4.0% 4524
load form PDF (116KB) 1.37ms 2.60ms ±2.1% 364
load heavy PDF (9.9MB) 2.59ms 4.00ms ±1.7% 193
Saving

benchmarks/saving.bench.ts

Benchmark Mean p99 RME Samples
save unmodified (19KB) 106μs 263μs ±0.9% 4734
save with modifications (19KB) 755μs 1.42ms ±1.4% 662
incremental save (19KB) 161μs 343μs ±0.8% 3113
save heavy PDF (9.9MB) 2.29ms 2.82ms ±1.1% 219
incremental save heavy PDF (9.9MB) 7.76ms 10.94ms ±3.4% 65
Splitting

Extract single page

Benchmark Mean p99 RME Samples
extractPages (1 page from small PDF) 966μs 2.04ms ±2.2% 518
extractPages (1 page from 100-page PDF) 3.68ms 5.27ms ±1.2% 136
extractPages (1 page from 2000-page PDF) 59.76ms 62.93ms ±1.5% 10

Split into single-page PDFs

Benchmark Mean p99 RME Samples
split 100-page PDF (0.1MB) 31.67ms 37.64ms ±4.8% 16
split 2000-page PDF (0.9MB) 572.69ms 572.69ms ±0.0% 1

Batch page extraction

Benchmark Mean p99 RME Samples
extract first 10 pages from 2000-page PDF 60.84ms 63.03ms ±1.3% 9
extract first 100 pages from 2000-page PDF 63.63ms 64.97ms ±1.3% 8
extract every 10th page from 2000-page PDF (200 pages) 70.08ms 88.99ms ±9.1% 8
Environment
  • Runner: Linux (X64)
  • Runtime: Bun 1.3.9

Results are machine-dependent.

@Mythie Mythie merged commit bc102b6 into main Feb 23, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant