Skip to content

Add fastcdc chunker (keyed Gear hash)#9824

Open
ThomasWaldmann wants to merge 2 commits into
borgbackup:masterfrom
ThomasWaldmann:fastcdc-chunker
Open

Add fastcdc chunker (keyed Gear hash)#9824
ThomasWaldmann wants to merge 2 commits into
borgbackup:masterfrom
ThomasWaldmann:fastcdc-chunker

Conversation

@ThomasWaldmann

@ThomasWaldmann ThomasWaldmann commented Jun 27, 2026

Copy link
Copy Markdown
Member

Based on #9823.

New fastcdc chunker (keyed Gear hash)

A FastCDC content-defined chunker using the window-less Gear rolling hash
(fp = (fp << 1) + Gear[byte]), which is cheaper per byte than buzhash's
cyclic-polynomial update, so it chunks noticeably faster while producing the same
chunk-size distribution and deduplication.

The Gear table is keyed: derived from the repo id key via CSPRNG (own
fastcdc domain), exactly like the buzhash64 table, so chunk cut points stay
unpredictable without the key (anti-fingerprinting). It implements the same
FastCDC techniques as buzhash64 (sub-minimum skipping, normalized chunking with a
required nc_level, min/max clamping); the mask uses the high bits of the hash.

chunker-params: fastcdc,chunk_min,chunk_max,chunk_mask,nc_level — no window
field, because Gear is window-less. E.g. fastcdc,19,23,21,2.

borg benchmark cpu now measures the fastcdc chunker; tests live in
borg.testsuite.chunkers (golden vector, size distribution, keyed gear table,
param parsing, slow fuzz); docs and changelog updated.

Benchmarks

scripts/chunker_bench.py, buzhash64 vs fastcdc, both nc_level=2, incompressible
data unless noted:

corpus / target metric buzhash64 fastcdc
5 GiB, 2 MiB target CV 0.294 0.295
throughput 1011 MB/s 1313 MB/s (+30%)
64 MiB, 64 KiB target CV 0.374 0.359
shift-resilience 0.9928 0.9929
throughput 963 MB/s 1331 MB/s (+38%)
2.5 GiB re-backup, 64 edits dedup (lower=better) 0.5237 0.5236
2.5 GiB re-backup, 320 edits dedup 0.6133 0.6161

borg benchmark cpu, 1 GB: fastcdc 3.80s, buzhash 4.36s, buzhash64 8.13s, fixed 0.56s.

Chunk-size distribution, deduplication and shift-resilience match buzhash64 within
noise; fastcdc is consistently faster.

🤖 Generated with Claude Code

@ThomasWaldmann ThomasWaldmann changed the title Add fastcdc chunker (keyed Gear hash); buzhash64 normalized chunking Add fastcdc chunker (keyed Gear hash) Jun 27, 2026
@ThomasWaldmann ThomasWaldmann force-pushed the fastcdc-chunker branch 2 times, most recently from c16e0fe to f41a414 Compare June 27, 2026 22:57
@codecov

codecov Bot commented Jun 27, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 81.25000% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.17%. Comparing base (106dfba) to head (afa8189).
⚠️ Report is 10 commits behind head on master.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/borg/helpers/parseformat.py 66.66% 3 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #9824      +/-   ##
==========================================
+ Coverage   84.94%   85.17%   +0.22%     
==========================================
  Files          92       92              
  Lines       15291    15325      +34     
  Branches     2296     2307      +11     
==========================================
+ Hits        12989    13053      +64     
+ Misses       1611     1581      -30     
  Partials      691      691              

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

ThomasWaldmann and others added 2 commits June 28, 2026 12:00
Normalized chunking switches between a stricter and a looser cut mask
around the target chunk size. This greatly tightens the chunk-size
distribution (coefficient of variation ~0.9 -> ~0.3 in tests) and removes
the dedup-hostile max-size-clamped chunks, with unchanged deduplication.

chunker-params for buzhash64 gains a required 6th field, nc_level:

  buzhash64,chunk_min,chunk_max,chunk_mask,window_size,nc_level

Use nc_level=2 for the new default, nc_level=0 to disable (then behavior
is byte-identical to the previous single-mask chunker).

buzhash (32bit) is untouched and stays bit-compatible with borg 1.x.

The mask transition point (normal_size) defaults to a principled formula
(target minus the expected loose-phase tail) so the mean stays near the
target; it can be tuned via the normal_size constructor arg.

scripts/chunker_bench.py: evidence harness used to measure chunk-size
distribution, dedup ratio, throughput and shift-resilience.

Measurements (before = nc_level 0, after = nc_level 2; both at the default
params buzhash64,19,23,21,4095; measured with scripts/chunker_bench.py):

5 GiB of incompressible data (~2000-2700 chunks, statistically stable):

  before:  CV 0.739,  49 max-size-clamped (8 MiB) chunks,   953 MB/s
  after:   CV 0.311,   0 max-size-clamped chunks,          1024 MB/s

Re-backup of a 2.5 GiB file after a few scattered single-byte edits
(deduplication ratio; 0.5 = v2 fully deduplicated against v1, lower is
better):

   64 edits:  before 0.5424  ->  after 0.5235
  320 edits:  before 0.6791  ->  after 0.6142

Normalized chunking deduplicates better after edits: removing the
max-size-clamped chunks means a single-byte change invalidates much less
data (about 36% less dedup overhead at 320 edits). Throughput was also
consistently higher with nc_level=2 at this scale.

Also: fix bug when computing the mask, one needs to use 1ULL instead of
1, so the shifting computation is done in a uint64, not in a 32bit int.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a new "fastcdc" content-defined chunker selectable via --chunker-params.
It uses the FastCDC Gear rolling hash (fp = (fp << 1) + Gear[byte]), which is
window-less and cheaper per byte than buzhash's cyclic-polynomial update, so it
chunks noticeably faster (see "borg benchmark cpu" output), while producing
the same chunk-size distribution and deduplication.

The Gear table is keyed: it is derived from the repo id key via CSPRNG (own
"fastcdc" domain), exactly like the buzhash64 table, so chunk cut points stay
unpredictable without the key (anti-fingerprinting). It implements the same
FastCDC techniques as buzhash64 (sub-minimum skipping, normalized chunking with
a required nc_level, min/max clamping); the mask uses the high bits of the hash
(Gear accumulates entropy there).

chunker-params: "fastcdc,chunk_min,chunk_max,chunk_mask,nc_level" - there is no
window field, because Gear is window-less. e.g. fastcdc,19,23,21,2

Also: borg benchmark cpu now measures the fastcdc chunker; tests in
borg.testsuite.chunkers (golden vector, size distribution, keyed gear table,
param parsing, slow fuzz); docs and changelog.

Benchmarks (scripts/chunker_bench.py, buzhash64 vs fastcdc, both nc_level=2,
incompressible data unless noted):

  5 GiB, 2 MiB target (default params):
    buzhash64: CV 0.294, 1011 MB/s
    fastcdc:   CV 0.295, 1313 MB/s   (+30%)

  64 MiB, 64 KiB target:
    buzhash64: CV 0.374, shift-resilience 0.9928,  963 MB/s
    fastcdc:   CV 0.359, shift-resilience 0.9929, 1331 MB/s   (+38%)

  Re-backup of a 2.5 GiB file after scattered single-byte edits (dedup ratio,
  0.5 = v2 fully deduplicated, lower is better):
     64 edits:  buzhash64 0.5237, fastcdc 0.5236
    320 edits:  buzhash64 0.6133, fastcdc 0.6161

  borg benchmark cpu, 1 GB: fastcdc 3.80s, buzhash 4.36s, buzhash64 8.13s,
  fixed 0.56s.

Chunk-size distribution, deduplication and shift-resilience match buzhash64
within noise; fastcdc is consistently faster.

Also: fix bug when computing the mask, one needs to use 1ULL instead of
1, so the shifting computation is done in a uint64, not in a 32bit int.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant