Skip to content

perf(snapshots): Parallelize image hashing with rayon#3250

Merged
NicoHinderling merged 2 commits intomasterfrom
03-26-perf_snapshots_parallelize_image_hashing_with_rayon
Mar 27, 2026
Merged

perf(snapshots): Parallelize image hashing with rayon#3250
NicoHinderling merged 2 commits intomasterfrom
03-26-perf_snapshots_parallelize_image_hashing_with_rayon

Conversation

@NicoHinderling
Copy link
Copy Markdown
Contributor

@NicoHinderling NicoHinderling commented Mar 26, 2026

Use rayon's par_iter to hash all images concurrently instead of sequentially. Also increase the hash read buffer from 8KB to 64KB to reduce syscall overhead. Reduces hashing time from 5.3s to 0.8s (6.6x speedup) on a 753-image / 99MB dataset.

Copy link
Copy Markdown
Contributor Author

NicoHinderling commented Mar 26, 2026

@NicoHinderling NicoHinderling marked this pull request as ready for review March 26, 2026 21:18
@NicoHinderling NicoHinderling requested review from a team as code owners March 26, 2026 21:18
@NicoHinderling NicoHinderling changed the base branch from fix/chunk-snapshot-uploads to graphite-base/3250 March 26, 2026 21:28
@NicoHinderling NicoHinderling force-pushed the 03-26-perf_snapshots_parallelize_image_hashing_with_rayon branch from 97fb068 to 3c123e0 Compare March 26, 2026 21:28
@NicoHinderling NicoHinderling changed the base branch from graphite-base/3250 to master March 26, 2026 21:29
@NicoHinderling NicoHinderling merged commit 572bb4c into master Mar 27, 2026
27 of 41 checks passed
@NicoHinderling NicoHinderling deleted the 03-26-perf_snapshots_parallelize_image_hashing_with_rayon branch March 27, 2026 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants