perf: optimize KMeans centroid recomputation with thread-local parallel accumulators by hushengquan · Pull Request #6370 · lance-format/lance

hushengquan · 2026-04-01T03:10:25Z

What

Optimizes KMeansAlgoFloat::to_kmeans centroid recomputation step.

Changes

Replaced the old parallel-scan approach (each of P cores scans all N data points, filtering by centroid range) with a thread-local parallel accumulation pattern:

par_chunks splits data into P chunks (one per rayon thread)
Each thread accumulates into a private centroid buffer — zero write contention
reduce merges all thread-local buffers into the final centroids
Centroid normalization remains parallel over k clusters

Complexity comparison

	Old	New
Data reads	O(N × P)	O(N)
Merge overhead	—	O(k × dim × P)
Write contention	None (disjoint slices)	None (private buffers)

Benchmarks (release, Apple M-series)

Config	Old	New	Speedup
N=131K, k=256, dim=128	3.53 ms	0.48 ms	7.4x
N=524K, k=1024, dim=256	9.98 ms	3.88 ms	2.6x
N=2M, k=4096, dim=256	45.95 ms	24.87 ms	1.9x

…lators

codecov · 2026-04-01T03:42:31Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

perf: optimize KMeans centroid recomputation with thread-local accumu…

f9eed91

…lators

github-actions bot added the performance label Apr 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: optimize KMeans centroid recomputation with thread-local parallel accumulators#6370

perf: optimize KMeans centroid recomputation with thread-local parallel accumulators#6370
hushengquan wants to merge 1 commit intolance-format:mainfrom
hushengquan:optimize-kmeans

hushengquan commented Apr 1, 2026

Uh oh!

codecov bot commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hushengquan commented Apr 1, 2026

What

Changes

Complexity comparison

Benchmarks (release, Apple M-series)

Uh oh!

codecov bot commented Apr 1, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant