Skip to content

perf(query,types): lazy-init ShardedMap + inline small-input math#9718

Open
shaunpatterson wants to merge 2 commits into
dgraph-io:mainfrom
shaunpatterson:perf/sharded-map-and-math
Open

perf(query,types): lazy-init ShardedMap + inline small-input math#9718
shaunpatterson wants to merge 2 commits into
dgraph-io:mainfrom
shaunpatterson:perf/sharded-map-and-math

Conversation

@shaunpatterson
Copy link
Copy Markdown
Contributor

types: ShardedMap.shards becomes a [30]map fixed array (one alloc for the struct) with shard maps created lazily on first write. Adds PeekShard (read-only, no alloc), makes GetShardOrNil lazily allocate, and fixes IsEmpty (previously returned false for any non-nil map; now true when empty).

query: processBinary handles small inputs (<512 entries) inline instead of launching 30 goroutines + a WaitGroup, using PeekShard for read-only access.

Bundled because the inline math path depends on PeekShard. BenchmarkShardedMap: New 560n→0.25n (30→0 allocs). BenchmarkProcessBinarySmall: n=8 8.9µs→2.2µs (-75%).


Independent change; branched from 4b9b399d (no dependency on the other PRs in this set). Verified: package unit tests pass + Go benchmark (benchstat, p<0.01). Numbers are single-thread microbenchmarks; not yet run through the Docker integration suite.

shards becomes a [30]map fixed array (one heap alloc for the struct); shard
maps are created lazily on first write. Adds PeekShard (read-only, no alloc)
and makes GetShardOrNil lazily allocate. Also fixes IsEmpty, which previously
returned false for any non-nil map (it checked len(shards)) and now reports
true when all shards are empty.

BenchmarkShardedMap:
  New:    560.8n -> 0.25n, 30 allocs -> 0
  SetGet: 1060n -> 682n,  38 allocs -> 16 (-57.9%)
types: ShardedMap.shards becomes a [30]map fixed array (one heap alloc for the
struct) with shard maps created lazily on first write. Adds PeekShard
(read-only, no alloc), makes GetShardOrNil lazily allocate, and fixes IsEmpty
(previously returned false for any non-nil map; now reports true when empty).

query: processBinary processes small inputs (<512 total entries) inline instead
of launching 30 goroutines + a WaitGroup, using PeekShard for read-only shard
access. These are bundled because the inline math path depends on PeekShard.

BenchmarkShardedMap: New 560n->0.25n (30->0 allocs); SetGet 38->16 allocs.
BenchmarkProcessBinarySmall: n=8 8.9µs->2.2µs (-75%, 233->60 allocs).
@shaunpatterson shaunpatterson requested a review from a team as a code owner May 28, 2026 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant