Skip to content

compute: OffsetOptimized index_pair + strided_len caching - 2.0-2.25x arrangement lookup speedup#35159

Draft
def- wants to merge 1 commit intoMaterializeInc:mainfrom
def-:pr-compute-offset-optimized
Draft

compute: OffsetOptimized index_pair + strided_len caching - 2.0-2.25x arrangement lookup speedup#35159
def- wants to merge 1 commit intoMaterializeInc:mainfrom
def-:pr-compute-offset-optimized

Conversation

@def-
Copy link
Contributor

@def- def- commented Feb 22, 2026

Three optimizations to the core offset lookup data structures used in differential dataflow arrangement spines (OffsetOptimized, BytesBatch, BytesContainer in row_spine.rs):

  1. OffsetStride::index_pair() returns (index(i), index(i+1)) with a single enum dispatch instead of two separate calls. For uniform stride (common after compaction), computes both values with one multiply.

  2. strided_len field caches strided.len() on OffsetOptimized, avoiding repeated enum dispatch on every index() call. Updated on push_into().

  3. Single-batch fast path in BytesContainer::index() avoids iterator loop for the common single-batch case after compaction/merge.

Coverage data shows 1.58 trillion OffsetStride operations, making this the most frequently executed code path in the compute layer.

Benchmark results (10K lookups):

  • DatumContainer sequential int5: 31.44µs → 15.00µs (2.10x)
  • DatumContainer sequential mixed5: 29.88µs → 15.10µs (1.98x)
  • DatumContainer sequential narrow1: 28.14µs → 12.52µs (2.25x)
  • DatumContainer random int5: 31.67µs → 19.42µs (1.63x)

Taken and cleaned up from #35076

@github-actions
Copy link

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

  • Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
  • Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
  • Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

  • The PR title is descriptive and will make sense in the git log.
  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

… arrangement lookup speedup

Three optimizations to the core offset lookup data structures used in
differential dataflow arrangement spines (OffsetOptimized, BytesBatch,
BytesContainer in row_spine.rs):

1. OffsetStride::index_pair() returns (index(i), index(i+1)) with a
   single enum dispatch instead of two separate calls. For uniform
   stride (common after compaction), computes both values with one
   multiply.

2. strided_len field caches strided.len() on OffsetOptimized, avoiding
   repeated enum dispatch on every index() call. Updated on push_into().

3. Single-batch fast path in BytesContainer::index() avoids iterator
   loop for the common single-batch case after compaction/merge.

Coverage data shows 1.58 trillion OffsetStride operations, making this
the most frequently executed code path in the compute layer.

Benchmark results (10K lookups):
- DatumContainer sequential int5:    31.44µs → 15.00µs (2.10x)
- DatumContainer sequential mixed5:  29.88µs → 15.10µs (1.98x)
- DatumContainer sequential narrow1: 28.14µs → 12.52µs (2.25x)
- DatumContainer random int5:        31.67µs → 19.42µs (1.63x)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@def- def- force-pushed the pr-compute-offset-optimized branch from 43f60ff to 7cbffe0 Compare February 22, 2026 22:53
@def- def- closed this Feb 23, 2026
@def- def- reopened this Feb 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant