Skip to content

feat(parquet): wire scan_filtered through ArrayReader stack; add with_miniblock_predicate#9770

Draft
sahuagin wants to merge 2 commits intoapache:mainfrom
sahuagin:delta-scan-filtered-wiring
Draft

feat(parquet): wire scan_filtered through ArrayReader stack; add with_miniblock_predicate#9770
sahuagin wants to merge 2 commits intoapache:mainfrom
sahuagin:delta-scan-filtered-wiring

Conversation

@sahuagin
Copy link
Copy Markdown
Contributor

@sahuagin sahuagin commented Apr 20, 2026

Which issue does this PR close?

Depends on #9769 (the decoder PR — diff includes those changes while that PR is pending).
Please review after #9769 merges; this PR's diff includes those decoder changes while that PR is pending.

Rationale for this change

Exposes the scan_filtered miniblock-level predicate pushdown added in #9769 through the full column reader stack and as a public API. For mandatory DELTA_BINARY_PACKED
INT32/INT64 columns, entire miniblocks (32/64 values) can be skipped without decoding when a caller-supplied range predicate rules them out. This is especially effective
for monotone columns (timestamps, sequence numbers, auto-increment IDs) where bw=0 blocks allow O(1) skipping.

What changes are included in this PR?

Wiring chain (bottom to top):

  • ColumnValueDecoderImpl::scan_filtered_values() — dispatches to decoder
  • GenericColumnReader::scan_filtered_records() — mandatory columns only; optional/repeated fall back to full decode to keep def/rep levels in sync
  • GenericRecordReader::scan_filtered_records() — page-switching loop with same fallback
  • ArrayReader::scan_records() — new provided trait method (default = read_records, safe for all encodings/types)
  • PrimitiveArrayReader::scan_records() — override that calls the above chain
  • StructArrayReader::scan_records() — delegates to its single child for single-column projections; multi-column projections fall back to read_records
  • ParquetRecordBatchReader::next_inner — uses scan_records in the All selection branch when a predicate is set
  • ArrowReaderBuilder::with_miniblock_predicate() — fluent setter; works for both ParquetRecordBatchReaderBuilder (sync) and ParquetRecordBatchStreamBuilder (async)

New public API:
pub type MiniblockPredicate = Arc<dyn Fn(i64, i64) -> bool + Send + Sync>;

// on ArrowReaderBuilder:
pub fn with_miniblock_predicate(self, predicate: MiniblockPredicate) -> Self;

Example:
let pred: MiniblockPredicate = Arc::new(|_lo, hi| hi >= 1_000_000);
let reader = ParquetRecordBatchReaderBuilder::try_new(file)?
.with_projection(mask)
.with_miniblock_predicate(pred)
.build()?;

Limitations (documented in MiniblockPredicate rustdoc):

  • Multi-column projections fall back to full decode (each column has independent value ranges)
  • Optional/repeated columns fall back (def/rep level synchronization required)
  • No file format changes; miniblock ranges are computed on-the-fly from data already present in DELTA_BINARY_PACKED block headers

Are these changes tested?

  • test_scan_records_delta_binary_packed_mandatory in primitive_array.rs: unit test exercising the full wiring chain on a mandatory INT64 DELTA column, verifying correct
    miniblock-level skipping and no false negatives
  • test_with_miniblock_predicate_single_column in arrow_reader/mod.rs: end-to-end test through ParquetRecordBatchReaderBuilder verifying the public API, correct output,
    and that fewer rows are returned than a full read

Are there any user-facing changes?

Yes — two additive public API items:

  • MiniblockPredicate type alias
  • ArrowReaderBuilder::with_miniblock_predicate() method

No breaking changes. The new scan_records method on the ArrayReader trait has a provided default (read_records) so no existing implementations are affected.

…red()

Two additions to DeltaBitPackDecoder:

1. skip() optimization: bw=0 miniblocks use an O(1) multiply instead of
   decoding 32/64 values per miniblock. Terminal skips (discarding all
   remaining page values) avoid heap allocation and last_value tracking.

2. Decoder::scan_filtered() — new provided method on the Decoder trait
   (default: decode everything, safe fallback for all encodings).
   DeltaBitPackDecoder overrides it to compute a conservative value range
   [lo, hi] per miniblock and skip non-matching miniblocks without decoding
   individual values.

Benchmarks vs upstream HEAD (arrow_reader bench):
  bw=0 single-value skip:      -21.6%
  bw=0 increasing-value skip:  -24.3%
  mixed stepped skip:           -3.9%

Wall-time scan_filtered on 1M-row DELTA file (monotone column):
  full decode: 1.96ms -> scan_filtered: 470us (4.2x speedup)
…_miniblock_predicate

Wires DeltaBitPackDecoder::scan_filtered() up the full column reader stack
and exposes it as a public API on both ParquetRecordBatchReaderBuilder
(sync) and ParquetRecordBatchStreamBuilder (async).

Wiring chain (bottom to top):
  ColumnValueDecoderImpl::scan_filtered_values()
  GenericColumnReader::scan_filtered_records()    (mandatory columns only)
  GenericRecordReader::scan_filtered_records()    (optional/repeated fallback)
  ArrayReader::scan_records() trait method + page-switching helper
  PrimitiveArrayReader::scan_records() override
  StructArrayReader::scan_records()               (single-child delegate)
  ParquetRecordBatchReader::next_inner All branch
  ArrowReaderBuilder::with_miniblock_predicate()

Public API additions:
  - MiniblockPredicate type alias
    (Arc<dyn Fn(i64, i64) -> bool + Send + Sync>)
  - ArrowReaderBuilder::with_miniblock_predicate(pred) fluent setter

Example:
  let pred: MiniblockPredicate = Arc::new(|_lo, hi| hi >= 1_000_000);
  let reader = ParquetRecordBatchReaderBuilder::try_new(file)?
      .with_projection(mask)
      .with_miniblock_predicate(pred)
      .build()?;

Limitations:
  - Multi-column projections fall back to full decode (column value ranges
    are independent; a shared predicate would produce mismatched lengths)
  - Optional/repeated columns fall back (def/rep levels must stay
    synchronized with the values buffer)
  - No file format changes; miniblock ranges computed on-the-fly from
    DELTA_BINARY_PACKED block headers already present in the page data
@github-actions github-actions Bot added the parquet Changes to the parquet crate label Apr 20, 2026
@sahuagin sahuagin marked this pull request as draft April 20, 2026 02:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant