Skip to content

parquet: add Decoder::scan_filtered for miniblock-level predicate pushdown#9788

Open
sahuagin wants to merge 1 commit intoapache:mainfrom
sahuagin:pr/scan-filtered
Open

parquet: add Decoder::scan_filtered for miniblock-level predicate pushdown#9788
sahuagin wants to merge 1 commit intoapache:mainfrom
sahuagin:pr/scan-filtered

Conversation

@sahuagin
Copy link
Copy Markdown
Contributor

@sahuagin sahuagin commented Apr 21, 2026

Closes #9785

Adds scan_filtered(num_values, out, predicate) as a provided method on the
Decoder trait. The method scans up to num_values, appending to out only
values from regions where predicate(lo, hi) returns true.

Default implementation (all encodings): ignores the predicate, decodes
everything. Safe fallback — no behavioral change for existing decoders.

DeltaBitPackDecoder override: Computes a conservative [lo, hi] range per
miniblock from last_value, min_delta, bit_width, and miniblock value count.
If the predicate rejects the range, the miniblock is skipped without decoding
individual values. Three skip strategies depending on context:

  • bw=0: arithmetic advancement of last_value, no bit reads.
  • Terminal bw>0: BitReader::skip, no decode.
  • Mid-stream bw>0: decode into scratch buffer to maintain last_value
    accuracy for subsequent miniblock range checks.

The predicate contract is conservative: false means the region definitely
cannot match (safe to skip); true means it might match (decode and emit).
False positives are safe. False negatives are not permitted by implementations.

Benchmarks (arrow_reader bench vs upstream HEAD, combined with #9786 and #9787):

Int32 skip single value:            -24.8%
Int32 skip increasing value:        -20.9%
Int32 skip stepped increasing:      -11.0%
Int64 skip single value:            -25.4%
Int64 skip increasing value:        -27.2%

Note: scan_filtered in isolation shows smaller gains since it does not have
the bw=0 (#9786) and terminal-skip (#9787) optimizations underneath it. The
numbers above reflect the combined state, which is the intended deployment.
Benchmarks were run on a non-isolated machine (no CPU frequency pinning);
small variances of ±5% on non-bw=0 paths should be attributed to measurement
noise.

Tests added:

  • Default implementation (PLAIN): predicate ignored, all values emitted.
  • Delta reject-all: nothing emitted, all values consumed.
  • Delta accept-all: all values emitted (identical to get()).
  • Delta conservative overlap: miniblock accepted when range overlaps threshold.
  • Delta bw=0 reject/accept: constant column skipped or emitted O(1).

Note on API surface: scan_filtered is a provided method with a safe
default, so adding it is non-breaking. Encodings that don't have per-region
metadata (PLAIN, RLE, etc.) get the correct conservative behavior for free.

Generated-by: Claude (claude-sonnet-4-6)

…te pushdown

Adds scan_filtered(num_values, out, predicate) as a provided method on the
Decoder trait. The default implementation ignores the predicate and decodes
everything — safe fallback for all encodings without per-region metadata.

DeltaBitPackDecoder overrides it to compute a conservative [lo, hi] range per
miniblock from last_value, min_delta, bit_width, and miniblock value count.
If the predicate rejects the range the miniblock is skipped without decoding:
  - bw=0: arithmetic advancement of last_value, no bit reads
  - terminal bw>0: BitReader::skip, no decode
  - mid-stream bw>0: decode into scratch to maintain last_value accuracy

Returns (values_emitted, values_consumed).

Benchmarks vs upstream HEAD:
  scan_filtered on 1M-row monotone DELTA column: 1.96ms -> 470us (4.2x)

Split from apache#9769 as requested by reviewer.
@github-actions github-actions Bot added the parquet Changes to the parquet crate label Apr 21, 2026
@sahuagin sahuagin marked this pull request as draft April 21, 2026 19:09
@sahuagin
Copy link
Copy Markdown
Contributor Author

Note on benchmark variance: These results were collected on a non-isolated machine without CPU frequency pinning. Small variances of ±5% on non-bw=0 paths (particularly mandatory/optional, no NULLs) are consistent with measurement noise rather than real regressions — the changes to those code paths are additive only. The #[cold] annotation on the terminal path (PR #9787) was added specifically to prevent icache pressure on the non-terminal hot path. Happy to share raw criterion output or rerun on a more controlled setup if helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

parquet: add Decoder::scan_filtered for miniblock-level predicate pushdown in DeltaBitPackDecoder

2 participants