Skip to content

[AURON #2163] Support native Iceberg scans with residual filters via scan pruning and post-scan native filter#2164

Open
weimingdiit wants to merge 1 commit intoapache:masterfrom
weimingdiit:feat/iceberg-native-support
Open

[AURON #2163] Support native Iceberg scans with residual filters via scan pruning and post-scan native filter#2164
weimingdiit wants to merge 1 commit intoapache:masterfrom
weimingdiit:feat/iceberg-native-support

Conversation

@weimingdiit
Copy link
Copy Markdown
Contributor

@weimingdiit weimingdiit commented Apr 4, 2026

Which issue does this PR close?

Closes #2163

Rationale for this change

The previous behavior was too conservative for Iceberg scans with residual filters. Even when the scan could still be executed natively and the remaining filter logic could be handled above the scan, the planner would fall back entirely.

This PR improves native coverage for Iceberg reads by:

  • preserving correctness for unsupported predicates
  • increasing native scan applicability for common filter patterns
  • reusing the existing native filter path instead of requiring full scan-level predicate support up front

This is an incremental improvement to Iceberg native execution, not full Iceberg feature coverage.

What changes are included in this PR?

This PR:

  • removes the unconditional fallback for Iceberg scans with non-alwaysTrue residual filters
  • extends IcebergScanPlan to carry pruningPredicates
  • extracts Iceberg scan filter expressions and converts a supported subset into Spark expressions
  • converts those Spark expressions into native scan pruning predicates
  • passes pruning predicates down through NativeIcebergTableScanExec
  • keeps unsupported predicates on the upper NativeFilter path
  • adds integration coverage for:
    • equality-based pruning
    • IN-based pruning
    • partial pushdown where only part of the predicate is pushed to scan pruning

Supported predicate scope in this PR

The scan-pruning conversion added here supports a limited subset of Iceberg expressions, including:

  • AND
  • OR
  • NOT
  • IS NULL
  • IS NOT NULL
  • IS NAN
  • NOT NAN
  • comparison predicates such as =, !=, <, <=, >, >=
  • IN
  • NOT IN

The current implementation intentionally avoids pushing some types through scan pruning, including:

  • StringType
  • BinaryType
  • DecimalType

Unsupported predicates are not pushed into scan pruning and are instead left for post-scan native filtering.

How was this patch tested?

Integration coverage was added in AuronIcebergIntegrationSuite

…s via scan pruning and post-scan native filter
@weimingdiit weimingdiit force-pushed the feat/iceberg-native-support branch from e19e1a9 to 4fe640b Compare April 4, 2026 10:33
@weimingdiit weimingdiit marked this pull request as ready for review April 4, 2026 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve native Iceberg scan coverage by pushing supported residual filters into scan pruning

1 participant