Skip to content

branch-4.1: [fix](iceberg) Fix NPE in COUNT(*) pushdown when snapshot summary omits total-* counters (#64648)#65061

Open
morningman wants to merge 1 commit into
apache:branch-4.1from
morningman:41_bp64648
Open

branch-4.1: [fix](iceberg) Fix NPE in COUNT(*) pushdown when snapshot summary omits total-* counters (#64648)#65061
morningman wants to merge 1 commit into
apache:branch-4.1from
morningman:41_bp64648

Conversation

@morningman

Copy link
Copy Markdown
Contributor

Proposed changes

Backport of #64648 to branch-4.1.

IcebergScanNode.getCountFromSnapshot() read total-equality-deletes /
total-position-deletes / total-records from the Iceberg snapshot summary and
called .equals() / Long.parseLong() directly on the Map.get() results. An
Iceberg snapshot summary is not guaranteed to carry these total-* counters
(snapshots written by compaction/replace, or some writers, may omit them), so
SELECT COUNT(*) threw a NullPointerException on such tables while SELECT *
worked fine.

Fix

  • Extract the summary parsing into a pure static
    IcebergUtils.getCountFromSummary(summary, ignoreDanglingDelete) that
    null-checks the counters and falls back to a normal scan (return -1) when any
    required counter is absent, and reuse it from both IcebergScanNode and
    IcebergUtils.getIcebergRowCount().
  • BE: IcebergTableReader::init_row_filters() accepts a table-level row count of
    0 (>= 0 instead of > 0) so a genuine pushed-down count of 0 takes the
    CountReader fast path. Applied to this branch's
    be/src/format/table/iceberg_reader.cpp (master's
    iceberg_reader_mixin.h does not exist on this branch).

Unit test IcebergCountPushDownTest covers the missing-counter, no-delete,
equality-delete, position-delete and zero-count cases.

… summary omits total-* counters (apache#64648)

Backport of apache#64648 to branch-4.1.

IcebergScanNode.getCountFromSnapshot() read total-equality-deletes /
total-position-deletes / total-records from the snapshot summary and called
.equals() / Long.parseLong() directly on the Map.get() results. An Iceberg
snapshot summary is not guaranteed to carry these counters (snapshots written
by compaction/replace, or some writers, may omit them), so `SELECT COUNT(*)`
threw a NullPointerException on such tables while `SELECT *` worked fine.

Extract the summary parsing into a pure static IcebergUtils.getCountFromSummary()
that null-checks the counters and falls back to a normal scan (returns -1) when
any is absent, and reuse it from both IcebergScanNode and IcebergUtils row-count
estimation. On the BE side, IcebergTableReader::init_row_filters() now accepts a
table-level row count of 0 (>= 0 instead of > 0) so a genuine pushed-down count
of 0 takes the CountReader fast path.

The BE change is applied to branch-4.1's
be/src/format/table/iceberg_reader.cpp, since master's
iceberg_reader_mixin.h does not exist on this branch.

Co-Authored-By: Raghvendra Singh <raghav@cashify.in>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@morningman morningman requested a review from yiguolei as a code owner July 1, 2026 03:07
@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morningman

Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen

Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 68.42% (13/19) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants