Skip to content

fix: fallback NullType LocalTableScanExec to Spark#4458

Closed
pchintar wants to merge 1 commit into
apache:mainfrom
pchintar:fix/nulltype-localtablescan-fallback
Closed

fix: fallback NullType LocalTableScanExec to Spark#4458
pchintar wants to merge 1 commit into
apache:mainfrom
pchintar:fix/nulltype-localtablescan-fallback

Conversation

@pchintar
Copy link
Copy Markdown

@pchintar pchintar commented May 27, 2026

Which issue does this PR close?

Closes #4457 .

Rationale for this change

Queries containing NullType columns could fail when spark.comet.exec.localTableScan.enabled=true.

While plain NullType projections may execute successfully, aggregate queries can still reach native Arrow conversion paths through CometLocalTableScanExec and fail with an UnsupportedOperationException.

For example:

SELECT max(col) FROM VALUES
  (NULL),
  (NULL)
AS t(col)

crashes during Arrow schema conversion in CometLocalTableScanExec.

What changes are included in this PR?

This PR adds an explicit support-level check for NullType columns in CometLocalTableScanExec.

When a LocalTableScanExec contains NullType, Comet now marks the operator as unsupported and falls back to Spark execution instead of attempting native Arrow conversion.

This PR intentionally uses fallback behavior instead of adding full native NullType support because NullType is currently unsupported across downstream native aggregate and shuffle paths as well. Simply allowing Arrow schema conversion would move the failure deeper into execution rather than resolving the incompatibility consistently.

A regression test has also been added for aggregate queries over NullType local table scans.

How are these changes tested?

Added a regression test in CometExecSuite covering:

SELECT max(col) FROM VALUES (NULL), (NULL) AS t(col)

with:

spark.comet.exec.localTableScan.enabled=true

Before the fix, the query failed with:

java.lang.UnsupportedOperationException:
Unsupported data type: [org.apache.spark.sql.types.NullType$] void

After the fix, the query cleanly falls back to Spark execution and returns the expected result.

Test command used:

mvn -pl spark -DwildcardSuites=org.apache.comet.exec.CometExecSuite test

@mbutrovich
Copy link
Copy Markdown
Contributor

Thanks @pchintar! Instead of a fallback, I have the fix in #4393. I will peel that off as its own PR.

Thanks again!

@mbutrovich
Copy link
Copy Markdown
Contributor

Closing in favor of #4460. Thanks again @pchintar!

@mbutrovich mbutrovich closed this May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Queries with NullType aggregate fails when native LocalTableScanExec is enabled

2 participants