fix: add EmptySchemaShufflePartitioner and test from #3858 by mbutrovich · Pull Request #3893 · apache/datafusion-comet

mbutrovich · 2026-04-03T00:29:42Z

Which issue does this PR close?

Closes #3846.

Rationale for this change

Native shuffle above a native scan that does not project any columns (e.g., COUNT(*)) results in RecordBatches with an empty schema but valid number of rows. Native shuffle currently panics trying to interleave those batches, but we can fast path this scenario with a special partitioner. It is similar to the SinglePartitionShufflePartitioner but instead of concatenating batches to write to a shuffle file for a single partition, it accumulates the number of rows, then writes a single IPC batch for the number of rows, but makes sure the index file has the expected number of partitions.

What changes are included in this PR?

native/shuffle/src/partitioners/empty_schema.rs: new EmptySchemaShufflePartitioner that accumulates row count, writes a single zero-column IPC batch to partition 0, and fills the index with equal offsets for all other partitions
native/shuffle/src/partitioners/mod.rs: exports the new partitioner
native/shuffle/src/shuffle_writer.rs: branches on schema.fields().is_empty() before falling through to MultiPartitionShuffleRepartitioner; added Rust test verifying row count roundtrip and index structure
spark/.../CometNativeShuffleSuite.scala: integration test from PR chore: fix native shuffle for batches with no columns and 0 row count #3858 for repartition(10).count() with native DataFusion scan

How are these changes tested?

New test from #3858 that reflects repro in #3846.

mbutrovich · 2026-04-03T16:53:01Z

Based on grepping logs when I still has it at INFO level, these Spark SQL tests cover this codepath in addition to the unit test we added to CometNativeShuffleSuite:

postgreSQL/union.sql
subquery/exists-subquery/exists-orderby-limit.sql

comphead · 2026-04-03T20:00:35Z

native/shuffle/src/partitioners/empty_schema.rs

+/// This handles shuffles for operations like COUNT(*) that produce empty-schema record batches
+/// but contain a valid row count. Accumulates the total row count and writes a single
+/// zero-column IPC batch to partition 0. All other partitions get empty entries in the index file.
+pub(crate) struct EmptySchemaShufflePartitioner {


would be useful to attach a data flow graph or something, so can figure how data transforms across shuffle phases?

I'm not sure what you have in mind for this one because this partitioner targets a very narrow type of queries. I think there are other resources to read about general Spark shuffle behavior.

comphead · 2026-04-03T20:03:01Z

native/shuffle/src/partitioners/empty_schema.rs

+#[async_trait::async_trait]
+impl ShufflePartitioner for EmptySchemaShufflePartitioner {
+    async fn insert_batch(&mut self, batch: RecordBatch) -> datafusion::common::Result<()> {
+        let start_time = Instant::now();


I'm starting to think if we need to wrap timings into macros and make them optional 🤔

Timers have cost, but in the grand scheme of Spark jobs that last hours or days, they're not the highest priority to optimize.

comphead · 2026-04-03T20:04:35Z

native/shuffle/src/partitioners/empty_schema.rs

+            .map_err(|e| DataFusionError::Execution(format!("shuffle write error: {e:?}")))?;
+        let mut index_writer = BufWriter::new(index_file);
+        index_writer.write_all(&0i64.to_le_bytes())?;
+        for _ in 0..self.num_output_partitions {


self.num_output_partitions ? Am I right it should be just 1 parittion?

The shuffle writer must write index entries for all target partitions, even if we're accumulating everything into a single batch in the first partition.

add EmptySchemaShufflePartitioner and test from apache#3858

8d71b65

mbutrovich changed the title ~~add EmptySchemaShufflePartitioner and test from #3858~~ fix: add EmptySchemaShufflePartitioner and test from #3858 Apr 3, 2026

mbutrovich added 2 commits April 2, 2026 20:48

reorder match, add info logging

c2a6c50

remove println from test

3c76d50

mbutrovich marked this pull request as ready for review April 3, 2026 15:13

mbutrovich added 2 commits April 3, 2026 11:21

Merge branch 'main' into empty_schema_partitioner

a9da7b6

add test with 0 row, reduce logging verbosity to debug

2c7650b

comphead reviewed Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add EmptySchemaShufflePartitioner and test from #3858#3893

fix: add EmptySchemaShufflePartitioner and test from #3858#3893
mbutrovich wants to merge 5 commits intoapache:mainfrom
mbutrovich:empty_schema_partitioner

mbutrovich commented Apr 3, 2026 •

edited

Loading

Uh oh!

mbutrovich commented Apr 3, 2026

Uh oh!

comphead Apr 3, 2026

Uh oh!

mbutrovich Apr 3, 2026

Uh oh!

comphead Apr 3, 2026

Uh oh!

mbutrovich Apr 3, 2026

Uh oh!

comphead Apr 3, 2026

Uh oh!

mbutrovich Apr 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mbutrovich commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

mbutrovich commented Apr 3, 2026

Uh oh!

comphead Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

mbutrovich Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

comphead Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

mbutrovich Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

comphead Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

mbutrovich Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mbutrovich commented Apr 3, 2026 •

edited

Loading

mbutrovich Apr 3, 2026 •

edited

Loading