Fix repartition from dropping data when spilling by xanderbailey · Pull Request #20672 · apache/datafusion

xanderbailey · 2026-03-03T14:43:09Z

Which issue does this PR close?

Closes Repartition drops data when spilling #20683

Rationale for this change

In non-preserve-order repartitioning mode, all input partition tasks share clones of the same SpillPoolWriter for each output partition. SpillPoolWriter used #[derive(Clone)] but its Drop implementation unconditionally set writer_dropped = true and finalized the current spill file. This meant that when the first input task finished and its clone was dropped, the SpillPoolReader would see writer_dropped = true on an empty queue and return EOF — silently discarding every batch subsequently written by the still-running input tasks.

This bug requires three conditions to trigger:

Non-preserve-order repartitioning (so spill writers are cloned across input tasks)
Memory pressure causing batches to spill to disk
Input tasks finishing at different times (the common case with varying partition sizes)

What changes are included in this PR?

datafusion/physical-plan/src/spill/spill_pool.rs:

Added active_writer_count: usize to SpillPoolShared to track the number of live writer clones.
Replaced #[derive(Clone)] on SpillPoolWriter with a manual Clone impl that increments active_writer_count under the shared lock.
Updated Drop to decrement active_writer_count and only finalize the current file / set writer_dropped = true when the count reaches zero (i.e. the last clone is dropped). Non-last clones now return immediately from Drop.
Added regression test test_clone_drop_does_not_signal_eof_prematurely that reproduces the exact failure: writer1 writes and drops, the reader drains the queue, then writer2 (still alive) writes. Without the fix the reader returns premature EOF and the assertion fails; with the fix the reader waits and reads both batches.

Are these changes tested?

Yes. A new unit test (test_clone_drop_does_not_signal_eof_prematurely) directly reproduces the bug. It was verified to fail without the fix and pass with the fix.

Are there any user-facing changes?

No.

xanderbailey · 2026-03-03T14:47:09Z

datafusion/physical-plan/src/spill/spill_pool.rs

+        .await
+        .expect("Reader timed out — should not hang");
+
+        assert!(


Without this fix we fail here.

hareshkh

LGTM 🚀

github-actions bot added the physical-plan Changes to the physical-plan crate label Mar 3, 2026

xanderbailey commented Mar 3, 2026

View reviewed changes

Fix repartition from dropping data when spilling

09a8630

xanderbailey force-pushed the xb/fix_repartition branch from 4925c63 to 09a8630 Compare March 3, 2026 14:49

hareshkh mentioned this pull request Mar 3, 2026

Fix Arrow Spill Underrun #20159

Merged

hareshkh approved these changes Mar 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix repartition from dropping data when spilling#20672

Fix repartition from dropping data when spilling#20672
xanderbailey wants to merge 1 commit intoapache:mainfrom
xanderbailey:xb/fix_repartition

xanderbailey commented Mar 3, 2026 •

edited

Loading

Uh oh!

xanderbailey Mar 3, 2026

Uh oh!

hareshkh left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xanderbailey commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

xanderbailey Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

hareshkh left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xanderbailey commented Mar 3, 2026 •

edited

Loading