feat: implement spill callback for cross-task memory eviction [experimental]#3869
Draft
andygrove wants to merge 12 commits intoapache:mainfrom
Draft
feat: implement spill callback for cross-task memory eviction [experimental]#3869andygrove wants to merge 12 commits intoapache:mainfrom
andygrove wants to merge 12 commits intoapache:mainfrom
Conversation
Add benchmarking script to measure peak RSS per TPC-H query under different Spark/Comet configurations and off-heap memory sizes. Includes analysis document investigating why Comet requires more off-heap memory than expected.
local[8] benchmark shows Comet memory usage now on par with Spark (8525 vs 8476 MB) and elastic growth eliminated.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Related to reducing Comet's off-heap memory requirements for large-scale workloads.
Rationale for this change
When running TPC-H at SF100+ scale, Comet requires significantly more off-heap memory than Spark alone. Benchmarking on TPC-H SF100 with
local[4]showed:Q9 showed elastic memory growth (450 MB increase from 4g to 8g offHeap), traced to shuffle writer greedy buffering. The root cause is that Comet's
NativeMemoryConsumer.spill()returned 0, preventing Spark from reclaiming memory across tasks.What changes are included in this PR?
Spill callback implementation
SpillState(native/core/src/execution/memory_pools/spill.rs): Shared state with atomics and condvar for coordinating spill requests between Spark's memory manager thread and DataFusion's operator threads.CometUnifiedMemoryPool: ChecksSpillState.pressure()intry_grow()— returnsResourcesExhaustedwhen spill pressure is active, triggering DataFusion's Sort/Aggregate/Shuffle operators to spill internally. Tracks freed bytes inshrink().CometTaskMemoryManager.spill(): Now forwards spill requests to native via JNIrequestSpill()instead of returning 0. Waits up to 10 seconds for operators to react.CometExecIterator: Wires the native plan handle toCometTaskMemoryManagerafter creation, clears it beforereleasePlanto prevent callbacks after destruction.Memory profiling tools
benchmarks/tpc/memory-profile.sh: Script that runs TPC-H queries individually under different configurations (Spark-only, Comet with varying offHeap sizes) in local mode, capturing peak RSS via/usr/bin/time -l.docs/source/contributor-guide/memory-management.md: Analysis of Comet's memory management architecture, benchmark results, comparison with Gluten's approach, and documentation of the spill callback design.How are these changes tested?
Benchmarked with TPC-H SF100 comparing peak RSS before and after the change. The spill callback is exercised whenever Spark's memory manager calls
spill()on Comet'sNativeMemoryConsumer, which happens when multiple concurrent tasks compete for the shared off-heap pool. Existing test suites cover the execution paths (Sort, Aggregate, Shuffle) that react toResourcesExhaustedby spilling.