JIT: Clear out return value references from continuations#129157
JIT: Clear out return value references from continuations#129157jakobbotsch wants to merge 3 commits into
Conversation
Async1 deterministically clears out awaiters on resumption, meaning that the callee's `Task` and its result do not stay alive. This change similarly makes it so that we do not keep results alive in async2. Async1 has similar clearing for locals based on lexical scope. I am hoping we can get away with not implementing something similar for runtime async (it would be expensive and impossible to guarantee similar behavior as async1).
There was a problem hiding this comment.
Pull request overview
This PR updates the CoreCLR JIT’s runtime-async transformation to proactively clear GC references from the continuation’s stored return value after resumption copies the result out, reducing unintended object rooting when continuations are reused.
Changes:
- Adds
AsyncTransformation::ClearReturnValueOnResumptionand invokes it after copying the awaited call’s return value when continuation reuse is enabled. - Implements two clearing strategies for struct returns with GC pointers: per-GC-slot clearing for small/ref-sparse structs, otherwise a bulk zeroing store.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/coreclr/jit/async.h |
Declares the new helper used to clear return-value GC references on resumption. |
src/coreclr/jit/async.cpp |
Calls the helper after copying results and implements logic to clear GC refs in ref/struct return slots when continuations are reused. |
| // If there are few GC references, and at most half of the struct is | ||
| // made up of GC references, then clear the individual GC pointers | ||
| // instead of zeroing out the whole struct. | ||
| if ((gcPtrCount <= 4) && ((gcPtrCount * 2) <= retLayout->GetSlotCount())) |
There was a problem hiding this comment.
@EgorBo Any suggestion on the heuristic to use here? Wondering if something in the backend does similar reasoning.
Could also always just be a struct clear.
There was a problem hiding this comment.
@jakobbotsch I assume we don't care about non-gc slots? it's actually a very interesting question that might improve perf everywhere. Example: https://godbolt.org/z/xMTcqn8EW today in SkipLocalsInit context (which is default everywhere in BCL) we don't try to optimize it like you do here 🤔 so I assume a fix for that might improve perf in many places.
I wonder if we should introduce a "GT_UNINIT" (or GT_POISON?) node as RHL of GT_STORE_BLK and then lower will come up with the best strategy
There was a problem hiding this comment.
Right, we only care about clearing GC slots to ensure they don't stay rooted.
Example: https://godbolt.org/z/xMTcqn8EW today in SkipLocalsInit context (which is default everywhere in BCL) we don't try to optimize it like you do here 🤔
We do have some logic it seems:
runtime/src/coreclr/jit/codegencommon.cpp
Lines 4097 to 4112 in 4b6b5d0
Async1 deterministically clears out awaiters on resumption, meaning that the callee's
Taskand its result do not stay alive. This change similarly makes it so that we do not keep results alive in async2.Async1 has similar clearing for locals based on lexical scope. I am hoping we can get away with not implementing something similar for runtime async (it would be expensive and impossible to guarantee similar behavior as async1).
Fix #126735