Skip to content

JIT: Clear out return value references from continuations#129157

Open
jakobbotsch wants to merge 3 commits into
dotnet:mainfrom
jakobbotsch:runtime-async-clear-return-values
Open

JIT: Clear out return value references from continuations#129157
jakobbotsch wants to merge 3 commits into
dotnet:mainfrom
jakobbotsch:runtime-async-clear-return-values

Conversation

@jakobbotsch

Copy link
Copy Markdown
Member

Async1 deterministically clears out awaiters on resumption, meaning that the callee's Task and its result do not stay alive. This change similarly makes it so that we do not keep results alive in async2.

Async1 has similar clearing for locals based on lexical scope. I am hoping we can get away with not implementing something similar for runtime async (it would be expensive and impossible to guarantee similar behavior as async1).

Fix #126735

Async1 deterministically clears out awaiters on resumption, meaning that
the callee's `Task` and its result do not stay alive. This change
similarly makes it so that we do not keep results alive in async2.

Async1 has similar clearing for locals based on lexical scope. I am
hoping we can get away with not implementing something similar for
runtime async (it would be expensive and impossible to guarantee similar
behavior as async1).
Copilot AI review requested due to automatic review settings June 9, 2026 08:42
@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 9, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the CoreCLR JIT’s runtime-async transformation to proactively clear GC references from the continuation’s stored return value after resumption copies the result out, reducing unintended object rooting when continuations are reused.

Changes:

  • Adds AsyncTransformation::ClearReturnValueOnResumption and invokes it after copying the awaited call’s return value when continuation reuse is enabled.
  • Implements two clearing strategies for struct returns with GC pointers: per-GC-slot clearing for small/ref-sparse structs, otherwise a bulk zeroing store.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/coreclr/jit/async.h Declares the new helper used to clear return-value GC references on resumption.
src/coreclr/jit/async.cpp Calls the helper after copying results and implements logic to clear GC refs in ref/struct return slots when continuations are reused.

Comment thread src/coreclr/jit/async.cpp
Comment thread src/coreclr/jit/async.cpp
Comment on lines +3192 to +3195
// If there are few GC references, and at most half of the struct is
// made up of GC references, then clear the individual GC pointers
// instead of zeroing out the whole struct.
if ((gcPtrCount <= 4) && ((gcPtrCount * 2) <= retLayout->GetSlotCount()))

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EgorBo Any suggestion on the heuristic to use here? Wondering if something in the backend does similar reasoning.
Could also always just be a struct clear.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakobbotsch I assume we don't care about non-gc slots? it's actually a very interesting question that might improve perf everywhere. Example: https://godbolt.org/z/xMTcqn8EW today in SkipLocalsInit context (which is default everywhere in BCL) we don't try to optimize it like you do here 🤔 so I assume a fix for that might improve perf in many places.

I wonder if we should introduce a "GT_UNINIT" (or GT_POISON?) node as RHL of GT_STORE_BLK and then lower will come up with the best strategy

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we only care about clearing GC slots to ensure they don't stay rooted.

Example: https://godbolt.org/z/xMTcqn8EW today in SkipLocalsInit context (which is default everywhere in BCL) we don't try to optimize it like you do here 🤔

We do have some logic it seems:

if (varDsc->TypeIs(TYP_STRUCT) && !m_compiler->info.compInitMem &&
(varDsc->lvExactSize() >= TARGET_POINTER_SIZE))
{
// We only initialize the GC variables in the TYP_STRUCT
const unsigned slots = (unsigned)m_compiler->lvaLclStackHomeSize(varNum) / REGSIZE_BYTES;
ClassLayout* layout = varDsc->GetLayout();
for (unsigned i = 0; i < slots; i++)
{
if (layout->IsGCPtr(i))
{
GetEmitter()->emitIns_S_R(ins_Store(TYP_I_IMPL), EA_PTRSIZE,
genGetZeroReg(initReg, pInitRegZeroed), varNum, i * REGSIZE_BYTES);
}
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Possible object rooting issue with runtime-async

3 participants