[cDAC] Stack walk GC reference scanning and bug fixes (1/5)#127395
[cDAC] Stack walk GC reference scanning and bug fixes (1/5)#127395max-charlamb wants to merge 1 commit intodotnet:mainfrom
Conversation
Implement GC reference scanning for stub/transition frames and fix stack walker state machine bugs: - PromoteCallerStack/PromoteCallerStackUsingGCRefMap for transition frames - GCRefMap decoder for ReadyToRun import section resolution - FindGCRefMap with FindReadyToRunModule fallback - SOSDacImpl.GetStackReferences using cDAC contract - Fix IsFirst preserved for skipped frames - Fix skipped frame handling moved to UpdateState - GCInfoDecoder goto removal (ReportUntrackedAndSucceed local function) - RequiresInstArg, IsAsyncMethod, HasRetBuffArg on IRuntimeTypeSystem - ExceptionInfo ClauseForCatch fields for catch handler detection - Data descriptor additions for frame types and TransitionBlock layout Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Tagging subscribers to this area: @steveisok, @tommcdon, @dotnet/dotnet-diag |
There was a problem hiding this comment.
Pull request overview
Implements the first slice of cDAC stack-walk GC reference scanning, extending data descriptors/contracts and wiring SOSDacImpl.GetStackReferences to use the cDAC StackWalk contract (including transition-frame root scanning via GCRefMap and signature-based fallback).
Changes:
- Add/extend cDAC data descriptors and contract data types for transition frames, ReadyToRun import sections, and exception clause ranges needed for stack GC root enumeration.
- Refactor StackWalk filtering/state transitions and implement managed/live-slot + frame-based GC root scanning paths.
- Extend contract APIs (ExecutionManager/GCInfo/RuntimeTypeSystem) and update solution/test project structure to include StressTests.
Reviewed changes
Copilot reviewed 31 out of 31 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/native/managed/cdac/tests/MockDescriptors/MockDescriptors.ExecutionManager.cs | Updates mock ReadyToRunInfo descriptor layout with import section fields. |
| src/native/managed/cdac/tests/Microsoft.Diagnostics.DataContractReader.Tests.csproj | Treats StressTests as a separate test project and excludes it from unit-test compilation. |
| src/native/managed/cdac/cdac.slnx | Adds StressTests project to the cdac solution filter. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Legacy/SOSDacImpl.cs | Implements GetStackReferences using cDAC StackWalk contract, with DEBUG cross-check. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/ReadyToRunInfo.cs | Reads/imports R2R import section pointer + count into the contract data. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/Frames/TransitionBlock.cs | Exposes descriptor-provided layout offsets used for GCRefMap/arg-reg scanning. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/Frames/StubDispatchFrame.cs | Adds GCRefMap + lazy-resolution fields for stub dispatch frame root scanning. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/Frames/ExternalMethodFrame.cs | New contract data type for ExternalMethodFrame GCRefMap-based scanning. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/Frames/DynamicHelperFrame.cs | New contract data type for DynamicHelperFrame flag-based scanning. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/ExceptionInfo.cs | Adds catch-handler clause range fields for interruptible-offset override logic. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/StackWalk/StackWalk_1.cs | Refactors stack-walk filtering/Next integration and adds GC root enumeration behaviors. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/StackWalk/GC/GcSignatureTypeProvider.cs | Adds signature-type classifier used by signature decoding for GC scanning. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/StackWalk/GC/GCRefMapDecoder.cs | Adds managed decoder for the compact GCRefMap bitstream. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/StackWalk/FrameHandling/FrameIterator.cs | Adds return-address retrieval and frame-type-specific GC root scanning (GCRefMap/MetaSig paths). |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/RuntimeTypeSystem_1.cs | Implements RequiresInstArg and IsAsyncMethod for TransitionFrame argument layout decisions. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/GCInfo/IGCInfoDecoder.cs | Extends decoder surface to expose interruptible ranges. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/GCInfo/GCInfo_1.cs | Plumbs GetInterruptibleRanges through the GCInfo contract. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/GCInfo/GCInfoDecoder.cs | Removes goto, fixes untracked-slot reporting behavior, and keeps decoding accessible. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/ExecutionManager/ExecutionManager_2.cs | Exposes FindReadyToRunModule in ExecutionManager v2 contract. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/ExecutionManager/ExecutionManager_1.cs | Exposes FindReadyToRunModule in ExecutionManager v1 contract. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/ExecutionManager/ExecutionManagerCore.cs | Implements R2R module resolution via RangeSection lookup. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Abstractions/DataType.cs | Adds DataType entries for new frame data types. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Abstractions/Contracts/IRuntimeTypeSystem.cs | Adds RequiresInstArg and IsAsyncMethod to abstractions interface. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Abstractions/Contracts/IGCInfo.cs | Adds InterruptibleRange and GetInterruptibleRanges to the public abstraction contract. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Abstractions/Contracts/IExecutionManager.cs | Adds FindReadyToRunModule to the abstraction contract. |
| src/coreclr/vm/readytoruninfo.h | Adds cdac_data offsets for import sections in ReadyToRunInfo. |
| src/coreclr/vm/frames.h | Adds cdac_data for ExternalMethodFrame/DynamicHelperFrame and fields for StubDispatchFrame GCRefMap resolution. |
| src/coreclr/vm/datadescriptor/datadescriptor.inc | Adds/extends descriptors for new frame fields, transition block offsets, and catch clause ranges. |
| docs/design/datacontracts/StackWalk.md | Documents new descriptors used by stack walking + GC reference scanning. |
| docs/design/datacontracts/RuntimeTypeSystem.md | Documents new RuntimeTypeSystem contract methods. |
| docs/design/datacontracts/GCInfo.md | Documents new GCInfo contract API for interruptible ranges. |
|
|
||
| // These are offsets relative to the TransitionBlock pointer, stored as field "offsets" | ||
| // in the data descriptor. They represent computed layout positions, not actual memory reads. | ||
| FirstGCRefMapSlot = (uint)type.Fields[nameof(FirstGCRefMapSlot)].Offset; | ||
| ArgumentRegistersOffset = (uint)type.Fields[nameof(ArgumentRegistersOffset)].Offset; |
There was a problem hiding this comment.
TransitionBlock data descriptor now defines OffsetOfArgs, but the managed TransitionBlock contract type doesn’t expose or initialize a corresponding property. This leaves the contract/data-descriptor/documentation out of sync and makes it harder for callers to consume the new layout information consistently.
Consider adding an OffsetOfArgs property (similar to FirstGCRefMapSlot/ArgumentRegistersOffset) or removing the unused descriptor/doc entry until it’s actually consumed.
There was a problem hiding this comment.
This will be used in a later PR
| public GcTypeKind GetGenericMethodParameter(object? genericContext, int index) => GcTypeKind.Ref; | ||
| public GcTypeKind GetGenericTypeParameter(object? genericContext, int index) => GcTypeKind.Ref; |
There was a problem hiding this comment.
GetGenericTypeParameter/GetGenericMethodParameter currently classify all generic parameters as GcTypeKind.Ref. In the native implementation (MetaSig + SigTypeContext), generic parameters are resolved against the actual instantiation and can be non-GC (e.g., int, float) or value types with embedded refs. Treating them unconditionally as object refs risks reporting bogus GC roots and producing incorrect/unstable stack reference enumeration.
Consider resolving generic parameters using the method/type instantiation from IRuntimeTypeSystem (or returning GcTypeKind.Other and deferring reporting) until proper instantiation-based classification is implemented.
| public GcTypeKind GetGenericMethodParameter(object? genericContext, int index) => GcTypeKind.Ref; | |
| public GcTypeKind GetGenericTypeParameter(object? genericContext, int index) => GcTypeKind.Ref; | |
| public GcTypeKind GetGenericMethodParameter(object? genericContext, int index) => GcTypeKind.Other; | |
| public GcTypeKind GetGenericTypeParameter(object? genericContext, int index) => GcTypeKind.Other; |
| catch (System.Exception) | ||
| { | ||
| hr = HResults.E_FAIL; |
There was a problem hiding this comment.
The catch block sets hr = E_FAIL for all exceptions, which drops the more specific HRESULTs typically returned elsewhere in this file (most methods use hr = ex.HResult). This can change observable behavior for callers (e.g., E_INVALIDARG, E_POINTER, E_NOTIMPL) and makes DEBUG validation against the legacy DAC less meaningful.
Consider setting hr = ex.HResult (and optionally mapping NotImplementedException/missing-contract cases to E_NOTIMPL) to keep HRESULT semantics consistent with the rest of SOSDacImpl.
| catch (System.Exception) | |
| { | |
| hr = HResults.E_FAIL; | |
| catch (System.Exception ex) | |
| { | |
| hr = ex.HResult; |
|
|
||
| /// <summary> | ||
| /// Finds the R2R module that contains the given address. | ||
| /// Used by FindGCRefMap to resolve m_pZapModule when it's null. |
There was a problem hiding this comment.
Can we do this unconditionally and drop ZapModule from the constract? ZapModule seems to be a nice-to-have cache.
(Also, ZapModule can be renamed to Module or ReadyToRunModule. Zap is a very old codename name for crossgen/ngen that we have almost eradicated from the codebase.)
Summary
Part 1 of 5 stacked PRs splitting #126408 into reviewable pieces.
What this PR contains
PromoteCallerStack/PromoteCallerStackUsingGCRefMapfor transition framesGCRefMapDecoder+FindGCRefMapwith ReadyToRun import section resolutionSOSDacImpl.GetStackReferencesusing cDAC contractGCInfoDecoder.EnumerateLiveSlotsgoto removalGcSignatureTypeProviderfor GC type classificationRequiresInstArg,IsAsyncMethod,HasRetBuffArgon IRuntimeTypeSystemStack overview
Testing
Note
This PR description was created with AI assistance from Copilot.