[cDAC] Fix GetMethodTableName empty type name mismatch#127405
[cDAC] Fix GetMethodTableName empty type name mismatch#127405max-charlamb wants to merge 1 commit intomainfrom
Conversation
|
Tagging subscribers to this area: @steveisok, @tommcdon, @dotnet/dotnet-diag |
There was a problem hiding this comment.
Pull request overview
Aligns cDAC GetMethodTableName behavior with legacy DAC for the “empty type name” case to prevent DEBUG validation asserts/crashes in SOS diagnostics runs, and improves diagnostic artifact availability in CI.
Changes:
- Return
E_OUTOFMEMORYwhenTypeNameBuilderyields an empty method table name (matching legacy DAC behavior), with additional stderr logging. - Add a DEBUG-only helper (
ValidateOutputStringBuffer) intended to soften output-buffer validation into warnings. - Update
runtime-diagnostics.ymlto publish SOS logs/test results artifacts on successful runs as well.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Legacy/SOSDacImpl.cs | Adds empty-name detection in GetMethodTableName and returns E_OUTOFMEMORY to match legacy DAC behavior. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Legacy/DebugExtensions.cs | Introduces a DEBUG helper to compare output buffers/needed sizes with warning-based reporting. |
| eng/pipelines/runtime-diagnostics.yml | Publishes test results/SOS logs artifacts for passing runs (not only failures). |
| string cdacStr = cdacBuffer is not null && cdacLen > 0 ? new string(cdacBuffer, 0, cdacLen - 1) : "<null>"; | ||
| string dacStr = dacNeeded > 0 ? new string(dacBuffer, 0, (int)dacNeeded - 1) : "<empty>"; | ||
| Trace.TraceWarning($"Output buffer pNeeded mismatch ({location}) - cDAC: {*cdacNeeded} (\"{cdacStr}\"), DAC: {dacNeeded} (\"{dacStr}\")"); | ||
| } | ||
| else if (cdacBuffer is not null && dacNeeded > 0 && count >= dacNeeded) | ||
| { | ||
| var cdacSpan = new ReadOnlySpan<char>(cdacBuffer, (int)dacNeeded - 1); | ||
| var dacSpan = new ReadOnlySpan<char>(dacBuffer, 0, (int)dacNeeded - 1); | ||
| if (!cdacSpan.SequenceEqual(dacSpan)) | ||
| { | ||
| Trace.TraceWarning($"Output buffer content mismatch ({location}) - cDAC: \"{new string(cdacBuffer, 0, (int)dacNeeded - 1)}\", DAC: \"{new string(dacBuffer, 0, (int)dacNeeded - 1)}\""); |
There was a problem hiding this comment.
ValidateOutputStringBuffer can throw when formatting dacStr/content: dacNeeded may be larger than dacBuffer.Length (e.g., when the legacy DAC reports required size > count). In the pNeeded-mismatch branch and the content-mismatch warning, clamp the length to the available buffer (e.g., min(dacNeeded, (uint)dacBuffer.Length, count)) before calling new string(...) or constructing spans.
| string cdacStr = cdacBuffer is not null && cdacLen > 0 ? new string(cdacBuffer, 0, cdacLen - 1) : "<null>"; | |
| string dacStr = dacNeeded > 0 ? new string(dacBuffer, 0, (int)dacNeeded - 1) : "<empty>"; | |
| Trace.TraceWarning($"Output buffer pNeeded mismatch ({location}) - cDAC: {*cdacNeeded} (\"{cdacStr}\"), DAC: {dacNeeded} (\"{dacStr}\")"); | |
| } | |
| else if (cdacBuffer is not null && dacNeeded > 0 && count >= dacNeeded) | |
| { | |
| var cdacSpan = new ReadOnlySpan<char>(cdacBuffer, (int)dacNeeded - 1); | |
| var dacSpan = new ReadOnlySpan<char>(dacBuffer, 0, (int)dacNeeded - 1); | |
| if (!cdacSpan.SequenceEqual(dacSpan)) | |
| { | |
| Trace.TraceWarning($"Output buffer content mismatch ({location}) - cDAC: \"{new string(cdacBuffer, 0, (int)dacNeeded - 1)}\", DAC: \"{new string(dacBuffer, 0, (int)dacNeeded - 1)}\""); | |
| int dacLen = (int)Math.Min(dacNeeded, Math.Min((uint)dacBuffer.Length, count)); | |
| string cdacStr = cdacBuffer is not null && cdacLen > 0 ? new string(cdacBuffer, 0, cdacLen - 1) : "<null>"; | |
| string dacStr = dacLen > 0 ? new string(dacBuffer, 0, dacLen - 1) : "<empty>"; | |
| Trace.TraceWarning($"Output buffer pNeeded mismatch ({location}) - cDAC: {*cdacNeeded} (\"{cdacStr}\"), DAC: {dacNeeded} (\"{dacStr}\")"); | |
| } | |
| else if (cdacBuffer is not null && dacNeeded > 0 && count >= dacNeeded) | |
| { | |
| int compareLen = (int)Math.Min(dacNeeded, Math.Min((uint)dacBuffer.Length, count)); | |
| if (compareLen > 0) | |
| { | |
| var cdacSpan = new ReadOnlySpan<char>(cdacBuffer, compareLen - 1); | |
| var dacSpan = new ReadOnlySpan<char>(dacBuffer, 0, compareLen - 1); | |
| if (!cdacSpan.SequenceEqual(dacSpan)) | |
| { | |
| Trace.TraceWarning($"Output buffer content mismatch ({location}) - cDAC: \"{new string(cdacBuffer, 0, compareLen - 1)}\", DAC: \"{new string(dacBuffer, 0, compareLen - 1)}\""); | |
| } |
| displayName: 'Publish Test Results and SOS Logs' | ||
| continueOnError: true | ||
| condition: failed() | ||
| condition: always() |
There was a problem hiding this comment.
Using condition: always() will also run this artifact publish step on canceled jobs, which often produces noisy "path not found" failures (even with continueOnError) and extra pipeline work. If the goal is to publish on both pass and fail runs, condition: succeededOrFailed() is typically a better fit (or an always() condition additionally excluding Agent.JobStatus == 'Canceled').
| condition: always() | |
| condition: succeededOrFailed() |
| displayName: 'Publish Test Results and SOS Logs' | ||
| continueOnError: true | ||
| condition: failed() | ||
| condition: always() |
There was a problem hiding this comment.
Same as earlier: consider using succeededOrFailed() (or exclude canceled) instead of always() so this publish step doesn’t run on canceled jobs and create noisy failures.
| condition: always() | |
| condition: succeededOrFailed() |
| displayName: 'Publish Test Results and SOS Logs' | ||
| continueOnError: true | ||
| condition: failed() | ||
| condition: always() |
There was a problem hiding this comment.
Same as earlier: consider using succeededOrFailed() (or exclude canceled) instead of always() so this publish step doesn’t run on canceled jobs and create noisy failures.
| condition: always() | |
| condition: succeededOrFailed() |
| System.Console.Error.WriteLine($"CDAC_EMPTY_NAME GetMethodTableName MT=0x{(ulong)mt:X} count={count} — TypeNameBuilder produced empty string, returning E_OUTOFMEMORY"); | ||
| System.Console.Error.Flush(); |
There was a problem hiding this comment.
This stderr logging is unconditional and will run in normal (non-"CDAC_NO_FALLBACK") scenarios, which can add noise and overhead in a hot path like GetMethodTableName. Consider gating it behind an opt-in switch (similar to LegacyFallbackHelper’s CDAC_NO_FALLBACK logging) or using Trace/Conditional("DEBUG") so release builds don’t emit unsolicited stderr output.
| System.Console.Error.WriteLine($"CDAC_EMPTY_NAME GetMethodTableName MT=0x{(ulong)mt:X} count={count} — TypeNameBuilder produced empty string, returning E_OUTOFMEMORY"); | |
| System.Console.Error.Flush(); | |
| Debug.WriteLine($"CDAC_EMPTY_NAME {nameof(GetMethodTableName)} MT=0x{(ulong)mt:X} count={count} - TypeNameBuilder produced empty string, returning E_OUTOFMEMORY"); |
The legacy DAC returned E_OUTOFMEMORY when TypeString::AppendType produced an empty string, while the cDAC returned S_OK with pNeeded=1. This mismatch caused the #if DEBUG validation assertion to crash SOS.WebApp3 in all runtime-diagnostics PR builds. The E_OUTOFMEMORY was incorrect — an empty type name is not an OOM condition. Fix the native DAC to write an empty string with pNeeded=1 (matching the cDAC's existing behavior) instead of returning a misleading error code. This was exposed by c835d0f ('Change heap dumps to use HEAP2 as the default') which changed what memory regions are included in heap dumps, causing some type names to become unresolvable. Also: - Add ValidateOutputStringBuffer helper to DebugExtensions for future use in converting output buffer assertions to soft-fail - Change runtime-diagnostics pipeline to always publish test results and SOS logs (not just on failure) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
6f603d6 to
da6d53f
Compare
Note
This PR was generated with the assistance of GitHub Copilot.
Problem
The
SOS.WebApp3test has been crashing in allruntime-diagnosticsPR builds since April 24 with aDebug.Assertfailure inGetMethodTableName.Root Cause
Commit c835d0f ("Change heap dumps to use HEAP2 as the default") removed explicit enumeration of
m_pAvailableParamTypes,m_pInstMethodHashTable, and per-type/methodEnumMemoryRegionsfromModule::EnumMemoryRegionsfor heap dumps. HEAP2 relies solely onLoaderAllocatorheap enumeration, but generic instantiation lookup tables aren't fully captured by raw heap page enumeration. This causes some type names to be unresolvable from HEAP2 dumps.When
TypeString::AppendTypecan't resolve a type name (produces empty string), the native DAC was returningE_OUTOFMEMORYwhile the cDAC returnedS_OKwithpNeeded=1. The#if DEBUGvalidation caught this mismatch and crashed.What this PR fixes
request.cpp: Fix the native DAC to write an empty string with
pNeeded=1instead of returningE_OUTOFMEMORY— an empty type name is not an OOM condition. Both DACs now agree onS_OKwith empty string.runtime-diagnostics.yml: Changed "Publish Test Results and SOS Logs" from
condition: failed()tocondition: always()for all three SOS test jobs, so diagnostic output is accessible for passing runs.DebugExtensions.cs: Added
ValidateOutputStringBufferhelper for future use in converting output buffer assertions to soft-fail.Follow-up needed
The HEAP2 dump regression should be addressed separately. The removed code from
ceeload.cpp(m_pAvailableParamTypes,m_pInstMethodHashTable, and per-MethodTable/MethodDesc enumeration) needs to be partially restored to ensure generic instantiation type names are resolvable from heap dumps. cc @hoyosjs