Skip to content

JIT: Enable CHECK_IR earlier#127425

Draft
AndyAyersMS wants to merge 22 commits intodotnet:mainfrom
AndyAyersMS:check-ir-stepwise-minimal
Draft

JIT: Enable CHECK_IR earlier#127425
AndyAyersMS wants to merge 22 commits intodotnet:mainfrom
AndyAyersMS:check-ir-stepwise-minimal

Conversation

@AndyAyersMS
Copy link
Copy Markdown
Member

We only do a detailed check of some IR invariants (notably flags) after morph. Start moving the enabling earlier in the phase list (here, by two phases).

Allow a brief window between morphing implicit byrefs and morph where USEASG is not correct. We possibly should revisit the division of labor here.

AndyAyersMS and others added 15 commits April 23, 2026 16:04
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Move CHECK_IR enablement up one phase past PHASE_MORPH_IMPBYREF.
- Default GTF_EXCEPT on in gtNewCallNode; clear for no-throw helpers
  in both gtNewHelperCallNode overloads.
- Remove GT_QMARK !op1->CanCSE() assertion from fgDebugCheckFlags
  (per user direction).

Not yet clean on aspnet2: 312 missing, 457 extra, 1 USEASG.
Total assertions went from 945 -> 769 (else-if redistribution).
- fginline: after RET_EXPR substitution, run gtUpdateStmtSideEffects on the
  owning statement to refresh stale ancestor side-effect flags.
- gentree: GT_RET_EXPR now sets GTF_CALL in OperRequiresCallFlag so that
  gtUpdateStmtSideEffects does not strip GTF_CALL from nested RET_EXPR
  nodes (the fginline walker uses GTF_CALL to decide whether to recurse).
- lclmorph: when marking a call with GTF_CALL_M_RETBUFFARG_LCLOPT, also
  set GTF_ASG on the call, matching OperRequiresAsgFlag.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…rofile this-arg

- indirectcalltransformer.cpp ClonabilityVisitor: when the scout substitutes
  a RET_EXPR with its gtSubstExpr in a subsequent statement, remember it and
  call gtUpdateStmtSideEffects on the affected statement so ancestor STORE/
  COMMA nodes no longer carry stale GTF_CALL.
- fgprofile.cpp class profile instrumentation: after replacing the this-arg
  with a COMMA(stores, this), propagate GTF_ALL_EFFECT from the COMMA to the
  owning call so GTF_ASG/GTF_CALL flags stay consistent.
- fginline.cpp LateDevirtualization: mark m_substitutedRetExpr = true after
  a successful gtSplitTree so the outer walker refreshes side effects on
  m_curStmt (safety belt for devirt-triggered substitutions).

aspnet2 replay failures: 150 -> 106.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
fgRetypeImplicitByRefArgs retypes implicit byref args from TYP_STRUCT to
TYP_BYREF but does not rewrite existing LCL_FLD/STORE_LCL_FLD references
into those args. Those references are rewritten later in
fgMorphExpandImplicitByRefArg during PHASE_MORPH_GLOBAL. During the
intervening window the GTF_VAR_USEASG flag that physical promotion set on
partial stores based on the old (struct) size no longer matches
IsPartialLclFld (computed from the new byref size), which trips the
post-phase CHECK_IR assert.

Advance CHECK_IR enablement one more phase (to before
PHASE_IMPBYREF_COPY_OMISSION) so the check first fires on the IR entering
that phase, and temporarily suppress CHECK_IR across PHASE_MORPH_IMPBYREF
where the IR is transiently inconsistent by design. CHECK_IR is restored
for PHASE_MORPH_GLOBAL and all subsequent phases.

Full SuperPMI replay (all 11 MCHs) is now clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
LocalAddressVisitor can rewrite IND(FIELD_ADDR(LCL_VAR)) into a LCL_FLD,
which removes GTF_EXCEPT (and other effect bits) from the rewritten node.
Ancestor nodes retain the stale effect flags, which tripped CHECK_IR once
it was enabled earlier in the phase pipeline.

Call gtUpdateStmtSideEffects once per modified statement to recompute
ancestor effect flags. This matches the pattern already used by other
phases that mutate IR shape (e.g. class-profile instrumentation).

On a full SuperPMI replay this reduces 'Extra flags' asserts ~88% and
'Missing flags' asserts ~80% across all collections.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…truction

With CHECK_IR enabled earlier in the phase pipeline (before PHASE_MORPH_IMPBYREF)
the diagnostic check fires in several places where the importer constructs calls
or intrinsics without accurately carrying over child side-effect flags. Fix the
underlying IR construction sites:

* importercalls.cpp: six sites that propagate arg/this/fptr flags to a call via
  `& GTF_GLOB_EFFECT` drop `GTF_ORDER_SIDEEFF`. Use `GTF_ALL_EFFECT` so
  volatile indirections and other ordering-sensitive side effects are carried
  onto the call node.
* importercalls.cpp: `RuntimeHelpers.IsKnownConstant` is modelled as a
  `GT_INTRINSIC` that `OperRequiresCallFlag` treats as call-implemented.
  Explicitly set `GTF_CALL` on the constructed node (matching the pattern
  already used for other user-call intrinsics) so the check-IR invariant holds
  until morph folds the node to 0/1.
* gentree.cpp (`gtNewSimdBinOpNode` small-integer shift path): after
  `fgMakeMultiUse` rewrites the existing `CreateScalar` operand into a
  COMMA containing a STORE_LCL_VAR, propagate the new child's side-effect
  flags onto the parent HWINTRINSIC.

Measured on SuperPMI full replay:
  Missing-flags diagnostics: 1828 -> 10 (total across 11 MCHs)
  No asm diffs expected; the flags were already semantically required.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ndow

PHASE_MORPH_IMPBYREF retypes implicit byref args from TYP_STRUCT to TYP_BYREF
but does not rewrite existing LCL_FLD/STORE_LCL_FLD references into them;
that rewrite happens later in fgMorphExpandImplicitByRefArg during
PHASE_MORPH_GLOBAL. In the window between those two phases, the GTF_VAR_USEASG
flag on such nodes was set for the original TYP_STRUCT size and may not match
IsPartialLclFld computed against the new TYP_BYREF size, tripping the
fgDebugCheckFlags invariant when CHECK_IR runs before PHASE_MORPH_GLOBAL.

Replace the previous broad `activePhaseChecks &= ~CHECK_IR` suppression
with a narrow predicate: a new debug-only `fgImplicitByRefLclFldsStale`
flag gates the USEASG/IsPartialLclFld check so it is skipped only for
STORE_LCL_FLD/LCL_FLD nodes whose local is an implicit-byref now typed
TYP_BYREF. All other checks continue to run during the window.

The flag must be true when the post-phase check fires at the end of
PHASE_MORPH_IMPBYREF, so set it before DoPhase and clear it after
PHASE_MORPH_GLOBAL.

SuperPMI full replay: USEASG diagnostic count 309 max/MCH -> 0 across all
11 MCHs. No other assertion counts regress.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PHASE_MORPH_GLOBAL rewrites the stale LCL_FLD/STORE_LCL_FLD references to
the now-TYP_BYREF implicit byref args (via fgMorphExpandImplicitByRefArg),
so by the time that phase's post-phase CHECK_IR runs the IR is consistent
again. Clear fgImplicitByRefLclFldsStale before DoPhase(PHASE_MORPH_GLOBAL)
instead of after it, so the full USEASG/IsPartialLclFld invariant check
runs at the end of morph rather than being relaxed.

SuperPMI full replay: USEASG count remains 0 across all 11 MCHs; no new
assertions surfaced by tightening the scope.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
fgCompactBlock merges a target block into a predecessor, preserving the
set of flags listed in BBF_COMPACT_UPD from the target. The list did not
include BBF_RECURSIVE_TAILCALL, so when an earlier inline pass split a
block holding a recursive tail call and later passes compacted the
subsequent empty tail back into a new predecessor, the merged block lost
BBF_RECURSIVE_TAILCALL even though the recursive tail call statement
itself ended up in that block.

With CHECK_IR now running before PHASE_MORPH_GLOBAL, this trips the
`block->HasFlag(BBF_RECURSIVE_TAILCALL)` invariant for recursive
tailcall nodes in fgDebugCheckFlags.

Add BBF_RECURSIVE_TAILCALL to BBF_COMPACT_UPD so the merged block
retains the annotation.

SuperPMI full replay: BBF_RECURSIVE_TAILCALL assertion count 7 max/MCH
(8 occurrences) -> 0 across all 11 MCHs. No other categories regress.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…MD MinMax fixup

Three related fixes where IR construction/rewrite sites introduced GTF_ASG,
GTF_CALL, or GTF_EXCEPT on a descendant without updating ancestor flags,
triggering CHECK_IR 'Missing flags on tree' when that check runs before
PHASE_MORPH_IMPBYREF:

1. HandleHistogramProbeInstrumentor::Instrument (class-profile instrumentation):
   wrap the existing per-statement walk so that when a probe is actually
   inserted, we call gtUpdateStmtSideEffects on the statement. The inserter
   rewrites 'this' args with COMMA(STORE, COMMA(helper-call, LCL_VAR)),
   introducing new ASG/CALL/EXCEPT flags that must be propagated up the
   statement. Remove the earlier narrower fix at the inserter that only
   updated the immediate CALL's flags.

2. ValueInstrumentor::Instrument (value-profile instrumentation): when
   the length arg is rewritten to COMMA(helper-call, LCL_VAR), OR the new
   arg's effect flags onto the parent CALL so GTF_ASG surfaces there.

3. gtNewSimdMinMaxNode FixupScalar path: after fgMakeMultiUse rewrites
   op2 (a CreateScalarUnsafe) into a COMMA(STORE, LCL_VAR) and that COMMA
   is re-attached to the MaxScalar/MinScalar HWINTRINSIC, propagate the
   new COMMA's effect flags onto the HWINTRINSIC so GTF_ASG is reflected.

Reduces max 'Missing flags' count per MCH from 4 to 3 across 11 MCHs in
SuperPMI replay, and reduces affected MCHs from 4 to 3. No regressions
in Extra/USEASG/tailcall categories.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… instrumentation

The initial fix only updated the direct enclosing call, but the new GTF_ASG
introduced by the wrapping COMMA(profile-helper, load) also needs to
propagate up to ancestors (e.g., RETURN). Use the same gtUpdateStmtSideEffects
pattern applied for HandleHistogramProbeInstrumentor: track per-statement
whether any probe was inserted, and refresh the whole statement's
side-effect flags when so.

Eliminates the final Missing-flags CHECK_IR assertion (libraries_tests.run
context #389863, StartsWith[long]) and brings the Missing category to 0
across all SuperPMI replay MCHs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…r swap

In gtNewSimdBinOpNode's Shift expansion, fgMakeMultiUse on op2->Op(1) wraps
the shift count in COMMA(STORE, LCL_VAR). GTF_ASG was OR'd onto op2
(CreateScalar) immediately. For GT_RSH the subsequent std::swap pulls the
COMMA out of op2 into shiftCountDup (used under maskAmountOp), leaving
op2->Op(1) as a plain LCL_VAR with no side effects. The OR on op2 is now
stale, yielding an Extra GTF_ASG on CreateScalar.

Move the OR to after the swap so we pick up flags of whatever ends up as
op2->Op(1) — the COMMA for non-RSH, the plain LCL_VAR (no extra flags) for RSH.

Eliminates the Extra GTF_ASG on CreateScalar across Vector*.Shift*
contexts and drops replay max Extra count from 109 to 101, and from 10
to 9 affected MCHs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two CHECK_IR Extra-flag patterns surface when CHECK_IR is enabled before
PHASE_MORPH_IMPBYREF:

1. fgInsertInlineeArgument bashes single-use temp LCL_VARs in-place to
   the original arg expression. Parents of the bashed node (NULLCHECK,
   FIELD_ADDR, ARR_LENGTH, ...) keep stale GTF_EXCEPT when the new
   child cannot fault (e.g. IND with GTF_IND_NONNULL/NONFAULTING).
   Fix: after argument insertion, walk all inlinee statements and call
   gtUpdateStmtSideEffects so ancestor flags are recomputed.
   Also: at nullcheck creation time, consult the original arg node
   alongside the temp; if the original is non-null, skip the redundant
   nullcheck entirely.

2. Forward Substitution only refreshed side effects when the substituted
   node had GTF_ALL_EFFECT. But replacing a possibly-null local use with
   a known non-null expression (e.g. IND of static-readonly init field)
   removes a potential fault from ancestors like ARR_LENGTH and
   FIELD_ADDR. Fix: always refresh side effects on the receiving
   statement after substitution.

Full SuperPMI replay is now clean across all 11 MCH collections.
Asm diffs show 3 size improvements (-10 bytes, -0.63% PerfScore) and no
regressions, reflecting elimination of redundant null/fault checks.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 25, 2026 14:37
@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 25, 2026
@AndyAyersMS
Copy link
Copy Markdown
Member Author

Draft for now since I want to see CI's read on TP. There is a full inlinee tree walk that may need to be more surgical.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Moves CHECK_IR earlier in the CoreCLR JIT phase pipeline so IR flag invariants (notably side-effect flags) are validated before morph, and adds targeted flag/side-effect refreshes to keep IR consistent during earlier phases.

Changes:

  • Enable PhaseChecks::CHECK_IR earlier (pre implicit-byref copy-omission), and scope a debug-only relaxation for a transient implicit-byref USEASG invariant violation.
  • Refresh statement/node side-effect flags after various IR rewrites (local morphing, forward substitution, inlining RET_EXPR substitution/splitting, PGO probe insertion, GDV cloning prep).
  • Adjust/propagate effect flags more precisely (e.g., use GTF_ALL_EFFECT propagation, ensure certain nodes carry required GTF_CALL/GTF_ASG/GTF_EXCEPT flags).

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/coreclr/jit/lclmorph.cpp Refresh stmt side effects after local-address rewrites; mark retbuf-local-optimized calls with GTF_ASG.
src/coreclr/jit/indirectcalltransformer.cpp Update stmt side effects after substituting clonable GT_RET_EXPR nodes.
src/coreclr/jit/importercalls.cpp Propagate GTF_ALL_EFFECT into call flags; ensure specific intrinsic placeholder carries GTF_CALL.
src/coreclr/jit/gentree.cpp Treat GT_RET_EXPR as requiring GTF_CALL; default calls to GTF_EXCEPT; propagate effects in some SIMD builders.
src/coreclr/jit/forwardsub.cpp Always recompute receiving statement side-effect flags after forward substitution.
src/coreclr/jit/fgprofile.cpp Recompute stmt side effects after histogram/value probe insertion mutates call argument trees.
src/coreclr/jit/fginline.cpp Recompute stmt side effects after RET_EXPR substitution/splitting; adjust nullcheck decision; refresh inlinee stmt side effects after arg insertion.
src/coreclr/jit/fgdiagnostic.cpp Relax USEASG invariant check during implicit-byref retype window; remove GT_QMARK CSE assert.
src/coreclr/jit/compiler.hpp Clear GTF_EXCEPT for helper calls proven NoThrow now that calls default to GTF_EXCEPT.
src/coreclr/jit/compiler.h Add debug flag fgImplicitByRefLclFldsStale to scope check relaxation.
src/coreclr/jit/compiler.cpp Enable CHECK_IR earlier; bracket implicit-byref stale window with fgImplicitByRefLclFldsStale.
src/coreclr/jit/block.h Include BBF_RECURSIVE_TAILCALL in BBF_COMPACT_UPD set.
Comments suppressed due to low confidence (1)

src/coreclr/jit/gentree.cpp:21459

  • In this path fgMakeMultiUse can rewrite op2->Op(1) to a COMMA(tempStore, lcl), and for GT_RSH the subsequent std::swap can replace that operand with a plain LCL_VAR. Using op2->gtFlags |= ... only ever adds flags and does not clear any stale effects op2 may still be carrying from its pre-rewrite operand, which can trip CHECK_IR as an "extra flags" failure. After mutating/swapping the operand, please recompute op2's effect flags (e.g., via gtUpdateNodeSideEffects(op2) or equivalent) rather than OR-ing bits in.

Comment thread src/coreclr/jit/fginline.cpp Outdated
Comment thread src/coreclr/jit/fgdiagnostic.cpp Outdated
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@AndyAyersMS
Copy link
Copy Markdown
Member Author

There is a decent TP hit in minopts, possibly from just the more "surgical" flag repairs. Will have to dig in and see what's up.

AndyAyersMS and others added 2 commits April 25, 2026 20:13
…g shrinks

LocalAddressVisitor::VisitStmt was calling gtUpdateStmtSideEffects whenever
`m_stmtModified` was set, which triggered a full pre+post tree walk for
every modified statement. Most lclmorph rewrites (e.g. local-address
canonicalization at MorphLocalAddress) do not actually remove any
side-effect bits, so the refresh is unnecessary in the common case.

Track a separate `m_stmtSideEffectsShrunk` flag and only refresh ancestor
flags when a rewrite is known to drop side effects (BashToConst, nullcheck
NOP, IndirTransform::Nop, MorphLocalIndir's IND->LCL, address-comparison
constant fold, and the promoted-struct-field replacements). For the
common MorphLocalAddress site, only mark shrunk when the address node
actually had side-effect bits before being reset to GTF_EMPTY.

This restores the lclmorph throughput cost on Tier0 from +9.3% to under
+1% on the aspnet2 minopts subset, and brings the overall Tier0
throughput delta on that subset from +1.65% down into noise (+0.05%).
SPMI replay is clean and asmdiffs are unchanged.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 26, 2026 14:31
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR moves CHECK_IR (detailed IR invariant checking, especially flag correctness) earlier in the JIT phase sequence and updates multiple transforms to keep side-effect flags consistent as IR is rewritten before/around morphing.

Changes:

  • Enable PhaseChecks::CHECK_IR earlier (before implicit-byref copy omission analysis) and relax one debug-only invariant during the implicit-byref retyping window.
  • Refresh statement side-effect flags (gtUpdateStmtSideEffects) after various rewrites/substitutions that can add/remove GTF_* effects.
  • Tighten/adjust effect flag propagation in call/import and selected SIMD/HWIntrinsic node construction.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/coreclr/jit/lclmorph.cpp Tracks when local-morph rewrites may stale ancestor effect flags and refreshes stmt side effects accordingly; marks retbuf-as-local calls as GTF_ASG.
src/coreclr/jit/indirectcalltransformer.cpp Updates stmt side effects when substituting GT_RET_EXPR during clonability scanning.
src/coreclr/jit/importercalls.cpp Propagates GTF_ALL_EFFECT (not just GTF_GLOB_EFFECT) into call flags; marks IsKnownConstant intrinsic as call-like pre-morph.
src/coreclr/jit/gentree.cpp Adjusts operators requiring call flags; makes new calls default to GTF_EXCEPT; fixes up effect propagation for certain SIMD construction paths.
src/coreclr/jit/forwardsub.cpp Always refreshes side-effect flags on the target statement after forward substitution.
src/coreclr/jit/fgprofile.cpp Refreshes stmt side effects after instrumentation inserts COMMA-wrapped helpers/temps.
src/coreclr/jit/fginline.cpp Refreshes side-effect flags after RET_EXPR substitution/splitting; adds null-check decision refinement; refreshes side effects across inlinee statements after arg insertion.
src/coreclr/jit/fgdiagnostic.cpp Relaxes USEASG vs IsPartialLclFld checking during implicit-byref retyping window; updates required call-flag set.
src/coreclr/jit/compiler.hpp Clears GTF_EXCEPT for no-throw helpers now that call nodes default to GTF_EXCEPT.
src/coreclr/jit/compiler.h Adds a debug-only flag to scope the implicit-byref “stale LCL_FLD” invariant relaxation.
src/coreclr/jit/compiler.cpp Moves enabling of CHECK_IR earlier; scopes the implicit-byref stale window around relevant phases.
src/coreclr/jit/block.h Ensures BBF_RECURSIVE_TAILCALL is preserved when compacting blocks.

Comment thread src/coreclr/jit/gentree.cpp
Comment thread src/coreclr/jit/fginline.cpp
When LocalAddressVisitor rewrites an address tree to a known non-null
LCL_ADDR, ancestor indirections that previously held GTF_EXCEPT due to
a potentially-null address can have stale flags. The narrowing
introduced in 98c29c7 only refreshed when the address itself had
side effects, missing this case and causing CHECK_IR 'Extra flags on
tree' assertions in the libraries.pmi MCH (e.g., context 126507) once
the check runs after Allocate Objects + Local morph.

Always set m_stmtUpdateSideEffects = true here so the per-statement
gtUpdateStmtSideEffects pass at the end of the visitor recomputes
ancestor flags. The other narrowing sites in lclmorph.cpp remain
narrow (they handle cases where the rewritten subtree's faulting
potential does not change).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…LocalAddress

The previous commit refreshed unconditionally inside MorphLocalAddress,
but only the ProcessIndirection wide-indir caller has a parent indir
whose GTF_EXCEPT can be made stale by collapsing the address to a known
non-null LCL_ADDR. The EscapeAddress caller has no such parent, so the
original narrow condition (refresh only when the address tree itself
had effects) is sufficient there.

Restores the MinOpts portion of the lclmorph throughput improvement:
aspnet2 MinOpts goes from +0.45% back to +0.26% (matching the
pre-correctness-fix measurement) while keeping replay clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 26, 2026 22:42
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Moves detailed IR invariant checking (notably effect/flag consistency) earlier in the JIT phase order, and updates/repairs effect-flag propagation in several transformations so CHECK_IR can run sooner without false positives.

Changes:

  • Enable PhaseChecks::CHECK_IR earlier (pre-morph) and scope a debug-only relaxation for the implicit-byref USEASG/IsPartialLclFld invariant during the transient retyping window.
  • Add/expand gtUpdateStmtSideEffects refreshes (and some new flag propagation) after IR rewrites that can add/remove GTF_ASG/GTF_CALL/GTF_EXCEPT.
  • Adjust call/intrinsic flag initialization and effect propagation (e.g., default GTF_EXCEPT on calls; propagate GTF_ALL_EFFECT in more places).

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/coreclr/jit/lclmorph.cpp Tracks when local-morph rewrites can invalidate ancestor effect flags and refreshes statement side effects accordingly; ensures retbuf-as-local calls carry GTF_ASG.
src/coreclr/jit/indirectcalltransformer.cpp Refreshes statement side effects when substituting GT_RET_EXPR during GDV chaining analysis.
src/coreclr/jit/importercalls.cpp Uses GTF_ALL_EFFECT propagation for indirect-call/control expressions and args; marks certain intrinsics with GTF_CALL.
src/coreclr/jit/gentree.cpp Makes call nodes default to GTF_EXCEPT; extends OperRequiresCallFlag to GT_RET_EXPR; fixes missing effect propagation in some SIMD/HWI construction paths.
src/coreclr/jit/forwardsub.cpp Always refreshes receiving statement side effects after forward substitution to avoid stale effect flags.
src/coreclr/jit/fgprofile.cpp Refreshes statement side effects after PGO probe insertion when instrumentation modifies call argument trees.
src/coreclr/jit/fginline.cpp Refreshes statement side effects after RET_EXPR substitution/splitting and after inline argument insertion (currently via full inlinee walk).
src/coreclr/jit/fgdiagnostic.cpp Relaxes USEASG/IsPartialLclFld debug invariant during the implicit-byref transient window; removes one GT_QMARK CSE assert.
src/coreclr/jit/compiler.hpp Clears GTF_EXCEPT for NoThrow helpers now that calls default to GTF_EXCEPT.
src/coreclr/jit/compiler.h Adds debug-only fgImplicitByRefLclFldsStale flag to scope invariant relaxation.
src/coreclr/jit/compiler.cpp Enables CHECK_IR earlier and toggles fgImplicitByRefLclFldsStale around implicit-byref retyping/morph window; removes later enabling.
src/coreclr/jit/block.h Includes BBF_RECURSIVE_TAILCALL in BBF_COMPACT_UPD so compaction preserves this block property.

Comment thread src/coreclr/jit/fginline.cpp
Comment thread src/coreclr/jit/fginline.cpp Outdated
…ng occurred

After commit 6c6fbb3, fgInlinePrependStatements unconditionally walks
every statement of every inlinee block and calls gtUpdateStmtSideEffects.
This walk is only required when fgInsertInlineeArgument bashed a single-use
LCL_VAR temp in place (the path that calls argSingleUseNode->ReplaceWith).
Other paths (gtNewTempStore, gtUnusedValNode, gtTryRemoveBoxUpstreamEffects)
do not mutate pre-existing inlinee body trees and therefore cannot leave
stale side-effect flags on inlinee parents.

Track whether any in-place bash occurred via a bool out-parameter from
fgInsertInlineeArgument and skip the refresh loop entirely when none did.

Throughput on aspnet2:
  FullOpts (full): -0.64%
  Tier1 subset:    -0.77%

Replay clean and no asm diffs across all 11 SPMI collections.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@AndyAyersMS
Copy link
Copy Markdown
Member Author

AndyAyersMS commented Apr 27, 2026

TP impact is now down to about 0.5% worst case: diffs

Not sure how much lower it can go.

We may be able to claw some of this back when optimizing, eg simplifying some of the extra checks in forward sub.

- fgdiagnostic.cpp: tighten the implicit-byref-stale guard to OperIs(GT_STORE_LCL_FLD).
  GT_LCL_FLD cannot reach the GT_STORE_LCL_VAR/GT_STORE_LCL_FLD switch arm, and
  GT_STORE_LCL_VAR is whole-local so its USEASG/IsPartialLclFld relationship is
  not affected by the implicit-byref retyping window.

- gentree.cpp gtNewSimdMinMaxNode: in the MIN !isNumber/!isScalar branch the
  inner block had a stray unconditional overwrite of needsFixup. The first
  assignment (IsVectorZero) was the wrong predicate for this case; the
  comment block above states fixup is needed when cns is -0. Collapse the
  two blocks into a single else that sets needsFixup from
  IsVectorNegativeZero(simdBaseType), matching the intent.

- fginline.cpp SubstitutePlaceholdersAndDevirtualizeWalker: rename
  m_substitutedRetExpr to m_needsSideEffectRefresh. The flag is set both when
  a RET_EXPR is substituted and when LateDevirtualization performs gtSplitTree;
  the new name reflects the actual role.

Replay clean across all 11 SPMI collections.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 27, 2026 18:15
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR moves PhaseChecks::CHECK_IR earlier in the JIT phase list (starting before implicit-byref copy omission analysis) and then adjusts several IR-mutating transforms to keep side-effect/flag invariants valid at the earlier check point. It also introduces a debug-only “stale implicit byref LCL_FLD” window to scope a temporary relaxation of the USEASG invariant.

Changes:

  • Enable CHECK_IR earlier and relax one debug invariant during the implicit-byref retype-to-rewrite window.
  • Add targeted gtUpdateStmtSideEffects refreshes after rewrites that can add/remove GTF_* effect flags.
  • Adjust call/intrinsic flag propagation and defaults so GTF_* summaries are consistent earlier.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/coreclr/jit/lclmorph.cpp Tracks when rewrites can stale ancestor effect flags and refreshes statement side effects.
src/coreclr/jit/indirectcalltransformer.cpp Refreshes statement side effects if GT_RET_EXPR is substituted during clonability scouting.
src/coreclr/jit/importercalls.cpp Switches effect propagation to GTF_ALL_EFFECT and annotates IsKnownConstant intrinsic with GTF_CALL.
src/coreclr/jit/gentree.cpp Tightens/aligns required flags (e.g., GT_RET_EXPR call-flag requirement), adjusts call defaults, and fixes effect propagation in some SIMD helpers.
src/coreclr/jit/forwardsub.cpp Always refreshes side-effect flags on the receiving statement after substitution.
src/coreclr/jit/fgprofile.cpp Refreshes stmt side effects when probe insertion changes call subtree effects.
src/coreclr/jit/fginline.cpp Adds conditional side-effect refresh after placeholder substitution/splits and after in-place arg temp bashing.
src/coreclr/jit/fgdiagnostic.cpp Scopes a debug-only relaxation for USEASG checks during implicit-byref rewrite window; removes a GT_QMARK CSE assertion.
src/coreclr/jit/compiler.hpp Updates helper-call flagging to clear GTF_EXCEPT for no-throw helpers (given new call defaults).
src/coreclr/jit/compiler.h Adds debug-only fgImplicitByRefLclFldsStale state and updates inline argument helper signature.
src/coreclr/jit/compiler.cpp Moves CHECK_IR enablement earlier and toggles the implicit-byref stale window around relevant phases.
src/coreclr/jit/block.h Ensures BBF_RECURSIVE_TAILCALL is preserved during block compaction flag updates.

m_compiler->lvaSetHiddenBufferStructArg(lclNum);
escapeAddr = false;
callUser->gtCallMoreFlags |= GTF_CALL_M_RETBUFFARG_LCLOPT;
callUser->gtFlags |= GTF_ASG;
Comment on lines 3671 to +3673
// op1 is not a known constant, we'll do the expansion in morph
retNode = new (this, GT_INTRINSIC) GenTreeIntrinsic(TYP_INT, op1, ni, method R2RARG(*entryPoint));
retNode->gtFlags |= GTF_CALL;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants