JIT: coalesce constant-indexed bounds checks within a block#127439
JIT: coalesce constant-indexed bounds checks within a block#127439AndyAyersMS wants to merge 2 commits intodotnet:mainfrom
Conversation
Add a new phase `optBoundsCheckCoalesce` that runs before assertion prop, looking for sequences of bounds checks that can be collapsed into a single dominating check. For example: `a[0] + a[1] + a[2] + a[3]` produces four bounds checks with indices 0, 1, 2, 3 and the same length VN. The phase rewrites the first check index to 3 and marks the other three checks as "in bound" so they get removed during assertion prop. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
@EgorBo FYI |
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Pull request overview
This PR adds a new JIT optimization phase (optBoundsCheckCoalesce) that runs before assertion propagation to reduce redundant bounds checks for constant indices within a basic block by strengthening one dominating check and marking others as provably in-bounds for later removal.
Changes:
- Introduces a new JIT phase (
PHASE_BOUNDS_CHECK_COALESCE) and schedules it before assertion propagation. - Adds
Compiler::optBoundsCheckCoalesce()and implements block-local coalescing for constant-index bounds checks with the same length VN. - Wires the new implementation into the JIT build (CMake sources).
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/jit/compphases.h | Registers the new phase name/id for diagnostics and phase tracking. |
| src/coreclr/jit/compiler.h | Declares optBoundsCheckCoalesce() on Compiler. |
| src/coreclr/jit/compiler.cpp | Runs the new coalescing phase immediately before assertion propagation. |
| src/coreclr/jit/boundscheckcoalesce.cpp | Implements the new bounds-check coalescing pass. |
| src/coreclr/jit/CMakeLists.txt | Adds the new .cpp file to the JIT build sources. |
| bool IsSideEffectBarrier(Compiler* comp, GenTree* node, bool blockIsInsideTry) | ||
| { | ||
| if (node->IsCall()) | ||
| { | ||
| return true; | ||
| } | ||
| if (node->OperIs(GT_MEMORYBARRIER)) |
There was a problem hiding this comment.
IsSideEffectBarrier only treats calls/stores/atomics/memory barriers as blockers, but the transformation can also change observable exception ordering across other potentially-throwing or ordering-constrained nodes (e.g., checked overflow ops, div/mod, nullchecks/indirections on unrelated objects, or nodes with GTF_ORDER_SIDEEFF). Strengthening an earlier bounds check to a larger index can make IndexOutOfRangeException occur before an intervening DivideByZeroException/OverflowException/NullReferenceException that would have been thrown first per the original evaluation order. Consider extending the barrier criteria to include nodes that may throw (or at least those with exception sets beyond the bounds-check group) and nodes with GTF_ORDER_SIDEEFF, so coalescing only happens when it cannot change which exception is observed.
There was a problem hiding this comment.
It does seem kind of ad hoc, can't we just use exception flags here?
There was a problem hiding this comment.
Yeah, let me try and simplify this a bit.
There was a problem hiding this comment.
As for using flags: we'd have to filter out summary effects since we are walking node by node in execution order, so not clear if it makes things simpler.
There was a problem hiding this comment.
Well, I am more worried about correctness here...
You can use OperEffects to get flags only for the top level node.
|
should it already be optimized today via block clonning? static int Max(int[] a)
{
return a[0] + a[1] + a[2] + a[3];
}; Method MinMaxBench:Max(int[]):int (FullOpts)
G_M60565_IG01: ;; offset=0x0000
sub rsp, 40
G_M60565_IG02: ;; offset=0x0004
mov edx, dword ptr [rcx+0x08]
cmp edx, 3
jle SHORT G_M60565_IG04
mov eax, dword ptr [rcx+0x10]
add eax, dword ptr [rcx+0x14]
add eax, dword ptr [rcx+0x18]
add eax, dword ptr [rcx+0x1C]
G_M60565_IG03: ;; offset=0x0018
add rsp, 40
ret
G_M60565_IG04: ;; offset=0x001D
test edx, edx
je SHORT G_M60565_IG06
mov eax, dword ptr [rcx+0x10]
cmp edx, 1
jbe SHORT G_M60565_IG06
add eax, dword ptr [rcx+0x14]
cmp edx, 2
jbe SHORT G_M60565_IG06
add eax, dword ptr [rcx+0x18]
cmp edx, 3
jbe SHORT G_M60565_IG06
add eax, dword ptr [rcx+0x1C]
G_M60565_IG05: ;; offset=0x003C
add rsp, 40
ret
G_M60565_IG06: ;; offset=0x0041
call CORINFO_HELP_RNGCHKFAIL
int3
; Total bytes of code: 71I assume in this specific case we can avoid clonning, but this will be just about removing a cold block? |
|
It does trim away some of the cloning cold code, but also seems to get other cases. Will post some examples in a bit. diffs -- lots of hits in tests, less in more realistic stuff. A case where this is not just streamlining range cloning (there are more, just pointing one out that is readily visible from the diffs) aspnet2
|
- Use OperMayThrow + GTF_ORDER_SIDEEFF as the side-effect barrier rule instead of an ad-hoc list (memorybarrier/atomic). GT_BOUNDS_CHECK is exempted since IOOB is the same exception class our strengthened check throws. - Switch from block->hasTryIndex() to block->HasPotentialEHSuccs(this) for the EH-reachability test, matching usage elsewhere in the JIT. - Drop GTF_CHK_INDEX_INBND tagging on followers; strengthening only the head is enough -- forward assertion prop drops the followers. Add regression tests covering exception-ordering invariants: divide/NRE between BCs, IOOB on short arrays, and locals live into catch/finally handlers in a try block. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add a new phase
optBoundsCheckCoalescethat runs before assertion prop, looking for sequences of bounds checks that can be collapsed into a single dominating check.For example:
a[0] + a[1] + a[2] + a[3]produces four bounds checks with indices 0, 1, 2, 3 and the same length VN. The phase rewrites the first check index to 3 and marks the other three checks as "in bound" so they get removed during assertion prop.