Skip to content

JIT: Transform arithmetic using distributive property#126852

Open
BoyBaykiller wants to merge 10 commits intodotnet:mainfrom
BoyBaykiller:transform-using-distributive-property
Open

JIT: Transform arithmetic using distributive property#126852
BoyBaykiller wants to merge 10 commits intodotnet:mainfrom
BoyBaykiller:transform-using-distributive-property

Conversation

@BoyBaykiller
Copy link
Copy Markdown
Contributor

@BoyBaykiller BoyBaykiller commented Apr 13, 2026

Generalization of #126070

Basically we weren't doing any simplification based on distributive property before. So transforming
((A op1 B) op2 (A op1 C)) => (A op1 (B op2 C)). And this adds some basic support. Examples:

int MulDistedOverAdd(int A, int B, int C)
{
    return (A * B) + (A * C);
}
;; ------ BASE
G_M000_IG02:
       mov      eax, edx
       imul     eax, r8d
       imul     r9d, edx
       add      eax, r9d

;; ------ DIFF
G_M48043_IG02:  ;; offset=0x0000
       lea      eax, [r8+r9]
       imul     eax, edx
bool AfterOptimizeBools(int A, int B)
{
    return (A & 4) != 0 || (A & 8) != 0;
}
;; ------ BASE
       mov      eax, edx
       and      eax, 4
       and      edx, 8
       or       eax, edx
       setne    al
       movzx    rax, al

;; ------ DIFF
G_M55610_IG02:
       test     dl, 12
       setne    al
       movzx    rax, al

We still need something that changes order to enable this opt. So (A | B) | C becoming A | (B | C) in this case:

uint Reassociate(uint foo, uint flags)
{
    return (foo | (flags & 256)) | (flags & 512);
}

Also I have to check for GTF_ORDER_SIDEEFF to exclude volatile loads but that has false negatives: https://discord.com/channels/143867839282020352/312132327348240384/1487221372899037234

* add OR and AND are 'distributive' over themselves
@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 13, 2026
@dotnet-policy-service dotnet-policy-service Bot added the community-contribution Indicates that the PR has been added by a community member label Apr 13, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Comment thread src/coreclr/jit/morph.cpp Outdated
return tree;
}

if (((tree->gtFlags & GTF_PERSISTENT_SIDE_EFFECTS) != 0) || ((tree->gtFlags & GTF_ORDER_SIDEEFF) != 0))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could use the same optimization you are trying to apply😅... Probably handled by c++ though, so this is purely documentation ;-)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol. A new level of self documenting code. I didn't even notice that, I copied this check from elsewhere. I guess this just goes to show how useful the optimization is. Unfortunately even with this PR it's still not handled : (

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: This now gets handled by calling this opt again after the "optimizeBools" phase where it simplifies e.g
(A & 4) != 0 || (A & 8) != 0 into ((A & 4) | (A & 8)) != 0.

@BoyBaykiller
Copy link
Copy Markdown
Contributor Author

BoyBaykiller commented Apr 15, 2026

@EgorBo PTAL. Diffs are rather small, but nonetheless I believe it's a step in the right direction. I've written down some future work (which all improve diffs). I think the most interesting case for real-world code is arround bitwise ops.
Also should this go into post-order or pre-order morphing?

@BoyBaykiller BoyBaykiller marked this pull request as ready for review April 15, 2026 01:07
…implification like '(A & 4) != 0 || (A & 8) != 0'
@BoyBaykiller
Copy link
Copy Markdown
Contributor Author

Calling fgMorphBlockStmt after optimizeBools fixes point 2 in my list and produces much better diffs old new

@BoyBaykiller
Copy link
Copy Markdown
Contributor Author

BoyBaykiller commented Apr 15, 2026

Regarding point 1. The issue is that this runs first:

runtime/src/coreclr/jit/morph.cpp

Lines 10665 to 10690 in e0fb1f9

else if (mulShiftOpt && (lowestBit > 1) && jitIsScaleIndexMul(lowestBit))
{
int shift = genLog2(lowestBit);
ssize_t factor = abs_mult >> shift;
if (factor == 3 || factor == 5 || factor == 9)
{
// if negative negate (min-int does not need negation)
if (mult < 0 && mult != SSIZE_T_MIN)
{
op1 = gtNewOperNode(GT_NEG, genActualType(op1), op1);
mul->gtOp1 = op1;
fgMorphTreeDone(op1);
}
// change the multiplication into a smaller multiplication (by 3, 5 or 9) and a shift
GenTree* const factorNode = gtNewIconNodeWithVN(this, factor, mul->TypeGet());
factorNode->SetMorphed(this);
op1 = gtNewOperNode(GT_MUL, mul->TypeGet(), op1, factorNode);
mul->gtOp1 = op1;
fgMorphTreeDone(op1);
op2->AsIntConCommon()->SetIconValue(shift);
changeToShift = true;
}
}

Perhaps this should be moved to Lower. LLVM also does it in "X86 DAG->DAG Instruction Selection" and not "InstCombine" https://godbolt.org/z/jsPea1PhP

Comment thread src/coreclr/jit/optimizebools.cpp Outdated
@@ -346,6 +346,9 @@ bool OptBoolsDsc::optOptimizeBoolsCondBlock()

optOptimizeBoolsUpdateTrees();

// There may be new opportunities for distributive arithmetic optimization
m_compiler->fgMorphBlockStmt(m_b1, s1 DEBUGARG(__FUNCTION__), false);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have had many correctness issues resulting from calls like this into morph. I am not generally inclined to think these calls are worth it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example correctness issue from the past so I can better understand the problem with calling it like this? I am hoping this is fixable so that in the future we can call fgMorphBlockStmt from more or less anywhere without much thought. More of a utility than a fixed global thing.

The diffs are ~3x better with this. I guess I could make fgOptimizeDistributiveArithemtic public and only call that, same as in #125549 (comment). Not a big fan though, because this basically means we are exposing a "safe" subset of morph to be called from anywhere which... isn't that what if (fgGlobalMorph) in morph is supposed to be doing

Copy link
Copy Markdown
Member

@EgorBo EgorBo Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main issue with morph (and fgMorphBlockStmt specifically) is that it can delete statements and modify block layout. If the place where you call it from is not ready for that - e.g. somwhere in the call chain it walks over blocks and statements using iterators - it's a recipe for a bug. if you see better diffs you might want to examine them and maybe there are low-hanging fruits we can copy to Lower.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens is that after optimizeBools expression like (A & 4) != 0 || (A & 8) != 0 is turned into ((A & 4) | (A & 8)) != 0 so the control flow is simplified and fgOptimizeDistributiveArithemtic can now deal with it.

I've removed the call to fgMorphBlockStmt and instead call fgOptimizeDistributiveArithemtic directly. This has all the diffs but hopefully is considered safe.

Alternatively, optimizeBools could be moved before global morph. But there are probably plenty of reasons to not do that.

@BoyBaykiller BoyBaykiller requested a review from EgorBo April 18, 2026 17:58
Comment thread src/coreclr/jit/morph.cpp Outdated
// Return Value:
// The unchanged tree or optimized tree with oper GT_MUL/GT_OR/GT_AND.
//
GenTree* Compiler::fgOptimizeDistributiveArithemtic(GenTreeOp* tree)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's a typo in this name

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment thread src/coreclr/jit/morph.cpp Outdated

auto isLeftDistributive = [](genTreeOps op1, genTreeOps op2) {
// op1 is left distributive over op2 iff:
// "A op1 (B op2 C)" <==> "(A op1 B) op2 (A op1 C)"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already do something similar for GT_ADD somewhere, we should unify it with that

Comment thread src/coreclr/jit/optimizebools.cpp Outdated
// There may be new opportunities for distributive arithmetic optimization
if (m_foldOp == GT_OR || m_foldOp == GT_AND)
{
cmpOp1 = m_compiler->fgOptimizeDistributiveArithmetic(cmpOp1->AsOp());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that probably should live in gtFoldExpr

@BoyBaykiller
Copy link
Copy Markdown
Contributor Author

BoyBaykiller commented Apr 27, 2026

TODO: https://discord.com/channels/143867839282020352/312132327348240384/1498386796919263313
And look how this relates to

/*****************************************************************************
*
* A little helper used to rearrange nested commutative operations. The
* effect is that nested associative, commutative operations are transformed
* into a 'left-deep' tree, i.e. into something like this:
*
* (((a op b) op c) op d) op...
*/
#if REARRANGE_ADDS
void Compiler::fgMoveOpsLeft(GenTree* tree)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants