JIT: Transform arithmetic using distributive property#126852
JIT: Transform arithmetic using distributive property#126852BoyBaykiller wants to merge 10 commits intodotnet:mainfrom
Conversation
* add OR and AND are 'distributive' over themselves
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
| return tree; | ||
| } | ||
|
|
||
| if (((tree->gtFlags & GTF_PERSISTENT_SIDE_EFFECTS) != 0) || ((tree->gtFlags & GTF_ORDER_SIDEEFF) != 0)) |
There was a problem hiding this comment.
This could use the same optimization you are trying to apply😅... Probably handled by c++ though, so this is purely documentation ;-)
There was a problem hiding this comment.
Lol. A new level of self documenting code. I didn't even notice that, I copied this check from elsewhere. I guess this just goes to show how useful the optimization is. Unfortunately even with this PR it's still not handled : (
There was a problem hiding this comment.
Update: This now gets handled by calling this opt again after the "optimizeBools" phase where it simplifies e.g
(A & 4) != 0 || (A & 8) != 0 into ((A & 4) | (A & 8)) != 0.
|
@EgorBo PTAL. Diffs are rather small, but nonetheless I believe it's a step in the right direction. I've written down some future work (which all improve diffs). I think the most interesting case for real-world code is arround bitwise ops. |
…implification like '(A & 4) != 0 || (A & 8) != 0'
|
Regarding point 1. The issue is that this runs first: runtime/src/coreclr/jit/morph.cpp Lines 10665 to 10690 in e0fb1f9 Perhaps this should be moved to Lower. LLVM also does it in "X86 DAG->DAG Instruction Selection" and not "InstCombine" https://godbolt.org/z/jsPea1PhP |
| @@ -346,6 +346,9 @@ bool OptBoolsDsc::optOptimizeBoolsCondBlock() | |||
|
|
|||
| optOptimizeBoolsUpdateTrees(); | |||
|
|
|||
| // There may be new opportunities for distributive arithmetic optimization | |||
| m_compiler->fgMorphBlockStmt(m_b1, s1 DEBUGARG(__FUNCTION__), false); | |||
There was a problem hiding this comment.
We have had many correctness issues resulting from calls like this into morph. I am not generally inclined to think these calls are worth it.
There was a problem hiding this comment.
Can you give an example correctness issue from the past so I can better understand the problem with calling it like this? I am hoping this is fixable so that in the future we can call fgMorphBlockStmt from more or less anywhere without much thought. More of a utility than a fixed global thing.
The diffs are ~3x better with this. I guess I could make fgOptimizeDistributiveArithemtic public and only call that, same as in #125549 (comment). Not a big fan though, because this basically means we are exposing a "safe" subset of morph to be called from anywhere which... isn't that what if (fgGlobalMorph) in morph is supposed to be doing
There was a problem hiding this comment.
the main issue with morph (and fgMorphBlockStmt specifically) is that it can delete statements and modify block layout. If the place where you call it from is not ready for that - e.g. somwhere in the call chain it walks over blocks and statements using iterators - it's a recipe for a bug. if you see better diffs you might want to examine them and maybe there are low-hanging fruits we can copy to Lower.
There was a problem hiding this comment.
What happens is that after optimizeBools expression like (A & 4) != 0 || (A & 8) != 0 is turned into ((A & 4) | (A & 8)) != 0 so the control flow is simplified and fgOptimizeDistributiveArithemtic can now deal with it.
I've removed the call to fgMorphBlockStmt and instead call fgOptimizeDistributiveArithemtic directly. This has all the diffs but hopefully is considered safe.
Alternatively, optimizeBools could be moved before global morph. But there are probably plenty of reasons to not do that.
e92d4c5 to
8e06971
Compare
…ockStmt * call gtFoldExpr
| // Return Value: | ||
| // The unchanged tree or optimized tree with oper GT_MUL/GT_OR/GT_AND. | ||
| // | ||
| GenTree* Compiler::fgOptimizeDistributiveArithemtic(GenTreeOp* tree) |
There was a problem hiding this comment.
Looks like there's a typo in this name
|
|
||
| auto isLeftDistributive = [](genTreeOps op1, genTreeOps op2) { | ||
| // op1 is left distributive over op2 iff: | ||
| // "A op1 (B op2 C)" <==> "(A op1 B) op2 (A op1 C)" |
There was a problem hiding this comment.
We already do something similar for GT_ADD somewhere, we should unify it with that
| // There may be new opportunities for distributive arithmetic optimization | ||
| if (m_foldOp == GT_OR || m_foldOp == GT_AND) | ||
| { | ||
| cmpOp1 = m_compiler->fgOptimizeDistributiveArithmetic(cmpOp1->AsOp()); |
There was a problem hiding this comment.
that probably should live in gtFoldExpr
|
TODO: https://discord.com/channels/143867839282020352/312132327348240384/1498386796919263313 runtime/src/coreclr/jit/morph.cpp Lines 2720 to 2731 in 14d3e32 |
Generalization of #126070
Basically we weren't doing any simplification based on distributive property before. So transforming
((A op1 B) op2 (A op1 C)) => (A op1 (B op2 C)). And this adds some basic support. Examples:We still need something that changes order to enable this opt. So (A | B) | C becoming A | (B | C) in this case:
Also I have to check for
GTF_ORDER_SIDEEFFto exclude volatile loads but that has false negatives: https://discord.com/channels/143867839282020352/312132327348240384/1487221372899037234