Commit be4016e
[X86] Fix logic for optimizing movmsk(bitcast(shuffle(x))); PR67287
Prior logic would remove the shuffle iff all of the elements in `x`
where used. This is incorrect.
The issue is `movmsk` only cares about the highbits, so if the width
of the elements in `x` is smaller than the width of the elements
for the `movmsk`, then the shuffle, even if it preserves all the elements,
may change which ones are used by the highbits.
For example:
`movmsk64(bitcast(shuffle32(x, (1,0,3,2))))`
Even though the shuffle mask `(1,0,3,2)` preserves all the elements, it
flips which will be relevant to the `movmsk64` (x[1] and x[3]
before and x[0] and x[2] after).
The fix here, is to ensure that the shuffle mask can be scaled to the
element width of the `movmsk` instruction. This ensure that the
"high" elements stay "high". This is overly conservative as it
misses cases like `(1,1,3,3)` where the "high" elements stay
intact despite not be scalable, but for an relatively edge-case
optimization that should generally be handled during
simplifyDemandedBits, it seems okay.
(cherry picked from commit 1684c65)1 parent 496b174 commit be4016e
File tree
2 files changed
+22
-6
lines changed- llvm
- lib/Target/X86
- test/CodeGen/X86
2 files changed
+22
-6
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
48539 | 48539 | | |
48540 | 48540 | | |
48541 | 48541 | | |
48542 | | - | |
| 48542 | + | |
| 48543 | + | |
| 48544 | + | |
| 48545 | + | |
| 48546 | + | |
| 48547 | + | |
| 48548 | + | |
| 48549 | + | |
| 48550 | + | |
| 48551 | + | |
| 48552 | + | |
| 48553 | + | |
| 48554 | + | |
| 48555 | + | |
| 48556 | + | |
48543 | 48557 | | |
48544 | 48558 | | |
48545 | 48559 | | |
48546 | 48560 | | |
48547 | 48561 | | |
48548 | | - | |
| 48562 | + | |
| 48563 | + | |
48549 | 48564 | | |
48550 | 48565 | | |
48551 | 48566 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4458 | 4458 | | |
4459 | 4459 | | |
4460 | 4460 | | |
4461 | | - | |
4462 | | - | |
4463 | | - | |
4464 | | - | |
4465 | 4461 | | |
4466 | 4462 | | |
4467 | 4463 | | |
| 4464 | + | |
| 4465 | + | |
| 4466 | + | |
| 4467 | + | |
| 4468 | + | |
4468 | 4469 | | |
4469 | 4470 | | |
4470 | 4471 | | |
| |||
0 commit comments