Skip to content

Conversation

@paulwalker-arm
Copy link
Collaborator

The LangRef current defines out-of-range stepvector values as poison. This property is at odds with both the expansion used for fixed-length vectors and the equivalent ISD node, both of which implicitly truncate out-of-range values.

NOTE: In order to keep the PR mostly NFC I would like to defer the follow extensions to seperate PRs.

  1. The new definition means the "8-bit" restriction can be lifted
    because that only existed due to problematic cases like
    <vscale x n x i1> stepvector(), which by definition is mostly poison.
    Defering because I'm unsure of the code generation support for
    smaller types, as a minimum we're missing test coverage.

  2. The instcombine can fire in many more case, and the current constant
    handling can be a simplification rather than a combine.

…cated.

The LangRef current defines out-of-range stepvector values as poison.
This property is at odds with both the expansion used for fixed-length
vectors and the equivalent ISD node, both of which implicitly truncate
out-of-range values.

NOTE: In order to keep the PR mostly NFC I would like to defer the
follow extensions to seperate PRs.

1) The new definition means the "8-bit" restriction can be lifted
   because that only existed due to problematic cases like
   `<vscale x n x i1> stepvector()`, which by definition is mostly poison.
   Defering because I'm unsure of the code generation support for
   smaller types, as a minimum we're missing test coverage.

2) The instcombine can fire in many more case, and the current constant
   handling can be a simplification rather than a combine.
@llvmbot llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:ir llvm:transforms labels Dec 24, 2025
@llvmbot
Copy link
Member

llvmbot commented Dec 24, 2025

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-llvm-transforms

Author: Paul Walker (paulwalker-arm)

Changes

The LangRef current defines out-of-range stepvector values as poison. This property is at odds with both the expansion used for fixed-length vectors and the equivalent ISD node, both of which implicitly truncate out-of-range values.

NOTE: In order to keep the PR mostly NFC I would like to defer the follow extensions to seperate PRs.

  1. The new definition means the "8-bit" restriction can be lifted
    because that only existed due to problematic cases like
    &lt;vscale x n x i1&gt; stepvector(), which by definition is mostly poison.
    Defering because I'm unsure of the code generation support for
    smaller types, as a minimum we're missing test coverage.

  2. The instcombine can fire in many more case, and the current constant
    handling can be a simplification rather than a combine.


Full diff: https://github.com/llvm/llvm-project/pull/173494.diff

3 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+1-1)
  • (modified) llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/vscale_extractelement.ll (+5-2)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index d99280f05e73f..5b462b87acb0f 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -20828,7 +20828,7 @@ of integers whose elements contain a linear sequence of values starting from 0
 with a step of 1. This intrinsic can only be used for vectors with integer
 elements that are at least 8 bits in size. If the sequence value exceeds
 the allowed limit for the element type then the result for that lane is
-a poison value.
+truncated.
 
 These intrinsics work for both fixed and scalable vectors. While this intrinsic
 supports all vector types, the recommended way to express this operation for
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp b/llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
index 98e2d9ebe4fc2..f5e8d341e3493 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
@@ -448,7 +448,7 @@ Instruction *InstCombinerImpl::visitExtractElementInst(ExtractElementInst &EI) {
         if (IndexC->getValue().getActiveBits() <= BitWidth)
           Idx = ConstantInt::get(Ty, IndexC->getValue().zextOrTrunc(BitWidth));
         else
-          Idx = PoisonValue::get(Ty);
+          return nullptr;
         return replaceInstUsesWith(EI, Idx);
       }
     }
diff --git a/llvm/test/Transforms/InstCombine/vscale_extractelement.ll b/llvm/test/Transforms/InstCombine/vscale_extractelement.ll
index 9ac8a92abb689..ec58307a253d1 100644
--- a/llvm/test/Transforms/InstCombine/vscale_extractelement.ll
+++ b/llvm/test/Transforms/InstCombine/vscale_extractelement.ll
@@ -214,12 +214,15 @@ entry:
   ret i64 %1
 }
 
-; Check that poison is returned when the extracted element has wrapped.
+; TODO: stepvector now wraps rather than poisons elements when the value does
+; not fit, so this should return 0.
 
 define i8 @ext_lane256_from_stepvec() {
 ; CHECK-LABEL: @ext_lane256_from_stepvec(
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    ret i8 poison
+; CHECK-NEXT:    [[TMP0:%.*]] = call <vscale x 512 x i8> @llvm.stepvector.nxv512i8()
+; CHECK-NEXT:    [[TMP1:%.*]] = extractelement <vscale x 512 x i8> [[TMP0]], i64 256
+; CHECK-NEXT:    ret i8 [[TMP1]]
 ;
 entry:
   %0 = call <vscale x 512 x i8> @llvm.stepvector.nxv512i8()

unsigned BitWidth = Ty->getIntegerBitWidth();
Value *Idx;
// Return index when its value does not exceed the allowed limit
// for the element type of the vector, otherwise return undefined.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is outdated now.

Why not drop the if entirely here instead of keeping the nullptr return?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not drop the if entirely here instead of keeping the nullptr return?

That's point two in the message. This whole functionality belongs in simplifyExtractElementInst, so I figured it's better to do the work once in a separate PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with that. Maybe worth noting that InstCombine could handle the generalized case of a variable index extract (where a zext/trunc instruction may have to be generated).

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The RISC-V instruction vid.v we use for stepvector has the truncate behavior.

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks

@paulwalker-arm paulwalker-arm merged commit 3da3934 into llvm:main Dec 25, 2025
11 checks passed
@paulwalker-arm paulwalker-arm deleted the change-step-vector-def branch December 25, 2025 10:44
@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 25, 2025

LLVM Buildbot has detected a new failure on builder clang-ppc64le-linux-test-suite running on ppc64le-clang-test-suite while building llvm at step 4 "cmake-configure".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/95/builds/19539

Here is the relevant piece of the build log for the reference
Step 4 (cmake-configure) failure: cmake (failure) (timed out)

@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 25, 2025

LLVM Buildbot has detected a new failure on builder ppc64le-flang-rhel-clang running on ppc64le-flang-rhel-test while building llvm at step 3 "clean-build-dir".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/157/builds/43267

Here is the relevant piece of the build log for the reference
Step 3 (clean-build-dir) failure: Delete failed. (failure) (timed out)
Step 4 (cmake-configure) failure: cmake (failure) (timed out)
command timed out: 1200 seconds without output running [b'cmake', b'-DLLVM_TARGETS_TO_BUILD=PowerPC', b'-DLLVM_INSTALL_UTILS=ON', b'-DCMAKE_CXX_STANDARD=17', b'-DLLVM_LIT_ARGS=-vj 256', b'-DFLANG_ENABLE_WERROR=ON', b'-DLLVM_ENABLE_ASSERTIONS=ON', b'-DCMAKE_C_COMPILER_LAUNCHER=ccache', b'-DCMAKE_CXX_COMPILER_LAUNCHER=ccache', b'-DLLVM_ENABLE_PROJECTS=flang;llvm;mlir;clang', b'-DLLVM_ENABLE_RUNTIMES=flang-rt;openmp', b'-DCMAKE_BUILD_TYPE=Release', b'-GNinja', b'../llvm-project/llvm'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=1200.342612

@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 25, 2025

LLVM Buildbot has detected a new failure on builder ppc64le-mlir-rhel-clang running on ppc64le-mlir-rhel-test while building llvm at step 3 "clean-build-dir".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/129/builds/35525

Here is the relevant piece of the build log for the reference
Step 3 (clean-build-dir) failure: Delete failed. (failure) (timed out)
Step 4 (cmake-configure) failure: cmake (failure) (timed out)
command timed out: 1200 seconds without output running [b'cmake', b'-DLLVM_TARGETS_TO_BUILD=PowerPC', b'-DLLVM_INSTALL_UTILS=ON', b'-DCMAKE_CXX_STANDARD=17', b'-DLLVM_ENABLE_PROJECTS=mlir', b'-DLLVM_LIT_ARGS=-vj 256', b'-DCMAKE_C_COMPILER_LAUNCHER=ccache', b'-DCMAKE_CXX_COMPILER_LAUNCHER=ccache', b'-DCMAKE_BUILD_TYPE=Release', b'-DLLVM_ENABLE_ASSERTIONS=ON', b'-GNinja', b'../llvm-project/llvm'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=1200.314314

@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 25, 2025

LLVM Buildbot has detected a new failure on builder sanitizer-x86_64-linux-android running on sanitizer-buildbot-android while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/186/builds/14985

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
[       OK ] AddressSanitizer.AtoiAndFriendsOOBTest (2261 ms)
[ RUN      ] AddressSanitizer.HasFeatureAddressSanitizerTest
[       OK ] AddressSanitizer.HasFeatureAddressSanitizerTest (0 ms)
[ RUN      ] AddressSanitizer.CallocReturnsZeroMem
[       OK ] AddressSanitizer.CallocReturnsZeroMem (13 ms)
[ DISABLED ] AddressSanitizer.DISABLED_TSDTest
[ RUN      ] AddressSanitizer.IgnoreTest
[       OK ] AddressSanitizer.IgnoreTest (0 ms)
[ RUN      ] AddressSanitizer.SignalTest
[       OK ] AddressSanitizer.SignalTest (199 ms)
[ RUN      ] AddressSanitizer.ReallocTest
[       OK ] AddressSanitizer.ReallocTest (38 ms)
[ RUN      ] AddressSanitizer.WrongFreeTest
[       OK ] AddressSanitizer.WrongFreeTest (111 ms)
[ RUN      ] AddressSanitizer.LongJmpTest
[       OK ] AddressSanitizer.LongJmpTest (0 ms)
[ RUN      ] AddressSanitizer.ThreadStackReuseTest
[       OK ] AddressSanitizer.ThreadStackReuseTest (0 ms)
[ DISABLED ] AddressSanitizer.DISABLED_MemIntrinsicUnalignedAccessTest
[ DISABLED ] AddressSanitizer.DISABLED_LargeFunctionSymbolizeTest
[ DISABLED ] AddressSanitizer.DISABLED_MallocFreeUnwindAndSymbolizeTest
[ RUN      ] AddressSanitizer.UseThenFreeThenUseTest
[       OK ] AddressSanitizer.UseThenFreeThenUseTest (126 ms)
[ RUN      ] AddressSanitizer.FileNameInGlobalReportTest
[       OK ] AddressSanitizer.FileNameInGlobalReportTest (123 ms)
[ DISABLED ] AddressSanitizer.DISABLED_StressStackReuseAndExceptionsTest
[ RUN      ] AddressSanitizer.MlockTest
[       OK ] AddressSanitizer.MlockTest (0 ms)
[ DISABLED ] AddressSanitizer.DISABLED_DemoThreadedTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoStackTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoThreadStackTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoUAFLowIn
[ DISABLED ] AddressSanitizer.DISABLED_DemoUAFLowLeft
[ DISABLED ] AddressSanitizer.DISABLED_DemoUAFLowRight
[ DISABLED ] AddressSanitizer.DISABLED_DemoUAFHigh
[ DISABLED ] AddressSanitizer.DISABLED_DemoOOM
[ DISABLED ] AddressSanitizer.DISABLED_DemoDoubleFreeTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoNullDerefTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoFunctionStaticTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoTooMuchMemoryTest
[ RUN      ] AddressSanitizer.LongDoubleNegativeTest
[       OK ] AddressSanitizer.LongDoubleNegativeTest (0 ms)
[----------] 19 tests from AddressSanitizer (27864 ms total)

[----------] Global test environment tear-down
[==========] 22 tests from 2 test suites ran. (27873 ms total)
[  PASSED  ] 22 tests.

  YOU HAVE 1 DISABLED TEST

Step 24 (run instrumented asan tests [aarch64/aosp_coral-userdebug/AOSP.MASTER]) failure: run instrumented asan tests [aarch64/aosp_coral-userdebug/AOSP.MASTER] (failure)
...
[ RUN      ] AddressSanitizer.HasFeatureAddressSanitizerTest
[       OK ] AddressSanitizer.HasFeatureAddressSanitizerTest (0 ms)
[ RUN      ] AddressSanitizer.CallocReturnsZeroMem
[       OK ] AddressSanitizer.CallocReturnsZeroMem (8 ms)
[ DISABLED ] AddressSanitizer.DISABLED_TSDTest
[ RUN      ] AddressSanitizer.IgnoreTest
[       OK ] AddressSanitizer.IgnoreTest (0 ms)
[ RUN      ] AddressSanitizer.SignalTest
[       OK ] AddressSanitizer.SignalTest (315 ms)
[ RUN      ] AddressSanitizer.ReallocTest
[       OK ] AddressSanitizer.ReallocTest (20 ms)
[ RUN      ] AddressSanitizer.WrongFreeTest
[       OK ] AddressSanitizer.WrongFreeTest (253 ms)
[ RUN      ] AddressSanitizer.LongJmpTest
[       OK ] AddressSanitizer.LongJmpTest (0 ms)
[ RUN      ] AddressSanitizer.ThreadStackReuseTest
[       OK ] AddressSanitizer.ThreadStackReuseTest (0 ms)
[ DISABLED ] AddressSanitizer.DISABLED_MemIntrinsicUnalignedAccessTest
[ DISABLED ] AddressSanitizer.DISABLED_LargeFunctionSymbolizeTest
[ DISABLED ] AddressSanitizer.DISABLED_MallocFreeUnwindAndSymbolizeTest
[ RUN      ] AddressSanitizer.UseThenFreeThenUseTest
[       OK ] AddressSanitizer.UseThenFreeThenUseTest (314 ms)
[ RUN      ] AddressSanitizer.FileNameInGlobalReportTest
[       OK ] AddressSanitizer.FileNameInGlobalReportTest (280 ms)
[ DISABLED ] AddressSanitizer.DISABLED_StressStackReuseAndExceptionsTest
[ RUN      ] AddressSanitizer.MlockTest
[       OK ] AddressSanitizer.MlockTest (0 ms)
[ DISABLED ] AddressSanitizer.DISABLED_DemoThreadedTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoStackTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoThreadStackTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoUAFLowIn
[ DISABLED ] AddressSanitizer.DISABLED_DemoUAFLowLeft
[ DISABLED ] AddressSanitizer.DISABLED_DemoUAFLowRight
[ DISABLED ] AddressSanitizer.DISABLED_DemoUAFHigh
[ DISABLED ] AddressSanitizer.DISABLED_DemoOOM
[ DISABLED ] AddressSanitizer.DISABLED_DemoDoubleFreeTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoNullDerefTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoFunctionStaticTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoTooMuchMemoryTest
[ RUN      ] AddressSanitizer.LongDoubleNegativeTest
[       OK ] AddressSanitizer.LongDoubleNegativeTest (0 ms)
[----------] 19 tests from AddressSanitizer (71698 ms total)

[----------] Global test environment tear-down
[==========] 22 tests from 2 test suites ran. (71702 ms total)
[  PASSED  ] 22 tests.

  YOU HAVE 1 DISABLED TEST

Serial 17031FQCB00176

nikic added a commit that referenced this pull request Dec 25, 2025
LV can create step vectors that wrap around, e.g. `step-vector i1` with
VF>2. Allow truncation when creating the vector constant to avoid an
assertion failure with #171456.

After #173494 the definition of
the llvm.stepvector intrinsic has been changed to make it have wrapping
semantics, so the semantics for the fixed and scalable case match now.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Dec 25, 2025
…#173229)

LV can create step vectors that wrap around, e.g. `step-vector i1` with
VF>2. Allow truncation when creating the vector constant to avoid an
assertion failure with llvm/llvm-project#171456.

After llvm/llvm-project#173494 the definition of
the llvm.stepvector intrinsic has been changed to make it have wrapping
semantics, so the semantics for the fixed and scalable case match now.
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Jan 6, 2026
…cated. (llvm#173494)

The LangRef current defines out-of-range stepvector values as poison.
This property is at odds with both the expansion used for fixed-length
vectors and the equivalent ISD node, both of which implicitly truncate
out-of-range values.
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Jan 6, 2026
LV can create step vectors that wrap around, e.g. `step-vector i1` with
VF>2. Allow truncation when creating the vector constant to avoid an
assertion failure with llvm#171456.

After llvm#173494 the definition of
the llvm.stepvector intrinsic has been changed to make it have wrapping
semantics, so the semantics for the fixed and scalable case match now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:ir llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants