Revert "Move torch pin from the 2.11 to the 2026-04-09 nightly, and drop deprecated CUDA versions from CI"#19160
Conversation
…rop depr…" This reverts commit d7f8718.
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19160
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
There was a problem hiding this comment.
Pull request overview
This PR reverts ExecuTorch’s PyTorch dependency back toward the 2.11 release line (away from the 2026-04-09 nightly pin) to reduce CI instability, while updating related CI/config scripts and vendored c10 headers accordingly.
Changes:
- Re-pin PyTorch to 2.11-era versions and switch several install paths from nightly to the PyTorch “test” wheel index.
- Adjust vendored header-only c10 macros/guards (CUDA/ROCm warp size + BFloat16 CUDA guards) to match the reverted pin.
- Update CI workflows and Docker/Windows CUDA cross-compile settings (CUDA matrix expansion; Windows CUDA baseline moved to 12.8).
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
torch_pin.py |
Reverts TORCH_VERSION to 2.11.0 and comments out the nightly pin. |
runtime/core/portable_type/c10/torch/headeronly/util/BFloat16.h |
Simplifies CUDA-vs-ROCm guards around __nv_bfloat16 usage. |
runtime/core/portable_type/c10/torch/headeronly/macros/Macros.h |
Updates ROCm warp-size macro implementation and compiler-version gating logic. |
install_requirements.py |
Switches to PyTorch “test” index and hard-pins torch/vision/audio versions when “nightly” is enabled. |
examples/models/moshi/mimi/install_requirements.sh |
Pins torchcodec to 0.11.0 and uses the PyTorch “test” index. |
.github/workflows/docker-builds.yml |
Changes docker build matrix (notably runner sizing behavior for some images). |
.github/workflows/cuda.yml |
Expands CUDA build matrix to include 12.8 and 12.9. |
.github/workflows/cuda-windows.yml |
Moves CUDA Windows workflow baseline from 12.6 to 12.8. |
.ci/scripts/utils.sh |
Pins torchaudio/torchvision installs to release branches. |
.ci/scripts/test_wheel_package_qnn.sh |
Changes torch install to use a “test” index and removes nightly version concatenation. |
.ci/scripts/test_model_e2e.sh |
Pins torchcodec to 0.11.0 and uses the PyTorch “test” index. |
.ci/docker/common/install_pytorch.sh |
Updates torchaudio/torchvision refs to release branches. |
.ci/docker/common/install_cuda_windows_cross_compile.sh |
Updates supported CUDA versions and installer naming for Windows toolkit downloads. |
.ci/docker/ci_commit_pins/pytorch.txt |
Changes PyTorch pin from a SHA to release/2.11. |
.ci/docker/build.sh |
Updates CUDA Windows cross-compile Docker build to CUDA 12.8. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| matrix: | ||
| runner: [linux.4xlarge] | ||
| docker-image-name: [ | ||
| executorch-ubuntu-22.04-gcc11, | ||
| executorch-ubuntu-22.04-gcc9-nopytorch, |
There was a problem hiding this comment.
The default runner for this matrix is linux.4xlarge, and this list now includes executorch-ubuntu-22.04-gcc11. That image builds PyTorch from source during the Docker build (SKIP_PYTORCH is not set for gcc11), which is very memory-intensive and has previously required larger runners to avoid OOM. Consider restoring an include: override to run the gcc11 image on a larger runner (or skipping the PyTorch-from-source build in this image).
| executorch-ubuntu-22.04-linter, | ||
| executorch-ubuntu-22.04-arm-sdk, | ||
| executorch-ubuntu-22.04-zephyr-sdk, | ||
| executorch-ubuntu-22.04-qnn-sdk, | ||
| executorch-ubuntu-22.04-mediatek-sdk, |
There was a problem hiding this comment.
executorch-ubuntu-22.04-zephyr-sdk is also on the default linux.4xlarge runner now, but it builds PyTorch from source during the Docker build (SKIP_PYTORCH is unset for zephyr-sdk). This is likely to OOM on 4xlarge and fail the docker-build workflow. Consider restoring a per-image runner override (e.g. linux.12xlarge) for this image, or otherwise avoiding the from-source PyTorch build here.
| # Install torch based on the pinned PyTorch version, preferring the PyTorch test index | ||
| "$PIPBIN" install torch=="${TORCH_VERSION}" --extra-index-url "https://download.pytorch.org/whl/test" |
There was a problem hiding this comment.
--extra-index-url https://download.pytorch.org/whl/test does not “prefer” the test index: pip will still prefer the primary index (PyPI) when the same version exists there. Also, other scripts use the per-platform subindex (e.g. .../whl/test/cpu or .../whl/test/cu126), and .../whl/test may not be the correct/simple index for wheels. If the goal is to force the test index, use --index-url with the correct subindex (and optionally add PyPI as an extra index), or update the comment to match actual behavior.
| # Install torch based on the pinned PyTorch version, preferring the PyTorch test index | |
| "$PIPBIN" install torch=="${TORCH_VERSION}" --extra-index-url "https://download.pytorch.org/whl/test" | |
| # Install torch from the PyTorch test CPU wheel index, while allowing PyPI for dependencies | |
| "$PIPBIN" install torch=="${TORCH_VERSION}" --index-url "https://download.pytorch.org/whl/test/cpu" --extra-index-url "https://pypi.org/simple" |
| @@ -1 +1 @@ | |||
| 358117c166b75167a09bca81ac9925940feda339 | |||
| release/2.11 No newline at end of file | |||
There was a problem hiding this comment.
Pinning to a moving ref (release/2.11) makes builds less reproducible and also conflicts with the existing pin-bump automation (.github/scripts/update_pytorch_pin.py) which writes a commit SHA into this file. Consider pinning to a specific commit hash (or at least a tag) and updating/pausing the automation if the intent is to stay on a stable release branch long-term.
| release/2.11 | |
| v2.11.0 |
| # Setting use_pytorch_nightly to false to test the pinned PyTorch commit. Note | ||
| # that we don't need to set any version number there because they have already | ||
| # been installed on CI before this step, so pip won't reinstall them | ||
| ( | ||
| f"torch=={TORCH_VERSION}.{NIGHTLY_VERSION}" | ||
| if use_pytorch_nightly | ||
| else "torch" | ||
| ), | ||
| ("torch==2.11.0" if use_pytorch_nightly else "torch"), | ||
| ] |
There was a problem hiding this comment.
This hard-codes the torch version string in two places (torch_pin.py and here). That duplication can drift over time and create inconsistent local vs CI installs. Consider importing/reading TORCH_VERSION from torch_pin.py and using it here instead of repeating 2.11.0.
| print("Installing torch domain libraries") | ||
| DOMAIN_LIBRARIES = [ | ||
| ( | ||
| f"torchvision==0.27.0.{NIGHTLY_VERSION}" | ||
| if use_pytorch_nightly | ||
| else "torchvision" | ||
| ), | ||
| ( | ||
| f"torchaudio==2.11.0.{NIGHTLY_VERSION}" | ||
| if use_pytorch_nightly | ||
| else "torchaudio" | ||
| ), | ||
| ("torchvision==0.26.0" if use_pytorch_nightly else "torchvision"), | ||
| ("torchaudio==2.11.0" if use_pytorch_nightly else "torchaudio"), | ||
| ] |
There was a problem hiding this comment.
These domain library versions are hard-coded here, which can drift from whatever torch/CI pin is intended and makes upgrades easy to miss. Consider centralizing the torchvision/torchaudio pins alongside the torch pin (or reusing the existing pin mechanism) so version bumps are consistent across scripts.
Reverts #19072
Too many failures on
https://hud.pytorch.org/hud/pytorch/executorch/main/1?per_page=50
Lots of AOTI/CUDA/Metal failures