Skip to content

Conversation

@njzjz
Copy link
Member

@njzjz njzjz commented Dec 3, 2025

CUDA 11 is very old. TensorFlow, PyTorch, and JAX stopped the support of CUDA 11 very long ago.

Summary by CodeRabbit

  • Chores

    • Removed CUDA 11 support across build matrix; CUDA 12.8 is now the minimum supported version.
    • Dropped TensorFlow 2.14 support from C library builds; only TensorFlow 2.18 available.
    • Simplified build toolchain and updated base image specifications.
  • Documentation

    • Removed CUDA 11.8 installation guides and references.
    • Updated installation documentation to reflect current CUDA and TensorFlow requirements.

✏️ Tip: You can customize this high-level summary in your review settings.

CUDA 11 is very old. TensorFlow, PyTorch, and JAX stopped the support of CUDA 11 very long ago.
@github-actions github-actions bot added the Docs label Dec 3, 2025
@codecov
Copy link

codecov bot commented Dec 3, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.28%. Comparing base (2d5fa3c) to head (3562745).
⚠️ Report is 8 commits behind head on devel.

Additional details and impacted files
@@           Coverage Diff           @@
##            devel    #5080   +/-   ##
=======================================
  Coverage   84.28%   84.28%           
=======================================
  Files         709      709           
  Lines       70561    70561           
  Branches     3618     3619    +1     
=======================================
+ Hits        59472    59473    +1     
  Misses       9923     9923           
+ Partials     1166     1165    -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@ustc.edu.cn>
@njzjz njzjz marked this pull request as ready for review December 4, 2025 16:51
Copilot AI review requested due to automatic review settings December 4, 2025 16:51
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes support for CUDA 11 pre-built wheels as CUDA 11 is outdated and no longer supported by major frameworks like TensorFlow, PyTorch, and JAX.

  • Removed CUDA 11 dependency groups and wheel build configurations
  • Cleaned up CI/CD workflows to remove CUDA 11 build matrices and setup steps
  • Updated documentation to remove CUDA 11 installation instructions

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
source/install/docker/Dockerfile Removed conditional PyTorch backend selection for CUDA 11 during wheel installation
pyproject.toml Removed cu11 dependency group, updated manylinux image to latest, and cleaned up CUDA 11-specific build configuration
doc/install/install-from-source.md Removed CUDA 11.8/cu118 PaddlePaddle installation instructions and Paddle C++ inference library link
doc/install/install-from-c-library.md Updated to reflect only CUDA 12.2 library availability
doc/install/easy-install.md Removed CUDA 11/11.8 installation tabs for TensorFlow, PyTorch, and PaddlePaddle backends
doc/install/easy-install-dev.md Removed reference to devel_cu11 Docker tag
backend/find_tensorflow.py Removed CUDA 11 version detection logic and TensorFlow 2.14.1 requirement
backend/find_pytorch.py Removed CUDA 11 version detection logic and PyTorch 2.3.1 version pinning
.github/workflows/package_c.yml Removed TensorFlow 2.14 build matrix entry for CUDA 11 C library
.github/workflows/build_wheel.yml Removed CUDA 11.8 build matrix entry and QEMU/setuptools_scm setup steps
.github/workflows/build_cc.yml Removed CUDA 11.8 variant from build matrix and associated CUDA toolkit installation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 4, 2025

📝 Walkthrough

Walkthrough

Systematically removes CUDA 11 support across CI workflows, backend version detection logic, documentation, and build configuration. Consolidates the build matrix to CUDA 12.x and CPU variants, removes conditional CUDA 11 branching from PyTorch/TensorFlow version selectors, and eliminates related build steps (QEMU, UV, AlmaLinux RPM imports).

Changes

Cohort / File(s) Summary
CI Workflow Updates
.github/workflows/build_cc.yml
Removes CUDA variant from build matrix and eliminates CUDA-specific installation step block with conditional guard.
CI Workflow Updates
.github/workflows/build_wheel.yml
Bumps Linux CUDA version from 12.2 to 12.8, removes cu11 variant from build matrix and CUDA 11 flow, removes QEMU setup and related environment steps.
CI Workflow Updates
.github/workflows/package_c.yml
Removes TensorFlow 2.14 matrix entry, retains only TensorFlow 2.18 configuration for C library builds.
Backend Version Detection
backend/find_pytorch.py, backend/find_tensorflow.py
Removes CUDA >=11,<12 branching that previously selected PyTorch 2.3.1 and TensorFlow 2.14.1; CUDA 11.x now triggers unsupported error path.
Installation Documentation
doc/install/easy-install-dev.md
Removes development installation guidance for CUDA 11.8 devel\_cu11 tag.
Installation Documentation
doc/install/easy-install.md
Removes CUDA 11 and CUDA 11.8 instruction blocks under pip install section; retains only CUDA 12 and CPU variants.
Installation Documentation
doc/install/install-from-c-library.md
Updates pre-compiled library description to specify CUDA 12.2 only; removes CUDA 11.8 variant.
Installation Documentation
doc/install/install-from-source.md
Removes Paddle cu118 release/nightly entries and CUDA 11.8 from weekly-build section; adds explicit CPU entry alongside CUDA 12.3.
Build Configuration
pyproject.toml
Removes cu11 NVIDIA CUDA 11 dependency group, replaces pinned manylinux image with generic manylinux\_2\_28 tag, removes AlmaLinux RPM import and UV-related installation/build steps.
Docker Build
source/install/docker/Dockerfile
Removes CUDA\_VERSION=11 conditional workaround for UV\_TORCH\_BACKEND=cu118, adds python -m deepmd -h command to final RUN sequence.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Backend files (find_pytorch.py, find_tensorflow.py) require verification that removed CUDA 11.x branches correctly flow to unsupported error paths and don't break existing fallback logic
  • Multiple workflow files follow consistent removal pattern but span different build contexts (CUDA versions, platform handling, matrix configurations)
  • Configuration changes (pyproject.toml) modify both optional dependencies and build image specifications; verify manylinux tag compatibility and that UV removal doesn't break downstream build environments

Possibly related PRs

  • PR #4228 — Modifies backend/find_pytorch.py and backend/find_tensorflow.py for CUDA version handling and PyTorch/TensorFlow selection logic
  • PR #4841 — Modifies Dockerfile CUDA conditional logic and UV torch backend environment setup
  • PR #4887 — Modifies installation documentation (easy-install.md, install-from-source.md) and Paddle installation sections

Suggested labels

breaking change, Python

Suggested reviewers

  • iProzd
  • wanghan-iapcm

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'breaking: stop providing CUDA 11 pre-built wheels' directly and clearly summarizes the main change across all modified files, which consistently remove CUDA 11 support from build workflows, documentation, and dependency configurations.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2d5fa3c and 3562745.

📒 Files selected for processing (11)
  • .github/workflows/build_cc.yml (0 hunks)
  • .github/workflows/build_wheel.yml (1 hunks)
  • .github/workflows/package_c.yml (0 hunks)
  • backend/find_pytorch.py (0 hunks)
  • backend/find_tensorflow.py (0 hunks)
  • doc/install/easy-install-dev.md (0 hunks)
  • doc/install/easy-install.md (0 hunks)
  • doc/install/install-from-c-library.md (1 hunks)
  • doc/install/install-from-source.md (1 hunks)
  • pyproject.toml (1 hunks)
  • source/install/docker/Dockerfile (1 hunks)
💤 Files with no reviewable changes (6)
  • .github/workflows/package_c.yml
  • doc/install/easy-install-dev.md
  • .github/workflows/build_cc.yml
  • doc/install/easy-install.md
  • backend/find_pytorch.py
  • backend/find_tensorflow.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-14T07:11:51.357Z
Learnt from: njzjz
Repo: deepmodeling/deepmd-kit PR: 4884
File: .github/workflows/test_cuda.yml:46-46
Timestamp: 2025-08-14T07:11:51.357Z
Learning: As of PyTorch 2.8 (August 2025), the default wheel on PyPI installed by `pip install torch` is CPU-only. CUDA-enabled wheels are available on PyPI for Linux x86 and Windows x86 platforms, but require explicit specification via index URLs or variant-aware installers. For CUDA support, use `--index-url https://download.pytorch.org/whl/cu126` (or appropriate CUDA version).

Applied to files:

  • doc/install/install-from-source.md
  • pyproject.toml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: CodeQL analysis (python)
  • GitHub Check: Agent
🔇 Additional comments (8)
doc/install/install-from-c-library.md (1)

15-15: LGTM!

Documentation correctly updated to reflect CUDA 12.2 as the sole pre-compiled C library variant. Aligns with the PR's objective to drop CUDA 11 support.

doc/install/install-from-source.md (2)

352-356: Update to Paddle weekly-build section is clear and correct.

Documentation now states only CUDA 12.3 and CPU variants, with updated links reflecting the removal of CUDA 11 support. Header text clearly indicates "CUDA 12.3/CPU".


101-112: LGTM with minor clarification needed.

Paddle installation instructions correctly remove CUDA 11.8 (cu118) variants and retain only CUDA 12.6 (cu126) and CPU variants, aligning with CUDA 11 deprecation. Verification confirms that the stable PyPI index URLs are currently active and accessible:

  • https://www.paddlepaddle.org.cn/packages/stable/cu126/ provides paddlepaddle-gpu==3.1.1
  • https://www.paddlepaddle.org.cn/packages/stable/cpu/ provides paddlepaddle CPU variants

However, the original review references inconsistencies: it mentions verifying "lines 354-356" while the snippet is at lines 101-112, and requests verification of "weekly-build links for CUDA 12.3" which do not appear in the provided code snippet. Clarify whether additional sections of the file require verification.

.github/workflows/build_wheel.yml (2)

32-32: LGTM!

CUDA version bump from 12.2 to 12.8 for Linux x86_64 builds is a reasonable minor version upgrade and should be compatible with the ecosystem. No compatibility concerns for this update.


113-122: The Docker build matrix correctly matches CUDA 12.8 wheels.

The pattern cibw-*-manylinux_x86_64-cu12* will successfully match artifacts tagged cu12.8 because the glob * wildcard matches any sequence of characters following cu12. This is standard Unix glob behavior and works as expected.

pyproject.toml (2)

248-249: LGTM!

Updating manylinux-x86_64-image from pinned timestamped version to generic "manylinux_2_28" improves maintainability and reduces coupling to specific build artifacts. Both x86_64 and aarch64 now use consistent manylinux_2_28 baseline.


120-129: Verify extras coverage for Dockerfile dependencies.

The cu12 extra is defined in pyproject.toml (lines 120-129), but the Dockerfile attempts to install with extras [gpu,cu12,lmp,ipi,torch]. Only cu12, lmp, and ipi from this group are actually defined as valid optional-dependencies. The gpu and torch extras do not exist in [tool.deepmd_build_backend.optional-dependencies].

To resolve the Docker build failure flagged in the Dockerfile review, either:

  1. Add missing torch and/or gpu extras to [tool.deepmd_build_backend.optional-dependencies], or
  2. Update the Dockerfile to remove invalid extras.
source/install/docker/Dockerfile (1)

11-15: No action required; review comment is incorrect.

The extras specified in the uv pip install command [gpu,cu${CUDA_VERSION},lmp,ipi,torch] are all valid optional-dependencies. According to deepmd-kit's official documentation and pyproject.toml configuration, gpu, torch, cu12, lmp, and ipi are all defined as valid optional-dependencies in [tool.deepmd_build_backend.optional-dependencies]. The Docker build will succeed without modification.

Likely an incorrect or invalid review comment.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@njzjz njzjz requested a review from wanghan-iapcm December 5, 2025 02:14
@njzjz njzjz added this pull request to the merge queue Dec 12, 2025
Merged via the queue into deepmodeling:devel with commit 09486c5 Dec 12, 2025
62 checks passed
@njzjz njzjz deleted the drop-cuda11 branch December 12, 2025 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants