Skip to content

Conversation

@amdfaa
Copy link
Contributor

@amdfaa amdfaa commented Nov 5, 2025

Purpose

This PR is for fixing the AMD MI300: Language Models Tests (Hybrid) %N failing with uv: command not found
See failing build:
https://buildkite.com/vllm/ci/builds/37631/steps/canvas?sid=019a5264-347a-407b-8a34-6d54a2f8e82f

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: amdfaa <107946068+amdfaa@users.noreply.github.com>
@mergify mergify bot added ci/build rocm Related to AMD ROCm labels Nov 5, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a CI failure in the 'AMD MI300: Language Models Tests' job by installing uv before it is used. The change is correct and directly addresses the uv: command not found error. I have one suggestion to pin the version of uv to ensure build reproducibility and prevent potential breakages from future uv releases. This is a standard practice for maintaining stable CI environments.

commands:
# Install fast path packages for testing against transformers
# Note: also needed to run plamo2 model in vLLM
- pip install uv
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

For better reproducibility and to prevent unexpected build failures due to updates in uv, it's recommended to pin the version of uv being installed. This ensures that the CI environment remains stable and predictable over time. Please pin it to a recent, stable version that is known to work.

    - pip install uv==0.2.21

@gshtras gshtras added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 5, 2025
Signed-off-by: amdfaa <107946068+amdfaa@users.noreply.github.com>
@zhewenl
Copy link
Collaborator

zhewenl commented Nov 5, 2025

let's fix it in test-amd.yaml? also I think install UV in docker image should be more appropriate.

cc @Alexei-V-Ivanov-AMD @gshtras

@amdfaa
Copy link
Contributor Author

amdfaa commented Nov 5, 2025

let's fix it in test-amd.yaml? also I think install UV in docker image should be more appropriate.

cc @Alexei-V-Ivanov-AMD @gshtras

That makes sense. I'll add it to the requirements file and remove the current changes. @zhewenl

Signed-off-by: amdfaa <107946068+amdfaa@users.noreply.github.com>
Removed specific version pinning for uv package.

Signed-off-by: amdfaa <107946068+amdfaa@users.noreply.github.com>
@zhewenl
Copy link
Collaborator

zhewenl commented Nov 5, 2025

That makes sense. I'll add it to the requirements file and remove the current changes. @zhewenl

I am not sure if adding to deps list is sufficient - maybe trigger a test on AMD CI to test if uv issue is gone?
Here is how Nvidia setting up UV in docker: https://github.com/vllm-project/vllm/blob/main/docker/Dockerfile#L90-L91, and I had a draft PR for an example: #27716

@amdfaa
Copy link
Contributor Author

amdfaa commented Nov 6, 2025

That makes sense. I'll add it to the requirements file and remove the current changes. @zhewenl

I am not sure if adding to deps list is sufficient - maybe trigger a test on AMD CI to test if uv issue is gone? Here is how Nvidia setting up UV in docker: https://github.com/vllm-project/vllm/blob/main/docker/Dockerfile#L90-L91, and I had a draft PR for an example: #27716

This is a recent run based on my commit to add package to the dependencies. It looks like it worked. Can you confirm this and merge if you think it's good @zhewenl ?

https://buildkite.com/vllm/ci/builds/37752/steps/canvas?sid=019a55dd-7925-4543-bff0-be282fbbbec0

@zhewenl
Copy link
Collaborator

zhewenl commented Nov 7, 2025

@amdfaa I think you need to trigger a test in AMD CI, the test you linked is running on Nvidia. Example: #28156

cc @Alexei-V-Ivanov-AMD

@amdfaa amdfaa requested a review from tjtanaa as a code owner November 10, 2025 19:25
amdfaa and others added 3 commits November 10, 2025 14:27
Signed-off-by: amdfaa <107946068+amdfaa@users.noreply.github.com>
Signed-off-by: zhewenli <zhewenli@meta.com>
Signed-off-by: zhewenli <zhewenli@meta.com>
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) November 13, 2025 02:58
@DarkLight1337 DarkLight1337 merged commit a7791ea into vllm-project:main Nov 13, 2025
31 checks passed
@gshtras
Copy link
Collaborator

gshtras commented Nov 13, 2025

A bit late, but rather than changing the build process for the production image in order to fix the tests, potentially locking the run scenarios to the root user, isn't it better to address the test issue in the test?

geodavic pushed a commit to geodavic/vllm that referenced this pull request Nov 16, 2025
vllm-project#28142)

Signed-off-by: amdfaa <107946068+amdfaa@users.noreply.github.com>
Signed-off-by: zhewenli <zhewenli@meta.com>
Co-authored-by: zhewenli <zhewenli@meta.com>
Signed-off-by: George D. Torres <gdavtor@gmail.com>
bwasti pushed a commit to bwasti/vllm that referenced this pull request Nov 17, 2025
vllm-project#28142)

Signed-off-by: amdfaa <107946068+amdfaa@users.noreply.github.com>
Signed-off-by: zhewenli <zhewenli@meta.com>
Co-authored-by: zhewenli <zhewenli@meta.com>
Signed-off-by: Bram Wasti <bwasti@meta.com>
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025
vllm-project#28142)

Signed-off-by: amdfaa <107946068+amdfaa@users.noreply.github.com>
Signed-off-by: zhewenli <zhewenli@meta.com>
Co-authored-by: zhewenli <zhewenli@meta.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants