Skip to content

Conversation

@zhewenl
Copy link
Collaborator

@zhewenl zhewenl commented Nov 5, 2025

Purpose

See more details in #27442. Encoder-decoder models (Whisper, T5, BART, vision-language models) fail on AMD ROCm with NotImplementedError because all ROCm-specific attention backends only support decoder-only models.

Therefore, all the tests using these models will fail on AMD CI(example), mainly in Entrypoints Integration Test (API Server) and Entrypoints Integration Test (Pooling):

NotImplementedError: Encoder self-attention and encoder/decoder cross-attention
are not implemented for TritonAttentionImpl

In this PR:

  • Tests that contain BOTH encoder and decoder models use @pytest.mark.encoder_decoder on specific tests/variants; also fixed some tests not using parametrized fixtures.
  • Tests that contain ONLY encoder models are skipped entirely at the CI level in test-amd.yaml.
  • Ad-hoc patches tests can not skip/add decorator to skip easily by patching the simple logic to skip on ROCm platform.

Test Plan

CI: https://buildkite.com/vllm/amd-ci/builds/1020

Signed-off-by: zhewenli <zhewenli@meta.com>
@mergify mergify bot added ci/build rocm Related to AMD ROCm labels Nov 5, 2025
Signed-off-by: zhewenli <zhewenli@meta.com>
Signed-off-by: zhewenli <zhewenli@meta.com>
@ywang96
Copy link
Member

ywang96 commented Nov 6, 2025

cc @russellb @robertgshaw2-redhat

AFAIR we have reached an agreement that the only enc-dec model we're going to support in V1 is whisper, so we should deprecate other models.

Signed-off-by: zhewenli <zhewenli@meta.com>
@zhewenl zhewenl marked this pull request as ready for review November 7, 2025 00:00
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Signed-off-by: zhewenli <zhewenli@meta.com>
@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 7, 2025
Signed-off-by: zhewenli <zhewenli@meta.com>
@zhewenl
Copy link
Collaborator Author

zhewenl commented Nov 7, 2025

@DarkLight1337 Thanks for catching this! Updated list and relevant tests

@zhewenl
Copy link
Collaborator Author

zhewenl commented Nov 10, 2025

We will hold this PR as #28346, #28376 attempts to fix encoder-decoder on ROCM

@zhewenl zhewenl marked this pull request as draft November 10, 2025 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants