feat: implement Turbomind vision encoder support for Qwen3VL/3.5 families by lapy · Pull Request #4460 · InternLM/lmdeploy

lapy · 2026-03-24T22:05:46Z

Add vision support for Turbomind and Qwen3 VL and 3.5 vision encoders.

PR Testing

Scope

This change was validated on the TurboMind path only. The PyTorch backend was not modified or tested.

Regression Test

Ran the TurboMind-focused regression test:

PYTHONPATH=/root/lmdeploy-prs/split-qwen-vision-backend \
pytest -q tests/test_lmdeploy/test_vl/test_qwen_vl_family.py

Result:

8 passed

Optional Qwen3.5 Fast Path Dependency

Installed the matching prebuilt causal-conv1d wheel for this environment:

python -m pip install --break-system-packages \
  'https://github.com/Dao-AILab/causal-conv1d/releases/download/v1.6.1.post4/causal_conv1d-1.6.1%2Bcu12torch2.10cxx11abiTRUE-cp312-cp312-linux_x86_64.whl'

Verified the optional Qwen3.5 fast path is available after install:

python - <<'PY'
from transformers.models.qwen3_5 import modeling_qwen3_5 as m
print('is_fast_path_available', m.is_fast_path_available)
print('causal_conv1d_fn', m.causal_conv1d_fn is not None)
print('causal_conv1d_update', m.causal_conv1d_update is not None)
print('chunk_gated_delta_rule', m.chunk_gated_delta_rule is not None)
print('fused_recurrent_gated_delta_rule', m.fused_recurrent_gated_delta_rule is not None)
PY

End-to-End TurboMind Inference

Validated image inference with the smallest Qwen3-VL model:

CUDA_VISIBLE_DEVICES=0 \
PYTHONPATH=/root/lmdeploy-prs/split-qwen-vision-backend:/root/lmdeploy/lmdeploy/lib \
TM_LOG_LEVEL=ERROR \
python - <<'PY'
from lmdeploy import pipeline, TurbomindEngineConfig, GenerationConfig
from lmdeploy.vl import load_image

model = 'Qwen/Qwen3-VL-2B-Instruct'
backend_config = TurbomindEngineConfig(
    tp=1,
    max_batch_size=1,
    session_len=4096,
    cache_max_entry_count=0.05,
)
gen_config = GenerationConfig(max_new_tokens=48, do_sample=False, temperature=0.0)

img = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
pipe = pipeline(model, backend_config=backend_config, log_level='ERROR')
print('backend', pipe.async_engine.backend)
out = pipe(('Describe the image in one sentence.', img), gen_config=gen_config)
print(out.text)
pipe.close()
PY

Observed:

backend turbomind
A majestic tiger with a striking orange coat and black stripes rests peacefully on a vibrant green lawn, its gaze fixed directly at the camera.

Validated image inference with the smallest Qwen3.5 VL model:

CUDA_VISIBLE_DEVICES=0 \
PYTHONPATH=/root/lmdeploy-prs/split-qwen-vision-backend:/root/lmdeploy/lmdeploy/lib \
TM_LOG_LEVEL=ERROR \
python - <<'PY'
from lmdeploy import pipeline, TurbomindEngineConfig, GenerationConfig
from lmdeploy.vl import load_image

model = 'Qwen/Qwen3.5-0.8B'
backend_config = TurbomindEngineConfig(
    tp=1,
    max_batch_size=1,
    session_len=4096,
    cache_max_entry_count=0.05,
)
gen_config = GenerationConfig(max_new_tokens=48, do_sample=False, temperature=0.0)

img = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
pipe = pipeline(model, backend_config=backend_config, log_level='ERROR')
print('backend', pipe.async_engine.backend)
out = pipe(('Describe the image in one sentence.', img), gen_config=gen_config)
print(out.text)
pipe.close()
PY

Observed:

backend turbomind
A majestic tiger lies peacefully on a sunlit grassy field, its powerful eyes fixed forward and its powerful body relaxed.

…lies

lvhan028 · 2026-03-25T02:51:54Z

@lapy you are on fire!

lapy changed the title ~~feat: implement Turbomind vision encoder support for Qwen3VL/3.5 fami…~~ feat: implement Turbomind vision encoder support for Qwen3VL/3.5 families Mar 24, 2026

lapy force-pushed the split/qwen-vision-backend branch from b33b54e to 539e55c Compare March 24, 2026 22:11

feat: implement Turbomind vision encoder support for Qwen3VL/3.5 fami…

13f0ae2

…lies

lapy force-pushed the split/qwen-vision-backend branch from f177f41 to 13f0ae2 Compare March 24, 2026 23:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement Turbomind vision encoder support for Qwen3VL/3.5 families#4460

feat: implement Turbomind vision encoder support for Qwen3VL/3.5 families#4460
lapy wants to merge 1 commit intoInternLM:mainfrom
lapy:split/qwen-vision-backend

lapy commented Mar 24, 2026 •

edited

Loading

Uh oh!

lvhan028 commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lapy commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add vision support for Turbomind and Qwen3 VL and 3.5 vision encoders.

PR Testing

Scope

Regression Test

Optional Qwen3.5 Fast Path Dependency

End-to-End TurboMind Inference

Uh oh!

lvhan028 commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lapy commented Mar 24, 2026 •

edited

Loading