Skip to content

feat: implement Turbomind vision encoder support for Qwen3VL/3.5 families#4460

Open
lapy wants to merge 1 commit intoInternLM:mainfrom
lapy:split/qwen-vision-backend
Open

feat: implement Turbomind vision encoder support for Qwen3VL/3.5 families#4460
lapy wants to merge 1 commit intoInternLM:mainfrom
lapy:split/qwen-vision-backend

Conversation

@lapy
Copy link
Contributor

@lapy lapy commented Mar 24, 2026

Add vision support for Turbomind and Qwen3 VL and 3.5 vision encoders.

PR Testing

Scope

This change was validated on the TurboMind path only. The PyTorch backend was not modified or tested.

Regression Test

Ran the TurboMind-focused regression test:

PYTHONPATH=/root/lmdeploy-prs/split-qwen-vision-backend \
pytest -q tests/test_lmdeploy/test_vl/test_qwen_vl_family.py

Result:

8 passed

Optional Qwen3.5 Fast Path Dependency

Installed the matching prebuilt causal-conv1d wheel for this environment:

python -m pip install --break-system-packages \
  'https://github.com/Dao-AILab/causal-conv1d/releases/download/v1.6.1.post4/causal_conv1d-1.6.1%2Bcu12torch2.10cxx11abiTRUE-cp312-cp312-linux_x86_64.whl'

Verified the optional Qwen3.5 fast path is available after install:

python - <<'PY'
from transformers.models.qwen3_5 import modeling_qwen3_5 as m
print('is_fast_path_available', m.is_fast_path_available)
print('causal_conv1d_fn', m.causal_conv1d_fn is not None)
print('causal_conv1d_update', m.causal_conv1d_update is not None)
print('chunk_gated_delta_rule', m.chunk_gated_delta_rule is not None)
print('fused_recurrent_gated_delta_rule', m.fused_recurrent_gated_delta_rule is not None)
PY

End-to-End TurboMind Inference

Validated image inference with the smallest Qwen3-VL model:

CUDA_VISIBLE_DEVICES=0 \
PYTHONPATH=/root/lmdeploy-prs/split-qwen-vision-backend:/root/lmdeploy/lmdeploy/lib \
TM_LOG_LEVEL=ERROR \
python - <<'PY'
from lmdeploy import pipeline, TurbomindEngineConfig, GenerationConfig
from lmdeploy.vl import load_image

model = 'Qwen/Qwen3-VL-2B-Instruct'
backend_config = TurbomindEngineConfig(
    tp=1,
    max_batch_size=1,
    session_len=4096,
    cache_max_entry_count=0.05,
)
gen_config = GenerationConfig(max_new_tokens=48, do_sample=False, temperature=0.0)

img = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
pipe = pipeline(model, backend_config=backend_config, log_level='ERROR')
print('backend', pipe.async_engine.backend)
out = pipe(('Describe the image in one sentence.', img), gen_config=gen_config)
print(out.text)
pipe.close()
PY

Observed:

backend turbomind
A majestic tiger with a striking orange coat and black stripes rests peacefully on a vibrant green lawn, its gaze fixed directly at the camera.

Validated image inference with the smallest Qwen3.5 VL model:

CUDA_VISIBLE_DEVICES=0 \
PYTHONPATH=/root/lmdeploy-prs/split-qwen-vision-backend:/root/lmdeploy/lmdeploy/lib \
TM_LOG_LEVEL=ERROR \
python - <<'PY'
from lmdeploy import pipeline, TurbomindEngineConfig, GenerationConfig
from lmdeploy.vl import load_image

model = 'Qwen/Qwen3.5-0.8B'
backend_config = TurbomindEngineConfig(
    tp=1,
    max_batch_size=1,
    session_len=4096,
    cache_max_entry_count=0.05,
)
gen_config = GenerationConfig(max_new_tokens=48, do_sample=False, temperature=0.0)

img = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
pipe = pipeline(model, backend_config=backend_config, log_level='ERROR')
print('backend', pipe.async_engine.backend)
out = pipe(('Describe the image in one sentence.', img), gen_config=gen_config)
print(out.text)
pipe.close()
PY

Observed:

backend turbomind
A majestic tiger lies peacefully on a sunlit grassy field, its powerful eyes fixed forward and its powerful body relaxed.

@lapy lapy changed the title feat: implement Turbomind vision encoder support for Qwen3VL/3.5 fami… feat: implement Turbomind vision encoder support for Qwen3VL/3.5 families Mar 24, 2026
@lapy lapy force-pushed the split/qwen-vision-backend branch from b33b54e to 539e55c Compare March 24, 2026 22:11
@lapy lapy force-pushed the split/qwen-vision-backend branch from f177f41 to 13f0ae2 Compare March 24, 2026 23:14
@lvhan028
Copy link
Collaborator

@lapy you are on fire!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants