Skip to content

天数 BI-150S 无法部署 PaddleOCR-VL #7461

@megemini

Description

@megemini

在 aistudio 的天数 BI-150S 环境中通过以下命令部署 PaddleOCR-VL:

python -m fastdeploy.entrypoints.openai.api_server \
    --model /home/aistudio/baidu/PaddleOCR-VL-1.5 \
    --port 8185 \
    --metrics-port 8186 \
    --engine-worker-queue-port 8187 \
    --max-model-len 16384 \
    --max-num-batched-tokens 16384 \
    --gpu-memory-utilization 0.8 \
    --max-num-seqs 256

报错:

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
INFO     2026-04-17 14:15:14,144 344971 engine.py[line:151] Waiting for worker processes to be ready...
Loading Weights:   0%|                                                                                                                                     | 0/100 [00:16<?, ?it/s]
ERROR    2026-04-17 14:15:35,184 344971 engine.py[line:160] Failed to launch worker processes, check log/workerlog.* for more details.
ERROR    2026-04-17 14:15:41,788 344971 engine.py[line:452] Error extracting sub services: [Errno 3] No such process, Traceback (most recent call last):
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/fastdeploy/engine/engine.py", line 449, in _exit_sub_services
    pgid = os.getpgid(self.worker_proc.pid)
ProcessLookupError: [Errno 3] No such process

log/workerlog.0 中提示错误:

INFO     2026-04-17 14:11:37,472 339066 sampler.py[line:132] GuidedDecoding max_num_seqs=256 fill_bitmask_parallel_batch_size=4 is_cuda_platform=False max_workers=64.0
INFO     2026-04-17 14:11:37,479 339066 input_batch.py[line:346] Enabled logits processors: []
INFO     2026-04-17 14:11:37,480 339066 iluvatar.py[line:29] Using ixinfer MHA backend instead of append attention
Traceback (most recent call last):
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/fastdeploy/engine/../worker/worker_process.py", line 1333, in <module>
    run_worker_proc()
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/base/dygraph/base.py", line 406, in _decorate_function
    return func(*args, **kwargs)
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/fastdeploy/engine/../worker/worker_process.py", line 1313, in run_worker_proc
    worker_proc.init_device()
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/fastdeploy/worker/worker_process.py", line 726, in init_device
    self.worker.init_device()
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/fastdeploy/worker/iluvatar_worker.py", line 68, in init_device
    self.model_runner: IluvatarModelRunner = IluvatarModelRunner(
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/fastdeploy/worker/iluvatar_model_runner.py", line 54, in __init__
    super(IluvatarModelRunner, self).__init__(
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/fastdeploy/worker/gpu_model_runner.py", line 226, in __init__
    self._initialize_attn_backend()
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/fastdeploy/worker/iluvatar_model_runner.py", line 94, in _initialize_attn_backend
    attn_backend = attn_cls(
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/fastdeploy/model_executor/layers/backends/iluvatar/attention/mha_attn_backend.py", line 81, in __init__
    assert self.block_size == 16, "Iluvatar paged attn requires block_size must be 16."
AssertionError: Iluvatar paged attn requires block_size must be 16.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions