Skip to content

Commit 5cd2377

Browse files
committed
Fix the issue where there is no parallelism in PP mode
Signed-off-by: Weiliang Liu <weiliangl@nvidia.com>
1 parent 315068e commit 5cd2377

File tree

1 file changed

+6
-1
lines changed

1 file changed

+6
-1
lines changed

vllm/v1/core/sched/scheduler.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -355,7 +355,12 @@ def schedule(self) -> SchedulerOutput:
355355
while self.waiting and token_budget > 0:
356356
if len(self.running) == self.max_num_running_reqs:
357357
break
358-
358+
if len(scheduled_resumed_reqs) + len(scheduled_new_reqs) >= max(
359+
1,
360+
self.max_num_running_reqs
361+
// self.parallel_config.pipeline_parallel_size,
362+
):
363+
break
359364
request = self.waiting.peek_request()
360365

361366
# KVTransfer: skip request if still waiting for remote kvs.

0 commit comments

Comments
 (0)