Support router as replica with pipelines by Bihan · Pull Request #3721 · dstackai/dstack

Bihan · 2026-03-31T12:32:47Z

Refer design document for this PR is here.

r4victor · 2026-04-08T06:19:13Z

+
+
+class ServiceRouterWorkerSyncFetcher(Fetcher[ServiceRouterWorkerSyncPipelineItem]):
+    @sentry_utils.instrument_named_task("pipeline_tasks.ServiceRouterWorkerSyncFetcher.fetch")


I recently added @sentry_utils.instrument_pipeline_task – use it to avoid hardcoding pipeline_tasks prefix.

r4victor · 2026-04-08T06:28:33Z

+            run_model = sync_row.run
+            if run_model is None:
+                await session.delete(sync_row)
+                await session.commit()
+                return


How can run_model be None here?

I thought what if the run row can be hard-deleted, so sync_row.run becomes None. If this is not possible we can delete this block.

But you defined run_id as non-optional with ondelete="CASCADE" - how can it be possible?

You are right. Maybe I delete this block.

r4victor · 2026-04-08T06:34:42Z

+                .options(
+                    selectinload(RunModel.project),
+                    selectinload(RunModel.jobs).selectinload(JobModel.project),
+                    selectinload(RunModel.jobs)
+                    .selectinload(JobModel.instance)
+                    .selectinload(InstanceModel.project),
+                )
+            )


This is potentially a very inefficient select – a run can have thousands of job submissions. Select only the jobs that the processing needs, i.e. only the router replica job. Also every selectinload will be a separate query here – not sure if it's justified. joinedload may be a better suited for a one-to-one rel. Also, try to avoid loading all models's columns and use load_only to select only the necessary.

Please check if below proposed query addresses the concerns

Avoid loading thousands of job submissions: no longer load RunModel.jobs unconditionally. The selectinload(RunModel.jobs.and_(...)) restricts the loaded jobs to only RUNNING + registered replicas, which are the only ones sync_router_workers_for_run_model() can use (router job selection and worker list building both ignore non‑running / unregistered jobs).

selectinload is intentional: RunModel.jobs is a one‑to‑many collection; using joinedload would duplicate the RunModel row per job.

joinedload for one‑to‑one/many‑to‑one: RunModel.project, JobModel.project, JobModel.instance, InstanceModel.project are loaded with joinedload because these are scalar relationships from from run,job and instance.

Use load_only: This limits columns required by sync_router_workers_for_run_model(run_for_sync) and _get_service_replica_client(job_model)

res = await session.execute( select(RunModel) .where(RunModel.id == item.run_id) .options( load_only(RunModel.id, RunModel.run_spec), selectinload( RunModel.jobs.and_( JobModel.status == JobStatus.RUNNING, JobModel.registered == true(), ) ) .load_only( JobModel.id, JobModel.status, JobModel.registered, JobModel.job_spec_data, JobModel.job_provisioning_data, JobModel.job_runtime_data, ) .options( joinedload(JobModel.project).load_only(ProjectModel.id, ProjectModel.ssh_private_key), joinedload(JobModel.instance) .load_only(InstanceModel.id, InstanceModel.remote_connection_info) .joinedload(InstanceModel.project) .load_only(ProjectModel.id, ProjectModel.ssh_private_key), ), ) )

looks good, at least at a glance

r4victor · 2026-04-08T06:39:31Z

+    router_jobs = [
+        j
+        for j in run_model.jobs
+        if job_belongs_to_group(j, group_name) and j.status == JobStatus.RUNNING
+    ]
+    if not router_jobs or not is_replica_registered(router_jobs):
+        return None
+    return router_jobs[0]


Can there be multiple router jobs? If so, how does that work?

For the first iteration, I suggest restricting the router replica group to count: 1 via configuration validation. The current sync logic effectively assumes a single active router job. We can extend this later to support multiple router replicas for HA.

it's worth a comment!

r4victor · 2026-04-08T06:43:05Z

+def run_spec_has_router_replica_group(run_spec: RunSpec) -> bool:
+    if run_spec.configuration.type != "service":
+        return False
+    cfg = run_spec.configuration
+    if not isinstance(cfg, ServiceConfiguration):
+        return False
+    return any(g.router is not None for g in cfg.replica_groups)
+
+
+async def ensure_service_router_worker_sync_row(


Why put these router-speicfic functions in top of runs services.

I kept it there because they are used by run lifecycle. Should I shift them to src/dstack/_internal/server/services/router_worker_sync.py?

I mean at least they should not be at the top of the file.

r4victor · 2026-04-08T06:46:34Z

+    if not run_spec_has_router_replica_group(run_spec):
+        return
+    res = await session.execute(
+        select(ServiceRouterWorkerSyncModel.id).where(
+            ServiceRouterWorkerSyncModel.run_id == run_model.id
+        )
+    )
+    if res.scalar_one_or_none() is not None:
+        return


How can it be that ServiceRouterWorkerSyncModel already exists for a run if ensure_service_router_worker_sync_row is called only on run submit?

r4victor · 2026-04-08T06:48:48Z

+                return
+            run_model = sync_row.run
+            if run_model is None:
+                await session.delete(sync_row)


We generally use soft deletes in dstack server easier debugging and historical data. Assuming there will be very few ServiceRouterWorkerSyncModel rows (one per service replica router), I'd also soft-delete it for consistency.

r4victor · 2026-04-08T06:50:11Z

    )


+class ServiceRouterWorkerSyncModel(PipelineModelMixin, BaseModel):


Let's put it somewhere in the end of the file so that "core" models come first.

r4victor · 2026-04-08T06:52:14Z

@@ -0,0 +1,49 @@
+"""SSH-tunneled async HTTP client to a job's service port (same path as probes)."""


put this file in jobs services?

r4victor · 2026-04-08T06:53:05Z

@@ -0,0 +1,345 @@
+"""Reconcile SGLang router /workers with dstack's registered worker replicas (async, SSH-tunneled)."""


put this file in runs services

r4victor

Did a quick review of the pipeline code. Haven't looked into the worker sync logic.

jvstme · 2026-04-09T23:22:41Z

+async def _stream_response_body_bytes(resp: Response, max_bytes: int) -> bytes:
+    buf = bytearray()
+    async for chunk in resp.aiter_bytes():
+        buf.extend(chunk)
+        if len(buf) > max_bytes:
+            raise _ResponseTooLargeError()
+    return bytes(buf)


(nit) We have the join_byte_stream_checked function that appears to do the same thing

jvstme · 2026-04-14T20:03:32Z

+fleets: [pd-disagg]

-# Custom probe is required for PD disaggregation
+# Custom probe is required for PD disaggregation.


(nit) By the way, is it still required? I thought sync_router_workers_for_run_model can gracefully handle the router or workers not being ready, and perform the registration eventually, once they become ready

Yes this is still required. Because probes queries /v1/chat/completions to register the job but router fails to serve /v1/chat/completions until workers are registered. Meanwhile, the router-worker sync pipeline only considers RUNNING jobs that are also registered=True.

Oh, I see, so our default probe is the problem. But I assume it's possible to work around it by either setting probes: [], or not setting model. If that's the case, a custom probe is more of a recommendation, not a strict requirement.

Anyways, I think we were going to improve the UX here by introducing a different default probe for services with the SGLang router. Not in this PR, of course.

jvstme

Looks good to me overall, but the following may require more attention:

Forbid unsupported in-place updates (thread)
Fix the path whitelist in the in-server proxy (thread)

These may be uncommon cases, but they are security-related, so I would prefer to
address them before merging, or at least before the release

jvstme · 2026-04-15T11:13:12Z

                set_processed_update_map_fields(early_cleanup_update_map)
                set_unlock_update_map_fields(early_cleanup_update_map)
                now = get_current_datetime()
                resolve_now_placeholders(early_cleanup_update_map, now=now)
-                await session.execute(
-                    update(ServiceRouterWorkerSyncModel)
-                    .where(
-                        ServiceRouterWorkerSyncModel.id == item.id,
-                        ServiceRouterWorkerSyncModel.lock_token == item.lock_token,
-                    )
-                    .values(**early_cleanup_update_map)
+                await _update_sync_row_or_log_lock_token_changed(


(nit) Identical set_processed_update_map_fields, set_unlock_update_map_fields, and resolve_now_placeholders calls are also repeated in three places in this method. It's worth moving them inside _update_sync_row_or_log_lock_token_changed

Bihan · 2026-04-15T17:14:19Z

Looks good to me overall, but the following may require more attention:

Forbid unsupported in-place updates (thread)

Fix the path whitelist in the in-server proxy (thread)

These may be uncommon cases, but they are security-related, so I would prefer to address them before merging, or at least before the release

@jvstme Done

Bihan force-pushed the support_router_replica_with_pipelines branch from 2fe5e14 to bafd2d9 Compare April 1, 2026 07:22

Bihan requested review from jvstme and r4victor April 7, 2026 10:33

r4victor reviewed Apr 8, 2026

View reviewed changes

Comment thread src/dstack/_internal/server/services/runs/__init__.py

r4victor reviewed Apr 8, 2026

View reviewed changes

r4victor requested changes Apr 8, 2026

View reviewed changes

Bihan force-pushed the support_router_replica_with_pipelines branch from e155d17 to 7b268cb Compare April 9, 2026 10:36

jvstme reviewed Apr 10, 2026

View reviewed changes

Comment thread src/dstack/_internal/server/background/pipeline_tasks/service_router_worker_sync.py

jvstme reviewed Apr 12, 2026

View reviewed changes

Bihan Rana added 8 commits April 13, 2026 13:03

Resolve Merge Conflict

2e46b95

Resolve pyright test

14bab7a

Resolve tests

f99bdd2

Optimize ServiceRouterWorkerSyncWorkerProcess select query

35120a3

Use soft_delete for ServiceRouterWorkerSyncModel

8481bd3

Remove worker registration to gateway in PD

f04999e

Resolve review comments and add ServiceRouterWorkerSyncPipeline test

b349f2c

Resolve Migration Conflict

8fe01e5

Bihan force-pushed the support_router_replica_with_pipelines branch from 3bc04df to 8fe01e5 Compare April 13, 2026 07:33

Bihan Rana added 2 commits April 13, 2026 21:11

Resolve review comments

c5a6716

Resolve Comments

37a1c5a

Bihan Rana added 3 commits April 14, 2026 11:54

Resolve all review comments

397cf98

Resolve all review comments

cbb13f0

Update docs for router as replica

59d246b

Bihan changed the title ~~[Draft PR] Support router as replica with pipelines~~ Support router as replica with pipelines Apr 14, 2026

jvstme reviewed Apr 14, 2026

View reviewed changes

Bihan Rana added 2 commits April 15, 2026 12:47

Resolve review comments

274ad08

Resolve build error

0f8f1b6

jvstme approved these changes Apr 15, 2026

View reviewed changes

Resolve review comments

34baf13

Bihan merged commit 46ec81f into dstackai:master Apr 15, 2026
28 checks passed



		class ServiceRouterWorkerSyncFetcher(Fetcher[ServiceRouterWorkerSyncPipelineItem]):
		@sentry_utils.instrument_named_task("pipeline_tasks.ServiceRouterWorkerSyncFetcher.fetch")

		)


		class ServiceRouterWorkerSyncModel(PipelineModelMixin, BaseModel):

		@@ -0,0 +1,49 @@
		"""SSH-tunneled async HTTP client to a job's service port (same path as probes)."""

		@@ -0,0 +1,345 @@
		"""Reconcile SGLang router /workers with dstack's registered worker replicas (async, SSH-tunneled)."""

Uh oh!

Conversation

Bihan commented Mar 31, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Bihan Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

r4victor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Bihan Apr 8, 2026 •

edited

Loading

jvstme left a comment •

edited

Loading