feat(phase-9b): async optimize jobs via FastAPI BackgroundTasks#27
Open
harsh-pandhe wants to merge 1 commit into
Open
feat(phase-9b): async optimize jobs via FastAPI BackgroundTasks#27harsh-pandhe wants to merge 1 commit into
harsh-pandhe wants to merge 1 commit into
Conversation
The synchronous /optimize/run and POST /htmx/run endpoints block the
HTTP request thread for the full duration of a GA run. Fine for
synthetic 60-generation demos; unacceptable for a DEM-driven
120-generation production run that takes 30+ seconds.
New module src/ropeway/server/jobs.py adds a thin in-process job
queue: POST a run, get an immediate job_id, poll for status. Uses
FastAPI's BackgroundTasks (one worker thread per job, no external
broker). Swap-out for Celery/Redis is straight-forward — the
JobStore interface is the contract; the in-memory store is one
impl.
Routes:
POST /optimize/async -> {job_id, status}
GET /optimize/jobs/{id}
-> {status, result | null, error | null}
GET /optimize/jobs -> list, newest first
Status state machine: PENDING -> RUNNING -> DONE | FAILED. Every
exception in the background task is caught and surfaced as the job's
error field so the client always reaches a terminal state.
The existing /optimize/run (synchronous, JWT-authenticated) is
untouched — it remains the deterministic synchronous path for the
test suite and pre-9b callers.
Tests: tests/test_server_async_jobs.py — 8 new
- submit returns job_id + pending/running
- async optimize runs to DONE with metrics
- unknown job_id -> 404
- jobs list includes the submitted id
- invalid system -> 400 (validates body before queueing)
- JobStore.get returns None for unknown id
- JobStore.submit assigns unique ids
- JobStore.update mutates fields in place
Full suite 216 -> 224, zero regressions.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The synchronous
/optimize/runandPOST /htmx/runendpoints block the HTTP request thread for the full duration of a GA run. Fine for synthetic 60-gen demos; unacceptable for a DEM-driven 120-gen production run that takes 30+ seconds.New module
src/ropeway/server/jobs.pyadds a thin in-process job queue: POST a run, get an immediatejob_id, poll for status. Uses FastAPI'sBackgroundTasks(one worker thread per job, no external broker). Swap-out for Celery/Redis is straight-forward — theJobStoreinterface is the contract; the in-memory store is one impl.Routes
/optimize/async{job_id, status}/optimize/jobs/{id}{status, result | null, error | null}/optimize/jobsStatus state machine:
PENDING -> RUNNING -> DONE | FAILED. Every exception in the background task is caught and surfaced as the job'serrorfield so the client always reaches a terminal state.The existing
/optimize/run(synchronous, JWT-authenticated) is untouched — it remains the deterministic synchronous path for the test suite and pre-9b callers.Test plan
tests/test_server_async_jobs.py— 8 new tests: submit returns id+pending, async runs to DONE with metrics, unknown id 404, jobs list, invalid system 400, JobStore internals (3)