Skip to content

feat(phase-9b): async optimize jobs via FastAPI BackgroundTasks#27

Open
harsh-pandhe wants to merge 1 commit into
mainfrom
feat/phase-9b-async-jobs
Open

feat(phase-9b): async optimize jobs via FastAPI BackgroundTasks#27
harsh-pandhe wants to merge 1 commit into
mainfrom
feat/phase-9b-async-jobs

Conversation

@harsh-pandhe
Copy link
Copy Markdown
Owner

Summary

The synchronous /optimize/run and POST /htmx/run endpoints block the HTTP request thread for the full duration of a GA run. Fine for synthetic 60-gen demos; unacceptable for a DEM-driven 120-gen production run that takes 30+ seconds.

New module src/ropeway/server/jobs.py adds a thin in-process job queue: POST a run, get an immediate job_id, poll for status. Uses FastAPI's BackgroundTasks (one worker thread per job, no external broker). Swap-out for Celery/Redis is straight-forward — the JobStore interface is the contract; the in-memory store is one impl.

Routes

Method Path Returns
POST /optimize/async {job_id, status}
GET /optimize/jobs/{id} {status, result | null, error | null}
GET /optimize/jobs list, newest first

Status state machine: PENDING -> RUNNING -> DONE | FAILED. Every exception in the background task is caught and surfaced as the job's error field so the client always reaches a terminal state.

The existing /optimize/run (synchronous, JWT-authenticated) is untouched — it remains the deterministic synchronous path for the test suite and pre-9b callers.

Test plan

  • tests/test_server_async_jobs.py8 new tests: submit returns id+pending, async runs to DONE with metrics, unknown id 404, jobs list, invalid system 400, JobStore internals (3)
  • Full suite 216 → 224 passing, zero regressions

The synchronous /optimize/run and POST /htmx/run endpoints block the
HTTP request thread for the full duration of a GA run. Fine for
synthetic 60-generation demos; unacceptable for a DEM-driven
120-generation production run that takes 30+ seconds.

New module src/ropeway/server/jobs.py adds a thin in-process job
queue: POST a run, get an immediate job_id, poll for status. Uses
FastAPI's BackgroundTasks (one worker thread per job, no external
broker). Swap-out for Celery/Redis is straight-forward — the
JobStore interface is the contract; the in-memory store is one
impl.

Routes:
  POST /optimize/async   -> {job_id, status}
  GET  /optimize/jobs/{id}
                         -> {status, result | null, error | null}
  GET  /optimize/jobs    -> list, newest first

Status state machine: PENDING -> RUNNING -> DONE | FAILED. Every
exception in the background task is caught and surfaced as the job's
error field so the client always reaches a terminal state.

The existing /optimize/run (synchronous, JWT-authenticated) is
untouched — it remains the deterministic synchronous path for the
test suite and pre-9b callers.

Tests: tests/test_server_async_jobs.py — 8 new
  - submit returns job_id + pending/running
  - async optimize runs to DONE with metrics
  - unknown job_id -> 404
  - jobs list includes the submitted id
  - invalid system -> 400 (validates body before queueing)
  - JobStore.get returns None for unknown id
  - JobStore.submit assigns unique ids
  - JobStore.update mutates fields in place

Full suite 216 -> 224, zero regressions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant