From fe32e7bea0dda4d3a49849233bbdbab8311479f8 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Fri, 12 Jun 2026 18:38:54 +0200
Subject: [PATCH 01/32] docs(docs): add showcase workspace runbook and domain
 model entries (#401)

---
 docs/_base/DOMAIN_MODEL.md | 13 +++++++++++++
 docs/_base/RUNBOOKS.md     | 10 ++++++++++
 2 files changed, 23 insertions(+)

diff --git a/docs/_base/DOMAIN_MODEL.md b/docs/_base/DOMAIN_MODEL.md
index 25e2927b..06ba2d30 100644
--- a/docs/_base/DOMAIN_MODEL.md
+++ b/docs/_base/DOMAIN_MODEL.md
@@ -55,6 +55,16 @@
   - JSONB columns are persisted via `model_dump(mode="json")` so `date`/`datetime` serialise to ISO strings.
   - An agent-saved plan (`source='agent'`) is persisted ONLY after the human approves it through the HITL gate — it always carries the approval audit trail.
 
+### `showcase_workspace` (Demo)
+- **Root:** `ShowcaseWorkspace(workspace_id: str, status: str)` — one row = one preserved (`preservation="keep"`) showcase run.
+- **Status state machine:** `running` → `completed` | `failed` (CHECK-constrained; the finalize hook settles the row even on mid-run failure).
+- **JSONB fields:** `created_objects` (sparse soft-reference keys — `winning_run_id`, `v2_run_id`, `v2_model_path`, `alias`, `agent_session_id`, `batch_id`, `scenario_plan_ids`, `scenario_artifact_key`, `train_model_types`, `stale_alias_run_id`) and `result_summary` (winner / WAPE / wall-clock display payload).
+- **Invariants:**
+  - The config columns (`seed`, `scenario`, `reset`, `skip_seed`) are sufficient for a verbatim Replay through the normal run path — replay never mutates the original row; it creates a NEW row.
+  - `name` is deliberately NON-unique; `workspace_id` (UUID hex) is the unique handle.
+  - `created_objects` carries SOFT references only — **no ForeignKeys by design**. The workspace row is an audit record, not an ownership root: the referenced runs/plans/aliases are independently operator-deletable, and a workspace must never block (or cascade) their deletion.
+  - Persistence is warn-and-continue: a workspace write failure must never break the demo pipeline (the run completes with `workspace_id: null`).
+
 ## Key Invariants — NEVER violate
 
 1. **Time safety in features.** `app/features/featuresets/` uses only data at or before `cutoff_date`. Lags via `shift(positive)`, rolling via `shift(1).rolling(...)`, all `groupby` entity-aware. The test `app/features/featuresets/tests/test_leakage.py` is the spec — it MUST keep passing.
@@ -89,6 +99,7 @@
 | `model_exogenous` | The scenario `method` where a regression baseline genuinely re-forecasts through the assumptions — as opposed to the `heuristic` post-forecast multiplier | re-trained model (the baseline is not re-trained, only re-run) |
 | `future feature frame` | The leakage-safe `X_future` matrix `feature_frame.py` builds — long-lag, calendar, and exogenous columns the regression model consumes to re-forecast a scenario | feature matrix (that is the training-time term) |
 | `scenario tag` | A free-text label on a saved `scenario_plan` (its own queryable JSONB-array column) for filtering and grouping the library | seeder `scenario` preset, registry `alias` |
+| `workspace` (showcase) | A saved showcase-run record (`showcase_workspace` row) — replay config + soft references to everything the run created | seeder `scenario` (a preset), `scenario plan` (a saved what-if), agent `session` |
 
 ## Event Taxonomy
 
@@ -113,6 +124,8 @@ agent_session ──owns──► message_history (JSONB) ──may-contain─
 job ──may-reference──► model_run (for train/backtest jobs)
 
 scenario_plan ──built-from──► model artifact (a baseline run_id) ──embeds──► comparison snapshot (JSONB)
+
+showcase_workspace ──soft-references──► model_run / scenario_plan / run_alias / agent_session / batch (JSONB ids, NO FK)
 ```
 
 ## Glossary (cross-cutting)
diff --git a/docs/_base/RUNBOOKS.md b/docs/_base/RUNBOOKS.md
index df636648..b54bf7e1 100644
--- a/docs/_base/RUNBOOKS.md
+++ b/docs/_base/RUNBOOKS.md
@@ -144,6 +144,16 @@ uv run python scripts/run_demo.py --seed 42 --quiet 2>&1 | tee demo.log
 
 **Notes:** the `POST /demo/run` body and `WS /demo/stream` events are documented in `docs/_base/API_CONTRACTS.md`. The pipeline mirrors `scripts/run_demo.py`; the per-step diagnosis for `make demo` above applies to the same steps. PRP-38 added the `scenario` field on `DemoRunRequest` (defaults to `demo_minimal`) and the additive `phase_name` / `phase_index` / `phase_total` fields on every `StepEvent`. PRP-39 added four new steps (`champion_compat_compare`, `stale_alias_trigger`, `safer_promote_flow`, `batch_preset`) and a new `portfolio` phase between `decision` and `verify`. PRP-40 added the `planning` + `knowledge` phases (5 steps inserted after `portfolio`, before `verify`) and the additive `IndexProjectDocsRequest.path_prefix` field on the RAG slice. PRP-41 — design Z renames the legacy `agent` phase to `agents`, swaps the legacy `step_agent` for `agent_hitl_flow` (HITL approval round-trip), and appends a new `ops` phase carrying `ops_snapshot` immediately before `cleanup`. Total: 24 rows / 10 phases on `showcase_rich`; demo_minimal / sparse keep the 11-row layout under the unified `agents` phase id. The frontend's `DemoPhasePanel.tsx` now carries `onValueChange` (issue #311) and the Showcase page adds a KPI strip + Run-history strip + Stop button + Inspect-Artifacts panel + one-click Approve button on the HITL step card. E2 (#391) — the Scenario control is a card grid exposing all 8 `ScenarioPreset` values with per-preset demo seed profiles (`_SCENARIO_SEED_PROFILE` is exhaustive over the enum; `holiday_rush` seeds a pinned Oct–Dec 2024 window); the 5 newly exposed presets keep the legacy 11-row layout.
 
+### Showcase workspace — preserve/restore/replay semantics (E1–E4, umbrella #389)
+**Surface:** the `/showcase` "Save as workspace" controls + **Saved workspaces** panel; `GET /demo/workspaces(/{id})`; `showcase_workspace` table. Endpoint contracts live in `docs/_base/API_CONTRACTS.md` — this entry covers the operational traps only.
+
+1. **Replay is verbatim — replaying a `reset=true` workspace WIPES the database.** Replay re-submits the recorded config exactly (`seed`/`scenario`/`reset`/`skip_seed`) with `preservation="keep"`. A workspace saved from a Reset-database run therefore wipes + reseeds on every Replay; the panel styles such rows with a `DESTRUCTIVE` marker. This is designed E4 semantics (#393), not a bug — there is deliberately no confirm dialog (consistency with the Reset checkbox's severity styling).
+2. **Names are non-unique by design.** Every Replay creates a NEW `showcase_workspace` row; same-named rows accumulate (the replay regression test itself leaves two `replay-regression` rows). Disambiguate by `workspace_id` or `created_at` (panel lists newest first).
+3. **Rows accumulate — there is no DELETE endpoint yet** (a future epic; deletion was out of #393's scope). Rows are harmless audit records. `created_objects` ids are SOFT references (deliberately no FKs): an operator-issued `DELETE /registry/runs/{id}` or scenario-plan delete leaves dangling deep links on a loaded workspace's artifact cards — expected; the workspace row records what WAS created, not what still exists.
+4. **`holiday_rush` workspaces replay the pinned 2024 window.** The preset seeds a fixed Oct–Dec 2024 window (incident 28 above); a Replay with `reset=false` ADDS those rows to a today-anchored dataset, so `/seeder/status` reports the union range afterwards. For a clean pinned window, save the workspace from a run with **Reset database** ticked — its (destructive) Replay then reproduces the pinned window exactly.
+
+**Notes:** keep-runs are recorded by warn-and-continue hooks — a DB hiccup during `create_workspace` yields a green pipeline with `workspace_id: null` and no row (check uvicorn logs for `demo.workspace_create_failed`). Ephemeral runs write no workspace rows and stay in the localStorage Run-history strip; kept runs appear ONLY in the server-backed panel. On `showcase_rich` keep-runs, the planning-phase scenario plans carry the `workspace:<name|id>` tag (E3 #392) — retrieve them via `GET /scenarios?tags=workspace:<label>`.
+
 ### release-please skipped the bump after a dev → main merge
 **Symptoms:** `dev → main` PR is merged, `CD Release` workflow on `main` completes in ~10s, **no Release PR** is opened. release-please log shows `No user facing commits found since <sha> - skipping`.
 **Root cause:** `gh pr merge --merge` uses the **PR title** as the merge-commit subject. If that subject is a valid conventional commit of a non-bumping type (`chore`, `docs`, `refactor`, `test`, `ci`), release-please reads it at face value, classifies the whole merge as non-bumping, and stops. Prior dev→main merges done via the GitHub web UI used the default `Merge pull request #N from <branch>` subject — non-conventional — so release-please traversed to the underlying commits and bumped correctly.

From 5f3507a40d69d7da7bcec0fcd485284b5f96c14b Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Fri, 12 Jun 2026 18:38:54 +0200
Subject: [PATCH 02/32] docs(repo): track showcase workspace e5 prp (#401)

---
 .../PRP-showcase-workspace-E5-release-gate.md | 642 ++++++++++++++++++
 1 file changed, 642 insertions(+)
 create mode 100644 PRPs/PRP-showcase-workspace-E5-release-gate.md

diff --git a/PRPs/PRP-showcase-workspace-E5-release-gate.md b/PRPs/PRP-showcase-workspace-E5-release-gate.md
new file mode 100644
index 00000000..2c28cdf5
--- /dev/null
+++ b/PRPs/PRP-showcase-workspace-E5-release-gate.md
@@ -0,0 +1,642 @@
+name: "PRP showcase-workspace-E5 — release gate: 8-preset dogfood + workspace-mode dogfood + doc sweep + umbrella close-out"
+description: |
+  Issue #401 (epic E5 of umbrella #389, milestone showcase-workspace).
+  Release-gate epic: NO production code. Deliverables are (a) an executed
+  verification — a per-preset dogfood matrix across all 8 ScenarioPreset cards
+  on /showcase plus a workspace-mode (preservation=keep) dogfood with
+  list/Load/Replay + tag retrieval, on a fresh-DB stack; (b) a tracked docs
+  sweep — a Showcase-workspace section in docs/_base/RUNBOOKS.md and a
+  showcase_workspace aggregate + ubiquitous-language entry in
+  docs/_base/DOMAIN_MODEL.md; (c) evidence recorded on #401, umbrella #389
+  ticked + closed. If any dogfood check fails OUTSIDE the documented
+  expected-outcome matrix, the gate STOPS and files a fix issue — it never
+  fixes forward inside this epic.
+
+---
+
+## Goal
+
+Close umbrella #389 (showcase workspace — preserve, restore, replay) on **proof,
+not per-epic merges**. E1 #390, E2 #391, E3 #392, E4 #393 are all CLOSED and
+shipped in v0.2.22; nothing has yet verified their *combined* behavior across
+all 8 presets, nor the workspace keep-path on `showcase_rich` (E4's manual
+dogfood covered `demo_minimal` only), and the deferred RUNBOOKS/DOMAIN_MODEL
+documentation never landed.
+
+1. **8-preset dogfood matrix** — fresh-DB stack, then one `/showcase` run per
+   `ScenarioPreset` card with **Re-seed first** ticked (+ **Reset database**
+   where the matrix below requires it). Record green / expected-skip /
+   expected-fail per the RUNBOOKS entry-28 matrix; any deviation → STOP RULE.
+2. **Workspace-mode dogfood** — one `preservation="keep"` run each on
+   `demo_minimal` and `showcase_rich` (these double as those presets' matrix
+   rows). Verify: exactly one new `showcase_workspace` row per run, list/detail
+   endpoints, UI **Load** (config + artifacts re-attach) and **Replay** (new
+   row, green pipeline, no 409/500), and `GET /scenarios?tags=workspace:<name>`
+   returns the showcase-saved plans (E3, `showcase_rich` only).
+3. **Docs sweep** — add the Showcase-workspace operational section to
+   `docs/_base/RUNBOOKS.md` and the `showcase_workspace` aggregate +
+   `workspace` ubiquitous-language entry to `docs/_base/DOMAIN_MODEL.md`
+   (both currently have ZERO "workspace" mentions — verified 2026-06-12).
+4. **Regression coverage (verify only)** —
+   `tests/test_e2e_demo.py::test_demo_replay_same_config_twice` green in CI
+   (CI runs the full pytest incl. integration; latest dev run 27427250799 ✅)
+   and green in a targeted local re-run.
+5. **Close-out** — evidence comment on #401; tick ALL satisfied checkboxes on
+   #389 (live body has 11/11 unticked — drift) and fix the E5 line
+   ("not yet created" → "#401"); close #389; close #401 last.
+
+**End state**: #389 and #401 CLOSED with linked evidence; the two `docs/_base/`
+files document workspace semantics; this PRP file committed (`docs(repo)`
+precedent b1c8593).
+
+## Why
+
+- Every umbrella #389 success criterion is implemented but **none is ticked
+  with evidence**, and three of six are only provable by a live multi-preset
+  dogfood (8-preset green/skip matrix; restore/replay without 409/500;
+  workspace-tag retrieval).
+- E4's dogfood covered `demo_minimal` keep-runs only. The `showcase_rich`
+  keep-path is the one that exercises E3 tagging (planning phase exists only
+  there) and the 24-step `created_objects` recording — untested live as a
+  whole.
+- The umbrella explicitly deferred RUNBOOKS/DOMAIN_MODEL documentation to E5;
+  operators currently have no runbook for replay-of-`reset=true` destructive
+  semantics, non-unique names, row accumulation (no DELETE), or the
+  `holiday_rush` union-window replay trap.
+
+## What
+
+A verification campaign plus a docs-only repo change. No `app/`, `frontend/`,
+or `alembic/` change is in scope. Tracked changes: this PRP file +
+`docs/_base/RUNBOOKS.md` + `docs/_base/DOMAIN_MODEL.md`, one branch
+(`docs/showcase-workspace-e5-gate`), one PR into `dev`.
+
+### Success Criteria (mirror of #401 sub-tasks)
+
+- [ ] Fresh-DB stack built via the **DROP/CREATE DATABASE** procedure (NOT
+      `down -v` — see Known Gotchas) + `alembic upgrade head` clean.
+- [ ] 8/8 preset matrix executed and recorded; every outcome matches the
+      expected-outcome matrix (Known Gotchas) — zero undocumented ❌.
+- [ ] `demo_minimal` keep-run: 1 new workspace row (status `completed`),
+      listed in **Saved workspaces**, Load restores config + artifacts panel,
+      Replay completes green with a NEW distinct `workspace_id`.
+- [ ] `showcase_rich` keep-run: same as above PLUS `created_objects` carries
+      `winning_run_id`/`v2_run_id`/`alias`/`scenario_plan_ids`/`batch_id`, and
+      `GET /scenarios?tags=workspace:<name>` returns ≥1 plan tagged
+      `["showcase", …, "source:showcase", "workspace:<name>"]`.
+- [ ] Legacy frame back-compat re-confirmed: one run WITHOUT workspace fields
+      behaves as today (no workspace row created for it).
+- [ ] `test_demo_replay_same_config_twice` green: targeted local run + CI
+      citation.
+- [ ] RUNBOOKS.md gains the Showcase-workspace section (4 mandated topics);
+      DOMAIN_MODEL.md gains the aggregate + ubiquitous-language row.
+- [ ] Five validation gates green on the docs branch (ruff, format, mypy,
+      pyright, unit pytest) + targeted frontend vitest for the workspace
+      components.
+- [ ] Evidence on #401; #389 checkboxes ticked + E5 line fixed; #389 closed;
+      #401 closed.
+
+## All Needed Context
+
+### Documentation & References
+
+```yaml
+# ── The gate's contract ──────────────────────────────────────────────────────
+- issue: "#401 — gh issue view 401"
+  why: The epic's six sub-tasks this PRP encodes verbatim.
+
+- issue: "#389 — gh issue view 389 --json body"
+  why: "Umbrella. DRIFT (verified 2026-06-12): ALL 11 checkboxes unticked
+       (5 decomposition + 6 success criteria); the E5 decomposition line still
+       reads 'not yet created'. Tick every satisfied box, update the E5 line to
+       '#401', close with a close-out comment. E1-E4 = #390 #391 #392 #393,
+       all CLOSED, shipped v0.2.22."
+
+- file: PRPs/PRP-reliability-E6-release-gate.md
+  why: "The release-gate precedent this PRP mirrors (STOP rule, evidence
+       format, close-out order). ONE CORRECTION: its 'docker compose down -v'
+       fresh-stack step is superseded — see the fresh-stack procedure below."
+
+- file: PRPs/PRP-showcase-workspace-E4-restore-replay.md
+  why: "Restore-vs-Replay designed semantics (replay is always keep; config
+       verbatim incl. reset/skip_seed; no provenance column; no DELETE) — the
+       semantics the RUNBOOKS section must document."
+
+# ── What 'green' means per preset ────────────────────────────────────────────
+- file: docs/_base/RUNBOOKS.md
+  why: "'Showcase page (/showcase) pipeline fails at step X' items 1-28.
+       Entry 28 is THE per-preset expected-outcome matrix (sparse may-fail,
+       holiday_rush pinned window + union-range trap, others green). Items
+       9-26 list acceptable ⏭️/⚠️ on showcase_rich. ALSO the doc-sweep target:
+       add the new '### Showcase workspace …' section AFTER this incident
+       section's closing Notes paragraph."
+
+- file: app/features/demo/pipeline.py
+  why: "_phase_table(scenario) (line 2528): showcase_rich = 24 steps / 10
+       phases (data 7, modeling 2, decision 5, portfolio 1, planning 2,
+       knowledge 3, verify 1, agents 1, ops 1, cleanup 1); ALL other presets =
+       the legacy 11-step / 6-phase table. _SCENARIO_SEED_PROFILE (lines
+       513-538): showcase_rich/retail_standard/high_variance/stockout_heavy
+       5×15×180d, new_launches 5×25×180d, holiday_rush PINNED
+       2024-10-01..2024-12-31. READ-ONLY."
+
+# ── Workspace surface (what the keep-runs must prove) ────────────────────────
+- file: app/features/demo/models.py
+  why: "showcase_workspace table (line 57): workspace_id String(32) UNIQUE,
+       status CHECK ∈ {running, completed, failed} (lines 32-34), name
+       NON-unique, seed/scenario/reset/skip_seed config columns,
+       store_id/product_id/date_start/date_end grain columns, created_objects
+       JSONB (sparse keys: winning_run_id, v2_run_id, v2_model_path, alias,
+       agent_session_id, batch_id, scenario_plan_ids, scenario_artifact_key,
+       train_model_types, stale_alias_run_id), result_summary JSONB. Soft
+       references only — NO FKs. This is the DOMAIN_MODEL aggregate source."
+
+- file: app/features/demo/routes.py
+  why: "GET /demo/workspaces (lines 70-97; response {workspaces:[…]}, newest
+       first, limit 1-100/offset), GET /demo/workspaces/{id} (100-125; 404
+       problem+json when missing), POST /demo/run (41-67), WS /demo/stream
+       (128-156; workspace_name without preservation='keep' → one error
+       event). No DELETE endpoint exists — verified; that's a runbook fact."
+
+- file: app/features/demo/schemas.py
+  why: "DemoRunRequest defaults: seed=42, reset=false, skip_seed=true,
+       scenario='demo_minimal', preservation='ephemeral', workspace_name=None;
+       workspace_name pattern ^[a-z0-9][a-z0-9\\-_]*$, ≤100 chars. ScenarioPreset
+       = the 8 enum values. WorkspaceListItem vs WorkspaceDetailResponse
+       (detail adds grain/window + created_objects)."
+
+- file: tests/test_e2e_demo.py
+  why: "test_demo_replay_same_config_twice (line ~561, @pytest.mark.integration):
+       POSTs the IDENTICAL keep-body (seed 42, reset=true, skip_seed=false,
+       demo_minimal, workspace_name='replay-regression') twice against a
+       SUBPROCESS uvicorn on :8124; asserts both pass, distinct workspace_ids,
+       both listed completed. The #146/#324 regression guard. NOTE: it RESETS
+       the DB — never run it mid-dogfood."
+
+# ── Frontend dogfood surface ─────────────────────────────────────────────────
+- file: frontend/src/pages/showcase.tsx
+  why: "Controls: scenario card grid (8 presets), 'Re-seed first' →
+       skip_seed=false, 'Reset database' → reset=true, seed input, 'Save as
+       workspace' checkbox (line 332) + name input (344, mirrors the backend
+       pattern), Run/Stop. handleReplayWorkspace (174-186) re-submits the
+       recorded config VERBATIM with preservation='keep' (+ recorded name).
+       Starting any run detaches a loaded workspace (140)."
+
+- file: frontend/src/components/demo/WorkspacePanel.tsx
+  why: "'Saved workspaces' panel — Load (restore config + artifacts, no run)
+       and Replay buttons per row; reset=true rows render a destructive-styled
+       marker (line ~38/94). Vitest: WorkspacePanel.test.tsx,
+       WorkspaceArtifactsPanel.test.tsx, RunHistoryStrip.test.tsx,
+       ScenarioPicker.test.tsx."
+
+- file: frontend/src/components/demo/ScenarioPicker.tsx
+  why: "The 8 preset cards with wall-clock estimates; sparse carries
+       caveatKind='expected-skip' ('May fail at features/backtest (NaN WAPE) —
+       expected; see runbook', line ~66-72)."
+
+# ── Doc-sweep targets ────────────────────────────────────────────────────────
+- file: docs/_base/DOMAIN_MODEL.md
+  why: "Sweep target. Add '### showcase_workspace (Demo)' under Core
+       Aggregates (mirror the scenario_plan entry's shape: Root / JSONB fields
+       / Invariants), one Ubiquitous Language row ('workspace' vs seeder
+       'scenario' vs 'scenario plan'), and one Entity Relationship Summary
+       line (soft-references, no FK)."
+
+- file: docs/_base/API_CONTRACTS.md
+  why: "READ-ONLY here — E4 already documented the workspace endpoints + WS
+       fields (commit ee844f1). Cross-check the docs sweep against it; do NOT
+       duplicate endpoint tables into RUNBOOKS."
+
+# ── Close-out mechanics ──────────────────────────────────────────────────────
+- file: .claude/rules/umbrella-issue.md
+  why: "Write discipline for gh mutations: dry-run echo → idempotent check →
+       approval gate → confirm. Applies to the #389 body edit + closes."
+
+- file: .claude/rules/output-formatting.md
+  why: "Evidence-comment format: emoji status indicators, box separators,
+       ≤40 lines."
+```
+
+### Current Codebase tree (verification-relevant subset)
+
+```bash
+app/features/demo/models.py          # showcase_workspace ORM (E1)
+app/features/demo/pipeline.py        # _phase_table + _SCENARIO_SEED_PROFILE
+app/features/demo/routes.py          # /demo/run, /demo/workspaces[,/{id}], WS
+app/features/demo/schemas.py         # DemoRunRequest, ScenarioPreset, Workspace*
+app/features/demo/tests/             # test_workspace.py, test_routes.py, test_pipeline.py
+tests/test_e2e_demo.py               # test_demo_replay_same_config_twice (:561)
+frontend/src/pages/showcase.tsx      # dogfood entry point
+frontend/src/components/demo/        # WorkspacePanel, ScenarioPicker, … (+ vitest)
+docs/_base/RUNBOOKS.md               # sweep target 1 (zero 'workspace' today)
+docs/_base/DOMAIN_MODEL.md           # sweep target 2 (zero 'workspace' today)
+docker-compose.gpu.yml               # GPU overlay — REQUIRED for ollama legs
+docker-compose.lan.yml               # untracked local overlay — NOT used here
+```
+
+### Desired Codebase tree (files added/modified)
+
+```bash
+PRPs/PRP-showcase-workspace-E5-release-gate.md   # ADD — this file
+docs/_base/RUNBOOKS.md                           # MOD — +'### Showcase workspace' section
+docs/_base/DOMAIN_MODEL.md                       # MOD — +aggregate, +UL row, +ER line
+# No app/, frontend/, or alembic/ change is in scope.
+```
+
+### Known Gotchas & Environment Quirks
+
+```python
+# ── STOP RULE (governs the whole epic) ───────────────────────────────────────
+# If ANY preset run or workspace check deviates from the expected-outcome
+# matrix below: capture evidence (step table / screenshot / response body),
+# open a NEW fix issue referencing #389 + #401, comment the failure on #401,
+# and STOP the close-out. The docs sweep (Task 7) still lands — it documents
+# already-shipped E1-E4 semantics and is independent of dogfood outcomes.
+# A DOCUMENTED expected-fail (sparse) or sanctioned ⏭️/⚠️ is NOT a deviation.
+
+# ── Fresh stack — SUPERSEDES the reliability-E6 procedure ────────────────────
+# NEVER `docker compose down -v`: it removes ALL named volumes incl.
+# forecastlab_ollama_models (pulled gemma4/qwen3 models, expensive to rebuild).
+# Fresh-DB equivalent (memory: fresh-stack-gate-procedure, hit 2026-06-12):
+#   docker compose --profile gpu down --remove-orphans
+#   docker compose -f docker-compose.yml -f docker-compose.gpu.yml --profile gpu up -d
+#   docker compose exec -T postgres psql -U forecastlab -d postgres \
+#     -c "DROP DATABASE IF EXISTS forecastlab WITH (FORCE);" \
+#     -c "CREATE DATABASE forecastlab OWNER forecastlab;"
+#   uv run alembic upgrade head        # cold-boot proof on the empty DB
+# GOTCHA: WITHOUT the gpu overlay, ollama runs CPU-only and the showcase
+# rag_index_subset step HARD-FAILS (probe says reachable=True but the cold
+# qwen3-embedding:4b load exceeds the 60s embedding ReadTimeout → 502).
+# Verify `docker exec forecastlab-ollama nvidia-smi` works, then WARM the
+# embedder before any showcase_rich run (~41s cold-on-GPU, ~2.4s warm):
+#   curl -s localhost:11434/api/embed -d '{"model":"qwen3-embedding:4b","input":"warmup"}'
+# GOTCHA: the fresh DB wipes app_config runtime overrides — agent model
+# reverts to .env (agent_default_model=ollama:gemma4-agent on this host).
+# Re-check GET /config/ai after boot.
+# GOTCHA: a stale uvicorn from a prior session can hold :8123 — curl then hits
+# OLD code. lsof -iTCP:8123 -sTCP:LISTEN and kill stale PIDs first.
+# Run the backend as LOCAL uvicorn from the REPO ROOT (host-filesystem
+# artifacts for verify/feature-metadata; docs/ visible to rag_index_subset —
+# the compose backend image lacks docs/, which is why docker-compose.lan.yml
+# exists; do NOT use that overlay here). pnpm 11 depsStatusCheck can stall
+# `pnpm dev` — start Vite directly: cd frontend && ./node_modules/.bin/vite --host 0.0.0.0
+
+# ── Per-preset expected-outcome matrix (RUNBOOKS entry 28 — the gate's spec) ─
+# Every run: 'Re-seed first' TICKED (skip_seed=false). seed=42.
+#   demo_minimal      11 steps  GREEN (this run = the demo_minimal keep-run)
+#   retail_standard   11 steps  GREEN
+#   high_variance     11 steps  GREEN
+#   stockout_heavy    11 steps  GREEN
+#   new_launches      11 steps  GREEN
+#   sparse            11 steps  GREEN **or documented FAIL** at features/
+#                     backtest (50% missing grains / all-NaN WAPE gate) —
+#                     the card carries the expected-skip badge; either
+#                     outcome = matrix-conformant; record which occurred
+#   holiday_rush      11 steps  GREEN — tick **Reset database** TOO (pinned
+#                     2024-10-01..12-31 window; re-seed without reset ADDS
+#                     rows → /seeder/status reports the union range)
+#   showcase_rich     24 steps / 10 phases GREEN — run LAST, tick **Reset
+#                     database** TOO (clears holiday_rush's pinned window so
+#                     the 180d today-anchored window seeds clean; also clears
+#                     accumulated model_run rows). This run = the
+#                     showcase_rich keep-run.
+# ACCEPTABLE non-green steps on showcase_rich (RUNBOOKS items 9-26):
+#   agent_hitl_flow ⏭️ (KNOWN on this host: gemma4-agent 2B reliably skips —
+#   no Approve button appears; memory showcase-crypto-randomuuid-lan-crash),
+#   rag_index_subset / rag_retrieve_probe ⏭️ (provider unreachable/rejected —
+#   should NOT happen with the GPU overlay + warm-up; investigate if hit),
+#   verify ⏭️ (V2 prophet_like winner — artifact roots differ),
+#   champion_compat_compare / safer_promote_flow ⏭️ (missing V1/V2 — should
+#   NOT happen with Re-seed; investigate), batch_preset ⚠️ (90s poll timeout),
+#   ops_snapshot ⚠️. ANY other ❌/⏭️ = deviation → STOP RULE.
+# Only ONE pipeline at a time (module asyncio.Lock; 2nd start → one error
+# event / 409; Stop releases the lock in ~5s). Budget: ~90s-3min per 11-step
+# run, 5-10 min showcase_rich; whole matrix ~25-40 min.
+
+# ── Workspace-mode mechanics ─────────────────────────────────────────────────
+# workspace_name pattern ^[a-z0-9][a-z0-9\-_]*$ (lowercase!) ≤100 — use
+# e5-gate-minimal / e5-gate-rich. 'Save as workspace' + name without the
+# checkbox is impossible in the UI; over raw WS, workspace_name without
+# preservation='keep' → one error event (negative probe, optional).
+# Replay re-submits reset/skip_seed VERBATIM: replaying a reset=true row IS
+# DESTRUCTIVE (wipes + reseeds) — that's designed semantics (E4) and a
+# mandated RUNBOOKS topic, not a bug. Names are NON-unique by design; every
+# replay creates a NEW row. Rows accumulate; there is NO DELETE endpoint.
+# localStorage run-history ('forecastlab.showcase.runs.v1', FIFO 5) EXCLUDES
+# workspace runs — keep-runs appear only in the server-backed panel.
+# GET /scenarios?tags=workspace:<label> — label is the workspace_name when
+# provided, else workspace_id; JSONB containment, ALL listed tags must match.
+# Planning phase (scenario plans) exists ONLY on showcase_rich — the tag
+# check is meaningless on demo_minimal keep-runs.
+# The /scenarios run_id field is the ARTIFACT KEY, not registry run_id —
+# different ID spaces (memory: scenario-run-id-vs-registry-run-id).
+
+# ── Tests / gates ────────────────────────────────────────────────────────────
+# test_demo_replay_same_config_twice spins its OWN uvicorn on :8124 but hits
+# the SAME compose Postgres and RESETS it (reset=true) — run it ONLY in
+# Task 6, after the dogfood matrix, never concurrently with a :8123 run.
+# NEVER run the full integration suite as a gate — known shared-state
+# pollution (memory: integration-suite-shared-state-pollution). Targeted only.
+# `pnpm tsc --noEmit` is VACUOUS (solution-style tsconfig, 0 files) and
+# `tsc -b` has pre-existing dev failures — frontend evidence = targeted vitest.
+# Seeder does NOT reset ID sequences — discover store/product IDs via
+# /dimensions/* if any manual curl needs them; never assume id=1.
+# Playwright MCP + `playwright install` fail on this host — use native Python
+# Playwright with executable_path="/snap/bin/chromium" (symlink verified
+# present) or the agent-browser skill. localhost:5173 is fine (no E3
+# secure-context requirement in this gate).
+
+# ── Docs sweep ───────────────────────────────────────────────────────────────
+# Repo has MIXED CRLF/LF line endings (memory: repo-line-endings-crlf) —
+# after editing the two docs, check `git diff --stat` for whole-file noise
+# before committing; touch only the lines you mean to.
+# RUNBOOKS insertion point: a NEW '### Showcase workspace …' incident-style
+# section AFTER the '### Showcase page (/showcase) pipeline fails at step X'
+# section's closing **Notes** paragraph (before '### release-please skipped…').
+# DOMAIN_MODEL: mirror the scenario_plan aggregate's structure; the no-FK
+# rationale: created_objects are SOFT references because the referenced
+# objects (runs, plans, aliases) are independently operator-deletable — the
+# workspace row is an audit record, not an ownership root.
+# These files are imported into every agent session's context — keep both
+# additions tight (~25-35 lines RUNBOOKS, ~15-20 lines DOMAIN_MODEL).
+
+# ── Third-party API claims ───────────────────────────────────────────────────
+# None. This PRP cites no new library attributes; every verification command
+# is first-party (curl/pytest/grep/gh) and listed inline. (Policy per #258.)
+
+# ── GitHub close-out ─────────────────────────────────────────────────────────
+# Write discipline (.claude/rules/umbrella-issue.md): echo each gh mutation
+# before running it.
+# #389 body edit: fetch with `gh issue view 389 --json body`, tick the 5
+# Decomposition boxes + all 6 Success-criteria boxes, change the E5 line's
+# 'not yet created' → '#401', push back via `gh issue edit 389 --body-file`.
+# Preserve everything else byte-identical. Do NOT pattern-match checkbox text
+# from this PRP — edit the fetched live markdown.
+# Close order: PR opened first → evidence comment on #401 → tick #389 →
+# close #389 (comment links the #401 evidence) → close #401 last.
+# The PR needs 1 approving review + CI — it will NOT merge autonomously;
+# opening it is enough to proceed (reliability-E6 precedent).
+```
+
+## Implementation Blueprint
+
+### Data models and structure
+
+None. Zero schemas, zero migrations, zero source changes. The only authored
+content is two markdown sections (Task 7) whose required topics are fixed by
+issue #401.
+
+### List of tasks in execution order
+
+```yaml
+Task 0 — Preflight:
+  VERIFY branch: git switch dev && git pull → clean, up to date.
+  VERIFY no stale server: lsof -iTCP:8123 -sTCP:LISTEN → kill stale PIDs.
+  VERIFY chromium: ls -la /snap/bin/chromium (else plan agent-browser skill).
+  VERIFY epics: gh issue view 390 391 392 393 → all CLOSED (re-confirm).
+  RECORD: git rev-parse HEAD → the SHA all evidence refers to.
+
+Task 1 — Fresh-DB stack (memory-corrected procedure; NEVER down -v):
+  RUN: docker compose --profile gpu down --remove-orphans
+  RUN: docker compose -f docker-compose.yml -f docker-compose.gpu.yml --profile gpu up -d
+  VERIFY: docker exec forecastlab-ollama nvidia-smi → GPU visible
+  RUN: docker compose exec -T postgres psql -U forecastlab -d postgres \
+         -c "DROP DATABASE IF EXISTS forecastlab WITH (FORCE);" \
+         -c "CREATE DATABASE forecastlab OWNER forecastlab;"
+  RUN: uv run alembic upgrade head           # MUST exit 0 on the empty DB
+  WARM embedder: curl -s localhost:11434/api/embed \
+         -d '{"model":"qwen3-embedding:4b","input":"warmup"}'  # expect <60s
+  START backend: uv run uvicorn app.main:app --port 8123  (background, repo
+         root, log to file); VERIFY curl /health → {"status":"ok"}
+  VERIFY config: curl -s localhost:8123/config/ai → agent model is the .env
+         value (ollama:gemma4-agent); providers health as expected.
+  START frontend: cd frontend && ./node_modules/.bin/vite --host 0.0.0.0
+         (background); VERIFY curl -sI localhost:5173 → 200.
+
+Task 2 — Legacy-frame back-compat probe (cheap, before the matrix):
+  DRIVE one run with NO workspace fields (UI defaults: demo_minimal,
+  Re-seed first ticked, Save-as-workspace UNticked) → green 11 steps.
+  ASSERT: curl -s 'localhost:8123/demo/workspaces?limit=100' → zero rows
+  (fresh DB + ephemeral run created none). This is the byte-compat evidence.
+  NOTE: this same run seeds demo data; it is NOT the demo_minimal matrix row
+  (that one is the keep-run in Task 3).
+
+Task 3 — Workspace keep-run #1 (= demo_minimal matrix row):
+  UI: scenario=demo_minimal, Re-seed first ✓, Save as workspace ✓,
+      name=e5-gate-minimal → Run → green 11 steps.
+  ASSERT (curl): GET /demo/workspaces → exactly 1 row named e5-gate-minimal,
+      status=completed; GET /demo/workspaces/{id} → seed=42, scenario,
+      reset=false, skip_seed=false, created_objects.winning_run_id set.
+  UI: 'Saved workspaces' panel lists the row → click Load → config
+      repopulates + WorkspaceArtifactsPanel renders (links resolve) →
+      click Replay → green pipeline → a SECOND distinct row appears.
+  CAPTURE: screenshot of panel with both rows + the step table.
+
+Task 4 — Preset matrix (the five remaining 11-step presets):
+  FOR preset IN [retail_standard, high_variance, stockout_heavy,
+                 new_launches, sparse]:
+    UI: select card, Re-seed first ✓ (no Reset, no workspace) → Run.
+    RECORD: per-step outcome table; expected GREEN for the first four;
+            sparse = GREEN or the documented features/backtest FAIL
+            (record which; a sparse fail is matrix-conformant, NOT a stop).
+  THEN holiday_rush: Re-seed first ✓ AND Reset database ✓ → Run → GREEN;
+    RECORD /seeder/status date range == 2024-10-01..2024-12-31 (pinned,
+    no union range because Reset was ticked).
+  ON ANY non-conformant outcome: STOP RULE (RUNBOOKS items 1-28 give the
+  per-step diagnosis; file the fix issue; docs sweep still proceeds).
+
+Task 5 — Workspace keep-run #2 (= showcase_rich matrix row; E3 tag proof):
+  UI: scenario=showcase_rich, Re-seed first ✓, Reset database ✓ (clears the
+      holiday_rush pinned window), Save as workspace ✓, name=e5-gate-rich
+      → Run → 24 steps / 10 phases, zero ❌ (acceptable ⏭️/⚠️ per matrix);
+      if the HITL Approve button appears within its 90s window, click it
+      (a ⏭️ skip is acceptable — KNOWN on this host).
+  ASSERT (curl): GET /demo/workspaces/{id} → created_objects carries
+      winning_run_id, v2_run_id, alias, scenario_plan_ids (≥1), batch_id.
+  ASSERT (curl): GET '/scenarios?tags=workspace:e5-gate-rich' → ≥1 plan;
+      its tags ⊇ ["showcase", "source:showcase", "workspace:e5-gate-rich"].
+  UI: Load + Replay the e5-gate-rich row → green re-run, NEW distinct row
+      (replay survives accumulated model_run rows — the live #146/#324 proof
+      on the 24-step path). NOTE the row renders reset=true destructively
+      styled, and the replay re-seeds — expected designed semantics.
+  CAPTURE: full-page screenshot + step table + the two curl bodies.
+
+Task 6 — Replay regression test (verify-only sub-task):
+  CITE CI: gh run list --workflow ci.yml --branch dev --limit 1 → success
+      (run 27427250799 at gate time; re-cite the current latest).
+  RUN targeted (AFTER the dogfood — the test RESETS the shared DB):
+      uv run pytest "tests/test_e2e_demo.py::test_demo_replay_same_config_twice" -v -m integration
+  EXPECT: pass in ≤ ~8 min (two 240s-budget runs on :8124).
+
+Task 7 — Docs sweep (lands regardless of dogfood outcome):
+  BRANCH: git switch -c docs/showcase-workspace-e5-gate  (off dev)
+  MODIFY docs/_base/RUNBOOKS.md — ADD '### Showcase workspace (E1-E4 #389)'
+    AFTER the showcase-incident section's Notes paragraph, covering exactly:
+    (1) Replay is verbatim incl. reset — replaying a reset=true workspace is
+        DESTRUCTIVE (wipes + reseeds); the panel styles such rows
+        destructively; this is designed (E4), not a bug.
+    (2) Names are non-unique by design — every replay creates a NEW row;
+        disambiguate by workspace_id / created_at.
+    (3) Rows accumulate — no DELETE endpoint yet (future epic); harmless
+        audit records; created_objects are soft references that may dangle
+        if an operator deletes the underlying run/plan/alias.
+    (4) holiday_rush replay: the row replays the pinned 2024 window; without
+        Reset the re-seed ADDS rows → /seeder/status reports the union
+        range; tick Reset for a clean pinned window (cross-ref entry 28).
+  MODIFY docs/_base/DOMAIN_MODEL.md —
+    ADD '### showcase_workspace (Demo)' under Core Aggregates (mirror the
+      scenario_plan entry): Root ShowcaseWorkspace(workspace_id, status);
+      status machine running → completed | failed; JSONB created_objects
+      (sparse soft-reference keys) + result_summary; invariants: name
+      non-unique, config columns (seed/scenario/reset/skip_seed) sufficient
+      for verbatim replay, NO FKs (audit record, not ownership root — the
+      referenced objects are independently deletable).
+    ADD Ubiquitous Language row: `workspace` = a saved showcase run record
+      (config + soft references) | NOT: seeder `scenario` (a preset), NOT
+      `scenario plan` (a saved what-if).
+    ADD ER summary line:
+      showcase_workspace ──soft-references──► model_run / scenario_plan /
+      run_alias / job artifacts (JSONB ids, no FK)
+  CHECK: git diff --stat → only intended lines (CRLF/LF noise guard).
+  COMMIT 1: docs(docs): add showcase workspace runbook and domain model entries (#401)
+  COMMIT 2: docs(repo): track showcase workspace e5 prp (#401)   # this file
+  PUSH; OPEN PR into dev (needs 1 review + CI; opening suffices to proceed).
+
+Task 8 — Five validation gates (on the docs branch):
+  RUN: uv run ruff check . && uv run ruff format --check .
+  RUN: uv run mypy app/ && uv run pyright app/
+  RUN: uv run pytest -v -m "not integration"
+  PLUS frontend workspace evidence:
+       cd frontend && pnpm test --run src/components/demo/
+  ALL must pass. A failure on an untouched surface = regression → STOP RULE.
+
+Task 9 — Evidence + close-out (gh write discipline: echo each command first;
+         ONLY if Tasks 1-6 were fully matrix-conformant):
+  COMMENT on #401: evidence block per output-formatting.md — HEAD SHA,
+    fresh-DB proof, the 8-preset matrix table (preset / steps / outcome /
+    skips with reasons), workspace keep-run + Load/Replay + tag-retrieval
+    results, replay-test + CI citation, gate results, screenshot paths,
+    PR link for the docs sweep.
+  EDIT #389 body: tick the 5 Decomposition boxes + all 6 Success-criteria
+    boxes; update the E5 line 'not yet created' → '#401'. Byte-preserve the
+    rest (fetch live body; never retype it).
+  CLOSE #389: gh issue close 389 --comment "<close-out linking the #401
+    evidence + epics #390 #391 #392 #393 + v0.2.22>"
+  CLOSE #401: gh issue close 401 --comment "<gate complete — evidence above;
+    docs PR <link> lands through normal review>"
+
+Task 10 — Teardown:
+  STOP the background uvicorn + vite processes.
+  LEAVE the seeded DB + workspace rows in place (operator-visible artifacts).
+  LEAVE the compose stack (postgres + GPU ollama) up — shared session state.
+```
+
+### Integration Points
+
+```yaml
+GITHUB:
+  - issue #401: evidence comment + close
+  - issue #389: body checkbox tick + E5-line fix + close-out comment + close
+  - PR: docs branch (RUNBOOKS + DOMAIN_MODEL + this PRP) into dev
+
+RUNTIME (consumers only — no code integration):
+  - compose Postgres :5433 + GPU ollama :11434 (gpu overlay, warmed embedder)
+  - local uvicorn :8123 (repo root), Vite :5173
+  - test-owned uvicorn :8124 (Task 6 only)
+```
+
+## Validation Loop
+
+### Level 1 — environment sanity (before anything else)
+
+```bash
+git status --short && git rev-parse --abbrev-ref HEAD      # dev, clean
+lsof -iTCP:8123 -sTCP:LISTEN                                # must be empty
+docker compose ps                                           # postgres healthy
+docker exec forecastlab-ollama nvidia-smi | head -3         # GPU overlay active
+curl -s http://localhost:8123/health                        # {"status":"ok"} after Task 1
+```
+
+### Level 2 — targeted committed proofs
+
+```bash
+# Workspace slice units (fast, no DB):
+uv run pytest app/features/demo/tests/ -v -m "not integration"
+# Replay regression (Task 6 ONLY — resets the shared DB; integration):
+uv run pytest "tests/test_e2e_demo.py::test_demo_replay_same_config_twice" -v -m integration
+# Frontend workspace components:
+cd frontend && pnpm test --run src/components/demo/ && cd ..
+```
+
+### Level 3 — live system (the dogfood matrix + workspace probes)
+
+```bash
+# Matrix: 8 preset runs at http://localhost:5173/showcase per Tasks 2-5.
+# Workspace API probes:
+curl -s 'http://localhost:8123/demo/workspaces?limit=100' | python3 -m json.tool | head -40
+curl -s "http://localhost:8123/demo/workspaces/<id>" | python3 -m json.tool
+curl -s 'http://localhost:8123/scenarios?tags=workspace:e5-gate-rich' | python3 -m json.tool | head -40
+# Negative probe (404 problem+json):
+curl -s -o /dev/null -w '%{http_code}\n' http://localhost:8123/demo/workspaces/nonexistent000
+```
+
+### Level 4 — repo gates (docs branch)
+
+```bash
+uv run ruff check . && uv run ruff format --check .
+uv run mypy app/ && uv run pyright app/
+uv run pytest -v -m "not integration"
+```
+
+## Final validation Checklist
+
+- [ ] Fresh DB via DROP/CREATE (NOT down -v); `alembic upgrade head` clean;
+      GPU ollama up + embedder warmed
+- [ ] Legacy-frame run green; zero workspace rows created by it
+- [ ] 8/8 matrix rows recorded; outcomes conformant (sparse fail OK if it
+      matches the documented mode; holiday_rush pinned window verified)
+- [ ] demo_minimal keep-run: row completed; Load + Replay green; distinct ids
+- [ ] showcase_rich keep-run: 24 steps zero ❌; created_objects populated;
+      `tags=workspace:e5-gate-rich` retrieval returns tagged plans;
+      Load + Replay green with a new row
+- [ ] `test_demo_replay_same_config_twice` green locally + CI run cited
+- [ ] RUNBOOKS section (4 topics) + DOMAIN_MODEL aggregate/UL/ER added;
+      `git diff --stat` shows only intended lines
+- [ ] Five gates green + frontend demo-component vitest green
+- [ ] Evidence on #401; #389 ticked (11 boxes) + E5 line fixed; #389 closed;
+      #401 closed; docs PR open into dev
+- [ ] Background servers stopped; compose stack + seeded DB left in place
+
+---
+
+## Anti-Patterns to Avoid
+
+- ❌ Don't `docker compose down -v` — it destroys the Ollama models volume;
+     use the DROP/CREATE DATABASE procedure
+- ❌ Don't run showcase_rich with CPU-only Ollama or a cold embedder —
+     rag_index_subset hard-fails (502 ReadTimeout), polluting the matrix
+- ❌ Don't fix forward inside the gate — a non-conformant outcome files a new
+     issue and STOPS the close-out (the docs sweep still lands)
+- ❌ Don't treat the documented sparse fail or RUNBOOKS-sanctioned ⏭️/⚠️ as a
+     deviation — but don't hand-wave an undocumented ❌ either
+- ❌ Don't run `test_demo_replay_same_config_twice` (or the full integration
+     suite) mid-dogfood — both mutate the shared DB
+- ❌ Don't skip Reset on holiday_rush or on the showcase_rich run after it —
+     the union-window trap corrupts both rows of the matrix
+- ❌ Don't uppercase the workspace name — the pattern rejects it at 422
+- ❌ Don't retype #389's body — fetch, tick, push back byte-preserved
+- ❌ Don't duplicate API_CONTRACTS endpoint tables into RUNBOOKS — link them
+- ❌ Don't `gh pr merge` anything dev→main here — the release cut is a
+     separate stop-and-ask decision
+
+## Confidence Score: 8.5/10
+
+One-pass success likelihood is high: every check maps to a named committed
+test, an exact curl, or a UI control pinned to file:line; the per-preset
+expected-outcome matrix is lifted verbatim from RUNBOOKS entry 28; the
+fresh-stack procedure incorporates the hard-won 2026-06-12 corrections
+(Ollama volume, GPU overlay, embedder warm-up); and the umbrella-drift state
+was verified live. Residual risk (−1.5): the matrix has non-deterministic legs
+(sparse's two sanctioned outcomes, agent_hitl_flow timing, batch_preset on a
+loaded laptop) that may force a re-run or RUNBOOKS triage, and browser
+automation on snap chromium remains the most fragile dependency.

From 863fc1ad19f0f43d9ddb3dbede2ea1ba2a44c4a8 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Fri, 12 Jun 2026 18:38:54 +0200
Subject: [PATCH 03/32] chore(repo): sync uv.lock to v0.2.22 (#401)

---
 uv.lock | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/uv.lock b/uv.lock
index 61725802..f7cf5bdf 100644
--- a/uv.lock
+++ b/uv.lock
@@ -821,7 +821,7 @@ wheels = [
 
 [[package]]
 name = "forecastlabai"
-version = "0.2.21"
+version = "0.2.22"
 source = { editable = "." }
 dependencies = [
     { name = "alembic" },

From 967588e4c805b0c5b118eaafc60f3c586699033a Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Fri, 12 Jun 2026 19:24:25 +0200
Subject: [PATCH 04/32] feat(api,ui): add showcase workspace delete endpoint
 and panel action (#404)

---
 app/features/demo/routes.py                   |  44 +++++++-
 app/features/demo/tests/test_routes.py        |  96 ++++++++++++++++
 app/features/demo/tests/test_workspace.py     |  18 +++
 app/features/demo/workspace.py                |  28 ++++-
 docs/_base/API_CONTRACTS.md                   |   1 +
 docs/_base/DOMAIN_MODEL.md                    |   6 +-
 docs/_base/RUNBOOKS.md                        |  13 ++-
 .../components/demo/WorkspacePanel.test.tsx   |  96 +++++++++++++++-
 .../src/components/demo/WorkspacePanel.tsx    | 104 ++++++++++++++++--
 frontend/src/hooks/use-workspaces.test.ts     |  90 +++++++++++++++
 frontend/src/hooks/use-workspaces.ts          |  19 +++-
 frontend/src/pages/showcase.tsx               |   7 +-
 12 files changed, 502 insertions(+), 20 deletions(-)
 create mode 100644 frontend/src/hooks/use-workspaces.test.ts

diff --git a/app/features/demo/routes.py b/app/features/demo/routes.py
index 6d3284c4..d2881acb 100644
--- a/app/features/demo/routes.py
+++ b/app/features/demo/routes.py
@@ -3,8 +3,10 @@
 Exposes:
 - ``POST /demo/run``    -- synchronous; runs the whole pipeline, returns a result.
 - ``WS   /demo/stream`` -- streams one StepEvent per step for the live UI.
-- ``GET  /demo/workspaces``                 -- E4 (#393): list saved workspaces.
-- ``GET  /demo/workspaces/{workspace_id}``  -- E4 (#393): one workspace's detail.
+- ``GET    /demo/workspaces``                 -- E4 (#393): list saved workspaces.
+- ``GET    /demo/workspaces/{workspace_id}``  -- E4 (#393): one workspace's detail.
+- ``DELETE /demo/workspaces/{workspace_id}``  -- delete the workspace METADATA
+  row only; the run's created objects are soft references and stay untouched.
 
 The run/stream handlers obtain the live FastAPI app from ``request.app`` /
 ``websocket.app`` and pass it into the pipeline -- the slice never imports
@@ -16,7 +18,15 @@
 
 import json
 
-from fastapi import APIRouter, Depends, Query, Request, WebSocket, WebSocketDisconnect
+from fastapi import (
+    APIRouter,
+    Depends,
+    Query,
+    Request,
+    WebSocket,
+    WebSocketDisconnect,
+    status,
+)
 from pydantic import ValidationError
 from sqlalchemy.ext.asyncio import AsyncSession
 
@@ -125,6 +135,34 @@ async def get_showcase_workspace(
     return WorkspaceDetailResponse.model_validate(row)
 
 
+@router.delete(
+    "/workspaces/{workspace_id}",
+    status_code=status.HTTP_204_NO_CONTENT,
+    summary="Delete a saved showcase workspace",
+    description=(
+        "Delete one saved workspace METADATA row. Everything the run created "
+        "(model runs, scenario plans, aliases, jobs, artifacts) is a soft "
+        "reference and is NOT deleted."
+    ),
+)
+async def delete_showcase_workspace(
+    workspace_id: str,
+    db: AsyncSession = Depends(get_db),
+) -> None:
+    """Delete a saved showcase workspace metadata row.
+
+    Args:
+        workspace_id: External identifier of the workspace.
+        db: Async database session from dependency.
+
+    Raises:
+        NotFoundError: When no workspace matches ``workspace_id``.
+    """
+    deleted = await workspace.delete_workspace(db, workspace_id)
+    if not deleted:
+        raise NotFoundError(message=f"Workspace not found: {workspace_id}")
+
+
 @router.websocket("/stream")
 async def stream_demo_pipeline(websocket: WebSocket) -> None:
     """Stream one StepEvent per pipeline step over a WebSocket.
diff --git a/app/features/demo/tests/test_routes.py b/app/features/demo/tests/test_routes.py
index 016049db..6fd5b84a 100644
--- a/app/features/demo/tests/test_routes.py
+++ b/app/features/demo/tests/test_routes.py
@@ -316,6 +316,41 @@ async def fake_get(_db, workspace_id: str) -> SimpleNamespace:
     assert body["date_end"] == "2026-03-31"
 
 
+# =============================================================================
+# DELETE /demo/workspaces/{workspace_id} (unit)
+# =============================================================================
+
+
+async def test_delete_workspace_204(client, monkeypatch):
+    """A deleted workspace row yields 204 with an empty body."""
+    seen: dict[str, str] = {}
+
+    async def fake_delete(_db, workspace_id: str) -> bool:
+        seen["workspace_id"] = workspace_id
+        return True
+
+    monkeypatch.setattr(workspace, "delete_workspace", fake_delete)
+
+    resp = await client.delete("/demo/workspaces/" + "c" * 32)
+    assert resp.status_code == 204
+    assert resp.content == b""
+    assert seen["workspace_id"] == "c" * 32
+
+
+async def test_delete_workspace_404(client, monkeypatch):
+    """An unknown workspace_id is a 404 problem+json."""
+
+    async def fake_delete(_db, _workspace_id: str) -> bool:
+        return False
+
+    monkeypatch.setattr(workspace, "delete_workspace", fake_delete)
+
+    resp = await client.delete("/demo/workspaces/" + "0" * 32)
+    assert resp.status_code == 404
+    assert resp.headers["content-type"].startswith("application/problem+json")
+    assert "Workspace not found" in resp.json()["detail"]
+
+
 # =============================================================================
 # E4 (#393) -- workspace GET routes against real Postgres (integration)
 # =============================================================================
@@ -368,3 +403,64 @@ async def test_get_workspace_integration_round_trip(client, db_session: AsyncSes
     missing = await client.get("/demo/workspaces/" + "f" * 32)
     assert missing.status_code == 404
     assert missing.headers["content-type"].startswith("application/problem+json")
+
+
+@pytest.mark.integration
+async def test_delete_workspace_integration_round_trip(client, db_session: AsyncSession):
+    """DELETE removes exactly the target metadata row; a re-delete is 404."""
+    kept = await workspace.create_workspace(
+        DemoRunRequest.model_validate({"preservation": "keep", "workspace_name": "del-kept"})
+    )
+    target = await workspace.create_workspace(
+        DemoRunRequest.model_validate({"preservation": "keep", "workspace_name": "del-target"})
+    )
+    assert kept is not None and target is not None
+
+    resp = await client.delete(f"/demo/workspaces/{target}")
+    assert resp.status_code == 204
+
+    # The deleted row is gone; the sibling row is untouched.
+    assert (await client.get(f"/demo/workspaces/{target}")).status_code == 404
+    survivors = await client.get("/demo/workspaces")
+    assert [w["workspace_id"] for w in survivors.json()["workspaces"]] == [kept]
+
+    # Deleting again is a 404 problem+json (no idempotent 204).
+    again = await client.delete(f"/demo/workspaces/{target}")
+    assert again.status_code == 404
+    assert again.headers["content-type"].startswith("application/problem+json")
+
+
+@pytest.mark.integration
+async def test_delete_workspace_integration_keeps_created_objects(client, db_session: AsyncSession):
+    """Deleting a workspace never deletes (or resolves) its soft references.
+
+    The workspace references one REAL cross-slice object (an agent session)
+    plus one dangling run id -- the delete must succeed without touching the
+    former or resolving the latter (no-FK soft-reference contract).
+    """
+    session_resp = await client.post("/agents/sessions", json={"agent_type": "experiment"})
+    assert session_resp.status_code == 201
+    agent_session_id = session_resp.json()["session_id"]
+    try:
+        workspace_id = await workspace.create_workspace(
+            DemoRunRequest.model_validate({"preservation": "keep", "workspace_name": "del-softref"})
+        )
+        assert workspace_id is not None
+        row = await workspace.get_workspace(db_session, workspace_id)
+        assert row is not None
+        row.created_objects = {
+            "agent_session_id": agent_session_id,
+            "winning_run_id": "run-dangling-never-created",
+        }
+        await db_session.commit()
+
+        resp = await client.delete(f"/demo/workspaces/{workspace_id}")
+        assert resp.status_code == 204
+
+        # The metadata row is gone...
+        assert (await client.get(f"/demo/workspaces/{workspace_id}")).status_code == 404
+        # ...but the soft-referenced agent session still exists.
+        still_there = await client.get(f"/agents/sessions/{agent_session_id}")
+        assert still_there.status_code == 200
+    finally:
+        await client.delete(f"/agents/sessions/{agent_session_id}")
diff --git a/app/features/demo/tests/test_workspace.py b/app/features/demo/tests/test_workspace.py
index 110254c4..0b002be3 100644
--- a/app/features/demo/tests/test_workspace.py
+++ b/app/features/demo/tests/test_workspace.py
@@ -164,3 +164,21 @@ async def test_list_workspaces_newest_first_limit_offset(db_session: AsyncSessio
 async def test_get_workspace_missing_returns_none(db_session: AsyncSession) -> None:
     """get_workspace returns None for an unknown id."""
     assert await workspace.get_workspace(db_session, "0" * 32) is None
+
+
+async def test_delete_workspace_removes_only_target_row(db_session: AsyncSession) -> None:
+    """delete_workspace removes exactly the matching metadata row."""
+    id_a = await workspace.create_workspace(_keep_request(workspace_name="it-del-a"))
+    id_b = await workspace.create_workspace(_keep_request(workspace_name="it-del-b"))
+    assert id_a is not None and id_b is not None
+
+    assert await workspace.delete_workspace(db_session, id_a) is True
+
+    assert await workspace.get_workspace(db_session, id_a) is None
+    remaining = await workspace.list_workspaces(db_session)
+    assert [r.workspace_id for r in remaining] == [id_b]
+
+
+async def test_delete_workspace_missing_returns_false(db_session: AsyncSession) -> None:
+    """delete_workspace returns False (no raise) for an unknown id."""
+    assert await workspace.delete_workspace(db_session, "0" * 32) is False
diff --git a/app/features/demo/workspace.py b/app/features/demo/workspace.py
index 40b20807..b0e65dad 100644
--- a/app/features/demo/workspace.py
+++ b/app/features/demo/workspace.py
@@ -14,7 +14,8 @@
 
 :func:`get_workspace` / :func:`list_workspaces` / :func:`count_workspaces` are
 routed since E4 (epic #393) by ``GET /demo/workspaces`` and
-``GET /demo/workspaces/{workspace_id}`` in ``app/features/demo/routes.py``.
+``GET /demo/workspaces/{workspace_id}`` in ``app/features/demo/routes.py``;
+:func:`delete_workspace` backs ``DELETE /demo/workspaces/{workspace_id}``.
 """
 
 from __future__ import annotations
@@ -195,6 +196,31 @@ async def list_workspaces(
     return list(result.scalars().all())
 
 
+async def delete_workspace(db: AsyncSession, workspace_id: str) -> bool:
+    """Delete a workspace METADATA row; return ``True`` when a row was removed.
+
+    Deletes ONLY the ``showcase_workspace`` row. Everything the run created --
+    model runs, scenario plans, aliases, jobs, agent sessions, artifacts -- is
+    carried as OPAQUE SOFT REFERENCES in ``created_objects`` (no ForeignKeys
+    by design, see ``app/features/demo/models.py``) and is deliberately left
+    untouched: the workspace is an audit record, never an ownership root.
+
+    Args:
+        db: An open async session (caller-owned).
+        workspace_id: The external id of the row to delete.
+
+    Returns:
+        ``True`` when a row was deleted, ``False`` when none matched.
+    """
+    row = await get_workspace(db, workspace_id)
+    if row is None:
+        return False
+    await db.delete(row)
+    await db.commit()
+    logger.info("demo.workspace_deleted", workspace_id=workspace_id)
+    return True
+
+
 async def count_workspaces(db: AsyncSession) -> int:
     """Count all workspace rows (E4, issue #393).
 
diff --git a/docs/_base/API_CONTRACTS.md b/docs/_base/API_CONTRACTS.md
index 7947f13a..d307d2a9 100644
--- a/docs/_base/API_CONTRACTS.md
+++ b/docs/_base/API_CONTRACTS.md
@@ -62,6 +62,7 @@ All endpoints serve JSON; error responses use `application/problem+json` (RFC 78
 | demo | WS | `/demo/stream` | Stream one `StepEvent` per pipeline step for the live Showcase page |
 | demo | GET | `/demo/workspaces` | **E4 (#393)** — list saved showcase workspaces, newest first (`limit` 1-100 default 20 / `offset`); `200` + empty list on an empty table |
 | demo | GET | `/demo/workspaces/{workspace_id}` | **E4 (#393)** — full workspace row incl. `created_objects` soft references + grain/window columns; `404 application/problem+json` when missing |
+| demo | DELETE | `/demo/workspaces/{workspace_id}` | Delete one saved workspace METADATA row; `204` on success, `404 application/problem+json` when missing. The run's created objects (model runs, scenario plans, aliases, jobs, artifacts) are soft references and are NOT deleted |
 | config | GET | `/config/ai` | Effective AI-model config (agent LLM + RAG embeddings); API keys masked, never raw |
 | config | PATCH | `/config/ai` | Persist + apply AI-model changes live (no restart). `409` if an embedding-dimension change would orphan indexed RAG chunks (resend with `force=true`) |
 | config | GET | `/config/providers/health` | Per-provider connectivity — Ollama probed live, cloud providers by API-key presence |
diff --git a/docs/_base/DOMAIN_MODEL.md b/docs/_base/DOMAIN_MODEL.md
index 06ba2d30..1ec3200b 100644
--- a/docs/_base/DOMAIN_MODEL.md
+++ b/docs/_base/DOMAIN_MODEL.md
@@ -56,14 +56,18 @@
   - An agent-saved plan (`source='agent'`) is persisted ONLY after the human approves it through the HITL gate — it always carries the approval audit trail.
 
 ### `showcase_workspace` (Demo)
-- **Root:** `ShowcaseWorkspace(workspace_id: str, status: str)` — one row = one preserved (`preservation="keep"`) showcase run.
+- **Root:** `ShowcaseWorkspace(workspace_id: str, status: str)` — one row = one preserved (`preservation="keep"`) showcase run. Ephemeral runs (the default) write no row; a `workspace_name` merely labels a keep-run row (names are non-unique).
 - **Status state machine:** `running` → `completed` | `failed` (CHECK-constrained; the finalize hook settles the row even on mid-run failure).
+- **Stored metadata:** replay config (`seed`, `scenario`, `reset`, `skip_seed`), showcase grain + window (`store_id`, `product_id`, `date_start`, `date_end` — NULL on early failure), lifecycle (`status`, `created_at`/`updated_at`), and the two JSONB payloads below.
 - **JSONB fields:** `created_objects` (sparse soft-reference keys — `winning_run_id`, `v2_run_id`, `v2_model_path`, `alias`, `agent_session_id`, `batch_id`, `scenario_plan_ids`, `scenario_artifact_key`, `train_model_types`, `stale_alias_run_id`) and `result_summary` (winner / WAPE / wall-clock display payload).
+- **Relationship to demo pipeline runs:** one workspace row per kept pipeline run — `create_workspace` inserts it as `running` before the first step; `finalize_workspace` settles it with the run's collected ids. NOT a seeder `scenario`: a preset is a reusable data-generation recipe; a workspace is the record of ONE concrete run (which preset it used, with what seed, and what it produced).
 - **Invariants:**
   - The config columns (`seed`, `scenario`, `reset`, `skip_seed`) are sufficient for a verbatim Replay through the normal run path — replay never mutates the original row; it creates a NEW row.
   - `name` is deliberately NON-unique; `workspace_id` (UUID hex) is the unique handle.
   - `created_objects` carries SOFT references only — **no ForeignKeys by design**. The workspace row is an audit record, not an ownership root: the referenced runs/plans/aliases are independently operator-deletable, and a workspace must never block (or cascade) their deletion.
+  - Deletion is METADATA-ONLY, symmetric with the no-FK design: `DELETE /demo/workspaces/{id}` removes the `showcase_workspace` row and nothing else — the soft-referenced model runs, scenario plans, aliases, jobs, agent sessions, and artifacts survive, and a workspace whose references already dangle still deletes cleanly.
   - Persistence is warn-and-continue: a workspace write failure must never break the demo pipeline (the run completes with `workspace_id: null`).
+- **Out of scope (deliberately not modeled yet):** a `replayed_from` provenance column, export bundles under `artifacts/showcase/<workspace>/`, RAG-event / approval-decision capture, advanced seed config, and per-phase interactive configuration — see `docs/_base/RUNBOOKS.md` § Showcase workspace.
 
 ## Key Invariants — NEVER violate
 
diff --git a/docs/_base/RUNBOOKS.md b/docs/_base/RUNBOOKS.md
index b54bf7e1..007176be 100644
--- a/docs/_base/RUNBOOKS.md
+++ b/docs/_base/RUNBOOKS.md
@@ -144,16 +144,21 @@ uv run python scripts/run_demo.py --seed 42 --quiet 2>&1 | tee demo.log
 
 **Notes:** the `POST /demo/run` body and `WS /demo/stream` events are documented in `docs/_base/API_CONTRACTS.md`. The pipeline mirrors `scripts/run_demo.py`; the per-step diagnosis for `make demo` above applies to the same steps. PRP-38 added the `scenario` field on `DemoRunRequest` (defaults to `demo_minimal`) and the additive `phase_name` / `phase_index` / `phase_total` fields on every `StepEvent`. PRP-39 added four new steps (`champion_compat_compare`, `stale_alias_trigger`, `safer_promote_flow`, `batch_preset`) and a new `portfolio` phase between `decision` and `verify`. PRP-40 added the `planning` + `knowledge` phases (5 steps inserted after `portfolio`, before `verify`) and the additive `IndexProjectDocsRequest.path_prefix` field on the RAG slice. PRP-41 — design Z renames the legacy `agent` phase to `agents`, swaps the legacy `step_agent` for `agent_hitl_flow` (HITL approval round-trip), and appends a new `ops` phase carrying `ops_snapshot` immediately before `cleanup`. Total: 24 rows / 10 phases on `showcase_rich`; demo_minimal / sparse keep the 11-row layout under the unified `agents` phase id. The frontend's `DemoPhasePanel.tsx` now carries `onValueChange` (issue #311) and the Showcase page adds a KPI strip + Run-history strip + Stop button + Inspect-Artifacts panel + one-click Approve button on the HITL step card. E2 (#391) — the Scenario control is a card grid exposing all 8 `ScenarioPreset` values with per-preset demo seed profiles (`_SCENARIO_SEED_PROFILE` is exhaustive over the enum; `holiday_rush` seeds a pinned Oct–Dec 2024 window); the 5 newly exposed presets keep the legacy 11-row layout.
 
-### Showcase workspace — preserve/restore/replay semantics (E1–E4, umbrella #389)
-**Surface:** the `/showcase` "Save as workspace" controls + **Saved workspaces** panel; `GET /demo/workspaces(/{id})`; `showcase_workspace` table. Endpoint contracts live in `docs/_base/API_CONTRACTS.md` — this entry covers the operational traps only.
+### Showcase workspace — preserve/restore/replay/delete semantics (E1–E4, umbrella #389)
+**Surface:** the `/showcase` "Save as workspace" controls + **Saved workspaces** panel; `GET/DELETE /demo/workspaces(/{id})`; `showcase_workspace` table. Endpoint contracts live in `docs/_base/API_CONTRACTS.md` — this entry covers the operational traps only.
+
+**Lifecycle modes:** an **ephemeral** run (the default) writes no workspace row — it lives only in the localStorage Run-history strip. A **keep** run (`preservation="keep"` / the "Save as workspace" checkbox) records a `showcase_workspace` row with the run's replay config and soft references to what it created. A **named** keep run additionally carries the operator-supplied `workspace_name` label (non-unique). Kept rows back the panel's **Load** (restore config + artifact links, read-only), **Replay** (re-run verbatim), and **Delete** (remove the saved record) actions.
 
 1. **Replay is verbatim — replaying a `reset=true` workspace WIPES the database.** Replay re-submits the recorded config exactly (`seed`/`scenario`/`reset`/`skip_seed`) with `preservation="keep"`. A workspace saved from a Reset-database run therefore wipes + reseeds on every Replay; the panel styles such rows with a `DESTRUCTIVE` marker. This is designed E4 semantics (#393), not a bug — there is deliberately no confirm dialog (consistency with the Reset checkbox's severity styling).
 2. **Names are non-unique by design.** Every Replay creates a NEW `showcase_workspace` row; same-named rows accumulate (the replay regression test itself leaves two `replay-regression` rows). Disambiguate by `workspace_id` or `created_at` (panel lists newest first).
-3. **Rows accumulate — there is no DELETE endpoint yet** (a future epic; deletion was out of #393's scope). Rows are harmless audit records. `created_objects` ids are SOFT references (deliberately no FKs): an operator-issued `DELETE /registry/runs/{id}` or scenario-plan delete leaves dangling deep links on a loaded workspace's artifact cards — expected; the workspace row records what WAS created, not what still exists.
-4. **`holiday_rush` workspaces replay the pinned 2024 window.** The preset seeds a fixed Oct–Dec 2024 window (incident 28 above); a Replay with `reset=false` ADDS those rows to a today-anchored dataset, so `/seeder/status` reports the union range afterwards. For a clean pinned window, save the workspace from a run with **Reset database** ticked — its (destructive) Replay then reproduces the pinned window exactly.
+3. **Rows accumulate unless deleted.** `DELETE /demo/workspaces/{workspace_id}` (and the panel's per-row **Delete** button, behind a confirmation dialog) removes a saved row; a missing id is an RFC 7807 404. Undeleted rows are harmless audit records.
+4. **Deleting a workspace deletes METADATA ONLY.** The delete removes just the `showcase_workspace` row — the model runs, scenario plans, aliases, jobs, agent sessions, and on-disk artifacts the run created are NOT touched (and the seeded data is not reverted). `created_objects` ids are SOFT references (deliberately no FKs), so deletion in either direction never cascades: an operator-issued `DELETE /registry/runs/{id}` or scenario-plan delete leaves dangling deep links on a loaded workspace's artifact cards — expected; the workspace row records what WAS created, not what still exists.
+5. **`holiday_rush` workspaces replay the pinned 2024 window.** The preset seeds a fixed Oct–Dec 2024 window (incident 28 above); a Replay with `reset=false` ADDS those rows to a today-anchored dataset, so `/seeder/status` reports the union range afterwards. For a clean pinned window, save the workspace from a run with **Reset database** ticked — its (destructive) Replay then reproduces the pinned window exactly.
 
 **Notes:** keep-runs are recorded by warn-and-continue hooks — a DB hiccup during `create_workspace` yields a green pipeline with `workspace_id: null` and no row (check uvicorn logs for `demo.workspace_create_failed`). Ephemeral runs write no workspace rows and stay in the localStorage Run-history strip; kept runs appear ONLY in the server-backed panel. On `showcase_rich` keep-runs, the planning-phase scenario plans carry the `workspace:<name|id>` tag (E3 #392) — retrieve them via `GET /scenarios?tags=workspace:<label>`.
 
+**Explicitly out of scope (not implemented; future epics, do not assume they exist):** advanced seed configuration on `/showcase` (beyond seed/scenario/reset/skip_seed); export bundles under `artifacts/showcase/<workspace>/`; a `replayed_from` provenance column (replays are indistinguishable from fresh keep-runs except by name/timestamp); RAG-event and approval-decision capture on the workspace row; full phase-level interactive configuration.
+
 ### release-please skipped the bump after a dev → main merge
 **Symptoms:** `dev → main` PR is merged, `CD Release` workflow on `main` completes in ~10s, **no Release PR** is opened. release-please log shows `No user facing commits found since <sha> - skipping`.
 **Root cause:** `gh pr merge --merge` uses the **PR title** as the merge-commit subject. If that subject is a valid conventional commit of a non-bumping type (`chore`, `docs`, `refactor`, `test`, `ci`), release-please reads it at face value, classifies the whole merge as non-bumping, and stops. Prior dev→main merges done via the GitHub web UI used the default `Merge pull request #N from <branch>` subject — non-conventional — so release-please traversed to the underlying commits and bumped correctly.
diff --git a/frontend/src/components/demo/WorkspacePanel.test.tsx b/frontend/src/components/demo/WorkspacePanel.test.tsx
index 2d08aa40..843415f0 100644
--- a/frontend/src/components/demo/WorkspacePanel.test.tsx
+++ b/frontend/src/components/demo/WorkspacePanel.test.tsx
@@ -1,9 +1,24 @@
 import { QueryClient, QueryClientProvider } from '@tanstack/react-query'
-import { cleanup, fireEvent, render } from '@testing-library/react'
-import { afterEach, describe, expect, it, vi } from 'vitest'
+import { cleanup, fireEvent, render, screen } from '@testing-library/react'
+import { afterEach, beforeAll, describe, expect, it, vi } from 'vitest'
+import { toast } from 'sonner'
 import { WorkspacePanel } from './WorkspacePanel'
+import { ApiError } from '@/lib/api'
 import type { WorkspaceListItem, WorkspaceListResponse } from '@/types/api'
 
+beforeAll(() => {
+  // Radix AlertDialog needs these in jsdom (pattern: cancel-run-dialog.test.tsx).
+  class ResizeObserverStub {
+    observe() {}
+    unobserve() {}
+    disconnect() {}
+  }
+  vi.stubGlobal('ResizeObserver', ResizeObserverStub)
+  if (!Element.prototype.hasPointerCapture) {
+    Element.prototype.hasPointerCapture = () => false
+  }
+})
+
 afterEach(() => {
   cleanup()
   vi.clearAllMocks()
@@ -26,8 +41,18 @@ let mockResponse: { data: WorkspaceListResponse | undefined; isLoading: boolean
   isLoading: false,
 }
 
+let mockDeleteResult: { mutate: ReturnType<typeof vi.fn>; isPending: boolean } = {
+  mutate: vi.fn(),
+  isPending: false,
+}
+
 vi.mock('@/hooks/use-workspaces', () => ({
   useWorkspaces: () => mockResponse,
+  useDeleteWorkspace: () => mockDeleteResult,
+}))
+
+vi.mock('sonner', () => ({
+  toast: { success: vi.fn(), error: vi.fn() },
 }))
 
 function renderPanel(props: Partial<Parameters<typeof WorkspacePanel>[0]> = {}) {
@@ -103,3 +128,70 @@ describe('WorkspacePanel', () => {
     expect(buttons.every((b) => b.disabled)).toBe(true)
   })
 })
+
+describe('WorkspacePanel — delete', () => {
+  function openDeleteDialog() {
+    mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
+    mockDeleteResult = { mutate: vi.fn(), isPending: false }
+    const result = renderPanel({ onDeleted: vi.fn() })
+    fireEvent.click(screen.getByLabelText('Delete workspace e4-panel'))
+    return result
+  }
+
+  it('renders a Delete action for each saved workspace row', () => {
+    mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
+    const { container } = renderPanel()
+    const buttons = Array.from(container.querySelectorAll('button'))
+    expect(buttons.some((b) => (b.textContent ?? '').includes('Delete'))).toBe(true)
+  })
+
+  it('shows a confirmation whose copy makes metadata-only deletion clear', () => {
+    openDeleteDialog()
+    // The mutation must not fire before confirmation.
+    expect(mockDeleteResult.mutate).not.toHaveBeenCalled()
+    // Radix renders the dialog in a portal — read the whole document.
+    const copy = document.body.textContent ?? ''
+    expect(copy).toContain('Delete workspace "e4-panel"?')
+    expect(copy).toContain('only the saved workspace record')
+    expect(copy).toContain('NOT deleted')
+  })
+
+  it('confirming deletes the row and notifies the page on success', () => {
+    const onDeleted = vi.fn()
+    mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
+    mockDeleteResult = { mutate: vi.fn(), isPending: false }
+    renderPanel({ onDeleted })
+    fireEvent.click(screen.getByLabelText('Delete workspace e4-panel'))
+    fireEvent.click(screen.getByTestId('workspace-delete-confirm'))
+
+    expect(mockDeleteResult.mutate).toHaveBeenCalledTimes(1)
+    const [workspaceId, options] = mockDeleteResult.mutate.mock.calls[0] as [
+      string,
+      { onSuccess: () => void; onError: (error: unknown) => void },
+    ]
+    expect(workspaceId).toBe(baseItem.workspace_id)
+
+    // Success path: the page hook is told so it can drop a loaded workspace;
+    // the list refetch itself lives in useDeleteWorkspace (hook test).
+    options.onSuccess()
+    expect(onDeleted).toHaveBeenCalledWith(baseItem.workspace_id)
+    expect(toast.success).toHaveBeenCalledWith(expect.stringContaining('were kept'))
+  })
+
+  it('cancelling the dialog never fires the mutation', () => {
+    openDeleteDialog()
+    fireEvent.click(screen.getByText('Keep workspace'))
+    expect(mockDeleteResult.mutate).not.toHaveBeenCalled()
+  })
+
+  it('surfaces a failed delete via the error toast', () => {
+    openDeleteDialog()
+    fireEvent.click(screen.getByTestId('workspace-delete-confirm'))
+    const [, options] = mockDeleteResult.mutate.mock.calls[0] as [
+      string,
+      { onSuccess: () => void; onError: (error: unknown) => void },
+    ]
+    options.onError(new ApiError('Workspace not found: ' + 'a'.repeat(32), 404))
+    expect(toast.error).toHaveBeenCalledWith(expect.stringContaining('Delete failed'))
+  })
+})
diff --git a/frontend/src/components/demo/WorkspacePanel.tsx b/frontend/src/components/demo/WorkspacePanel.tsx
index 6638b597..3231cf14 100644
--- a/frontend/src/components/demo/WorkspacePanel.tsx
+++ b/frontend/src/components/demo/WorkspacePanel.tsx
@@ -1,22 +1,37 @@
 /**
  * E4 (#393) — server-backed saved-workspaces panel for the Showcase page.
  *
- * Lists `showcase_workspace` rows (newest first) with two actions per row:
+ * Lists `showcase_workspace` rows (newest first) with three actions per row:
  * - Load   — re-attach: the page repopulates the run controls + renders the
  *            artifact deep-link cards. Read-only; no run starts.
  * - Replay — re-run: the page re-submits the recorded config verbatim through
  *            the existing WS run path with preservation="keep".
+ * - Delete — remove the saved workspace METADATA row only (confirmed via
+ *            dialog). The run's created objects — model runs, scenario plans,
+ *            aliases, jobs, artifacts — are soft references and stay intact.
  *
  * The panel stays dumb: it hands the LIST item to the page callbacks; detail
  * fetching (created_objects) lives in the page via useWorkspace.
  */
 
-import { useEffect } from 'react'
+import { useEffect, useState } from 'react'
 import { useQueryClient } from '@tanstack/react-query'
-import { FolderOpen, Play } from 'lucide-react'
+import { FolderOpen, Play, Trash2 } from 'lucide-react'
+import { toast } from 'sonner'
+import {
+  AlertDialog,
+  AlertDialogAction,
+  AlertDialogCancel,
+  AlertDialogContent,
+  AlertDialogDescription,
+  AlertDialogFooter,
+  AlertDialogHeader,
+  AlertDialogTitle,
+} from '@/components/ui/alert-dialog'
 import { Button } from '@/components/ui/button'
 import { Card, CardContent } from '@/components/ui/card'
-import { useWorkspaces } from '@/hooks/use-workspaces'
+import { useDeleteWorkspace, useWorkspaces } from '@/hooks/use-workspaces'
+import { ApiError, getErrorMessage } from '@/lib/api'
 import type { WorkspaceListItem } from '@/types/api'
 
 interface WorkspacePanelProps {
@@ -24,7 +39,9 @@ interface WorkspacePanelProps {
   onLoad: (ws: WorkspaceListItem) => void
   /** Called when the operator clicks Replay — re-run the recorded config. */
   onReplay: (ws: WorkspaceListItem) => void
-  /** Disables both actions while a pipeline run is in flight. */
+  /** Called after a workspace row was deleted — lets the page drop a loaded one. */
+  onDeleted?: (workspaceId: string) => void
+  /** Disables all actions while a pipeline run is in flight. */
   isRunning: boolean
   /** summary.workspaceId of the latest kept run — triggers a list refetch. */
   lastWorkspaceId: string | null
@@ -46,9 +63,43 @@ function winnerOf(ws: WorkspaceListItem): string | null {
   return typeof winner === 'string' ? winner : null
 }
 
-export function WorkspacePanel({ onLoad, onReplay, isRunning, lastWorkspaceId }: WorkspacePanelProps) {
+function labelOf(ws: WorkspaceListItem): string {
+  return ws.name ?? ws.workspace_id.slice(0, 8)
+}
+
+export function WorkspacePanel({
+  onLoad,
+  onReplay,
+  onDeleted,
+  isRunning,
+  lastWorkspaceId,
+}: WorkspacePanelProps) {
   const { data, isLoading } = useWorkspaces()
   const queryClient = useQueryClient()
+  const deleteWorkspace = useDeleteWorkspace()
+  // The row awaiting confirmation — one shared dialog instead of one per row.
+  const [pendingDelete, setPendingDelete] = useState<WorkspaceListItem | null>(null)
+
+  const handleConfirmDelete = () => {
+    const ws = pendingDelete
+    if (!ws) return
+    setPendingDelete(null)
+    deleteWorkspace.mutate(ws.workspace_id, {
+      onSuccess: () => {
+        toast.success(
+          `Workspace "${labelOf(ws)}" deleted — its model runs, scenarios, and artifacts were kept.`
+        )
+        onDeleted?.(ws.workspace_id)
+      },
+      onError: (error) => {
+        toast.error(`Delete failed: ${getErrorMessage(error)}`)
+        // A 404 means the row is already gone server-side — drop the stale entry.
+        if (error instanceof ApiError && error.status === 404) {
+          void queryClient.invalidateQueries({ queryKey: ['workspaces'] })
+        }
+      },
+    })
+  }
 
   // Refetch the list once the latest kept run settles — syncing React state to
   // an external system (the server-backed list) is the sanctioned effect use.
@@ -85,7 +136,7 @@ export function WorkspacePanel({ onLoad, onReplay, isRunning, lastWorkspaceId }:
                 className="flex flex-wrap items-center justify-between gap-2 rounded-md border px-3 py-2 text-xs"
               >
                 <div className="flex flex-wrap items-center gap-3 font-mono">
-                  <span className="font-semibold">{ws.name ?? ws.workspace_id.slice(0, 8)}</span>
+                  <span className="font-semibold">{labelOf(ws)}</span>
                   <span className="rounded bg-muted px-2 py-0.5">{ws.scenario}</span>
                   <span>seed {ws.seed}</span>
                   <span className={statusClass(ws.status)}>{ws.status.toUpperCase()}</span>
@@ -118,12 +169,51 @@ export function WorkspacePanel({ onLoad, onReplay, isRunning, lastWorkspaceId }:
                     <Play className="mr-1 h-3 w-3" />
                     Replay
                   </Button>
+                  <Button
+                    size="sm"
+                    variant="ghost"
+                    className="text-destructive"
+                    disabled={isRunning || deleteWorkspace.isPending}
+                    onClick={() => setPendingDelete(ws)}
+                    aria-label={`Delete workspace ${labelOf(ws)}`}
+                  >
+                    <Trash2 className="mr-1 h-3 w-3" />
+                    Delete
+                  </Button>
                 </div>
               </li>
             ))}
           </ul>
         )}
       </CardContent>
+
+      {/* Shared confirmation dialog for the row pending deletion. */}
+      <AlertDialog
+        open={pendingDelete !== null}
+        onOpenChange={(open) => {
+          if (!open) setPendingDelete(null)
+        }}
+      >
+        <AlertDialogContent>
+          <AlertDialogHeader>
+            <AlertDialogTitle>
+              Delete workspace {pendingDelete ? `"${labelOf(pendingDelete)}"` : ''}?
+            </AlertDialogTitle>
+            <AlertDialogDescription>
+              This removes only the saved workspace record — its replay config
+              and artifact links. The model runs, scenario plans, aliases, jobs,
+              and artifacts the run created are NOT deleted and remain available
+              elsewhere in the app. This cannot be undone.
+            </AlertDialogDescription>
+          </AlertDialogHeader>
+          <AlertDialogFooter>
+            <AlertDialogCancel>Keep workspace</AlertDialogCancel>
+            <AlertDialogAction onClick={handleConfirmDelete} data-testid="workspace-delete-confirm">
+              Delete workspace
+            </AlertDialogAction>
+          </AlertDialogFooter>
+        </AlertDialogContent>
+      </AlertDialog>
     </Card>
   )
 }
diff --git a/frontend/src/hooks/use-workspaces.test.ts b/frontend/src/hooks/use-workspaces.test.ts
new file mode 100644
index 00000000..d804f96b
--- /dev/null
+++ b/frontend/src/hooks/use-workspaces.test.ts
@@ -0,0 +1,90 @@
+/**
+ * Unit tests for the use-workspaces hooks (``useDeleteWorkspace``).
+ *
+ * Stubs ``fetch`` to assert the hook issues a DELETE to the workspace
+ * endpoint and invalidates the workspaces list on success; no real backend
+ * is exercised (pattern: ``use-batches.test.ts``).
+ */
+import { QueryClient, QueryClientProvider } from '@tanstack/react-query'
+import { act, renderHook, waitFor } from '@testing-library/react'
+import { afterEach, describe, expect, it, vi } from 'vitest'
+import { createElement, type ReactNode } from 'react'
+
+import { useDeleteWorkspace } from './use-workspaces'
+import { ApiError } from '@/lib/api'
+
+function makeWrapper(client: QueryClient) {
+  return function Wrapper({ children }: { children: ReactNode }) {
+    return createElement(QueryClientProvider, { client }, children)
+  }
+}
+
+afterEach(() => {
+  vi.unstubAllGlobals()
+})
+
+describe('useDeleteWorkspace', () => {
+  it('issues a DELETE to /demo/workspaces/{id} and invalidates the list', async () => {
+    const fetchMock = vi.fn().mockResolvedValue(new Response(null, { status: 204 }))
+    vi.stubGlobal('fetch', fetchMock)
+
+    const client = new QueryClient({
+      defaultOptions: { queries: { retry: false } },
+    })
+    const invalidateSpy = vi.spyOn(client, 'invalidateQueries')
+    const { result } = renderHook(() => useDeleteWorkspace(), {
+      wrapper: makeWrapper(client),
+    })
+
+    const workspaceId = 'a'.repeat(32)
+    await act(async () => {
+      result.current.mutate(workspaceId)
+    })
+    await waitFor(() => expect(result.current.isSuccess).toBe(true))
+
+    expect(fetchMock).toHaveBeenCalledTimes(1)
+    const call = fetchMock.mock.calls[0]!
+    expect(String(call[0])).toContain(`/demo/workspaces/${workspaceId}`)
+    expect((call[1] as RequestInit).method).toBe('DELETE')
+
+    // Success invalidates every ['workspaces', ...] query — the panel list
+    // refetches and the deleted row disappears.
+    expect(invalidateSpy).toHaveBeenCalledWith({ queryKey: ['workspaces'] })
+  })
+
+  it('surfaces an RFC 7807 404 as ApiError on the mutation', async () => {
+    const problem = {
+      type: '/errors/not-found',
+      title: 'Not Found',
+      status: 404,
+      detail: 'Workspace not found: ' + 'f'.repeat(32),
+      code: 'NOT_FOUND',
+    }
+    vi.stubGlobal(
+      'fetch',
+      vi.fn().mockResolvedValue(
+        new Response(JSON.stringify(problem), {
+          status: 404,
+          headers: { 'content-type': 'application/problem+json' },
+        }),
+      ),
+    )
+
+    const client = new QueryClient({
+      defaultOptions: { queries: { retry: false } },
+    })
+    const { result } = renderHook(() => useDeleteWorkspace(), {
+      wrapper: makeWrapper(client),
+    })
+
+    await act(async () => {
+      result.current.mutate('f'.repeat(32))
+    })
+    await waitFor(() => expect(result.current.isError).toBe(true))
+
+    const error = result.current.error
+    expect(error).toBeInstanceOf(ApiError)
+    expect((error as ApiError).status).toBe(404)
+    expect((error as ApiError).message).toContain('Workspace not found')
+  })
+})
diff --git a/frontend/src/hooks/use-workspaces.ts b/frontend/src/hooks/use-workspaces.ts
index 8fc02054..76fd01bd 100644
--- a/frontend/src/hooks/use-workspaces.ts
+++ b/frontend/src/hooks/use-workspaces.ts
@@ -1,4 +1,4 @@
-import { useQuery } from '@tanstack/react-query'
+import { useMutation, useQuery, useQueryClient } from '@tanstack/react-query'
 import { api } from '@/lib/api'
 import type { WorkspaceDetail, WorkspaceListResponse } from '@/types/api'
 
@@ -23,3 +23,20 @@ export function useWorkspace(workspaceId: string, enabled = true) {
     enabled: enabled && !!workspaceId,
   })
 }
+
+/**
+ * Delete a saved workspace METADATA row; invalidates the workspaces list on
+ * success. Server-side this removes only the `showcase_workspace` record —
+ * the run's created objects (model runs, scenario plans, aliases, jobs,
+ * artifacts) are soft references and stay untouched.
+ */
+export function useDeleteWorkspace() {
+  const queryClient = useQueryClient()
+  return useMutation({
+    mutationFn: (workspaceId: string) =>
+      api<void>(`/demo/workspaces/${workspaceId}`, { method: 'DELETE' }),
+    onSuccess: () => {
+      void queryClient.invalidateQueries({ queryKey: ['workspaces'] })
+    },
+  })
+}
diff --git a/frontend/src/pages/showcase.tsx b/frontend/src/pages/showcase.tsx
index 61d5b947..b7eb4444 100644
--- a/frontend/src/pages/showcase.tsx
+++ b/frontend/src/pages/showcase.tsx
@@ -241,10 +241,15 @@ export default function ShowcasePage() {
         scenario={scenario}
       />
 
-      {/* E4 (#393) — server-backed saved workspaces (Load + Replay). */}
+      {/* E4 (#393) — server-backed saved workspaces (Load + Replay + Delete). */}
       <WorkspacePanel
         onLoad={handleLoadWorkspace}
         onReplay={handleReplayWorkspace}
+        onDeleted={(workspaceId) => {
+          // Deleting the currently loaded workspace detaches its artifacts
+          // panel — the metadata row backing it is gone (created objects stay).
+          if (selectedWorkspaceId === workspaceId) setSelectedWorkspaceId(null)
+        }}
         isRunning={isRunning}
         lastWorkspaceId={summary?.workspaceId ?? null}
       />

From fc390bfad5841a38c41e6473a3b4b3e776dba340 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Fri, 12 Jun 2026 23:42:27 +0200
Subject: [PATCH 05/32] docs(repo): track showcase-completion e1-e5 prps (#406)

---
 ...pletion-E1-metadata-provenance-backbone.md | 1031 ++++++++++++++
 ...ase-completion-E2-safe-replay-lifecycle.md | 1247 +++++++++++++++++
 ...howcase-completion-E3-seed-config-scope.md | 1080 ++++++++++++++
 ...completion-E4-run-config-phase-controls.md |  820 +++++++++++
 ...e-completion-E5-agent-rag-story-capture.md | 1185 ++++++++++++++++
 5 files changed, 5363 insertions(+)
 create mode 100644 PRPs/PRP-showcase-completion-E1-metadata-provenance-backbone.md
 create mode 100644 PRPs/PRP-showcase-completion-E2-safe-replay-lifecycle.md
 create mode 100644 PRPs/PRP-showcase-completion-E3-seed-config-scope.md
 create mode 100644 PRPs/PRP-showcase-completion-E4-run-config-phase-controls.md
 create mode 100644 PRPs/PRP-showcase-completion-E5-agent-rag-story-capture.md

diff --git a/PRPs/PRP-showcase-completion-E1-metadata-provenance-backbone.md b/PRPs/PRP-showcase-completion-E1-metadata-provenance-backbone.md
new file mode 100644
index 00000000..101fdf00
--- /dev/null
+++ b/PRPs/PRP-showcase-completion-E1-metadata-provenance-backbone.md
@@ -0,0 +1,1031 @@
+name: "PRP — Showcase Completion E1: Workspace Metadata & Provenance Backbone (issue #407)"
+description: |
+
+## Purpose
+
+Implement the Foundation epic of the showcase-completion initiative (umbrella #406):
+one Alembic migration extends `showcase_workspace` with lifecycle + provenance columns
+(`replayed_from_workspace_id`, `archived`, `pinned`, `notes`, `tags`,
+`config_schema_version`) and six documented JSONB story-slot columns
+(`seed_overrides`, `user_scope`, `approval_events`, `rag_events`, `job_ids`,
+`phase_summaries`); a `PATCH /demo/workspaces/{id}` lifecycle endpoint
+(rename/notes/tags/archive/pin) lands with its Pydantic schema surface; and Replay
+writes `replayed_from_workspace_id`. Every Parallel epic (#408–#412) writes into or
+reads from this surface, so it ships first. Blocks E2 #408, E3 #409, E4 #410,
+E5 #411, E6 #412.
+
+## Core Principles
+
+1. **Context is King**: every reference below was verified against the live code on 2026-06-12 (branch `dev` @ `bdf85f6`).
+2. **Validation Loops**: each level is executable as written.
+3. **Information Dense**: patterns cite exact file:line.
+4. **Progressive Success**: model+migration → schemas → service helpers → PATCH route → replay wiring → tests → docs.
+5. **Global rules**: follow CLAUDE.md / AGENTS.md; all five CI gates must pass; all changes ADDITIVE.
+
+---
+
+## Goal
+
+The `showcase_workspace` table gains the metadata + provenance backbone every other
+epic of umbrella #406 consumes:
+
+- **Lifecycle columns**: `archived` (bool), `pinned` (bool), `notes` (free text),
+  `tags` (queryable JSONB string array, GIN-indexed — exact `scenario_plan.tags`
+  pattern), `config_schema_version` (int, schema-evolution marker).
+- **Provenance column**: `replayed_from_workspace_id` — a SOFT reference (String(32),
+  indexed, deliberately **no ForeignKey**, not even self-referential) recorded when a
+  run is a Replay of a saved workspace.
+- **Six documented JSONB story slots** as dedicated nullable JSONB columns:
+  `seed_overrides`, `user_scope`, `approval_events`, `rag_events`, `job_ids`,
+  `phase_summaries`. E1 ships the columns + the documented per-slot schema; E1 writes
+  NONE of them (all stay NULL) — E3 (#409) writes `seed_overrides` + `user_scope`,
+  E5 (#411) writes `approval_events` + `rag_events`, later parallel epics write
+  `job_ids` + `phase_summaries`.
+- **`PATCH /demo/workspaces/{workspace_id}`** — partial-update lifecycle endpoint:
+  rename / notes / tags / archive / pin. Missing id → RFC 7807 404. Returns the
+  updated `WorkspaceDetailResponse`.
+- **Replay provenance**: `DemoRunRequest` gains an additive Optional
+  `replayed_from_workspace_id` field; the frontend Replay handler sends the source
+  row's `workspace_id`; `create_workspace` records it on the NEW row.
+
+A run/request without any new field behaves **byte-identically to today** (legacy WS
+start frames and HTTP bodies unchanged). One migration applies AND downgrades cleanly
+on a fresh DB.
+
+**Deliverable** (all additive):
+
+- `app/features/demo/models.py` — 12 new columns on `ShowcaseWorkspace` + tags GIN index + replayed-from index.
+- `alembic/versions/<new>_add_showcase_workspace_metadata_provenance.py` — `down_revision = "324a2fa37fcc"`; add-columns + indexes; clean downgrade.
+- `app/features/demo/schemas.py` — `DemoRunRequest.replayed_from_workspace_id`; new `WorkspaceUpdateRequest`; `WorkspaceListItem` / `WorkspaceDetailResponse` additive response fields.
+- `app/features/demo/workspace.py` — `create_workspace` records `replayed_from_workspace_id`; new `update_workspace` helper.
+- `app/features/demo/routes.py` — `PATCH /demo/workspaces/{workspace_id}`.
+- `frontend/src/types/api.ts` + `frontend/src/pages/showcase.tsx` — two-line additive Replay wiring (see "Why the (ui) sliver" below).
+- Tests: schema unit tests, model constraint/roundtrip integration tests, workspace-helper integration tests, PATCH route tests (2xx + 404 + 422), migration up/down.
+- Docs: `docs/_base/API_CONTRACTS.md` + `docs/_base/DOMAIN_MODEL.md` additive notes (the documented story-slot schema lives in DOMAIN_MODEL — umbrella #406 risk mitigation).
+
+**Success definition**: all Success Criteria below check off; the five CI gates are
+green; integration suite green; a manual Replay from the `/showcase` Saved-workspaces
+panel produces a new row whose `replayed_from_workspace_id` equals the source row's
+`workspace_id`; `PATCH /demo/workspaces/{id}` round-trips rename/notes/tags/archive/pin.
+
+## Why
+
+- Umbrella #406: today workspaces cannot be renamed/archived/annotated/searched, the
+  row lacks replay lineage, seed overrides, user scope, approval history, and RAG
+  events. E1 is the Foundation — **every** Parallel epic writes into or reads from
+  the columns added here, so the frozen column/slot contract ships first.
+- Replays are currently indistinguishable from fresh keep-runs except by
+  name/timestamp (documented gap, `docs/_base/RUNBOOKS.md` § Showcase workspace,
+  "Explicitly out of scope" — the `replayed_from` provenance column is this epic).
+- The umbrella's junk-drawer risk ("JSONB story slots become a junk drawer") is
+  mitigated here by `config_schema_version` + a documented per-slot schema in
+  `docs/_base/DOMAIN_MODEL.md`.
+
+### Why the (ui) sliver in an (api,db) epic
+
+"Replay writes `replayed_from_workspace_id`" is a frozen epic-level success
+criterion, and Replay is frontend-initiated: `handleReplayWorkspace`
+(`frontend/src/pages/showcase.tsx:174-186`) re-submits the recorded config through
+the WS start frame. Without the sender including the field, the backend has nothing
+to record. The wiring is two additive lines (one TS interface field + one start-frame
+key) — deliberately included here so the criterion is verifiable in E1; the lineage
+*rendering* (badge + chain) stays in E2 (#408).
+
+## What
+
+### User-visible behavior
+
+- `PATCH /demo/workspaces/{workspace_id}` accepts a partial body of
+  `{name?, notes?, tags?, archived?, pinned?}`; only provided fields change; explicit
+  `null` clears `name` / `notes`. Missing id → `404 application/problem+json`. A
+  malformed body (bad name pattern, unknown key, >20 tags) → `422
+  application/problem+json`. Empty body `{}` → `200` no-op returning the current row
+  (mirrors the `RunUpdate` precedent — see Decisions).
+- `POST /demo/run` and the `WS /demo/stream` start frame accept an additive Optional
+  `replayed_from_workspace_id: str | null` (`^[0-9a-f]{32}$`); supplying it without
+  `preservation="keep"` is a 422 (a lineage pointer is meaningless when no row is
+  written — same validator pattern as `workspace_name`).
+- Clicking **Replay** on the Saved-workspaces panel now records the source
+  `workspace_id` on the new row. The original row is never mutated (E4 #393
+  invariant preserved).
+- `GET /demo/workspaces` list items additively carry `archived`, `pinned`, `tags`,
+  `replayed_from_workspace_id`; the detail response additively carries those plus
+  `notes`, `config_schema_version`, and the six story slots. **List behavior is
+  otherwise unchanged in E1** — archived rows are still listed; default-filtering /
+  search / sort is E2 (#408).
+
+### Technical requirements
+
+- One Alembic migration off head `324a2fa37fcc` (verified `uv run alembic heads`,
+  2026-06-12). Forward-only: a NEW revision — never edit
+  `324a2fa37fcc_create_showcase_workspace_table.py`.
+- Every new column is nullable OR carries a `server_default` so the migration applies
+  on a table with existing rows; downgrade drops indexes then columns, cleanly.
+- **No ForeignKeys anywhere** — `replayed_from_workspace_id` is an opaque soft
+  reference, consistent with the table-wide invariant
+  (`docs/_base/DOMAIN_MODEL.md` § `showcase_workspace`: "`created_objects` carries
+  SOFT references only — no ForeignKeys by design"). Even a *self-referential* FK is
+  ruled out: ancestor workspace rows must remain independently deletable
+  (metadata-only delete, #404) without cascading to or blocking descendants. State
+  this in the model docstring.
+- `status` is NOT patchable — the pipeline finalize hook owns the
+  running/completed/failed lifecycle; `archived` is an orthogonal boolean so the
+  existing `ck_showcase_workspace_status` CHECK is untouched.
+- Vertical slice: all backend changes inside `app/features/demo/` +
+  `alembic/versions/`; no cross-slice imports (demo imports only `app.core.*`,
+  `app.shared.*`, stdlib/3rd-party).
+- RFC 7807 errors only — `NotFoundError` from `app/core/exceptions.py` (the demo
+  routes' existing pattern, `routes.py:134`), never bare `HTTPException`.
+- Pydantic v2 `ConfigDict(strict=True)` on the new request body. All new fields are
+  JSON-native (`str`/`bool`/`list[str]`) → NO `Field(strict=False)` override needed;
+  the AST policy walker (`app/core/tests/test_strict_mode_policy.py`) only fires on
+  date/datetime/time/UUID/Decimal.
+- Warn-and-continue invariant untouched: `create_workspace` /`finalize_workspace`
+  keep swallowing all DB errors. The new `update_workspace` helper is
+  request-scoped (caller-owned session, raises normally) — it backs an HTTP
+  endpoint, not the pipeline.
+
+### Success Criteria
+
+- [ ] Migration applies AND downgrades cleanly on a fresh DB (`upgrade head` →
+  `downgrade -1` → `upgrade head`); applies on a DB with pre-existing
+  `showcase_workspace` rows (server defaults backfill `archived=false`,
+  `pinned=false`, `tags=[]`, `config_schema_version=1`).
+- [ ] `DemoRunRequest()` (no args) serializes identically to today plus
+  `replayed_from_workspace_id=None`; a legacy start frame (no new keys) validates;
+  `replayed_from_workspace_id` without `preservation="keep"` → 422; a non-32-hex
+  value → 422.
+- [ ] A keep-run with `replayed_from_workspace_id="<32hex>"` produces a row whose
+  `replayed_from_workspace_id` column equals that value; the source row is unread
+  and unmodified (the value is recorded verbatim — no existence check, it is a soft
+  reference).
+- [ ] Frontend Replay sends `replayed_from_workspace_id: ws.workspace_id`;
+  `pnpm tsc -b` introduces no NEW errors (see gotcha on the pre-existing-failure
+  baseline) and `pnpm test --run` green.
+- [ ] `PATCH /demo/workspaces/{id}`: happy path updates exactly the provided fields
+  and returns the updated detail; `{}` is a 200 no-op; missing id → 404
+  problem+json; bad name pattern / unknown key / 21 tags → 422 problem+json.
+- [ ] `tags` round-trips as a JSONB string array and is GIN-indexed
+  (`ix_showcase_workspace_tags_gin`); a `.contains(["x"])` containment query works
+  (E2 will route it — E1 proves it in an integration test).
+- [ ] All six story-slot columns exist, default NULL, and round-trip a JSONB payload
+  in an integration test; E1 production code writes none of them.
+- [ ] `uv run ruff check . && uv run ruff format --check . && uv run mypy app/ &&
+  uv run pyright app/ && uv run pytest -v -m "not integration"` all green;
+  integration suite green against docker-compose Postgres;
+  `test_strict_mode_policy.py` green.
+
+## Decisions (the open questions this PRP resolves)
+
+> These are FROZEN for the parallel epics. #408–#412 PRP authors: consume, don't re-decide.
+
+1. **`tags` representation — CONFIRMED: mirror `scenario_plan.tags` exactly.**
+   A dedicated JSONB string-array column, `nullable=False`,
+   `server_default=text("'[]'::jsonb")`, with a GIN index
+   (`ix_showcase_workspace_tags_gin`). Verified in code:
+   `app/features/scenarios/models.py:74-76,97` (column + index), migration
+   `alembic/versions/bb8c4587ef1d_add_scenario_library_columns.py:26-45`
+   (add_column + GIN), and the containment query
+   `app/features/scenarios/service.py:464` (`ScenarioPlan.tags.contains(tags)`).
+   No deviation: the pattern is proven, queryable, and E2's tag filter reuses the
+   same `.contains()` shape. Tags are free-text strings (scenario precedent has no
+   per-item pattern); the PATCH boundary caps the list at 20 items
+   (`Field(max_length=20)` — same cap as `ScenarioCreateRequest.tags`,
+   `app/features/scenarios/schemas.py:203-206`).
+
+2. **Story slots — six dedicated nullable JSONB columns** (NOT keys inside one
+   `story` blob, NOT keys inside `created_objects`). Rationale: the existing
+   precedent is purpose-named JSONB columns with documented internal schemas
+   (`created_objects`, `result_summary` — `app/features/demo/models.py:77-81`);
+   each slot has a different writer epic and a different write moment
+   (create-time vs mid-run append vs finalize), and separate columns keep each
+   write isolated, independently nullable (NULL = "never written", distinct from
+   empty), individually typed in the ORM (`dict[str, Any] | None` vs
+   `list[dict] | None`), and trivially additive in responses. A single `story`
+   column would force read-modify-write of one blob across four epics and would
+   itself need a documented sub-schema anyway — more coupling, zero benefit on a
+   low-cardinality audit table. Per-slot documented schema: see the Data-models
+   blueprint below + the DOMAIN_MODEL doc task.
+
+3. **`replayed_from_workspace_id` — SOFT reference, no FK, confirmed.** String(32)
+   nullable, btree index (`ix_showcase_workspace_replayed_from`), NO ForeignKey —
+   including no self-referential FK: `docs/_base/DOMAIN_MODEL.md` pins
+   "deletion in either direction never cascades", and an FK (even `ON DELETE SET
+   NULL`) would couple delete behavior to lineage. Dangling lineage pointers after
+   an ancestor delete are expected and harmless (same semantics as every
+   `created_objects` id). Recorded verbatim from the request — no existence
+   validation (a replay of a just-deleted workspace still records the id it came
+   from; E2's liveness check surfaces dangles).
+
+4. **PATCH semantics — `exclude_unset` partial update, `extra="forbid"`, empty body
+   = no-op 200.** `model_dump(exclude_unset=True)` distinguishes absent from
+   explicit-null (runtime-verified, see Gotchas); explicit `null` clears `name` /
+   `notes`; `extra="forbid"` catches typo'd field names (the `RunUpdate` precedent,
+   `app/features/registry/schemas.py:113-123`); an empty body is a valid no-op
+   (mirrors `RunUpdate`, which has no min-fields validator). `archived`/`pinned`
+   accept only `true`/`false` and `tags` accepts only a list (not null — all
+   three back NOT NULL columns; send `[]` to clear tags). Explicit `null` on any
+   of the three is rejected at the schema boundary (422), never reaching
+   `setattr` → IntegrityError 500.
+
+5. **E1 writes no story slot.** `seed_overrides`/`user_scope` writers land in E3
+   (#409), `approval_events`/`rag_events` in E5 (#411), `job_ids`/
+   `phase_summaries` in the remaining parallel epics (E2 #408 health summary /
+   E4 #410 run-config echo — whichever lands first follows the documented schema).
+   E1 ships columns + schema docs + roundtrip tests only.
+
+6. **`config_schema_version` starts at 1.** Integer NOT NULL, `server_default
+   text("1")`, ORM `default=1`. It versions the *workspace config + story-slot
+   schema* as a whole; any epic that changes a documented slot shape bumps the
+   ORM default and documents the delta in DOMAIN_MODEL. E1 does not branch on it.
+
+### Assumptions (explicit, decided without user input)
+
+- `notes` is `sa.Text()` in the DB with a 2000-char cap enforced at the Pydantic
+  boundary only (no DB CHECK) — matches the repo's boundary-validation style
+  (`RunUpdate.error_message` caps at the schema layer, `registry/schemas.py:123`).
+- Renaming via PATCH uses the same `^[a-z0-9][a-z0-9\-_]*$` / ≤100 pattern as
+  `DemoRunRequest.workspace_name` (`demo/schemas.py:72-77`) — names stay
+  non-unique by design (E4 #393 invariant).
+- The PATCH route reuses `WorkspaceDetailResponse` as its response model (the
+  updated row, full detail) rather than introducing a new response shape.
+- Pin/archive carry NO behavioral semantics in E1 (no list reordering, no
+  default-filtering) — E2 (#408) wires the UX. E1 just persists the booleans.
+- The umbrella's "destructive-replay confirmation" is E2 (#408) — NOT here.
+  E1's replay change is provenance-recording only.
+- `replayed_from_workspace_id` requires `preservation="keep"`: a lineage pointer
+  on an ephemeral run has no row to land on. (The frontend Replay always sends
+  `preservation: 'keep'` — `showcase.tsx:179-185` — so this constraint is
+  invisible to the shipped UI.)
+
+## All Needed Context
+
+### Documentation & References
+
+```yaml
+# MUST READ — codebase patterns (all verified 2026-06-12, branch dev @ bdf85f6)
+
+- file: app/features/demo/models.py
+  why: |
+    THE file you extend. ShowcaseWorkspace at line 37; status constants 32-34;
+    JSONB precedent created_objects/result_summary at 77-81; __table_args__ with
+    named CheckConstraint + composite index at 83-89. Module docstring documents
+    the no-FK soft-reference decision — extend that docstring for
+    replayed_from_workspace_id. GOTCHA in docstring: SQLAlchemy reserves the
+    attr name `metadata`.
+
+- file: alembic/versions/324a2fa37fcc_create_showcase_workspace_table.py
+  why: |
+    CURRENT HEAD (verified `uv run alembic heads` → 324a2fa37fcc). Your
+    down_revision. Header/docstring format, typing (`revision: str`,
+    `down_revision: str | None`), op.f() index-naming convention to mirror.
+    NEVER edit this file — forward-only.
+
+- file: alembic/versions/bb8c4587ef1d_add_scenario_library_columns.py
+  why: |
+    THE add-columns migration to mirror: op.add_column with JSONB
+    server_default text("'[]'::jsonb") (lines 26-34), GIN index creation
+    (39-45), downgrade drops index-then-columns (48-52) incl. the
+    postgresql_using='gin' kwarg on drop_index.
+
+- file: app/features/scenarios/models.py
+  why: |
+    tags JSONB-array pattern (lines 74-76: Mapped[list[str]], nullable=False,
+    default=list, server_default=text("'[]'::jsonb")) + GIN index in
+    __table_args__ (line 97). This is the tags representation E1 mirrors
+    verbatim (Decision 1).
+
+- file: app/features/scenarios/service.py
+  why: |
+    Line 464: `ScenarioPlan.tags.contains(tags)` — the JSONB containment query
+    shape the tags column must support (prove it in an integration test; E2
+    routes it).
+
+- file: app/features/demo/schemas.py
+  why: |
+    DemoRunRequest at 29-85: ConfigDict(strict=True) line 40; the
+    workspace_name pattern + model_validator _workspace_name_requires_keep
+    (72-85) — copy this exact validator shape for replayed_from_workspace_id.
+    WorkspaceListItem (169-189) / WorkspaceDetailResponse (192-203) /
+    WorkspaceListResponse (205-213) — the response models you extend
+    additively. Response models are plain BaseModel + from_attributes (NOT
+    strict) — keep that split.
+
+- file: app/features/demo/workspace.py
+  why: |
+    create_workspace (46-79): the insert you extend with one kwarg
+    (replayed_from_workspace_id=req.replayed_from_workspace_id). get_workspace
+    (158-171) — reuse inside update_workspace. delete_workspace (199-221) —
+    the caller-owned-session + commit + logger.info shape update_workspace
+    mirrors. NOTE the split: create/finalize open their OWN sessions
+    (pipeline-scoped, warn-and-continue); get/list/delete take a caller-owned
+    AsyncSession (request-scoped, raise normally) — update_workspace is the
+    second kind.
+
+- file: app/features/demo/routes.py
+  why: |
+    The router you extend. delete_showcase_workspace (138-163) — the exact
+    route shape for PATCH: Depends(get_db), NotFoundError on missing (RFC 7807
+    via registered handler), docstring style. get_showcase_workspace (110-135)
+    — WorkspaceDetailResponse return shape.
+
+- file: app/features/registry/schemas.py
+  why: |
+    RunUpdate (113-123) — THE partial-update request precedent:
+    ConfigDict(extra="forbid"), all-Optional fields, no min-fields validator
+    (empty body = no-op). E1's WorkspaceUpdateRequest adds strict=True on top
+    (post-PRP-14 request-body policy; RunUpdate predates it).
+
+- file: app/features/demo/pipeline.py
+  why: |
+    DemoContext workspace fields at 258-263; the keep-branch create hook at
+    2652-2657; finalize hook at 2741-2746. E1 does NOT touch the pipeline —
+    create_workspace reads the new field straight off `req`. Read only to
+    confirm no hook change is needed.
+
+- file: app/core/exceptions.py
+  why: |
+    NotFoundError (line 72) → RFC 7807 404 via registered handler. The 422s
+    come FREE from Pydantic validation at the boundary (FastAPI → 422
+    problem+json).
+
+- file: app/features/demo/tests/test_schemas.py
+  why: |
+    Existing DemoRunRequest tests INCLUDING the mandatory JSON-dict path
+    (Model.model_validate({...}) per .claude/rules/security-patterns.md
+    § strict mode). Extend for the new field + add a WorkspaceUpdateRequest
+    block.
+
+- file: app/features/demo/tests/test_workspace.py
+  why: |
+    Integration-test patterns for create/finalize/get/list/delete — session
+    fixture, @pytest.mark.integration, row-cleanup conventions. Extend with
+    update_workspace + replayed_from cases.
+
+- file: app/features/demo/tests/test_models.py
+  why: |
+    Constraint/roundtrip integration tests for ShowcaseWorkspace — extend with
+    new-column defaults, tags containment, story-slot roundtrip.
+
+- file: app/features/demo/tests/test_routes.py
+  why: |
+    Route-test conventions: ASGITransport client from conftest, workspace
+    module monkeypatched for unit-shaped route tests, integration-marked tests
+    for DB-backed paths. The DELETE 404 test is the template for PATCH 404.
+
+- file: frontend/src/pages/showcase.tsx
+  why: |
+    handleReplayWorkspace at 174-186 — the start() call that gains ONE key:
+    `replayed_from_workspace_id: ws.workspace_id`. handleLoadWorkspace
+    (160-168) stays untouched (Load is read-only).
+
+- file: frontend/src/types/api.ts
+  why: |
+    DemoRunRequest interface at 778-788 — add
+    `replayed_from_workspace_id?: string` with an `// E1 (#407)` comment in
+    the existing style.
+
+- file: docs/_base/DOMAIN_MODEL.md
+  why: |
+    § showcase_workspace aggregate — additively document the new columns, the
+    six story-slot schemas, the config_schema_version semantics, and restate
+    that replayed_from_workspace_id is a soft reference (no FK). This is the
+    umbrella's junk-drawer risk mitigation — non-optional.
+
+- file: docs/_base/API_CONTRACTS.md
+  why: |
+    The /demo rows + "WebSocket Events (/demo/stream)" section — additive
+    notes for the PATCH endpoint, the new request field, and the response
+    additions, in the established "E1 (#407) — ..." style.
+
+# Issue / initiative context
+- url: https://github.com/w7-mgfcode/ForecastLabAI/issues/407
+  why: The epic this PRP implements (Foundation; frozen column/slot/endpoint contract).
+- url: https://github.com/w7-mgfcode/ForecastLabAI/issues/406
+  why: Umbrella — success criteria, out-of-scope list, risk table (junk-drawer mitigation = config_schema_version + documented slot schema).
+
+# Exemplar PRPs (style + validation-gate conventions)
+- file: PRPs/PRP-showcase-workspace-E1-persistence-backbone.md
+  why: Closest analog — created the table this PRP extends; task style, gates, anti-patterns.
+- file: PRPs/PRP-showcase-workspace-E4-restore-replay.md
+  why: Replay flow context — verbatim re-submission through the WS path; original row never mutated.
+```
+
+### Current Codebase tree (relevant subset)
+
+```bash
+app/features/demo/
+├── models.py          # ShowcaseWorkspace @37 (16 columns today)
+├── workspace.py       # create @46 / finalize @106 / get @158 / list @174 / delete @199 / count @224
+├── schemas.py         # DemoRunRequest @29; WorkspaceListItem @169; WorkspaceDetailResponse @192
+├── routes.py          # GET list @80; GET detail @110; DELETE @138; POST /run @51; WS @166
+├── pipeline.py        # keep-branch create hook @2652; finalize hook @2741 (NO E1 changes)
+├── service.py         # (NO E1 changes)
+└── tests/             # conftest, test_models, test_workspace, test_schemas, test_routes, test_pipeline
+alembic/
+├── env.py             # demo models import already present @19
+└── versions/          # head: 324a2fa37fcc
+frontend/src/
+├── pages/showcase.tsx # handleReplayWorkspace @174
+└── types/api.ts       # DemoRunRequest @778
+```
+
+### Desired Codebase tree (files added/modified)
+
+```bash
+app/features/demo/
+├── models.py                # MOD — +12 columns, +2 indexes, extended docstring
+├── schemas.py               # MOD — DemoRunRequest +replayed_from_workspace_id (+validator);
+│                            #       NEW WorkspaceUpdateRequest; ListItem/Detail additive fields
+├── workspace.py             # MOD — create_workspace records replayed_from; NEW update_workspace
+├── routes.py                # MOD — PATCH /demo/workspaces/{workspace_id}
+└── tests/
+    ├── test_schemas.py      # MOD — new-field + WorkspaceUpdateRequest unit tests
+    ├── test_models.py       # MOD — column defaults, tags containment, slot roundtrip (integration)
+    ├── test_workspace.py    # MOD — replayed_from recording; update_workspace semantics (integration)
+    └── test_routes.py       # MOD — PATCH 200/404/422 (+ list/detail field passthrough)
+alembic/versions/<rev>_add_showcase_workspace_metadata_provenance.py   # NEW
+frontend/src/types/api.ts    # MOD — +replayed_from_workspace_id?: string
+frontend/src/pages/showcase.tsx  # MOD — one start-frame key in handleReplayWorkspace
+docs/_base/API_CONTRACTS.md  # MOD — additive contract notes
+docs/_base/DOMAIN_MODEL.md   # MOD — columns + documented story-slot schemas
+```
+
+### Known Gotchas & Library Quirks
+
+```python
+# CRITICAL — forward-only migrations: down_revision = "324a2fa37fcc" (verified
+#   `uv run alembic heads` → 324a2fa37fcc, 2026-06-12). NEVER edit the merged
+#   create-table migration. Revision ids are hand-written 12-hex continuing the
+#   chain (or keep an `alembic revision -m ...` generated id).
+
+# CRITICAL — every new NOT NULL column needs a server_default or the migration
+#   fails on tables with existing rows: archived/pinned text("false"),
+#   config_schema_version text("1"), tags text("'[]'::jsonb"). All six story
+#   slots + notes + replayed_from_workspace_id are nullable (no default needed).
+
+# CRITICAL — strict-mode policy: WorkspaceUpdateRequest and the new
+#   DemoRunRequest field are all JSON-native (str/bool/list[str]) → NO
+#   Field(strict=False) override. The AST walker
+#   (app/core/tests/test_strict_mode_policy.py) only fires on
+#   date/datetime/time/UUID/Decimal — nothing here triggers it.
+
+# CRITICAL — do NOT add extra="forbid" to DemoRunRequest (unknown-key tolerance
+#   is the WS forward/backward-compat contract, routes.py:182). DO add it to
+#   WorkspaceUpdateRequest (HTTP-only body; typo'd PATCH fields must 422, not
+#   silently no-op — RunUpdate precedent).
+
+# CRITICAL — JSONB change detection: always ASSIGN whole values
+#   (row.tags = [...]), never mutate in place (row.tags.append(...)) — in-place
+#   mutation is invisible to SQLAlchemy without flag_modified. The existing
+#   finalize_workspace assigns; keep that style in update_workspace.
+
+# GOTCHA — SQLAlchemy reserves the declarative attr name `metadata`
+#   (demo/models.py docstring). None of the new names collide — keep it that way.
+
+# GOTCHA — `status` stays out of WorkspaceUpdateRequest; the CHECK constraint
+#   ck_showcase_workspace_status is untouched. `archived` is orthogonal.
+
+# GOTCHA — update_workspace is caller-owned-session + raises normally (it backs
+#   an HTTP route). Do NOT wrap it in the warn-and-continue pattern — that
+#   contract is for the PIPELINE-scoped create/finalize only.
+
+# GOTCHA — repo has mixed CRLF/LF line endings; run `git diff --stat` before
+#   committing — Edit/Write emit LF, so verify schema/route/model diffs are
+#   surgical, not whole-file noise.
+
+# GOTCHA — frontend type gate: `pnpm tsc --noEmit` is vacuous (solution-style
+#   tsconfig checks zero files) and `pnpm tsc -b` already fails on dev with
+#   pre-existing errors. Gate on "no NEW errors vs the dev baseline" +
+#   `pnpm lint` + `pnpm test --run`.
+
+# GOTCHA — mypy --strict AND pyright --strict gate merge: full annotations incl.
+#   `-> None` on tests and typed fixtures.
+
+# CONVENTION — branch: feat/showcase-completion-e1-metadata-provenance (off dev).
+#   Commits reference #407, e.g. `feat(db): ... (#407)` for the migration,
+#   `feat(api): ... (#407)` for slice code, `feat(ui): ... (#407)` for the
+#   replay wiring (or `feat(api,ui)` if combined). NO AI trailer (hook-enforced).
+
+# RUNTIME-VERIFICATION LOG (per prp-create step 3 — re-run on library upgrade):
+#   1. `uv run alembic heads` → 324a2fa37fcc (2026-06-12).
+#   2. Pydantic exclude_unset distinguishes absent vs explicit-null, pattern
+#      constraint skips the None arm of `str | None`, extra="forbid" 422s
+#      unknown keys, strict=True accepts list[str] and rejects a bare str:
+#      uv run python -c "
+#      from pydantic import BaseModel, ConfigDict, Field
+#      class P(BaseModel):
+#          model_config = ConfigDict(strict=True, extra='forbid')
+#          name: str | None = Field(default=None, max_length=100, pattern=r'^[a-z0-9][a-z0-9\-_]*$')
+#          notes: str | None = Field(default=None, max_length=2000)
+#          tags: list[str] | None = Field(default=None, max_length=20)
+#      p = P.model_validate({'notes': None}); assert p.model_fields_set == {'notes'}
+#      assert p.model_dump(exclude_unset=True) == {'notes': None}
+#      assert P.model_validate({'name': None}).name is None        # null clears
+#      assert P.model_validate({'tags': ['a','b']}).tags == ['a','b']
+#      "
+#      → verified on pydantic in-repo (2026-06-12).
+#   3. SQLAlchemy 2.0.46: Boolean/Integer/JSONB server_default DDL compiles as
+#      expected (`DEFAULT false NOT NULL`, `DEFAULT 1 NOT NULL`,
+#      `DEFAULT '[]'::jsonb NOT NULL`):
+#      uv run python -c "import sqlalchemy as sa; from sqlalchemy.dialects import postgresql; from sqlalchemy.schema import CreateTable; md=sa.MetaData(); t=sa.Table('x',md, sa.Column('archived',sa.Boolean(),nullable=False,server_default=sa.text('false')), sa.Column('v',sa.Integer(),nullable=False,server_default=sa.text('1')), sa.Column('tags',postgresql.JSONB(),nullable=False,server_default=sa.text(\"'[]'::jsonb\"))); print(CreateTable(t).compile(dialect=postgresql.dialect()))"
+#      → verified (2026-06-12).
+#   4. JSONB .contains() containment is already production code in this repo
+#      (scenarios/service.py:464) — no external claim to probe.
+```
+
+## Implementation Blueprint
+
+### Data models and structure
+
+```python
+# app/features/demo/models.py — ADD after result_summary (line 81), keep the
+# existing __table_args__ entries and append the two new indexes.
+
+    # ── E1 (#407) — lifecycle metadata ────────────────────────────────────
+    # Orthogonal to `status` (which the pipeline owns): archive/pin are
+    # operator curation flags, PATCH-mutable, default false.
+    archived: Mapped[bool] = mapped_column(
+        nullable=False, default=False, server_default=text("false")
+    )
+    pinned: Mapped[bool] = mapped_column(
+        nullable=False, default=False, server_default=text("false")
+    )
+    # Free-text operator annotation; length capped at the Pydantic boundary (2000).
+    notes: Mapped[str | None] = mapped_column(Text, nullable=True)
+    # Queryable JSONB string array — EXACT scenario_plan.tags pattern
+    # (app/features/scenarios/models.py:74-76); GIN-indexed below.
+    tags: Mapped[list[str]] = mapped_column(
+        JSONB, nullable=False, default=list, server_default=text("'[]'::jsonb")
+    )
+    # Version of the workspace config + story-slot schema (umbrella #406
+    # junk-drawer mitigation). Bump the ORM default when a slot shape changes.
+    config_schema_version: Mapped[int] = mapped_column(
+        Integer, nullable=False, default=1, server_default=text("1")
+    )
+
+    # ── E1 (#407) — replay provenance ─────────────────────────────────────
+    # SOFT reference to the workspace this run replayed (uuid4().hex of the
+    # source row). Deliberately NO ForeignKey — not even self-referential:
+    # ancestor rows must stay independently deletable (metadata-only delete),
+    # and dangling lineage pointers are expected, like every created_objects id.
+    replayed_from_workspace_id: Mapped[str | None] = mapped_column(
+        String(32), nullable=True
+    )
+
+    # ── E1 (#407) — documented JSONB story slots ──────────────────────────
+    # Six dedicated nullable JSONB columns (precedent: created_objects /
+    # result_summary). NULL = "slot never written" (distinct from empty).
+    # E1 writes NONE of them; documented schema per slot (authoritative copy
+    # in docs/_base/DOMAIN_MODEL.md):
+    #   seed_overrides   (E3 #409 writes) — dict: the curated seeder-override
+    #                    payload from the start frame, stored verbatim
+    #                    (model_dump(mode="json")); replay echoes it.
+    #   user_scope       (E3 #409 writes) — dict: operator-selected focus,
+    #                    {"store_id": int, "product_id": int} (additive keys
+    #                    allowed later).
+    #   approval_events  (E5 #411 writes) — list[dict], append-only:
+    #                    {"action_id": str, "tool_name": str,
+    #                     "decision": "approved"|"rejected",
+    #                     "decided_at": iso8601-str, "session_id": str}.
+    #   rag_events       (E5 #411 writes) — list[dict], append-only:
+    #                    {"event": "index"|"retrieve"|"skip", "detail": str,
+    #                     "count": int, "occurred_at": iso8601-str}.
+    #   job_ids          (later parallel epic) — list[str]: job / batch
+    #                    sub-job ids the run submitted (soft references).
+    #   phase_summaries  (later parallel epic) — list[dict], one per phase:
+    #                    {"phase_name": str, "status": "pass"|"fail"|"warn"|"skip",
+    #                     "steps": int, "duration_ms": float}.
+    seed_overrides: Mapped[dict[str, Any] | None] = mapped_column(JSONB, nullable=True)
+    user_scope: Mapped[dict[str, Any] | None] = mapped_column(JSONB, nullable=True)
+    approval_events: Mapped[list[dict[str, Any]] | None] = mapped_column(JSONB, nullable=True)
+    rag_events: Mapped[list[dict[str, Any]] | None] = mapped_column(JSONB, nullable=True)
+    job_ids: Mapped[list[str] | None] = mapped_column(JSONB, nullable=True)
+    phase_summaries: Mapped[list[dict[str, Any]] | None] = mapped_column(JSONB, nullable=True)
+
+    # __table_args__ — APPEND (keep existing CheckConstraint + composite index):
+    #   Index("ix_showcase_workspace_tags_gin", "tags", postgresql_using="gin"),
+    #   Index("ix_showcase_workspace_replayed_from", "replayed_from_workspace_id"),
+    # imports to extend: Text from sqlalchemy (others already imported).
+```
+
+```python
+# app/features/demo/schemas.py — DemoRunRequest addition (after workspace_name,
+# line 78) + validator extension.
+
+    # E1 (#407): replay provenance. The frontend Replay handler sends the
+    # SOURCE row's workspace_id; create_workspace records it verbatim on the
+    # NEW row (soft reference — no existence check). JSON-native str → no
+    # Field(strict=False) needed.
+    replayed_from_workspace_id: str | None = Field(
+        default=None,
+        pattern=r"^[0-9a-f]{32}$",   # uuid4().hex shape of workspace_id
+        description="workspace_id this run replays; requires preservation='keep'.",
+    )
+
+    @model_validator(mode="after")
+    def _replayed_from_requires_keep(self) -> DemoRunRequest:
+        """Reject a lineage pointer on a run that writes no workspace row."""
+        if self.replayed_from_workspace_id is not None and self.preservation != "keep":
+            raise ValueError("replayed_from_workspace_id requires preservation='keep'")
+        return self
+
+
+# NEW request model — place after DemoRunRequest.
+# (add `field_validator` to the pydantic import at schemas.py:14 — the file
+#  currently imports only BaseModel/ConfigDict/Field/model_validator)
+class WorkspaceUpdateRequest(BaseModel):
+    """Partial lifecycle update for PATCH /demo/workspaces/{workspace_id}.
+
+    exclude_unset semantics: only fields present in the body are applied;
+    explicit ``null`` clears ``name`` / ``notes``. Explicit ``null`` on
+    ``archived`` / ``pinned`` / ``tags`` is rejected (422) — they back NOT NULL
+    columns; send ``[]`` to clear tags. ``extra="forbid"`` so a typo'd field
+    422s instead of silently no-opping (RunUpdate precedent,
+    app/features/registry/schemas.py:113). All fields JSON-native -> the
+    model-level strict=True needs no per-field override. ``status`` is
+    deliberately absent — the pipeline owns the run lifecycle.
+    """
+
+    model_config = ConfigDict(strict=True, extra="forbid")
+
+    name: str | None = Field(
+        default=None,
+        max_length=100,
+        pattern=r"^[a-z0-9][a-z0-9\-_]*$",   # same as workspace_name
+        description="Rename the workspace; explicit null clears the label.",
+    )
+    notes: str | None = Field(
+        default=None, max_length=2000,
+        description="Free-text annotation; explicit null clears it.",
+    )
+    tags: list[str] | None = Field(
+        default=None, max_length=20,
+        description="Replace the full tag list (not a merge).",
+    )
+    archived: bool | None = Field(default=None, description="Archive flag.")
+    pinned: bool | None = Field(default=None, description="Pin flag.")
+
+    @field_validator("archived", "pinned", "tags")
+    @classmethod
+    def _reject_explicit_null(cls, v: bool | list[str] | None) -> bool | list[str]:
+        # Fires only on explicitly provided values (pydantic skips validators for
+        # defaults unless validate_default=True), so absent stays None/unset while
+        # an explicit {"archived": null} / {"tags": null} 422s instead of reaching
+        # the NOT NULL column via exclude_unset -> setattr -> IntegrityError 500.
+        # tags: send [] to clear, never null.
+        if v is None:
+            raise ValueError(
+                "archived/pinned accept only true/false and tags accepts a list "
+                "(send [] to clear) — explicit null is not allowed"
+            )
+        return v
+
+
+# Response additions (additive — keep from_attributes, NOT strict):
+# WorkspaceListItem  += archived: bool, pinned: bool, tags: list[str]
+#                       (default_factory=list), replayed_from_workspace_id: str | None
+# WorkspaceDetailResponse += notes: str | None, config_schema_version: int,
+#                       seed_overrides / user_scope: dict[str, Any] | None,
+#                       approval_events / rag_events / phase_summaries:
+#                       list[dict[str, Any]] | None, job_ids: list[str] | None
+```
+
+```python
+# app/features/demo/workspace.py — update_workspace (NEW; caller-owned session,
+# raises normally — this backs an HTTP route, NOT the pipeline).
+async def update_workspace(
+    db: AsyncSession,
+    workspace_id: str,
+    update: WorkspaceUpdateRequest,
+) -> ShowcaseWorkspace | None:
+    """Apply a partial lifecycle update; return the row or None when missing."""
+    row = await get_workspace(db, workspace_id)
+    if row is None:
+        return None
+    changes = update.model_dump(exclude_unset=True)   # absent != explicit null
+    for field, value in changes.items():
+        setattr(row, field, value)                    # whole-value ASSIGNMENT (JSONB gotcha)
+    await db.commit()
+    await db.refresh(row)
+    logger.info("demo.workspace_updated", workspace_id=workspace_id, fields=sorted(changes))
+    return row
+
+# create_workspace — ONE added kwarg in the ShowcaseWorkspace(...) constructor:
+#     replayed_from_workspace_id=req.replayed_from_workspace_id,
+```
+
+```python
+# app/features/demo/routes.py — PATCH route (mirror the DELETE shape @138).
+@router.patch(
+    "/workspaces/{workspace_id}",
+    response_model=WorkspaceDetailResponse,
+    summary="Update a saved showcase workspace's lifecycle metadata",
+    description=(
+        "Partial update: rename / notes / tags / archive / pin. Only fields "
+        "present in the body change; explicit null clears name/notes. The run "
+        "lifecycle status is not patchable."
+    ),
+)
+async def update_showcase_workspace(
+    workspace_id: str,
+    update: WorkspaceUpdateRequest,
+    db: AsyncSession = Depends(get_db),
+) -> WorkspaceDetailResponse:
+    row = await workspace.update_workspace(db, workspace_id, update)
+    if row is None:
+        raise NotFoundError(message=f"Workspace not found: {workspace_id}")
+    return WorkspaceDetailResponse.model_validate(row)
+```
+
+### List of tasks (dependency order)
+
+```yaml
+Task 1 — branch & issue hygiene:
+  RUN: git switch dev && git pull && git switch -c feat/showcase-completion-e1-metadata-provenance
+  VERIFY: gh issue view 407 --json state   # open
+  NOTE: git status shows untracked docker-compose.lan.yml on this host — leave it alone.
+
+Task 2 — MODIFY app/features/demo/models.py:
+  - ADD the 12 columns per the blueprint (lifecycle block, provenance column, six slots)
+  - ADD `Text` to the sqlalchemy import line (others already imported)
+  - APPEND the two indexes to __table_args__ (tags GIN + replayed_from btree)
+  - EXTEND the module docstring: replayed_from_workspace_id is a soft reference
+    (no FK, not even self-referential); story slots NULL until their writer epic lands
+  - PRESERVE: existing columns, constants, CheckConstraint, composite index — untouched
+
+Task 3 — CREATE alembic/versions/<rev>_add_showcase_workspace_metadata_provenance.py:
+  - down_revision = "324a2fa37fcc"
+  - MIRROR: bb8c4587ef1d_add_scenario_library_columns.py (add_column + GIN + downgrade order)
+  - upgrade(): op.add_column x12 (server_defaults: archived/pinned text("false"),
+    config_schema_version text("1"), tags text("'[]'::jsonb"); the rest nullable),
+    then op.create_index("ix_showcase_workspace_tags_gin", ..., postgresql_using="gin")
+    and op.create_index("ix_showcase_workspace_replayed_from", ...)
+  - downgrade(): drop the two indexes (GIN drop with postgresql_using="gin",
+    matching bb8c4587ef1d:50), then drop the 12 columns in reverse order
+  - VERIFY: docker compose up -d &&
+    uv run alembic upgrade head && uv run alembic downgrade -1 && uv run alembic upgrade head
+
+Task 4 — MODIFY app/features/demo/schemas.py:
+  - ADD DemoRunRequest.replayed_from_workspace_id + _replayed_from_requires_keep
+    validator (blueprint); UPDATE the docstring sentence listing JSON-native fields
+  - ADD WorkspaceUpdateRequest (blueprint) — placed after DemoRunRequest
+  - EXTEND WorkspaceListItem (+archived/pinned/tags/replayed_from_workspace_id)
+    and WorkspaceDetailResponse (+notes/config_schema_version/six slots) additively
+
+Task 5 — MODIFY app/features/demo/workspace.py:
+  - create_workspace: add replayed_from_workspace_id=req.replayed_from_workspace_id
+    to the ShowcaseWorkspace(...) constructor (one line; warn-and-continue untouched)
+  - ADD update_workspace (blueprint) + the WorkspaceUpdateRequest import
+  - UPDATE module docstring routing note (PATCH now routed too)
+
+Task 6 — MODIFY app/features/demo/routes.py:
+  - ADD the PATCH route (blueprint) between GET detail and DELETE
+  - ADD WorkspaceUpdateRequest to the schemas import block
+  - UPDATE the module docstring endpoint list
+
+Task 7 — MODIFY frontend (two additive lines):
+  - frontend/src/types/api.ts DemoRunRequest (@778): add
+    `// E1 (#407) — replay provenance: the source workspace_id a Replay re-runs.`
+    `replayed_from_workspace_id?: string`
+  - frontend/src/pages/showcase.tsx handleReplayWorkspace start() call (@179-185):
+    add `replayed_from_workspace_id: ws.workspace_id,`
+  - DO NOT touch handleLoadWorkspace (Load is read-only) or WorkspacePanel
+
+Task 8 — tests (full matrix in Validation Loop):
+  - MODIFY tests/test_schemas.py   (unit)
+  - MODIFY tests/test_models.py    (@pytest.mark.integration)
+  - MODIFY tests/test_workspace.py (@pytest.mark.integration)
+  - MODIFY tests/test_routes.py    (PATCH 200/404/422; unit-shaped via monkeypatched
+    workspace.update_workspace where the existing file does so, integration otherwise —
+    follow whichever convention the existing GET/DELETE tests use)
+
+Task 9 — docs (additive):
+  - docs/_base/API_CONTRACTS.md:
+    * NEW row: `demo | PATCH | /demo/workspaces/{workspace_id} | E1 (#407) — partial
+      lifecycle update (name/notes/tags/archived/pinned; exclude_unset, explicit null
+      clears name/notes; status NOT patchable); 404 problem+json when missing; 422 on
+      unknown keys / bad name pattern / >20 tags; empty body = 200 no-op`
+    * POST /demo/run row + WS /demo/stream section: additive Optional
+      `replayed_from_workspace_id` (`^[0-9a-f]{32}$`, requires preservation='keep');
+      Replay now sends it; recorded verbatim as a soft reference
+    * GET /demo/workspaces rows: note the additive response fields
+  - docs/_base/DOMAIN_MODEL.md § showcase_workspace:
+    * Stored metadata: add lifecycle columns + config_schema_version semantics
+    * JSONB fields: add the six story slots WITH their documented schemas (copy the
+      model-comment schemas verbatim — this is the authoritative copy)
+    * Invariants: replayed_from_workspace_id is a SOFT reference (no FK, dangles OK);
+      status not patchable; archived orthogonal to status
+    * Trim the "Out of scope" line that lists `replayed_from` as not-modeled (now shipped)
+  - docs/_base/RUNBOOKS.md § Showcase workspace: remove `replayed_from` from the
+    "Explicitly out of scope" list (one-line edit; the full runbook sweep is E7)
+
+Task 10 — gates, commit, PR:
+  - RUN the full Validation Loop (Levels 1-4)
+  - git diff --stat   # surgical diffs only (CRLF noise check)
+  - COMMITS (reference #407, no AI trailer), e.g.:
+      feat(db): extend showcase_workspace with metadata and provenance columns (#407)
+      feat(api): add workspace patch lifecycle endpoint and replay provenance (#407)
+      feat(ui): send replayed_from_workspace_id on showcase replay (#407)
+      docs(repo): document workspace story slots and patch contract (#407)
+  - PR into dev; title `feat(api,db): showcase-completion E1 — workspace metadata & provenance backbone (#407)`
+```
+
+### Integration Points
+
+```yaml
+DATABASE:
+  - migration: 12 add_column on showcase_workspace + ix_showcase_workspace_tags_gin (GIN)
+    + ix_showcase_workspace_replayed_from (btree); clean downgrade
+  - registration: alembic/env.py already imports demo models (line 19) — NO change
+
+CONFIG: none — no new settings, no env vars.
+
+ROUTES: PATCH /demo/workspaces/{workspace_id} on the existing demo router — no
+  app/main.py change (router already wired).
+
+PIPELINE: none — create_workspace reads the new field straight off req; the
+  keep-branch hook (pipeline.py:2652) and finalize hook (2741) are untouched.
+
+FRONTEND: two additive lines (Task 7). No new components; lineage badge/chain is E2.
+
+DOCS: API_CONTRACTS + DOMAIN_MODEL (+ one-line RUNBOOKS trim). Full sweep is E7.
+```
+
+## Validation Loop
+
+### Level 1: Syntax & Style
+
+```bash
+uv run ruff check . && uv run ruff format --check .
+uv run mypy app/ && uv run pyright app/
+# Expected: clean. Both type checkers are --strict and gate merge.
+```
+
+### Level 2: Unit Tests (no DB)
+
+```python
+# tests/test_schemas.py — add:
+def test_demo_run_request_replayed_from_default_none() -> None: ...
+    # DemoRunRequest() -> replayed_from_workspace_id is None; legacy frame
+    # model_validate({"seed": 7}) still validates
+
+def test_demo_run_request_replayed_from_json_path() -> None: ...
+    # MANDATORY json-dict path (security-patterns.md § strict mode):
+    # model_validate({"preservation": "keep", "replayed_from_workspace_id": "a"*32})
+
+def test_demo_run_request_replayed_from_requires_keep() -> None: ...
+    # pytest.raises(ValidationError): model_validate({"replayed_from_workspace_id": "a"*32})
+
+def test_demo_run_request_replayed_from_pattern_rejected() -> None: ...
+    # "not-hex!", "ABC..." (uppercase), 31-char and 33-char values all raise
+
+def test_workspace_update_request_partial_fields_set() -> None: ...
+    # model_validate({"notes": None}).model_dump(exclude_unset=True) == {"notes": None}
+    # model_validate({}).model_dump(exclude_unset=True) == {}
+
+def test_workspace_update_request_rejects_unknown_key() -> None: ...
+    # model_validate({"status": "archived"}) raises (extra="forbid" — status not patchable)
+
+def test_workspace_update_request_name_pattern_and_tags_cap() -> None: ...
+    # "Bad Name!" raises; 21 tags raises; ["workspace:x", "demo"] passes
+
+def test_workspace_update_request_rejects_explicit_null_flags() -> None: ...
+    # pytest.raises(ValidationError): model_validate({"archived": None})
+    # pytest.raises(ValidationError): model_validate({"pinned": None})
+    # pytest.raises(ValidationError): model_validate({"tags": None})
+    # model_validate({"tags": []}) passes (the sanctioned clear path)
+    # (NOT NULL columns — explicit null must 422, never reach setattr)
+
+# tests/test_routes.py — add (follow the file's existing GET/DELETE conventions):
+async def test_patch_workspace_happy_path(...) -> None: ...
+    # PATCH {"name": "renamed", "pinned": true, "tags": ["t1"]} -> 200; response
+    # echoes the changes and the untouched fields
+async def test_patch_workspace_missing_404_problem_json(...) -> None: ...
+    # status 404; content-type application/problem+json
+async def test_patch_workspace_unknown_field_422(...) -> None: ...
+    # body {"bogus": 1} -> 422 problem+json
+async def test_patch_workspace_explicit_null_archived_422(...) -> None: ...
+    # body {"archived": null} -> 422 problem+json (NOT NULL column guard)
+async def test_patch_workspace_empty_body_noop_200(...) -> None: ...
+async def test_run_demo_rejects_replayed_from_without_keep_422(...) -> None: ...
+```
+
+```bash
+uv run pytest app/features/demo -v -m "not integration"
+uv run pytest app/core/tests/test_strict_mode_policy.py -v   # AST walker still green
+```
+
+### Level 3: Integration (real Postgres)
+
+```python
+# tests/test_models.py — @pytest.mark.integration, extend:
+#   - insert with NO new kwargs -> archived=False, pinned=False, tags=[],
+#     config_schema_version=1, all six slots None, replayed_from None
+#     (server_default + ORM default agreement)
+#   - tags JSONB roundtrip + containment: insert tags=["workspace:x","demo"];
+#     select(...).where(ShowcaseWorkspace.tags.contains(["demo"])) finds it
+#     (scenarios/service.py:464 query shape)
+#   - story-slot roundtrip: write a dict into seed_overrides and a list[dict]
+#     into approval_events; read back identical
+#   - status CHECK still enforced (regression — constraint untouched)
+
+# tests/test_workspace.py — @pytest.mark.integration, extend:
+#   - create_workspace with req.replayed_from_workspace_id set -> column recorded
+#     verbatim; without it -> None (legacy identical)
+#   - update_workspace partial: set name+pinned only -> other fields untouched;
+#     explicit name=None clears; tags replaced whole (not merged);
+#     missing workspace_id -> returns None (route maps to 404)
+#   - update_workspace empty request -> no-op, row returned
+```
+
+```bash
+docker compose up -d
+uv run alembic upgrade head
+uv run alembic downgrade -1 && uv run alembic upgrade head   # downgrade is clean
+uv run pytest app/features/demo -v -m integration
+```
+
+### Level 4: Manual smoke (seeded local stack, uvicorn on :8123 + vite)
+
+```bash
+# 1. Keep-run, then PATCH lifecycle round-trip:
+curl -s -X POST http://localhost:8123/demo/run -H 'Content-Type: application/json' \
+  -d '{"skip_seed": true, "preservation": "keep", "workspace_name": "e1-smoke"}' \
+  | python3 -c "import sys,json; print(json.load(sys.stdin)['workspace_id'])"
+WS=<that id>
+curl -s -X PATCH http://localhost:8123/demo/workspaces/$WS \
+  -H 'Content-Type: application/json' \
+  -d '{"name": "e1-renamed", "notes": "smoke", "tags": ["smoke"], "pinned": true}' | python3 -m json.tool
+curl -s -X PATCH http://localhost:8123/demo/workspaces/deadbeef -H 'Content-Type: application/json' -d '{}' \
+  | python3 -m json.tool    # 404 problem+json
+
+# 2. Replay provenance (browser): /showcase -> Saved workspaces -> Replay on
+#    the e1-renamed row; after the run:
+docker exec forecastlab-postgres psql -U forecastlab -d forecastlab -c \
+  "SELECT workspace_id, name, replayed_from_workspace_id FROM showcase_workspace ORDER BY created_at DESC LIMIT 2;"
+# Expect: newest row's replayed_from_workspace_id == $WS; the $WS row unchanged.
+
+# 3. Frontend gates:
+cd frontend && pnpm lint && pnpm test --run
+# pnpm tsc -b — confirm no NEW errors vs the dev baseline (gate is vacuous-aware,
+# see Known Gotchas).
+```
+
+## Final validation Checklist
+
+- [ ] All five gates green: `uv run ruff check . && uv run ruff format --check . && uv run mypy app/ && uv run pyright app/ && uv run pytest -v -m "not integration"`
+- [ ] Integration suite green: `uv run pytest -v -m integration` (fresh docker-compose DB; reset first if the shared DB is polluted)
+- [ ] Migration upgrade + downgrade clean on a fresh DB AND applies on a DB with existing workspace rows
+- [ ] Legacy surfaces byte-identical: start frame without new keys, GET list/detail for old rows (new fields all default/null), `test_strict_mode_policy.py` green
+- [ ] PATCH 200 / 404 / 422 paths verified (Level 2 + Level 4)
+- [ ] Replay records `replayed_from_workspace_id`; source row untouched (Level 4 step 2)
+- [ ] `git diff --stat` shows surgical diffs (no CRLF whole-file noise)
+- [ ] docs/_base/API_CONTRACTS.md + DOMAIN_MODEL.md updated additively (slot schemas documented); RUNBOOKS out-of-scope line trimmed
+- [ ] Commits `feat(db)/feat(api)/feat(ui)/docs(repo): ... (#407)`, no AI trailer; PR into dev
+
+---
+
+## Anti-Patterns to Avoid
+
+- ❌ Don't add ANY ForeignKey — not even self-referential on `replayed_from_workspace_id`. Soft references only.
+- ❌ Don't edit `324a2fa37fcc_create_showcase_workspace_table.py` — new revision off head `324a2fa37fcc`.
+- ❌ Don't make `status` patchable or widen `ck_showcase_workspace_status` — `archived` is the orthogonal flag.
+- ❌ Don't add `extra="forbid"` to `DemoRunRequest` (WS compat) — but DO add it to `WorkspaceUpdateRequest`.
+- ❌ Don't write any story slot from E1 production code — columns + docs + roundtrip tests only.
+- ❌ Don't validate that `replayed_from_workspace_id` points at an existing row — it's a soft reference; dangles are designed.
+- ❌ Don't wrap `update_workspace` in warn-and-continue — that contract is pipeline-only; HTTP helpers raise.
+- ❌ Don't add list filtering/sorting/search or archive-hiding — that's E2 (#408).
+- ❌ Don't add a replay confirmation dialog or lineage UI — E2 (#408).
+- ❌ Don't mutate JSONB values in place — always assign whole values.
+- ❌ Don't import another feature slice from `app/features/demo/` — core/shared only.
+
+## Notes for parallel-epic PRP authors (#408–#412)
+
+- The column set, slot names, and per-slot schemas above are the frozen E1 contract.
+  `job_ids` / `phase_summaries` have a documented schema but NO assigned writer in
+  E1 — E2 (#408, health summary) and E4 (#410, config echo) should agree on which
+  populates which and follow the documented shapes.
+- Slot writes that happen DURING a pipeline run inherit the warn-and-continue
+  invariant (extend `finalize_workspace` / add sibling helpers in `workspace.py`);
+  slot writes via HTTP go through caller-owned-session helpers like
+  `update_workspace`.
+- Tag filtering on `GET /demo/workspaces` (E2) should reuse the
+  `ShowcaseWorkspace.tags.contains([...])` containment shape proven in E1's
+  integration test, mirroring `GET /scenarios?tags=` (scenarios/routes.py:180).
+- A schema change to any slot bumps `config_schema_version` (ORM default) and
+  documents the delta in DOMAIN_MODEL.
+
+## Confidence Score
+
+**9/10** for one-pass implementation success. Every element has a verified in-repo
+precedent: the add-columns+GIN migration (`bb8c4587ef1d`), the tags column
+(`scenarios/models.py:74`), the partial-update schema (`registry RunUpdate`), the
+404-on-missing route shape (the demo DELETE), and the request-field+validator pattern
+(`workspace_name`, same file). The three judgment calls (tags representation, slot
+shape, no-FK soft reference) are resolved and frozen above, and all changes are
+additive — a wrong slot-schema guess costs a documented `config_schema_version` bump,
+not a rework. The −1: the PATCH route tests must match whichever
+unit-vs-integration convention `test_routes.py` currently uses for the workspace
+GET/DELETE endpoints (read it first), and the frontend type-gate baseline is fuzzy
+on this host (`tsc -b` has pre-existing dev failures — gate on "no NEW errors").
diff --git a/PRPs/PRP-showcase-completion-E2-safe-replay-lifecycle.md b/PRPs/PRP-showcase-completion-E2-safe-replay-lifecycle.md
new file mode 100644
index 00000000..6adfe994
--- /dev/null
+++ b/PRPs/PRP-showcase-completion-E2-safe-replay-lifecycle.md
@@ -0,0 +1,1247 @@
+name: "PRP — Showcase Completion E2: Safe Replay & Workspace Lifecycle (issue #408)"
+description: |
+
+## Purpose
+
+Implement the safe-replay + workspace-lifecycle epic of the showcase-completion
+initiative (umbrella #406): an explicit confirmation step (with preview/diff)
+before every replay — destructive copy when `reset=true` — lineage rendering of
+the E1 `replayed_from_workspace_id` chain, full lifecycle management on the
+saved-workspaces panel (rename / archive / pin / notes / tags / search /
+filter / sort / multi-select delete), a two-workspace compare view, and the
+folded-in ops slice: artifact-link liveness checks with dead-link warnings on
+soft references plus a per-workspace health summary (partial-run warning
+included). Parallel epic after Foundation E1 (#407) — **execution starts only
+AFTER E1 merges**; this PRP treats E1's epic body as a frozen contract (every
+dependency on it is tagged `CONTRACT(E1)` below).
+
+## Core Principles
+
+1. **Context is King**: every reference below was verified against the live code on 2026-06-12 (branch `dev`, post-#404/#405 merge — E1 #407 NOT yet merged; see the E1-reconciliation task).
+2. **Validation Loops**: each level is executable as written.
+3. **Information Dense**: patterns cite exact file:line.
+4. **Progressive Success**: backend list-filters + health endpoint → frontend types/hooks → confirm/diff dialog → lifecycle panel rework → lineage → compare page → docs.
+5. **Global rules**: follow CLAUDE.md / AGENTS.md; all five CI gates must pass; UI work follows `.claude/rules/ui-design.md` + `.claude/rules/shadcn-ui.md`.
+
+---
+
+## Goal
+
+An operator on `/showcase` can:
+
+- (a) **Replay safely** — clicking Replay opens a confirmation dialog showing a
+  preview/diff: the recorded config (seed / scenario / reset / skip_seed /
+  name) side-by-side with the exact `DemoRunRequest` about to be sent, any
+  divergence highlighted. When the recorded config has `reset=true`, the
+  dialog carries explicit destructive copy ("Replaying this workspace WIPES
+  the database") and a destructive-styled confirm button. No replay starts
+  without confirmation.
+- (b) **See lineage** — a workspace created by a replay carries a "replay"
+  badge in the list; the loaded-workspace view renders the
+  `replayed_from_workspace_id` chain (newest → original), with dangling
+  ancestors (deleted rows) marked rather than erroring.
+- (c) **Manage the library** — per-row actions: rename, edit notes, edit tags,
+  pin/unpin, archive/unarchive (all via the E1 `PATCH /demo/workspaces/{id}`),
+  plus the existing single delete. The list gains a search box (name), a
+  show-archived toggle (archived hidden by default), a tag filter, and an
+  allow-listed sort; pinned rows always sort first.
+- (d) **Multi-select delete** — checkbox per row, "Delete selected (N)" behind
+  one confirmation dialog, implemented as N sequential single
+  `DELETE /demo/workspaces/{id}` calls. **No new bulk endpoint** (metadata-only
+  singles; vision-compatible — no "wipe everything" operation).
+- (e) **Compare two workspaces** — select exactly two rows → Compare navigates
+  to a new deep-linkable page (`/showcase/compare?a=&b=`) mirroring the
+  run-compare two-picker pattern: config diff, result-summary diff (winner /
+  WAPE delta / wall-clock), created-objects presence matrix, lineage relation.
+- (f) **See link health** — loading a workspace probes its soft references
+  (model runs, scenario plans, alias, batch, agent session, E1 `job_ids`)
+  through a new backend aggregation endpoint
+  `GET /demo/workspaces/{id}/health`; dead references render a warning marker
+  on the artifact cards and a per-workspace health summary chip shows
+  alive/dead counts plus a partial-run warning when the run never completed.
+
+**Deliverable** (all additive — no migration in E2; the schema delta is E1's):
+
+- `app/features/demo/workspace.py` — `list_workspaces` / `count_workspaces`
+  gain filter/sort parameters (`q`, `tags`, `include_archived`, `sort_by`,
+  `sort_order`; pinned-first ordering).
+- `app/features/demo/link_health.py` — NEW: in-process soft-reference probe
+  module (httpx `ASGITransport`, mirroring `pipeline._Client`).
+- `app/features/demo/schemas.py` — `WorkspaceRefHealth`,
+  `WorkspaceHealthResponse` response models (plain BaseModel, NOT strict).
+- `app/features/demo/routes.py` — query params on `GET /demo/workspaces`;
+  NEW `GET /demo/workspaces/{workspace_id}/health`.
+- `frontend/src/types/api.ts` — lifecycle fields on the workspace types
+  (verify-or-add per CONTRACT(E1)), health types, list-params type,
+  `WorkspaceUpdate` type.
+- `frontend/src/hooks/use-workspaces.ts` — params-aware `useWorkspaces`,
+  `usePatchWorkspace`, `useWorkspaceHealth`, `useWorkspaceLineage`.
+- `frontend/src/components/demo/ReplayConfirmDialog.tsx` — NEW confirm +
+  preview/diff dialog.
+- `frontend/src/components/demo/WorkspaceEditDialog.tsx` — NEW
+  rename/notes/tags editor.
+- `frontend/src/components/demo/WorkspaceLineageStrip.tsx` — NEW lineage chain.
+- `frontend/src/components/demo/WorkspacePanel.tsx` — reworked: toolbar
+  (search / show-archived / sort), row badges (pinned, archived, replay),
+  per-row actions dropdown, multi-select + delete-selected + compare-selected.
+- `frontend/src/components/demo/WorkspaceArtifactsPanel.tsx` — health-aware
+  cards (dead-link warnings) + health summary chip.
+- `frontend/src/pages/workspace-compare.tsx` — NEW two-workspace compare page;
+  route + `ROUTES.SHOWCASE_COMPARE` constant.
+- `frontend/src/pages/showcase.tsx` — replay-confirm flow, lineage strip +
+  health wiring, `replayed_from_workspace_id` on the replay start frame.
+- Tests: backend route + module unit tests, integration tests for list filters
+  and health; frontend vitest for every new/changed component + hook.
+- `docs/_base/API_CONTRACTS.md` + `docs/_base/RUNBOOKS.md` — additive updates
+  (incl. superseding the "deliberately no confirm dialog" note).
+
+**Success definition**: all Success Criteria below check off, the five backend
+CI gates and the frontend gates are green, and a manual browser dogfood on a
+seeded stack walks: save → search/sort → rename/pin/archive → replay (confirm
+dialog with diff, destructive variant on a reset workspace) → lineage chain
+visible → two-workspace compare → delete a referenced run → health shows the
+dead link.
+
+## Why
+
+- Umbrella #406 success criteria commit: "a `reset=true` replay requires an
+  explicit confirmation step before it runs" and "Workspaces can be renamed,
+  archived, pinned, annotated (notes/tags), searched, filtered, sorted, and
+  multi-select-deleted (metadata-only) from the saved-workspaces panel".
+- Today a replay of a `reset=true` workspace wipes the database with **no
+  confirmation** — documented designed behavior
+  (`docs/_base/RUNBOOKS.md` § "Showcase workspace", item 1: "there is
+  deliberately no confirm dialog") that #406 explicitly reverses.
+- E1 (#407) ships the storage + PATCH surface but no UI consumes it; E2 is the
+  delivery surface that makes lifecycle, lineage, and provenance visible.
+- `created_objects` ids are soft references by design — operator deletes leave
+  dangling deep links ("expected; the workspace row records what WAS created,
+  not what still exists", RUNBOOKS § Showcase workspace item 4). Link health
+  turns that silent staleness into a visible, per-workspace signal — the novel
+  ops slice #406 folded into this epic.
+
+## What
+
+### Decisions locked here (so implementation doesn't re-litigate)
+
+These were the open questions this PRP owns; the decisions below are final for E2.
+
+1. **Replay-policy picker (exact / safe-keep / modified): OUT OF SCOPE.**
+   Replay stays verbatim (`E4 #393` semantics). Rationale: the umbrella
+   commits only confirm + preview/diff; a "modified replay" already exists as
+   Load → edit controls → Run (the Load path repopulates every control); a
+   policy enum would add request-surface + backend validation for zero new
+   capability. The confirm dialog's footer carries a one-line hint —
+   "Want to change the config first? Use Load instead." Document the
+   deferral in the PR description.
+2. **Confirmation applies to EVERY replay, not just `reset=true`.** The
+   preview/diff panel needs a pre-flight surface and a sometimes-there dialog
+   is worse UX than an always-there one. The `reset=true` variant escalates:
+   destructive copy + destructive-styled action button. This satisfies the
+   umbrella's "explicit confirmation before any reset=true replay" as a
+   strict superset. The direct Run button (operator-configured runs) is
+   unchanged — confirmation guards replays only.
+3. **Link-health architecture: BACKEND aggregation endpoint**
+   (`GET /demo/workspaces/{id}/health`), implemented by probing the public
+   API **in-process** via `httpx.ASGITransport` — the exact mechanism
+   `pipeline._Client` already uses from inside a request context
+   (`app/features/demo/pipeline.py:141-148`; `POST /demo/run` passes
+   `request.app` into the pipeline at `routes.py:75`). Justification:
+   (a) the demo slice may NOT import registry/scenarios/jobs/agents services
+   (vertical-slice rule), and in-process HTTP through the public surface is
+   the slice's established cross-slice seam; (b) one workspace has up to ~10+
+   references (3 runs + N plans + alias + batch + session + M jobs) — a
+   frontend-probed design costs 1+N browser round-trips per workspace and
+   duplicates existence semantics per artifact type; (c) a backend endpoint
+   gives the health summary a single testable contract and a place for the
+   partial-run flag. Probes run concurrently (`asyncio.gather`), classify
+   2xx→`alive`, 404→`dead`, anything else→`unknown`, and are fetched
+   on-demand (loaded workspace only — never for every list row).
+4. **Compare view: FRONTEND-ONLY page.** A workspace compare is a plain field
+   diff over two already-served `WorkspaceDetail` payloads — no new backend
+   endpoint (contrast: `GET /registry/compare/{a}/{b}` exists because metric
+   diffing has server-side logic). New page `/showcase/compare?a=&b=`
+   mirroring `frontend/src/pages/explorer/run-compare.tsx` (two `Select`
+   pickers + `useSearchParams` deep-linking).
+5. **Multi-select delete = N sequential single DELETEs.** The existing
+   `DELETE /demo/workspaces/{id}` is called once per selected row behind one
+   confirmation dialog. NO new bulk endpoint — product-vision guardrail ("no
+   wipe-everything operations"); failures are collected and toasted, the list
+   refetches once at the end.
+6. **Search/filter/sort: SERVER-SIDE additive query params** on
+   `GET /demo/workspaces`, mirroring established precedents: name search →
+   `dimensions` `search` ILIKE pattern (`app/features/dimensions/routes.py:65`),
+   tags → `scenarios` repeated-`tags` JSONB containment
+   (`app/features/scenarios/routes.py:180`, `service.py:462-465`), sort →
+   allow-listed `sort_by`/`sort_order` with silent fallback to default
+   (`dimensions/routes.py:70-75`). `include_archived=false` is the default
+   (archived rows hidden). Pinned rows always order first
+   (`ORDER BY pinned DESC, <sort>`). Server-side keeps the panel honest as
+   rows accumulate and gives the filter a route-test contract.
+
+### Frozen contract — CONTRACT(E1) (#407 ships these; E2 consumes, never re-decides)
+
+Every assumption below MUST be reconciled against the merged E1 diff before
+implementation (Task 1). Where E1's PRP chose different names, adapt E2's code
+to E1's names — never the reverse.
+
+- `CONTRACT(E1)-1` — `showcase_workspace` columns exist post-migration:
+  `replayed_from_workspace_id` (nullable String(32), soft reference — NO FK,
+  consistent with `models.py` no-FK doctrine), `archived` (bool, default
+  false), `pinned` (bool, default false), `notes` (nullable text), `tags`
+  (JSONB string array, default `[]`), `config_schema_version` (int).
+- `CONTRACT(E1)-2` — `tags` representation is a JSONB string array with a GIN
+  index, mirroring `scenario_plan.tags`
+  (`app/features/scenarios/models.py:74,97`), so SQLAlchemy
+  `.contains([tag])` containment filtering works.
+- `CONTRACT(E1)-3` — `PATCH /demo/workspaces/{workspace_id}` exists with an
+  all-Optional partial-update body (rename/notes/tags/archive/pin — assumed
+  schema name `WorkspaceUpdateRequest`, semantics mirroring registry
+  `RunUpdate`, `app/features/registry/schemas.py:113-121`: absent field =
+  unchanged), returns the updated workspace (assumed
+  `WorkspaceDetailResponse`), 404 problem+json on a missing id.
+- `CONTRACT(E1)-4` — the GET list/detail response schemas expose the new
+  columns (`WorkspaceListItem` += `archived`, `pinned`, `tags`,
+  `replayed_from_workspace_id`; `WorkspaceDetailResponse` += `notes`,
+  `config_schema_version` and the JSONB story slots it serves). **Defensive
+  rule**: if E1 did NOT extend the GET responses, E2 adds the fields
+  additively in Task 3 (they are required reading surface for this epic).
+- `CONTRACT(E1)-5` — replay provenance mechanism: `DemoRunRequest` (and the
+  WS start frame) carries an additive Optional
+  `replayed_from_workspace_id: str | None` that `workspace.create_workspace`
+  persists onto the new row (E1's epic body: "Replay writes
+  `replayed_from_workspace_id`"). NOTE: E1's PRP itself wires the frontend
+  send (handleReplayWorkspace sends `ws.workspace_id` — an E1 success
+  criterion), so E2 PRESERVES the field through the executeReplay refactor
+  rather than adding it; if E1 instead derived it server-side, E2 adapts.
+- `CONTRACT(E1)-6` — the `job_ids` JSONB story slot is a `list[str]` of job
+  ids; the health endpoint probes each via `GET /jobs/{job_id}` when the slot
+  is non-empty (and silently skips when absent/empty — pre-E1-backfill rows).
+- `CONTRACT(E1)-7` — E1 does NOT add filtering/sorting to
+  `GET /demo/workspaces` (its scope is migration + PATCH + schemas); the list
+  query params are E2's to add. If E1's merged code already added any of
+  them, reuse instead of duplicating.
+
+### User-visible behavior
+
+- **Replay confirm/diff**: Replay button → dialog titled "Replay workspace
+  \"name\"?" with a two-column table (Recorded / Will send) over seed,
+  scenario, reset, skip_seed, workspace name, preservation (always `keep`),
+  replayed-from (the source workspace id). Rows where the two values differ
+  are highlighted (defensive — verbatim replay means they normally match).
+  `reset=true` → red warning block + destructive confirm button labeled
+  "Replay & wipe database"; otherwise a default confirm labeled "Replay".
+  Cancel never starts a run.
+- **Lineage**: list rows with `replayed_from_workspace_id != null` show an
+  outline `Badge` "replay". The loaded-workspace view renders a breadcrumb
+  strip: `this ← parent ← grandparent …` (depth-capped at 5), each ancestor
+  clickable (loads it); a deleted ancestor renders as
+  "(original deleted)" — dangling soft references are expected, never errors.
+- **Lifecycle panel**: toolbar = search `Input` (filters by name,
+  debounced/enter-applied), "Show archived" `Checkbox`, sort `Select`
+  (Newest / Oldest / Name / Status). Rows: pin icon (filled when pinned),
+  muted styling + "archived" badge on archived rows, tags rendered as small
+  chips (clicking a chip filters the list by that tag; an active tag filter
+  shows as a clearable chip in the toolbar). Per-row `DropdownMenu` (lucide
+  `MoreHorizontal`): Pin/Unpin, Archive/Unarchive, Edit details…, Delete….
+  "Edit details…" opens `WorkspaceEditDialog` (name input with the
+  `^[a-z0-9][a-z0-9\-_]*$` client validation already used by the run controls,
+  notes `Textarea`, tags comma-separated input).
+- **Multi-select**: leading `Checkbox` per row + header select-all; selection
+  shows "N selected" with **Delete selected** (AlertDialog: "Delete N
+  workspace records? Their created objects are NOT deleted.") and **Compare**
+  (enabled only when exactly 2 selected → navigates to the compare page).
+- **Compare page** (`/showcase/compare?a=&b=`): back-link to `/showcase`, two
+  workspace `Select` pickers (deep-linkable URL params), then: config table
+  (seed/scenario/reset/skip_seed/name/tags, mismatches highlighted),
+  result-summary table (winner, WAPE with the `DeltaCell` sign-only
+  indicator, wall-clock), created-objects presence matrix (per soft-reference
+  key: recorded A / recorded B), lineage note when one side is a replay of
+  the other, partial-run badge per side when `status != "completed"`.
+- **Link health**: loading a workspace fires
+  `GET /demo/workspaces/{id}/health`; the artifacts panel shows a summary
+  chip — `✓ N live · ✕ M dead` (plus "partial run" warning chip when the
+  row's status is not `completed`) — and each card whose reference probed
+  `dead` gets a lucide `AlertTriangle` + tooltip "This object no longer
+  exists — it was deleted after the run." `unknown` references render
+  without a marker (no false alarms on transient 5xx).
+
+### Technical requirements
+
+- All five backend gates green; frontend `pnpm lint && pnpm test --run` green.
+- New/changed endpoints: route tests covering 2xx + at least one error path
+  (`.claude/rules/test-requirements.md`).
+- RFC 7807 for every error path (`NotFoundError` from `app/core/exceptions.py:72`).
+- Response models stay plain `BaseModel` (+`from_attributes` where ORM-built)
+  — strict mode is request-body-only policy (`demo/schemas.py:88-95` precedent).
+- The demo slice imports NO other feature slice — link health goes through
+  in-process HTTP (`request.app` + `ASGITransport`), never a service import.
+- Frontend: TanStack Query for all IO; shadcn/ui new-york primitives only
+  (everything needed is already installed — see gotchas); lucide icons;
+  semantic tokens only (`text-destructive`, `bg-muted`, …) — no raw colors.
+- Legacy behavior byte-identical: a client that never touches the new query
+  params / endpoints sees today's responses (new list params all default to
+  today's semantics EXCEPT archived-hidden — see gotcha on `include_archived`).
+
+### Success Criteria
+
+- [ ] Replay (panel button) always opens the confirm dialog with the
+      recorded-vs-sent preview; confirming a `reset=true` workspace requires
+      the destructive-styled button; Cancel starts nothing. No code path
+      starts a replay without the dialog.
+- [ ] A confirmed replay sends the recorded config verbatim +
+      `preservation="keep"` + the recorded name + `replayed_from_workspace_id`
+      (CONTRACT(E1)-5); the new row carries the provenance id and the list
+      shows its "replay" badge; the loaded view renders the ancestor chain,
+      tolerating deleted ancestors.
+- [ ] Rename / notes / tags / pin / archive each round-trip through
+      `PATCH /demo/workspaces/{id}` and re-render without a manual refresh
+      (query invalidation on list + detail).
+- [ ] `GET /demo/workspaces` supports `q` (name ILIKE), `tags` (repeated,
+      containment), `include_archived` (default false), allow-listed
+      `sort_by`/`sort_order` (unknown → default `created_at desc`); pinned
+      rows order first; `total` respects the active filters; route tests
+      cover each param + the bad-param paths.
+- [ ] Multi-select delete removes N metadata rows via N single DELETEs behind
+      one confirmation; created objects untouched; NO new bulk endpoint exists.
+- [ ] `/showcase/compare?a=&b=` deep-links two workspaces and renders config
+      diff, result diff, created-objects matrix, lineage note, partial-run
+      badges; invalid/missing ids degrade to the picker (no crash).
+- [ ] `GET /demo/workspaces/{id}/health` returns per-reference
+      `alive`/`dead`/`unknown` + counts + `partial_run`; 404 problem+json on a
+      missing workspace; integration test proves a bogus reference probes
+      `dead` and a real one probes `alive`.
+- [ ] Loaded-workspace artifact cards show dead-link warnings + the health
+      summary chip; the partial-run warning renders for non-completed rows.
+- [ ] Legacy list calls (no new params) return archived-free, pinned-first,
+      newest-first pages; all pre-existing demo tests still pass.
+- [ ] `uv run ruff check . && uv run ruff format --check . && uv run mypy app/
+      && uv run pyright app/ && uv run pytest -v -m "not integration"` green;
+      integration suite green; `cd frontend && pnpm lint && pnpm test --run`
+      green.
+
+## Assumptions (no user available — documented, not asked)
+
+1. E1 (#407) merges before E2 execution begins (implementation-order gate from
+   the umbrella). This PRP is authored against pre-E1 `dev`; Task 1
+   reconciles every CONTRACT(E1) point against E1's actual merged shape.
+2. Exact E1 schema/endpoint names (`WorkspaceUpdateRequest`, field names as
+   listed in CONTRACT(E1)) — adapt to E1's real names on divergence.
+3. Archived-by-default-hidden is the correct list semantics (that is what
+   "archive" means for a library); the only consumer of `GET /demo/workspaces`
+   is the Showcase panel (verified — no other frontend or backend caller), so
+   the default-flip is safe.
+4. Health probing is acceptable on-demand-only (loaded workspace), not for
+   every list row — probing N rows × M references on list render would be a
+   self-inflicted thundering herd through the in-process transport.
+5. The lineage chain depth cap of 5 is sufficient (a replay-of-a-replay chain
+   deeper than 5 is a pathological case; the strip renders "…" beyond it).
+6. `sonner` `toast` (already used by `WorkspacePanel.tsx:20`) is the
+   feedback surface for mutation success/failure — no new notification system.
+7. Tag editing via a comma-separated text input is acceptable UX for a
+   single-operator tool (no tag-autocomplete component is installed; building
+   one is out of scope).
+
+## All Needed Context
+
+### Documentation & References
+
+```yaml
+# MUST READ — issues (the contract stack)
+- url: https://github.com/w7-mgfcode/ForecastLabAI/issues/408
+  why: The epic this PRP implements — scope list is exhaustive (this PRP covers all of it).
+- url: https://github.com/w7-mgfcode/ForecastLabAI/issues/406
+  why: Umbrella — success criteria rows 2 & 3 are E2's acceptance bar; out-of-scope list (no replay-policy infra beyond confirm+diff).
+- url: https://github.com/w7-mgfcode/ForecastLabAI/issues/407
+  why: Foundation epic body = the frozen CONTRACT(E1) surface (columns, JSONB slots, PATCH endpoint, replay provenance write).
+- file: PRPs/PRP-showcase-workspace-E4-restore-replay.md
+  why: Closest-analog predecessor PRP — the E4 restore/replay semantics E2 hardens; its "decisions locked" #2/#3 (no confirm dialog, no provenance) are the two designed behaviors #406/#407 now reverse.
+
+# MUST READ — backend (verified 2026-06-12, dev pre-E1)
+- file: app/features/demo/routes.py
+  why: |
+    Current surface: POST /run @51 (passes request.app into the pipeline @75 —
+    the request-context app handle the health route also needs), GET
+    /workspaces @80-107 (limit/offset only — EXTEND with filters), GET
+    /workspaces/{workspace_id} @110-135 (NotFoundError 404 pattern @133-134),
+    DELETE @138-163, WS /stream @166. Router prefix="/demo" @48. Health route
+    lands between the GET detail and DELETE.
+- file: app/features/demo/workspace.py
+  why: |
+    list_workspaces @174-196 (order created_at.desc, id.desc @192) and
+    count_workspaces @224-234 — the two functions E2 extends with q/tags/
+    include_archived/sort_by/sort_order. get_workspace @158, delete_workspace
+    @199. All take caller-owned AsyncSession. create_workspace @46 is E1's to
+    extend (replayed_from) — DO NOT touch unless E1 missed it.
+- file: app/features/demo/models.py
+  why: |
+    ShowcaseWorkspace @37; current columns @59-81; CHECK + composite index
+    @83-89. E1 adds the lifecycle/provenance columns here — E2 reads them,
+    never migrates. No-FK doctrine in the module docstring @4-11 (the health
+    feature exists BECAUSE of this doctrine).
+- file: app/features/demo/schemas.py
+  why: |
+    DemoRunRequest @29 (strict=True @40; preservation @68; workspace_name
+    pattern @72-78; requires-keep validator @80-85 — the model E1 extends with
+    replayed_from_workspace_id). Response-model non-strict precedent: StepEvent
+    docstring @88-95, WorkspaceListItem @169 (from_attributes @177),
+    WorkspaceDetailResponse @192, WorkspaceListResponse @205. Append the two
+    health models here.
+- file: app/features/demo/pipeline.py
+  why: |
+    THE in-process probe mechanism to copy into link_health.py: _Client
+    @127-204 — httpx.AsyncClient(transport=httpx.ASGITransport(app=app,
+    raise_app_exceptions=False), base_url cosmetic, timeout @98) and
+    request() status handling @188-200. link_health needs a SIMPLER client:
+    status-code classification only, no _StepError. DO NOT modify pipeline.py
+    in E2 (E1 owns the provenance write; replay flows through unchanged).
+- file: app/features/demo/tests/test_routes.py
+  why: |
+    Route-test conventions to extend: unit tests monkeypatch the workspace
+    module functions (list @236-251, pagination pass-through @253-276, 404
+    @286-298, delete @324-347); integration tests @359+ use the db_session
+    fixture and seed real rows. New filter/health tests follow these shapes.
+- file: app/features/demo/tests/conftest.py
+  why: client fixture (ASGITransport over app.main.app) + db_session fixture
+    (real Postgres, wipes showcase_workspace on teardown).
+- file: app/features/scenarios/routes.py
+  why: |
+    Repeated-tags Query param precedent @168-195 (tags: list[str] | None =
+    Query(default=None)) — copy for the workspace list. GET detail 404 style
+    @198-223.
+- file: app/features/scenarios/service.py
+  why: list_plans @436-472 — tags containment filter @462-465
+    (stmt.where(ScenarioPlan.tags.contains(tags))) applied to BOTH count and
+    rows statements; total respects filters. Mirror exactly.
+- file: app/features/scenarios/models.py
+  why: tags JSONB string-array column @70-74 + GIN index @97 — the
+    representation CONTRACT(E1)-2 assumes for workspace tags.
+- file: app/features/dimensions/routes.py
+  why: |
+    search + allow-listed sort precedent @65-105 (search Query min-2-chars,
+    sort_by Query with allow-list note "unknown values use default order",
+    sort_order asc|desc). Mirror the docstring + silent-fallback style.
+- file: app/features/registry/schemas.py
+  why: RunUpdate @113-121 — the all-Optional partial-update body shape
+    CONTRACT(E1)-3 assumes for WorkspaceUpdateRequest (extra="forbid").
+- file: app/features/registry/routes.py
+  why: |
+    PATCH precedent @235; probe targets for link health: GET /registry/runs/
+    {run_id} @200-201, GET /registry/aliases/{alias_name} @503-504.
+- file: app/features/jobs/routes.py
+  why: probe target GET /jobs/{job_id} @219-220.
+- file: app/features/batch/routes.py
+  why: probe target GET /batch/{batch_id} @55-62 (NotFoundError on miss).
+- file: app/features/agents/routes.py
+  why: probe target GET /agents/sessions/{session_id} @80-104 — 404 via plain
+    HTTPException (status code is all the probe reads; body shape irrelevant).
+- file: app/core/exceptions.py
+  why: NotFoundError @72 (RFC 7807 404). No new exception classes needed.
+
+# MUST READ — frontend (verified 2026-06-12)
+- file: frontend/src/pages/showcase.tsx
+  why: |
+    453 lines. State block @118-131 (seed/keepWorkspace/workspaceName/
+    selectedWorkspaceId + useWorkspace detail resolution @128-131); handleRun
+    @139-156; handleLoadWorkspace @160-168; handleReplayWorkspace @174-186 —
+    THE function the confirm dialog intercepts (today it calls start()
+    directly); WorkspacePanel mount @245-255; name-pattern client validation
+    @26 + @135-137 (reuse in WorkspaceEditDialog); WorkspaceArtifactsPanel
+    mount @448-450 (gets health props).
+- file: frontend/src/components/demo/WorkspacePanel.tsx
+  why: |
+    219 lines — the component this epic reworks. Props @37-48; statusClass
+    @50-59 (semantic-token status colors); DESTRUCTIVE marker @144-148
+    (text-destructive span); per-row buttons @153-183; the AlertDialog
+    delete-confirm pattern @191-216 (open-state via pendingDelete, shared
+    one dialog for all rows, data-testid on the action) — COPY this pattern
+    for ReplayConfirmDialog + the multi-delete confirm; list invalidation
+    effect @106-110.
+- file: frontend/src/components/demo/WorkspacePanel.test.tsx
+  why: vitest conventions for this component family (mock use-workspaces
+    hooks via vi.mock, fire dialog actions, assert mutation calls).
+- file: frontend/src/components/demo/WorkspaceArtifactsPanel.tsx
+  why: |
+    157 lines. ArtifactCard shape @15-20, buildCards key mapping @30-107
+    (winning_run_id/v2_run_id/scenario_plan_ids/batch_id/alias/
+    agent_session_id + grain), disabled-card opacity-50 + title tooltip
+    @128-149. Health markers extend buildCards: each card gains an optional
+    `dead: boolean` resolved from the health response keyed by reference id.
+- file: frontend/src/hooks/use-workspaces.ts
+  why: |
+    43 lines — extend in place. useWorkspaces @10-16 (queryKey ['workspaces',
+    {limit}] — params object grows), useWorkspace @19-25, useDeleteWorkspace
+    @33-42 (invalidate ['workspaces'] on success — same invalidation for
+    usePatchWorkspace). useWorkspaceHealth + useWorkspaceLineage are new
+    siblings here.
+- file: frontend/src/pages/explorer/run-compare.tsx
+  why: |
+    THE compare-page pattern (370 lines): useSearchParams a/b @87-89,
+    selectRun setParams updater @103-109, RunPicker Select @56-84, DeltaCell
+    sign-only indicator @33-54, side-by-side Card/Table layout @114+. The
+    workspace compare page mirrors all of it with useWorkspace×2 instead of
+    useCompareRuns (frontend-only diff — Decision 4).
+- file: frontend/src/lib/constants.ts
+  why: ROUTES.SHOWCASE='/showcase' @4, ROUTES.EXPLORER.RUN_COMPARE @20 — add
+    SHOWCASE_COMPARE='/showcase/compare' beside SHOWCASE.
+- file: frontend/src/App.tsx
+  why: lazy-page + Suspense route registration pattern (ShowcasePage @12,
+    @54-61; RunComparePage @21, @119-126) — register WorkspaceComparePage
+    identically.
+- file: frontend/src/lib/api.ts
+  why: api<T>(endpoint, {params, method, body}) wrapper; ApiError carries
+    status (WorkspacePanel.tsx:97 shows instanceof usage); getErrorMessage.
+- file: frontend/src/types/api.ts
+  why: workspace types block @806-831 (WorkspaceListItem @806, WorkspaceDetail
+    @819, WorkspaceListResponse @828); DemoRunRequest @778-787 — extend here.
+- file: frontend/src/hooks/use-demo-pipeline.ts
+  why: start(req) signature + the picker-desync gotcha (start() does not sync
+    the scenario picker — Replay must setScenario first; already handled in
+    handleReplayWorkspace, keep that ordering inside the confirmed path).
+
+# Project docs to update (additive)
+- file: docs/_base/API_CONTRACTS.md
+  why: GET /demo/workspaces row gains the filter params; new health-endpoint
+    row; WS section note for replayed_from (if E1 didn't already add it).
+- file: docs/_base/RUNBOOKS.md
+  why: § "Showcase workspace — preserve/restore/replay/delete semantics" item 1
+    says "there is deliberately no confirm dialog" — E2 supersedes this
+    (update the item; keep the DESTRUCTIVE-marker sentence). Items 2-4 gain
+    one-line pointers to lineage badges / metadata-only multi-delete / health.
+- file: docs/_base/DOMAIN_MODEL.md
+  why: showcase_workspace § "Out of scope" lists the replayed_from column —
+    E1's PRP owns that doc edit; E2 only verifies it happened (do not double-edit).
+```
+
+### Current Codebase tree (relevant subset, pre-E1)
+
+```bash
+app/features/demo/
+├── link_health.py     # DOES NOT EXIST — E2 creates
+├── models.py          # ShowcaseWorkspace @37 (E1 extends; E2 reads)
+├── pipeline.py        # 2771 lines; _Client @127 — UNTOUCHED in E2
+├── routes.py          # POST /run @51; GETs @80,@110; DELETE @138; WS @166
+├── schemas.py         # 214 lines; workspace response models @169-213
+├── service.py         # lock + PipelineBusyError — untouched
+├── workspace.py       # 235 lines; list @174 / count @224 — E2 extends
+└── tests/             # conftest, test_{models,pipeline,routes,schemas,workspace}.py
+frontend/src/
+├── pages/showcase.tsx                       # 453 lines
+├── pages/explorer/run-compare.tsx           # 370 lines — compare pattern
+├── components/demo/WorkspacePanel.tsx       # 219 lines — reworked in E2
+├── components/demo/WorkspaceArtifactsPanel.tsx  # 157 lines — health-aware in E2
+├── hooks/use-workspaces.ts                  # 43 lines — extended in E2
+├── types/api.ts                             # workspace block @806-831
+└── components/ui/                           # 27 primitives incl. alert-dialog,
+                                             # dialog, dropdown-menu, textarea,
+                                             # table, select, tooltip, badge
+```
+
+### Desired Codebase tree (files added/modified)
+
+```bash
+app/features/demo/
+├── link_health.py                           # NEW — probe targets + probe_workspace_links()
+├── schemas.py                               # MOD — +WorkspaceRefHealth +WorkspaceHealthResponse
+├── workspace.py                             # MOD — list/count filters + sort
+├── routes.py                                # MOD — list query params; +GET /workspaces/{id}/health
+└── tests/
+    ├── test_link_health.py                  # NEW — probe classification vs a stub ASGI app
+    ├── test_routes.py                       # MOD — filter/sort/health unit + integration tests
+    └── test_workspace.py                    # MOD — list/count filter unit coverage (db-less where possible)
+frontend/src/
+├── types/api.ts                             # MOD — lifecycle fields (verify-or-add), health types, params, update type
+├── hooks/use-workspaces.ts                  # MOD — params-aware list; +usePatchWorkspace +useWorkspaceHealth +useWorkspaceLineage
+├── hooks/use-workspaces.test.ts             # MOD — new hooks covered
+├── components/demo/ReplayConfirmDialog.tsx       # NEW (+ .test.tsx)
+├── components/demo/WorkspaceEditDialog.tsx       # NEW (+ .test.tsx)
+├── components/demo/WorkspaceLineageStrip.tsx     # NEW (+ .test.tsx)
+├── components/demo/WorkspacePanel.tsx       # MOD — toolbar/badges/dropdown/multi-select (+ test MOD)
+├── components/demo/WorkspaceArtifactsPanel.tsx   # MOD — health markers + summary chip (+ test MOD)
+├── components/demo/index.ts                 # MOD — barrel exports
+├── pages/workspace-compare.tsx              # NEW (+ workspace-compare.test.tsx)
+├── pages/showcase.tsx                       # MOD — confirm flow, lineage, health, provenance field
+├── lib/constants.ts                         # MOD — ROUTES.SHOWCASE_COMPARE
+└── App.tsx                                  # MOD — compare route registration
+docs/_base/API_CONTRACTS.md                  # MOD — list params + health endpoint
+docs/_base/RUNBOOKS.md                       # MOD — supersede "no confirm dialog"; lifecycle notes
+```
+
+### Known Gotchas & Library Quirks
+
+```python
+# CRITICAL — EXECUTION GATE: do not start until E1 (#407) is merged to dev.
+#   Task 1 reconciles every CONTRACT(E1) point against the real merged code
+#   (git log --oneline --grep "#407"; read the E1 PRP + diff). Adapt E2 to
+#   E1's names; flag (don't silently fix) any E1 contract gap in the PR body.
+
+# CRITICAL — NO migration, NO models.py edit, NO pipeline.py edit in E2.
+#   The schema delta and the provenance/PATCH plumbing are E1's. If a column
+#   you need is missing post-E1, STOP and surface it — don't ship a stealth
+#   migration under E2.
+
+# CRITICAL — no cross-slice imports from app/features/demo/. Link health MUST
+#   go through in-process HTTP (request.app + httpx.ASGITransport — precedent
+#   pipeline.py:141-148 driven from a request context via routes.py:75).
+#   Importing RegistryService/ScenarioService/etc. fails the architecture rule.
+
+# CRITICAL — health probe classification: 2xx -> "alive", 404 -> "dead",
+#   EVERYTHING else (5xx, timeout, transport error) -> "unknown". Never let a
+#   probe exception escape the endpoint (asyncio.gather(..., return_exceptions=
+#   True) or per-probe try/except) — a flaky slice must not 500 the health
+#   route. raise_app_exceptions=False is REQUIRED on the ASGITransport (an
+#   unhandled error in a probed endpoint must surface as a 500 *response*).
+
+# CRITICAL — multi-select delete is N SINGLE DELETEs (existing endpoint).
+#   Adding POST /demo/workspaces/bulk-delete or DELETE /demo/workspaces is a
+#   product-vision violation (no bulk-wipe operations) — do not create it.
+
+# CRITICAL — the `total` returned by the filtered list MUST respect the active
+#   filters (scenarios precedent: BOTH count_stmt and rows_stmt get the same
+#   .where chain, scenarios/service.py:462-465). A filter-blind total breaks
+#   the "showing X of Y" header.
+
+# GOTCHA — include_archived default false flips list semantics for archived
+#   rows. Pre-E1 rows have archived=false (E1 migration default), so legacy
+#   lists are unchanged; route tests must still pin: no-param call returns
+#   only archived=false rows, include_archived=true returns both.
+
+# GOTCHA — sort allow-list: {created_at, name, seed, status}; unknown sort_by
+#   silently falls back to created_at desc (dimensions precedent — NOT a 422).
+#   Pinned-first is unconditional: ORDER BY pinned DESC, <sort>, id DESC
+#   tiebreak. name sort: NULLS LAST (unnamed rows sink) — use
+#   sqlalchemy .nulls_last() on the asc/desc expression.
+
+# GOTCHA — tags Query param: list[str] | None = Query(default=None) gives
+#   repeated-param parsing (?tags=a&tags=b). JSONB containment via
+#   ShowcaseWorkspace.tags.contains(tags) requires CONTRACT(E1)-2 (JSONB array
+#   column). Frontend sends ONE tag at a time (chip filter) — a single
+#   `tags` param serializes fine through api()'s params.
+
+# GOTCHA — q search: mirror dimensions ILIKE (case-insensitive, escape % and _
+#   if the precedent does; check dimensions/service.py before writing).
+#   Search NAME only (workspace_id prefixes are copy-paste handles, not search).
+
+# GOTCHA — strict-mode policy: the new health/response models are response
+#   models -> plain BaseModel, NO ConfigDict(strict=True). The AST walker
+#   (app/core/tests/test_strict_mode_policy.py) only inspects strict=True
+#   request models — keep it that way.
+
+# GOTCHA — agents GET /agents/sessions/{id} 404s via plain HTTPException (not
+#   NotFoundError) — irrelevant to the probe (status code only), but do NOT
+#   "fix" the agents slice as a drive-by.
+
+# GOTCHA — an EXPIRED-but-existing agent session returns 200 (row exists) ->
+#   "alive". That is correct link-health semantics (the row is the link
+#   target); the artifacts card blurb already says "the recorded session has
+#   likely expired".
+
+# GOTCHA — ReplayConfirmDialog destructive styling: AlertDialogAction renders
+#   buttonVariants default; pass className="bg-destructive text-destructive-
+#   foreground hover:bg-destructive/90" (semantic tokens — NEVER raw colors
+#   like bg-red-500). Copy the shared-dialog open-state pattern from
+#   WorkspacePanel.tsx:191-216 (pendingX state, one dialog for all rows).
+
+# GOTCHA — confirm-dialog flow ordering: the confirmed replay must run the
+#   EXISTING handleReplayWorkspace body (setScenario BEFORE start() — the
+#   picker-desync gotcha from E4 still applies). Refactor: handleReplayWorkspace
+#   becomes "setPendingReplay(ws)"; a new executeReplay(ws) holds the old body
+#   + the CONTRACT(E1)-5 replayed_from_workspace_id field.
+
+# GOTCHA — lineage walking: a deleted ancestor's GET returns 404 (ApiError
+#   .status === 404) — render "(original deleted)" and STOP the walk; never
+#   throw. Implement as one useQuery whose queryFn loops (await api(...) per
+#   ancestor, depth cap 5), queryKey ['workspaces', id, 'lineage'] — N
+#   serial fetches inside one query keeps cache + loading states simple.
+
+# GOTCHA — useWorkspaces signature change (limit -> params object) touches its
+#   existing call sites + use-workspaces.test.ts — update them in the same
+#   commit; keep queryKey shape ['workspaces', paramsObject] so the blanket
+#   invalidateQueries({queryKey: ['workspaces']}) keeps matching everything.
+
+# GOTCHA — pnpm tsc --noEmit is VACUOUS (solution-style tsconfig, zero files)
+#   and `tsc -b` fails on dev with PRE-EXISTING errors (known issue — memory
+#   [[frontend-tsc-noemit-gate-vacuous]]). Do NOT chase those. JS gates that
+#   must be green: pnpm lint && pnpm test --run. Optionally verify ONLY your
+#   new files compile via their vitest imports.
+
+# GOTCHA — every shadcn primitive needed (alert-dialog, dialog, dropdown-menu,
+#   checkbox, input, textarea, select, table, tooltip, badge, card, button) is
+#   ALREADY in frontend/src/components/ui/ (verified 2026-06-12). Do NOT run
+#   `shadcn add`. If you believe a new primitive is required, stop and recheck
+#   (.claude/rules/shadcn-ui.md; memory [[shadcn-cli-version-pin]]).
+
+# GOTCHA — never call crypto.randomUUID directly (issue #332; ESLint guard) —
+#   safeRandomUUID from @/lib/uuid-utils if any client id is needed.
+
+# GOTCHA — repo has mixed CRLF/LF; Write/Edit emit LF. New files fine; for
+#   showcase.tsx / WorkspacePanel.tsx / routes.py edits run `git diff --stat`
+#   and confirm surgical line counts before committing.
+
+# GOTCHA — mypy --strict AND pyright --strict gate merge: full annotations on
+#   the new probe module (TypedDict/dataclass or Pydantic for probe targets),
+#   `-> None` on tests, annotated fixtures.
+
+# COORDINATION — E3 (#409), E4 (#410), E5 (#411), E6 (#412) are open parallel
+#   epics. Shared-file risk: schemas.py / routes.py / showcase.tsx /
+#   API_CONTRACTS.md. Keep every edit additive + self-contained; rebase on dev
+#   before the PR.
+
+# RUNTIME-VERIFICATION LOG (per prp-create step 3):
+#   - demo routes/handlers + line refs — read routes.py (2026-06-12)
+#   - list/count signatures + ordering — read workspace.py:174-234
+#   - ShowcaseWorkspace pre-E1 columns — read models.py:59-89
+#   - response-model non-strict precedent — read schemas.py:88-95,169-213
+#   - ASGITransport in-process pattern — read pipeline.py:127-204
+#   - scenario tags containment + GIN — read scenarios/service.py:462-465, models.py:74,97
+#   - dimensions search/sort params — grep dimensions/routes.py:65-105
+#   - probe targets exist: /registry/runs/{run_id} (registry/routes.py:200),
+#     /registry/aliases/{alias_name} (:503), /jobs/{job_id} (jobs/routes.py:219),
+#     /batch/{batch_id} (batch/routes.py:55), /agents/sessions/{session_id}
+#     (agents/routes.py:80), /scenarios/{scenario_id} (scenarios/routes.py:198)
+#   - RunUpdate partial-update shape — read registry/schemas.py:113-121
+#   - frontend: WorkspacePanel AlertDialog pattern (191-216), run-compare
+#     useSearchParams pattern (87-109), installed ui primitives (ls), api.ts
+#     ApiError usage (WorkspacePanel.tsx:97)
+#   - E1 #407 OPEN / unmerged as of 2026-06-12 — CONTRACT(E1) tags mark every
+#     dependency; no third-party API claims beyond in-repo working patterns
+#     (httpx ASGITransport, sqlalchemy .contains, TanStack useQuery/useMutation
+#     — all already exercised in this repo; .nulls_last is standard
+#     SQLAlchemy 2.0 API but has NO in-repo precedent — verify at impl time).
+```
+
+## Implementation Blueprint
+
+### Data models and structure
+
+```python
+# app/features/demo/schemas.py — APPEND (response models; NOT strict)
+
+RefHealthStatus = Literal["alive", "dead", "unknown"]
+RefType = Literal["model_run", "scenario_plan", "alias", "batch", "agent_session", "job"]
+
+
+class WorkspaceRefHealth(BaseModel):
+    """Liveness of one soft reference recorded on a workspace (E2, #408)."""
+
+    key: str = Field(..., description="created_objects key, e.g. 'winning_run_id' or 'scenario_plan_ids[0]'.")
+    ref_type: RefType = Field(..., description="Kind of referenced object.")
+    ref_id: str = Field(..., description="The recorded soft-reference id.")
+    status: RefHealthStatus = Field(..., description="alive (2xx) / dead (404) / unknown (other).")
+    probe_path: str = Field(..., description="The public API path probed.")
+
+
+class WorkspaceHealthResponse(BaseModel):
+    """Per-workspace link-health summary (E2, #408)."""
+
+    workspace_id: str
+    workspace_status: str = Field(..., description="running / completed / failed.")
+    partial_run: bool = Field(..., description="True when workspace_status != 'completed'.")
+    references: list[WorkspaceRefHealth] = Field(default_factory=list)
+    alive: int = Field(..., ge=0)
+    dead: int = Field(..., ge=0)
+    unknown: int = Field(..., ge=0)
+    checked_at: datetime = Field(default_factory=_utc_now)
+```
+
+```python
+# app/features/demo/link_health.py — NEW (sketch; CRITICAL details only)
+
+@dataclass(frozen=True)
+class _ProbeTarget:
+    key: str          # e.g. "scenario_plan_ids[1]"
+    ref_type: str     # RefType value
+    ref_id: str
+    probe_path: str   # e.g. f"/registry/runs/{ref_id}"
+
+def build_probe_targets(ws: ShowcaseWorkspace) -> list[_ProbeTarget]:
+    # created_objects keys (workspace.py:_collect_created_objects:82-103):
+    #   winning_run_id / v2_run_id / stale_alias_run_id -> /registry/runs/{id}
+    #   scenario_plan_ids[i]                            -> /scenarios/{id}
+    #   alias                                           -> /registry/aliases/{name}
+    #   batch_id                                        -> /batch/{id}
+    #   agent_session_id                                -> /agents/sessions/{id}
+    # CONTRACT(E1)-6: job_ids JSONB slot [i]            -> /jobs/{id}
+    # NON-probeable keys (v2_model_path, scenario_artifact_key,
+    # train_model_types) are SKIPPED — no HTTP identity to check.
+    ...
+
+async def probe_workspace_links(app: FastAPI, ws: ShowcaseWorkspace) -> WorkspaceHealthResponse:
+    targets = build_probe_targets(ws)
+    async with httpx.AsyncClient(
+        transport=httpx.ASGITransport(app=app, raise_app_exceptions=False),
+        base_url="http://demo.internal",
+        timeout=httpx.Timeout(10.0, connect=5.0),
+    ) as client:
+        results = await asyncio.gather(
+            *(_probe_one(client, t) for t in targets), return_exceptions=False
+        )  # _probe_one NEVER raises: try/except httpx.HTTPError/OSError -> "unknown"
+    # classify: 200<=s<300 alive; s==404 dead; else unknown
+    # partial_run = ws.status != WORKSPACE_STATUS_COMPLETED
+    ...
+```
+
+```typescript
+// frontend/src/types/api.ts — extend the workspace block (806-831)
+
+// CONTRACT(E1)-4 — verify E1 added these; add additively if not:
+export interface WorkspaceListItem {
+  /* existing fields ... */
+  archived: boolean
+  pinned: boolean
+  tags: string[]
+  replayed_from_workspace_id: string | null
+}
+export interface WorkspaceDetail extends WorkspaceListItem {
+  /* existing fields ... */
+  notes: string | null
+  config_schema_version: number
+}
+
+// E2 (#408) — lifecycle PATCH body (CONTRACT(E1)-3 shape; adapt to E1 names):
+export interface WorkspaceUpdate {
+  name?: string | null
+  notes?: string | null
+  tags?: string[]
+  archived?: boolean
+  pinned?: boolean
+}
+
+export interface WorkspaceListParams {
+  limit?: number
+  offset?: number
+  q?: string
+  tags?: string
+  include_archived?: boolean
+  sort_by?: 'created_at' | 'name' | 'seed' | 'status'
+  sort_order?: 'asc' | 'desc'
+}
+
+export type RefHealthStatus = 'alive' | 'dead' | 'unknown'
+export interface WorkspaceRefHealth {
+  key: string
+  ref_type: 'model_run' | 'scenario_plan' | 'alias' | 'batch' | 'agent_session' | 'job'
+  ref_id: string
+  status: RefHealthStatus
+  probe_path: string
+}
+export interface WorkspaceHealth {
+  workspace_id: string
+  workspace_status: 'running' | 'completed' | 'failed'
+  partial_run: boolean
+  references: WorkspaceRefHealth[]
+  alive: number
+  dead: number
+  unknown: number
+  checked_at: string
+}
+```
+
+### List of tasks (dependency order)
+
+```yaml
+Task 1 — gate, branch & E1 reconciliation:
+  VERIFY: gh issue view 407 --json state  -> MUST be closed (E1 merged) before continuing
+  RUN: git switch dev && git pull && git switch -c feat/showcase-completion-e2-safe-replay-lifecycle
+  VERIFY: gh issue view 408 --json state   # open
+  RECONCILE every CONTRACT(E1) tag against the merged code:
+    - read app/features/demo/models.py    -> column names (CONTRACT(E1)-1/-2)
+    - read app/features/demo/schemas.py   -> PATCH body + GET response fields (CONTRACT(E1)-3/-4)
+    - read app/features/demo/routes.py    -> PATCH route exists
+    - grep replayed_from app/features/demo/ -> provenance mechanism (CONTRACT(E1)-5)
+    - read PRPs/PRP-showcase-completion-E1-*.md (whatever E1's PRP file is named)
+  ADAPT all names below to E1's reality; note any E1 gap in the PR body.
+
+Task 2 — MODIFY app/features/demo/workspace.py (filters + sort):
+  - EXTEND list_workspaces(db, *, limit=50, offset=0, q=None, tags=None,
+      include_archived=False, sort_by=None, sort_order="desc"):
+      # base stmt; if not include_archived: .where(~ShowcaseWorkspace.archived)
+      # if q: .where(ShowcaseWorkspace.name.ilike(f"%{q}%"))   (name only)
+      # if tags: .where(ShowcaseWorkspace.tags.contains(tags)) (CONTRACT(E1)-2)
+      # sort: allow-list {created_at,name,seed,status}; unknown -> created_at
+      #   desc; name uses .nulls_last(); ALWAYS ORDER BY pinned.desc() first,
+      #   then the sort expr, then id.desc() tiebreak
+  - EXTEND count_workspaces(db, *, q=None, tags=None, include_archived=False)
+      # SAME where-chain as list (scenarios/service.py:462-465 precedent) —
+      # extract a shared _apply_filters(stmt, ...) helper to keep them in sync
+  - Update module docstring (E2 routes the filters).
+
+Task 3 — MODIFY app/features/demo/schemas.py:
+  - APPEND WorkspaceRefHealth + WorkspaceHealthResponse (blueprint above);
+    docstring notes: response models, NOT strict (StepEvent precedent @88-95).
+  - CONTRACT(E1)-4 defensive check: if E1 did not expose archived/pinned/tags/
+    replayed_from_workspace_id on WorkspaceListItem (+notes/
+    config_schema_version on WorkspaceDetailResponse), ADD them here
+    additively (from_attributes picks them up from the ORM row).
+
+Task 4 — CREATE app/features/demo/link_health.py:
+  - build_probe_targets(ws) + probe_workspace_links(app, ws) per the blueprint.
+  - MIRROR pipeline._Client transport flags exactly (raise_app_exceptions=False).
+  - _probe_one catches (httpx.HTTPError, OSError) -> "unknown"; NEVER raises.
+  - Full --strict annotations; module docstring states the no-cross-slice-
+    import rationale (Decision 3) and the 2xx/404/other classification table.
+
+Task 5 — MODIFY app/features/demo/routes.py:
+  - EXTEND GET /workspaces signature with q / tags / include_archived /
+    sort_by / sort_order Query params (mirror dimensions/routes.py:65-75 +
+    scenarios/routes.py:180 styles; document the allow-list + silent fallback
+    in the docstring); pass through to workspace.list_workspaces /
+    count_workspaces (same filter args to BOTH).
+  - ADD GET /workspaces/{workspace_id}/health -> WorkspaceHealthResponse:
+      # async def get_workspace_health(workspace_id: str, request: Request,
+      #                                db: AsyncSession = Depends(get_db)):
+      #   row = await workspace.get_workspace(db, workspace_id)
+      #   if row is None: raise NotFoundError(message=f"Workspace not found: {workspace_id}")
+      #   return await link_health.probe_workspace_links(request.app, row)
+      # Place between the GET detail (@110) and DELETE (@138). No path
+      # collision: /workspaces/{id}/health is more specific than /workspaces/{id}.
+  - Update the module docstring route inventory.
+
+Task 6 — backend tests:
+  - CREATE app/features/demo/tests/test_link_health.py (unit, no DB):
+      # build a THROWAWAY FastAPI stub app with routes returning 200 / 404 /
+      # 500 at the probed paths; construct a ShowcaseWorkspace instance
+      # in-memory (not persisted) with created_objects covering every key +
+      # job_ids slot; assert classification alive/dead/unknown + counts +
+      # partial_run on status='failed'; assert non-probeable keys skipped;
+      # assert empty created_objects -> empty references, partial_run logic.
+  - MODIFY app/features/demo/tests/test_routes.py:
+      UNIT (monkeypatch app.features.demo.routes.workspace / .link_health):
+        - list passes q/tags/include_archived/sort args through (capture kwargs)
+        - list rejects bad limit (existing) — keep green
+        - health 404 on missing workspace (problem+json content-type)
+        - health 200 happy path (monkeypatched probe returns canned response)
+      INTEGRATION (@pytest.mark.integration, db_session):
+        - seed rows: named/unnamed, archived, pinned, tagged ->
+          default list hides archived; include_archived=true shows it;
+          q matches name substring case-insensitively; tags containment;
+          sort_by=name asc with NULLS LAST; pinned row first regardless of sort;
+          total respects filters
+        - health integration: insert a workspace whose created_objects carry
+          one REAL reference (insert a scenario_plan row via its ORM, or use a
+          bogus-vs-real registry pair) + one bogus id -> assert alive + dead
+  - MODIFY app/features/demo/tests/test_workspace.py: filter unit coverage of
+    _apply_filters where practical (or fold into the integration tests above).
+
+Task 7 — MODIFY frontend/src/types/api.ts:
+  - Lifecycle fields per CONTRACT(E1)-4 (verify-or-add), WorkspaceUpdate,
+    WorkspaceListParams, WorkspaceRefHealth/WorkspaceHealth (blueprint above).
+  - DemoRunRequest: verify E1 added replayed_from_workspace_id?: string
+    (CONTRACT(E1)-5); add if missing.
+
+Task 8 — MODIFY frontend/src/hooks/use-workspaces.ts (+ test):
+  - useWorkspaces(params: WorkspaceListParams = {}, enabled = true):
+      queryKey ['workspaces', params]; api('/demo/workspaces', { params })
+      # update existing call site: WorkspacePanel.tsx:77 (the sole useWorkspaces
+      # caller — showcase.tsx never calls it directly)
+  - ADD usePatchWorkspace():
+      mutationFn: ({workspaceId, update}: {workspaceId: string; update: WorkspaceUpdate}) =>
+        api<WorkspaceDetail>(`/demo/workspaces/${workspaceId}`, { method: 'PATCH', body: update })
+      onSuccess: invalidate ['workspaces']   # blanket key matches list+detail
+  - ADD useWorkspaceHealth(workspaceId: string, enabled = true):
+      queryKey ['workspaces', workspaceId, 'health']; staleTime 30_000
+  - ADD useWorkspaceLineage(workspaceId: string | null):
+      one useQuery; queryFn walks replayed_from_workspace_id via sequential
+      api<WorkspaceDetail>() calls, depth cap 5; a 404 (ApiError.status===404)
+      terminates the walk with a {deleted: true} sentinel entry; returns
+      Array<{workspace_id, name, deleted}> oldest-last.
+  - MODIFY use-workspaces.test.ts: params serialization, PATCH invalidation,
+    lineage walk incl. 404 termination (mock api module).
+
+Task 9 — CREATE frontend/src/components/demo/ReplayConfirmDialog.tsx (+ test):
+  - Props: { workspace: WorkspaceListItem | null,        # null = closed
+             requestPreview: DemoRunRequest | null,      # built by the page
+             onConfirm: () => void, onCancel: () => void }
+  - AlertDialog (open={workspace !== null}; onOpenChange close -> onCancel) —
+    copy the shared-dialog pattern from WorkspacePanel.tsx:191-216.
+  - Body: 3-column table (Field / Recorded / Will send) over seed, scenario,
+    reset, skip_seed, name, preservation, replayed_from; per-row mismatch
+    highlight (font-semibold text-destructive on the "Will send" cell when
+    values differ — defensive; verbatim replay normally matches).
+  - reset=true -> warning block (AlertTriangle + "Replaying this workspace
+    WIPES the database and reseeds it.") + AlertDialogAction
+    className="bg-destructive text-destructive-foreground hover:bg-destructive/90"
+    label "Replay & wipe database"; else label "Replay".
+  - Footer hint: "Want to change the config first? Use Load instead." (muted).
+  - data-testid="replay-confirm" on the action (WorkspacePanel test precedent).
+  - Test: renders preview values; destructive copy/label only when reset;
+    confirm fires onConfirm once; cancel fires onCancel; mismatch highlight.
+
+Task 10 — CREATE frontend/src/components/demo/WorkspaceEditDialog.tsx (+ test):
+  - Props: { workspace: WorkspaceListItem | null, onClose: () => void }
+  - Dialog (ui/dialog.tsx — form dialog, not AlertDialog) with: name Input
+    (reuse WORKSPACE_NAME_PATTERN from showcase.tsx:26 — export it from a
+    shared location, e.g. components/demo/workspace-name.ts, instead of
+    duplicating), notes Textarea, tags Input (comma-separated -> trimmed
+    string[]; render current tags as chips above the input).
+  - Save -> usePatchWorkspace().mutate({workspaceId, update}); toast on
+    success/failure (sonner pattern WorkspacePanel.tsx:88-99); close on success.
+  - Send ONLY changed fields (partial update — CONTRACT(E1)-3 semantics).
+  - Test: pattern violation disables Save with inline hint; save sends only
+    dirty fields; success closes + toasts (mock usePatchWorkspace).
+
+Task 11 — CREATE frontend/src/components/demo/WorkspaceLineageStrip.tsx (+ test):
+  - Props: { workspaceId: string, onLoadAncestor: (id: string) => void }
+  - useWorkspaceLineage(workspaceId); render breadcrumb: current ← parent ←
+    … oldest; ancestors as Button variant="link" size="sm" (click ->
+    onLoadAncestor); deleted sentinel renders muted "(original deleted)";
+    depth-cap overflow renders trailing "…". Render nothing (null) when the
+    workspace has no replayed_from_workspace_id.
+  - Test: chain render order, deleted sentinel, null when no lineage.
+
+Task 12 — MODIFY frontend/src/components/demo/WorkspacePanel.tsx (+ test):
+  - Toolbar row above the list: search Input (icon lucide Search; applies as
+    `q` on Enter/debounce), "Show archived" Checkbox, sort Select
+    (Newest/Oldest/Name/Status -> sort_by+sort_order pairs), active-tag chip
+    (clearable) when a tag filter is set.
+  - Panel owns the list-params state and calls useWorkspaces(params).
+  - Row additions: leading multi-select Checkbox; Pin icon button (lucide Pin
+    / PinOff, fires usePatchWorkspace toggle); archived rows: opacity-60 +
+    outline Badge "archived"; replay Badge (outline, "replay") when
+    replayed_from_workspace_id != null; tags as clickable chips (sets the tag
+    filter); DropdownMenu (MoreHorizontal): Pin/Unpin, Archive/Unarchive,
+    Edit details…, Delete… (Delete keeps the existing pendingDelete dialog).
+  - Replay button now calls a NEW prop onRequestReplay(ws) (the page owns the
+    confirm dialog) — RENAME the old onReplay prop to make the break explicit.
+  - Selection footer: "N selected" + Delete selected (AlertDialog confirm ->
+    sequential `for (const id of selected) await deleteWorkspace.mutateAsync(id)`
+    with per-failure collection -> one summary toast; clear selection) +
+    Compare button (disabled unless exactly 2; useNavigate ->
+    `${ROUTES.SHOWCASE_COMPARE}?a=${id1}&b=${id2}`).
+  - Keep the component lean: extract WorkspaceToolbar + WorkspaceRow as
+    file-local components if the file passes ~300 lines.
+  - Tests: search/sort/archived params flow into useWorkspaces (mock + assert
+    last call args); multi-select count + delete-selected confirm calls N
+    mutateAsync; compare disabled at 1 and 3 selections; pin/archive fire
+    PATCH mutations; replay fires onRequestReplay (NOT start).
+
+Task 13 — MODIFY frontend/src/components/demo/WorkspaceArtifactsPanel.tsx (+ test):
+  - Props += { health?: WorkspaceHealth | null }
+  - buildCards gains the refId per card; a card whose refId matches a
+    health.references entry with status==='dead' renders AlertTriangle
+    (h-3 w-3 text-destructive) beside the label + title tooltip "This object
+    no longer exists — it was deleted after the run." ('unknown' -> no marker).
+  - Header chip row: `✓ {alive} live` (text-success) + `✕ {dead} dead`
+    (text-destructive, only when dead>0) + outline Badge "partial run" when
+    health.partial_run (tooltip: "This run never completed — artifacts may be
+    missing."). Skeleton/silent when health undefined (query in flight/disabled).
+  - Test: dead marker on matching card; summary chip counts; partial-run badge.
+
+Task 14 — MODIFY frontend/src/pages/showcase.tsx:
+  - State += pendingReplay: WorkspaceListItem | null.
+  - handleReplayWorkspace(ws) -> setPendingReplay(ws)  (no start()).
+  - NEW executeReplay(ws): the post-E1 body (showcase.tsx:174-186 today —
+    setScenario first; E1 shifts these anchors and adds
+    replayed_from_workspace_id: ws.workspace_id, which executeReplay PRESERVES
+    — CONTRACT(E1)-5, preserve-not-add); clear pendingReplay.
+  - buildReplayRequest(ws): pure helper producing the DemoRunRequest preview
+    passed to the dialog AND used by executeReplay (single source — the diff
+    can never lie about what's sent). Export for unit testing.
+  - Mount <ReplayConfirmDialog workspace={pendingReplay}
+      requestPreview={pendingReplay && buildReplayRequest(pendingReplay)}
+      onConfirm={() => pendingReplay && executeReplay(pendingReplay)}
+      onCancel={() => setPendingReplay(null)} />
+  - Health: const health = useWorkspaceHealth(selectedWorkspaceId ?? '',
+      !!selectedWorkspaceId); pass health.data into WorkspaceArtifactsPanel.
+  - Lineage: mount <WorkspaceLineageStrip workspaceId={selectedWorkspaceId}
+      onLoadAncestor={(id) => { /* fetch list item via detail + handleLoad */ }} />
+      inside the loaded-workspace block (@448-450 region); simplest
+    onLoadAncestor: setSelectedWorkspaceId(id) + repopulate controls from the
+    lineage entry's detail (the strip's hook already has the details — pass
+    the full WorkspaceDetail up instead of just the id if cleaner).
+  - WorkspacePanel prop rename: onRequestReplay={handleReplayWorkspace}.
+
+Task 15 — CREATE frontend/src/pages/workspace-compare.tsx (+ test) + routing:
+  - MODIFY frontend/src/lib/constants.ts: SHOWCASE_COMPARE: '/showcase/compare'
+    (beside SHOWCASE @4).
+  - MODIFY frontend/src/App.tsx: lazy WorkspaceComparePage + <Route> (mirror
+    RunComparePage @21, @119-126). '/showcase/compare' and '/showcase' are
+    distinct paths — no nesting needed.
+  - Page mirrors run-compare.tsx: useSearchParams a/b (@87-109 pattern);
+    pickers = Select over useWorkspaces({limit: 100, include_archived: true})
+    items (label: name ?? id.slice(0,8) · scenario · status); two
+    useWorkspace(a/b) detail queries; render:
+      * config table — seed/scenario/reset/skip_seed/name/tags; mismatch rows
+        highlighted (font-semibold)
+      * results table — winner_model_type, winner_wape (DeltaCell-style
+        sign-only delta — copy the component from run-compare.tsx:33-54
+        file-locally), wall_clock_s
+      * created-objects matrix — union of soft-reference keys × (A: ✓/—,
+        B: ✓/—)
+      * lineage note — "B is a replay of A" (or inverse) when
+        replayed_from_workspace_id links them
+      * partial-run outline Badge per side when status !== 'completed'
+    Missing/invalid id -> that side renders the picker + muted "select a
+    workspace" (no crash; ApiError 404 -> same fallback).
+  - Test: renders diff for two mocked details; mismatch highlight; lineage
+    note; 404 side falls back to picker state.
+
+Task 16 — barrel + docs:
+  - MODIFY frontend/src/components/demo/index.ts — export the three new
+    components.
+  - MODIFY docs/_base/API_CONTRACTS.md:
+      * GET /demo/workspaces row: append "E2 (#408) — `q` name search, `tags`
+        containment filter, `include_archived` (default false), allow-listed
+        `sort_by`/`sort_order`; pinned rows first; `total` respects filters"
+      * NEW row: | demo | GET | `/demo/workspaces/{workspace_id}/health` |
+        E2 (#408) — probe the workspace's soft references in-process; per-ref
+        alive/dead/unknown + counts + `partial_run`; `404` when missing |
+  - MODIFY docs/_base/RUNBOOKS.md § "Showcase workspace — …":
+      * item 1: replace "there is deliberately no confirm dialog" with the E2
+        reality (every panel Replay confirms; reset=true gets destructive
+        copy; the DESTRUCTIVE row marker stays)
+      * item 3/4: one-line additions — multi-select delete = N metadata-only
+        singles; dead links now SURFACE via the health summary instead of
+        silently dangling
+  - VERIFY (not edit) DOMAIN_MODEL.md replayed_from note was updated by E1.
+
+Task 17 — gates, dogfood, commits, PR:
+  - Backend gates + integration suite (Validation Loop below).
+  - Frontend: cd frontend && pnpm lint && pnpm test --run.
+  - Browser dogfood via the webapp-testing skill (CLAUDE.md workflow step 4):
+    seeded stack -> save 3 workspaces (one reset=true, one tagged, one
+    replayed) -> search/sort/archive/pin -> replay with confirm (destructive
+    variant) -> lineage chain -> compare page -> delete a referenced scenario
+    plan -> reload workspace -> dead-link warning + health chip.
+  - git diff --stat (CRLF surgical-diff check on edited files).
+  - COMMITS (reference #408, no AI trailer), e.g.:
+      feat(api): add workspace list filters and link-health endpoint (#408)
+      feat(ui): add replay confirmation with config diff to showcase (#408)
+      feat(ui): add workspace lifecycle controls and lineage rendering (#408)
+      feat(ui): add two-workspace compare page (#408)
+      test(api): cover workspace filters and link-health probes (#408)
+      docs(api): document workspace lifecycle and health contracts (#408)
+  - PR into dev; title `feat(api,ui): showcase-completion E2 — safe replay &
+    workspace lifecycle (#408)`; body notes the replay-policy-picker deferral
+    (Decision 1) + any CONTRACT(E1) reconciliation deltas.
+```
+
+### Integration Points
+
+```yaml
+DATABASE: none in E2 — reads the E1-migrated table; NO new migration.
+
+CONFIG: none — no new settings or env vars (probe timeout is a module constant).
+
+ROUTES: existing demo router only (app/main.py wiring unchanged): extended GET
+  /demo/workspaces + new GET /demo/workspaces/{id}/health. PATCH is E1's.
+
+FRONTEND ROUTES: one new React Router page at ROUTES.SHOWCASE_COMPARE
+  ('/showcase/compare'); registered in App.tsx beside the existing pages.
+
+DOCS: API_CONTRACTS.md + RUNBOOKS.md (Task 16). Full doc sweep belongs to the
+  E7 release gate — keep E2's edits additive and minimal.
+```
+
+## Validation Loop
+
+### Level 1: Syntax & Style
+
+```bash
+uv run ruff check . && uv run ruff format --check .
+uv run mypy app/ && uv run pyright app/
+cd frontend && pnpm lint
+# Expected: clean. Both Python type checkers are --strict and gate merge.
+# (pnpm tsc --noEmit is vacuous; tsc -b fails with PRE-EXISTING errors — do
+# not chase them. lint + vitest are the JS gates.)
+```
+
+### Level 2: Unit Tests (no DB)
+
+```bash
+uv run pytest app/features/demo -v -m "not integration"
+uv run pytest app/core/tests/test_strict_mode_policy.py -v   # AST walker still green
+cd frontend && pnpm test --run
+# New/changed: test_link_health (stub-app probe classification), test_routes
+# filter/health unit tests, use-workspaces hooks, ReplayConfirmDialog,
+# WorkspaceEditDialog, WorkspaceLineageStrip, WorkspacePanel rework,
+# WorkspaceArtifactsPanel health markers, workspace-compare page.
+```
+
+### Level 3: Integration (real Postgres)
+
+```bash
+docker compose up -d && uv run alembic upgrade head
+uv run pytest app/features/demo -v -m integration
+# List filters against seeded rows (archived hidden / shown, q, tags,
+# sort + pinned-first, filtered total) + health probe (real + bogus refs).
+```
+
+### Level 4: Manual smoke + browser dogfood (seeded local stack, uvicorn :8123)
+
+```bash
+# 1. Filtered list + health round-trip
+curl -s "http://localhost:8123/demo/workspaces?q=demo&sort_by=name&sort_order=asc" | python3 -m json.tool | head -30
+curl -s "http://localhost:8123/demo/workspaces?include_archived=true" | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['total'])"
+WS_ID=$(curl -s -X POST http://localhost:8123/demo/run -H 'Content-Type: application/json' \
+  -d '{"skip_seed": true, "preservation": "keep", "workspace_name": "e2-smoke"}' \
+  | python3 -c "import sys,json; print(json.load(sys.stdin)['workspace_id'])")
+curl -s "http://localhost:8123/demo/workspaces/${WS_ID}/health" | python3 -m json.tool
+curl -s -o /dev/null -w "%{http_code} %{content_type}\n" \
+  http://localhost:8123/demo/workspaces/deadbeefdeadbeefdeadbeefdeadbeef/health   # 404 problem+json
+
+# 2. Dead-link proof: delete a referenced scenario plan, re-probe
+#    (pick a scenario_plan_id from the workspace detail's created_objects)
+curl -s -X DELETE http://localhost:8123/scenarios/<plan-id> -o /dev/null -w "%{http_code}\n"
+curl -s "http://localhost:8123/demo/workspaces/${WS_ID}/health" \
+  | python3 -c "import sys,json; print([r for r in json.load(sys.stdin)['references'] if r['status']=='dead'])"
+
+# 3. Browser dogfood (webapp-testing skill / agent-browser):
+#    /showcase -> save workspaces -> toolbar search/sort/show-archived ->
+#    pin (row jumps first) -> archive (vanishes until toggle) -> Edit details
+#    (rename + tags chips) -> Replay -> confirm dialog shows the diff table ->
+#    a reset=true workspace shows destructive copy + red button -> confirmed
+#    replay goes green, new row carries the "replay" badge -> Load it ->
+#    lineage strip shows the chain -> select 2 rows -> Compare page diff ->
+#    multi-select 2 -> Delete selected -> rows gone, created objects intact ->
+#    loaded workspace with the deleted plan shows the dead-link warning + chip.
+```
+
+## Final validation Checklist
+
+- [ ] All five gates green: `uv run ruff check . && uv run ruff format --check . && uv run mypy app/ && uv run pyright app/ && uv run pytest -v -m "not integration"`
+- [ ] Integration suite green: `uv run pytest -v -m integration` (fresh docker-compose DB)
+- [ ] Frontend gates green: `cd frontend && pnpm lint && pnpm test --run`
+- [ ] No replay path bypasses the confirm dialog; reset=true shows destructive variant (vitest + dogfood)
+- [ ] List filters: archived hidden by default, q/tags/sort behave, pinned-first, filtered total (route tests + curl)
+- [ ] Health endpoint classifies alive/dead/unknown; dead-link warning + partial-run chip render (integration + dogfood step 2/3)
+- [ ] Lineage chain renders incl. deleted-ancestor sentinel
+- [ ] Compare page deep-links `?a=&b=` and degrades gracefully on bad ids
+- [ ] Multi-select delete = N single DELETEs; **no new bulk endpoint in the diff**
+- [ ] Legacy list calls + all pre-existing demo tests unchanged-green
+- [ ] CONTRACT(E1) reconciliation notes in the PR body; replay-policy deferral noted
+- [ ] `git diff --stat` surgical (no CRLF whole-file noise)
+- [ ] docs/_base/API_CONTRACTS.md + RUNBOOKS.md updated additively
+- [ ] Commits `type(scope): description (#408)`, no AI trailer; PR into dev; browser dogfood evidence per `.claude/rules/ui-design.md`
+
+---
+
+## Anti-Patterns to Avoid
+
+- ❌ Don't start before E1 (#407) merges; don't re-implement E1 surface (migration, PATCH, provenance write).
+- ❌ Don't import another feature slice from `app/features/demo/` — link health is in-process HTTP only.
+- ❌ Don't add a bulk-delete endpoint or any "wipe everything" operation — N singles, period.
+- ❌ Don't add a replay-policy picker (exact/safe-keep/modified) — explicitly deferred (Decision 1).
+- ❌ Don't make health/response models strict — strict mode is request-body policy.
+- ❌ Don't probe health for every list row — loaded workspace only.
+- ❌ Don't let a probe exception 500 the health route — classify as `unknown`.
+- ❌ Don't mutate the original workspace row on replay — replay still creates a NEW row (provenance points back).
+- ❌ Don't duplicate the name pattern regex — share it between run controls and the edit dialog.
+- ❌ Don't run `shadcn add` — every needed primitive is installed; don't use raw colors — semantic tokens only.
+- ❌ Don't call `crypto.randomUUID` directly — `safeRandomUUID` (ESLint-enforced).
+- ❌ Don't chase pre-existing `tsc -b` errors — lint + vitest are the JS gates.
+
+## Confidence Score
+
+**7.5/10** for one-pass implementation success. The backend half (list filters
++ health endpoint) is a composition of three verified in-repo precedents
+(dimensions search/sort, scenarios tags containment, pipeline ASGITransport)
+with clear test shapes. The deductions: (a) E2 is authored against a frozen
+but UNMERGED E1 contract — seven CONTRACT(E1) points must reconcile against
+E1's real merged shape, and any naming/shape divergence costs an adaptation
+pass (mitigated by Task 1's reconciliation gate and verify-or-add fallbacks);
+(b) the WorkspacePanel rework is the single largest UI delta of the showcase
+initiative so far (toolbar + badges + dropdown + multi-select + confirm
+rerouting in one component) where an interaction miss costs an iteration; and
+(c) four parallel epics share `schemas.py` / `routes.py` / `showcase.tsx`,
+so rebase friction is plausible even with additive-only edits.
diff --git a/PRPs/PRP-showcase-completion-E3-seed-config-scope.md b/PRPs/PRP-showcase-completion-E3-seed-config-scope.md
new file mode 100644
index 00000000..e5a0df6e
--- /dev/null
+++ b/PRPs/PRP-showcase-completion-E3-seed-config-scope.md
@@ -0,0 +1,1080 @@
+name: "PRP — Showcase Completion E3: Advanced Seed Config MVP + Store/Product Scope Selection (issue #409)"
+description: |
+
+## Purpose
+
+Implement Parallel epic E3 of the showcase-completion initiative (umbrella #406):
+an additive, allow-listed nested override schema on the seeder HTTP contract
+(7 curated knobs), an additive `seed_overrides` field on `DemoRunRequest` / the
+WS start frame, a store/product focus-pair selector with pre-run preview on the
+Showcase page, frontend + backend validation of every knob, and persistence of
+overrides + user-selected scope into the workspace row (E1 #407 story slots) so
+replay honors them verbatim.
+
+**Execution gate:** this epic is Parallel after Foundation — implementation
+starts ONLY after E1 #407 merges to `dev` (its migration ships the
+`seed_overrides` / `user_scope` JSONB story slots E3 writes into). Every
+dependency on E1's surface is tagged `CONTRACT(E1):` below; re-verify each tag
+against the merged E1 code before starting Task 1.
+
+## Core Principles
+
+1. **Context is King**: every file reference below was verified against the live code on 2026-06-12 (branch dev @ bdf85f6, post-E4/#404 merge — PRE-E1-#407; line numbers will drift slightly after E1 merges, re-anchor by symbol name).
+2. **Validation Loops**: each level is executable as written.
+3. **Information Dense**: patterns cite exact file:line (or symbol when post-E1 drift is likely).
+4. **Progressive Success**: shared override schema → seeder contract → demo start frame → pipeline consumption → workspace persistence → frontend → docs → browser dogfood.
+5. **Global rules**: follow CLAUDE.md / AGENTS.md; all five backend CI gates must pass; UI work follows `.claude/rules/ui-design.md` + `.claude/rules/shadcn-ui.md`.
+
+---
+
+## Goal
+
+A user on `/showcase` ticking **Re-seed first** can open an **Advanced seed
+config** panel and turn 7 curated knobs (store count, product count, window
+days, sparsity, promotion intensity, stockout intensity, noise sigma) before
+running; independently, the user can pick an explicit **store/product focus
+pair** (with a pre-run preview of the selected entities and the seeded window)
+that the pipeline models instead of the auto-discovered first pair. Both the
+overrides and the scope persist into a kept workspace row and are re-submitted
+verbatim on Replay. A start frame without the new fields behaves
+byte-identically to today.
+
+**Deliverable** (all additive; ZERO migrations — E1 #407 owns the schema):
+
+- `app/shared/seeder/overrides.py` — NEW: `SeederOverrides` Pydantic model (the single shared allow-list, `extra="forbid"`), importable by both the seeder and demo slices through `app/shared/` (vertical-slice-legal).
+- `app/features/seeder/schemas.py` — `GenerateParams.overrides: SeederOverrides | None = None` (additive nested optional object on the EXISTING endpoint — decision rationale below).
+- `app/features/seeder/service.py` — `_apply_seed_overrides(config, overrides)` applied LAST in `_build_config_from_params` (wins over the legacy scalar `stores`/`products`/`sparsity`), mapping each knob onto its `SeederConfig` sub-dataclass via `dataclasses.replace`.
+- `app/features/demo/schemas.py` — `DemoRunRequest.seed_overrides: SeederOverrides | None` + `DemoRunRequest.user_scope: UserScope | None` (NEW small model) + two cross-field validators.
+- `app/features/demo/pipeline.py` — `DemoContext` carries both; `step_seed` forwards `overrides` to `POST /seeder/generate`; `step_status` honors `user_scope` (validate via `/dimensions/*/{id}`; warn + fallback to discovery on a dangling pair).
+- `app/features/demo/workspace.py` — `create_workspace` writes the two E1 story slots; list/detail response schemas expose them (replay reads list rows).
+- `frontend/src` — `SeedConfigPanel.tsx` + `ScopeSelector.tsx` (composed from installed shadcn primitives), `lib/workspace-replay.ts` pure replay-frame builder, `types/api.ts` additions, `showcase.tsx` wiring.
+- Tests: seeder schema/route/service tests (incl. out-of-bounds 422 + unknown-knob 422), demo schema JSON-path tests, pipeline `_RecordingClient` forwarding tests, workspace slot persistence tests, replay-verbatim regression (backend integration + frontend pure-helper test), component vitests.
+- Docs: `docs/_base/API_CONTRACTS.md` (3 rows), `docs/_base/RUNBOOKS.md` (new incident entry + workspace-section update), `docs/_base/DOMAIN_MODEL.md` (slot schema documentation).
+
+**Success definition**: all Success Criteria check off, the five backend gates +
+frontend lint/test are green, and a real-browser dogfood shows: an
+overridden re-seed run (e.g. 8 stores × 20 products, promo 0.3) goes green with
+the seed card echoing the overrides; a scope-selected run models the chosen
+pair; a kept run replays both verbatim.
+
+## Why
+
+- Umbrella #406: today the showcase accepts only `seed`/`scenario`/`reset`/`skip_seed`; the preset's behavioral character (noise, promos, stockouts, sparsity) is take-it-or-leave-it, and the modeled grain is always the first discovered `(store, product)` pair (`app/features/demo/pipeline.py:582-631`) — the operator cannot tell the story of a specific SKU.
+- The seeder HTTP contract already accepts 25+ FLAT scalar/flag fields (`app/features/seeder/schemas.py:78-298`) — the umbrella's top risk is that surface growing unbounded. A curated nested object with `extra="forbid"` is the documented mitigation: 7 knobs, mechanically allow-listed, everything else stays preset-driven.
+- E1 #407 reserves `seed_overrides` + `user_scope` JSONB story slots on `showcase_workspace` precisely so this epic's config survives into Replay — without persistence, replay of an overridden run would silently regenerate different data.
+- E3 is Parallel after Foundation: it can land independently of E2 #408 / E4 #410 / E5 #411 / E6 #412 (no shared files beyond additive edits to `showcase.tsx` / `workspace.py` — coordinate merge order if simultaneous).
+
+## What
+
+### Open question resolved — seeder override contract shape
+
+**DECISION: expand `GenerateParams` with an additive nested optional object
+(`overrides: SeederOverrides | None = None`). NO new endpoint.** Rationale,
+researched against the current code:
+
+1. **The layering already exists.** `_build_config_from_params` (`app/features/seeder/service.py:202-247`) is a layered override pipeline: preset → scalar dims/window/sparsity → `_apply_phase1_overrides` (:74-137) → `_apply_phase2_overrides` (:139-199). A `_apply_seed_overrides` applied last is a fourth layer in an established pattern — a new endpoint would have to reimplement or call into this exact function anyway.
+2. **A new endpoint duplicates load-bearing guards.** `POST /seeder/generate` carries `_check_seeder_enabled()` (production guard, `routes.py:21-33`), the ValueError→400 / Exception→500 RFC 7807 envelope (`routes.py:114-136`), and the seeder-is-the-only-bulk-mutation-path invariant. A second generate-shaped endpoint doubles that audit surface for zero contract benefit.
+3. **Back-compat is free.** Absent field = `None` = byte-identical behavior — the exact precedent the Phase 1/Phase 2 field comments in `schemas.py:121-123,175-177` already promise and test.
+4. **Nested (not more flat scalars) is the allow-list mechanism.** `ConfigDict(extra="forbid")` on the nested model makes an unknown knob a 422 — the umbrella's "contract grows unbounded" mitigation becomes machine-enforced, and the 7 curated knobs stay visually distinct from the 25+ legacy scalars.
+5. **One schema serves both slices.** The demo start frame forwards the same object verbatim; placing `SeederOverrides` in `app/shared/seeder/overrides.py` lets `app/features/seeder/schemas.py` and `app/features/demo/schemas.py` both import it without a cross-slice import (precedent: `demo/schemas.py:16` already imports `ScenarioPreset` from `app/shared/seeder/config`).
+
+Trade-off accepted: `extra="forbid"` means a FUTURE knob sent by a newer client
+to an older backend errors loudly instead of being ignored. That asymmetry vs.
+the top-level start frame (unknown TOP-LEVEL keys remain ignored) is
+deliberate — silent knob-dropping would fake-honor a config the run never used.
+
+### Allow-listed knob → config-field mapping (the complete MVP surface)
+
+| Knob (wire name) | Type / bounds | Maps to (via `dataclasses.replace`) | Preset reference values |
+|---|---|---|---|
+| `stores` | `int`, ge=1 le=100 | `config.dimensions.stores` (`DimensionConfig.stores`, `app/shared/seeder/config.py:118`) | demo profiles 3–5; scalar `GenerateParams.stores` caps 100 |
+| `products` | `int`, ge=1 le=500 | `config.dimensions.products` (`DimensionConfig.products`, config.py:119) | demo profiles 10–25; scalar caps 500 |
+| `window_days` | `int`, ge=75 le=365 | `config.start_date = config.end_date - timedelta(days=window_days)` (end_date untouched) | ≥75 keeps the `historical_backfill` gate clear (`pipeline.py` gate = `3*(14+1)+30 = 75`); ≤365 = `DEFAULT_SEED_SPAN_DAYS` |
+| `sparsity` | `float`, ge=0.0 le=0.9 | `config.sparsity = replace(config.sparsity, missing_combinations_pct=v)` (`SparsityConfig.missing_combinations_pct`, config.py:141) — `replace` PRESERVES the preset's `random_gaps_*` fields | sparse preset uses 0.5; 1.0 would seed zero series (hard-fail), hence the 0.9 cap |
+| `promotion_intensity` | `float`, ge=0.0 le=0.5 | `config.retail = replace(config.retail, promotion_probability=v)` (`RetailPatternConfig.promotion_probability`, config.py:101) | preset max 0.25 (holiday_rush); 0.5 cap = 2× headroom |
+| `stockout_intensity` | `float`, ge=0.0 le=0.5 | `config.retail = replace(config.retail, stockout_probability=v)` (config.py:102) | preset max 0.25 (stockout_heavy); higher values risk NaN-WAPE (documented expected-fail, mirrors sparse) |
+| `noise_sigma` | `float`, ge=0.0 le=0.5 | `config.time_series = replace(config.time_series, noise_sigma=v)` (`TimeSeriesConfig.noise_sigma`, config.py:72) | preset max 0.4 (high_variance) |
+
+Precedence (document in the field description AND a service test): nested
+`overrides` is applied LAST in `_build_config_from_params` and therefore WINS
+over the legacy scalar `stores` / `products` / `sparsity` when both are sent.
+`window_days` recomputes `start_date` from the (scalar-or-default) `end_date`.
+The pipeline keeps sending `sparsity=0.0` as the scalar (preserves preset
+character per the `if params.sparsity > 0` guard at `service.py:225-226`);
+`overrides.sparsity` is the only way the demo overrides sparsity.
+
+### `seed_overrides` / `user_scope` slot schemas (THIS PRP's contract to define)
+
+E1 #407 reserves the slots; the JSON inside them is defined HERE:
+
+```jsonc
+// showcase_workspace.seed_overrides  (JSONB; NULL when the run had none)
+// = SeederOverrides.model_dump(mode="json", exclude_none=True) — SPARSE:
+//   only operator-set knobs appear; {} never stored (None instead).
+{
+  "stores": 8,                  // int 1..100, optional
+  "products": 20,               // int 1..500, optional
+  "window_days": 120,           // int 75..365, optional
+  "sparsity": 0.3,              // float 0.0..0.9, optional
+  "promotion_intensity": 0.3,   // float 0.0..0.5, optional
+  "stockout_intensity": 0.1,    // float 0.0..0.5, optional
+  "noise_sigma": 0.25           // float 0.0..0.5, optional
+}
+
+// showcase_workspace.user_scope  (JSONB; NULL when no pair was picked)
+// = UserScope.model_dump(mode="json") — both keys always present when non-null:
+{
+  "store_id": 12,               // int ge=1 — REAL discovered id (sequences
+  "product_id": 47              // int ge=1    never reset; ids are NOT 1-based)
+}
+```
+
+Replay semantics: the slots record the REQUESTED config (replay-verbatim
+contract, mirrors the E1 seed/scenario/reset/skip_seed columns). The EFFECTIVE
+grain a run actually modeled is already recorded separately by
+`finalize_workspace` into the `store_id` / `product_id` columns
+(`workspace.py:136-137`) — when a replayed `user_scope` dangles (warn+fallback,
+below), the two will legitimately differ; that divergence is visible, not
+hidden.
+
+### User-visible behavior
+
+- **Advanced seed config panel** (`/showcase`): a collapsible "Advanced seed config" section appears under the run controls, enabled ONLY while **Re-seed first** is ticked (overrides are meaningless on `skip_seed=true` and the backend rejects the combination). 7 controls with the bounds above; a "live summary" line echoes the effective config (e.g. "8 stores × 20 products × 120 days · promo 0.30"); a caveat notes high sparsity/stockout values can legitimately fail the backtest (NaN WAPE — same documented semantics as the `sparse` preset). `window_days` control is disabled with an explanatory tooltip when the `holiday_rush` preset is selected (calendar-pinned window).
+- **Store/product focus-pair selector**: two dropdowns (stores, products — fed by `GET /dimensions/stores` / `GET /dimensions/products`, `page_size=100`) plus a pre-run preview card showing the chosen store (code/name/region/type), product (sku/name/category/brand) and the currently seeded window (from `GET /seeder/status`). Works WITHOUT re-seeding (scope selection on the existing dataset is the primary use). Ticking **Reset database** clears the selection with a caveat ("a wipe re-issues ids — re-pick after the run"), because Postgres sequences never reset (memory anchor: seeder-does-not-reset-id-sequences).
+- **Run**: the start frame carries `seed_overrides` (only when re-seeding and ≥1 knob set) and `user_scope` (when a pair is picked). The seed step card echoes the overridden dims; the status step card says "user-selected pair" vs "discovered pair".
+- **Replay** of a kept run re-submits recorded `seed_overrides` + `user_scope` verbatim alongside the existing 4 config fields. Load repopulates the panel + selector.
+- **Legacy behavior**: a start frame without the new fields is byte-identical to today (contract test).
+
+### Technical requirements
+
+- All new request fields are additive `Optional` with `None` defaults; the WS start frame keeps ignoring unknown TOP-LEVEL keys (`DemoRunRequest` default `extra=ignore`); the nested models use `extra="forbid"` (allow-list enforcement).
+- `SeederOverrides` and `UserScope` carry `ConfigDict(strict=True, extra="forbid")`. All fields are JSON-native (`int`/`float`) → NO `Field(strict=False)` override needed and the strict-mode AST policy test (`app/core/tests/test_strict_mode_policy.py`) stays green. Runtime-verified on pydantic 2.12.5: a nested-model field under a `strict=True` parent validates from the JSON-parsed dict (FastAPI's `validate_python` path) — see verification log.
+- All config is start-frame-time. NOTHING is configurable mid-run — the pipeline is strictly linear under the module-level `asyncio.Lock` (design invariant from umbrella #406; do not add any mid-run mutation channel).
+- The demo slice must not import `app/features/seeder/*` — `SeederOverrides` lives in `app/shared/seeder/overrides.py`; `UserScope` lives in `app/features/demo/schemas.py` (demo-only concept). `pipeline.py` may import both (`app.shared.*` + own-slice schemas are already imported at `pipeline.py:43-45`).
+- The seeder stays the only bulk-mutation path; no new wipe semantics; `_check_seeder_enabled` untouched.
+- E3 ships ZERO Alembic migrations. CONTRACT(E1): the `seed_overrides` + `user_scope` JSONB slots exist on `showcase_workspace` (E1 #407 migration) before this epic executes.
+
+### Success Criteria
+
+- [ ] `POST /seeder/generate` accepts `{"overrides": {"stores": 8, "promotion_intensity": 0.3}}` → 201, and the generated config reflects the knobs (service unit test); `{"overrides": {"stores": 0}}` → 422; `{"overrides": {"bogus_knob": 1}}` → 422; a body WITHOUT `overrides` produces a byte-identical `SeederConfig` to today (regression test).
+- [ ] `DemoRunRequest.model_validate({...})` JSON-path tests: `seed_overrides` with `skip_seed=true` → ValidationError; `window_days` with `scenario="holiday_rush"` → ValidationError; legacy 4-field frame still validates; `user_scope` happy path.
+- [ ] `step_seed` forwards `overrides` in the `/seeder/generate` POST body (`_RecordingClient` assertion); `step_status` uses a valid `user_scope` pair (asserts the GET-by-id calls + ctx fields), and WARNS + falls back to discovery on a 404 pair.
+- [ ] A `preservation="keep"` run records `seed_overrides` + `user_scope` into the E1 story slots; `GET /demo/workspaces` list items AND `/{id}` detail expose both; the e2e replay regression (`tests/test_e2e_demo.py::test_demo_replay_same_config_twice` extended or sibling test) proves a replayed row carries identical slot JSON.
+- [ ] Frontend: panel renders 7 bounded controls only when Re-seed is ticked; selector previews the chosen pair; `workspaceToRunRequest(ws)` unit test proves replay-verbatim including the new fields; `pnpm lint && pnpm test --run` green; no NEW `tsc -b` errors in touched files.
+- [ ] Legacy start frames byte-identical (backend contract test + existing demo tests untouched-green).
+- [ ] Backend gates green: `uv run ruff check . && uv run ruff format --check . && uv run mypy app/ && uv run pyright app/ && uv run pytest -v -m "not integration"`.
+- [ ] Docs updated additively: API_CONTRACTS (seeder + demo + WS rows), RUNBOOKS (new showcase incident entry + workspace-section note), DOMAIN_MODEL (slot schemas under the `showcase_workspace` aggregate).
+- [ ] Real-browser dogfood (Level 4) performed.
+
+## All Needed Context
+
+### Documentation & References
+
+```yaml
+# MUST READ — codebase patterns (verified 2026-06-12, dev @ bdf85f6 — PRE-E1;
+# re-anchor line numbers by symbol after E1 #407 merges)
+
+- file: app/features/seeder/schemas.py
+  why: |
+    GenerateParams at 78-298 — the contract to extend. Note the Phase 1
+    comment block at 121-123 ("All flags default off so existing scenarios
+    remain byte-identical") — copy that promise onto the new field. The model
+    is plain BaseModel (NO ConfigDict(strict=True)) — do NOT add strict mode
+    to GenerateParams itself (it has date fields start_date/end_date; only
+    the NEW nested SeederOverrides model is strict).
+    ChangepointEventParam at 51-64 is the existing nested-model-in-params
+    precedent (list[ChangepointEventParam] at 153-156).
+
+- file: app/features/seeder/service.py
+  why: |
+    _build_config_from_params at 202-247 — THE integration point. Scalar
+    overrides at 218-226 (dataclasses.replace on dimensions; sparsity only
+    when > 0); _apply_phase1_overrides at 74-137 and _apply_phase2_overrides
+    at 139-199 are the mutate-config-in-place pattern to mirror for
+    _apply_seed_overrides. APPLY THE NEW LAYER LAST (after :241) so nested
+    wins over scalars. from dataclasses import replace already imported (:7).
+
+- file: app/shared/seeder/config.py
+  why: |
+    The override targets: TimeSeriesConfig.noise_sigma :72,
+    RetailPatternConfig.promotion_probability/stockout_probability :101-102,
+    DimensionConfig.stores/products :118-119,
+    SparsityConfig.missing_combinations_pct :141 (+ random_gaps fields to
+    PRESERVE via replace). ScenarioPreset :37-47. holiday_rush pinned window
+    :553-579 (the reason window_days is rejected for that preset).
+    DEFAULT_SEED_SPAN_DAYS=365 :10. NO Pydantic here — config.py stays
+    dataclasses; the new Pydantic model goes in a NEW sibling module
+    app/shared/seeder/overrides.py.
+
+- file: app/features/seeder/routes.py
+  why: |
+    POST /seeder/generate at 85-136 — NO route-code change needed (the body
+    model change flows through); read for the _check_seeder_enabled guard
+    (21-33) and the error envelope you must NOT duplicate (the
+    no-new-endpoint rationale).
+
+- file: app/features/demo/schemas.py
+  why: |
+    DemoRunRequest at 29-85 — the model to extend. The model_validator
+    _workspace_name_requires_keep at 80-85 is the EXACT cross-field-rule
+    pattern for the two new validators. The docstring at 30-38 explains the
+    strict-mode policy; scenario's strict=False override at 59-63 (enum) —
+    nested BaseModel fields need NO such override (runtime-verified).
+    WorkspaceListItem at 169-190 / WorkspaceDetailResponse at 192-203 — add
+    seed_overrides + user_scope to BOTH (replay reads LIST rows:
+    showcase.tsx:174-186). CONTRACT(E1): E1's PRP may already have surfaced
+    the story slots on these response models — if so, verify shape
+    (dict[str, Any] | None) and skip the duplicate edit.
+
+- file: app/features/demo/pipeline.py
+  why: |
+    DemoContext at 212-263 — add seed_overrides/user_scope fields (follow the
+    PRP-38/39/40 additive-Optional comment style). step_seed at 541-579 —
+    extend the POST body; _SCENARIO_SEED_PROFILE at 513-538 supplies the
+    defaults overrides partially replace. step_status at 582-631 — the
+    first-pair discovery to branch around for user_scope (its docstring
+    already states ids are NOT 1-based). run_pipeline ctx construction at
+    2646-2651 — thread the two new req fields. StepStatus literal includes
+    "warn" (schemas.py:19) and only "fail" stops the run (:2729-2738) — the
+    warn+fallback path is safe. CRITICAL header rule :18-19: pipeline must
+    NOT import app.features.* outside its own slice — app.shared.* is fine.
+
+- file: app/features/demo/workspace.py
+  why: |
+    create_workspace at 46-79 — add the two slot writes on the
+    ShowcaseWorkspace(...) constructor; warn-and-continue contract at 10-13
+    (a slot-write failure must never break the run — the try/except already
+    guarantees it). finalize_workspace at 106-155 — NO change for the slots
+    (recorded at create); note store_id/product_id columns at 136-137 record
+    the EFFECTIVE grain (divergence-visible design).
+    CONTRACT(E1): E1 refactors create_workspace to write its new columns —
+    rebase this edit onto E1's merged version.
+
+- file: app/features/demo/models.py
+  why: |
+    ShowcaseWorkspace ORM — E3 does NOT edit this file. CONTRACT(E1): after
+    E1 merges it carries seed_overrides/user_scope as JSONB story slots;
+    verify the exact attribute names/types there before writing
+    workspace.py code. (Assumed shape: nullable JSONB columns mirroring the
+    created_objects precedent at 77-79.)
+
+- file: app/features/demo/tests/test_pipeline.py
+  why: |
+    _RecordingClient at 1025-1068 (records (method, path, json_body) per
+    call, canned responses keyed by (method, path-prefix)); _as_client cast
+    at 1070+. Reuse for: overrides-forwarding, user_scope GET-by-id calls,
+    warn+fallback (404 canned response).
+
+- file: app/features/demo/tests/test_schemas.py
+  why: |
+    The JSON-path test conventions: test_demo_run_request_json_path_keep_
+    with_name :67, test_demo_run_request_legacy_frame_still_validates :75,
+    test_demo_run_request_workspace_name_requires_keep :83 — mirror all
+    three shapes for the new fields.
+
+- file: app/features/seeder/tests/test_routes.py
+  why: |
+    Route-test harness: client fixture :15 (TestClient + mocked settings,
+    seeder_allow_production=True), TestGenerate :96 — add overrides 201 /
+    422-bounds / 422-unknown-knob cases here. test_generate_validation_error
+    :157 is the 422 pattern.
+
+- file: app/features/seeder/tests/test_service.py
+  why: |
+    Service-test patterns for _build_config_from_params — add: knob→field
+    mapping, precedence-over-scalars, window_days math, preset-character
+    preservation (e.g. sparse preset's random_gaps survive an overrides.
+    sparsity replace), and the no-overrides byte-identical regression.
+
+- file: tests/test_e2e_demo.py
+  why: |
+    test_demo_replay_same_config_twice at 561-609 — the replay-regression
+    guard to extend (or sibling): a keep-run with seed_overrides+user_scope,
+    replayed, must produce a second row with identical slot JSON.
+
+- file: frontend/src/pages/showcase.tsx
+  why: |
+    Wiring surface. handleRun start frame at 139-156 (conditional-spread
+    pattern for optional fields — reuse for seed_overrides/user_scope);
+    handleLoadWorkspace at 160-168 (repopulate panel+selector);
+    handleReplayWorkspace at 174-186 (REPLACE its inline object with the new
+    workspaceToRunRequest helper); controls block at 269-363 (panel +
+    selector land after the existing checkboxes); reset checkbox at 301-311
+    (hook the scope-clearing caveat here).
+
+- file: frontend/src/types/api.ts
+  why: |
+    DemoRunRequest at 778-788 (+ seed_overrides?/user_scope?);
+    WorkspaceListItem at 806-816 and WorkspaceDetail at 819-825 (+ both
+    fields, `| null`); add SeedOverrides + UserScope interfaces near the
+    demo block. WARNING: MIXED CRLF/LF line endings — surgical edits only;
+    verify `git diff --stat` stays small.
+
+- file: frontend/src/hooks/use-stores.ts
+  why: |
+    useStores at 16-43 (TanStack Query over /dimensions/stores with
+    page/page_size/enabled) — the selector's data source; use-products.ts
+    mirrors it (useProducts :16, useProduct :45). page_size hard cap is 100
+    (app/features/dimensions/routes.py:62,187).
+
+- file: frontend/src/hooks/use-seeder.ts
+  why: useSeederStatus :15 — the seeded-window source for the preview card.
+
+- file: frontend/src/hooks/use-demo-pipeline.ts
+  why: |
+    start(req) at 241-249 sends the req object as the WS start frame
+    verbatim — generic over the widened DemoRunRequest; NO change needed
+    (read to confirm). RunHistoryStrip replays stored req objects, so
+    localStorage replays inherit the new fields for free.
+
+- file: frontend/src/components/demo/ScenarioPicker.test.tsx
+  why: |
+    The vitest + @testing-library/react + afterEach(cleanup) harness pattern
+    for the two new component test files.
+
+- file: frontend/src/components/ui/
+  why: |
+    Installed primitives: collapsible.tsx, select.tsx, slider.tsx, input.tsx,
+    badge.tsx, card.tsx, tooltip.tsx, checkbox.tsx — the panel + selector
+    compose from these; NO new shadcn install required. If one becomes
+    necessary anyway: pin `pnpm dlx shadcn@4.7.0 add ...` (5.x writes a stub
+    pnpm-workspace.yaml and skips the component) and use per-component
+    @radix-ui/react-X imports, never the radix barrel.
+
+- file: docs/_base/RUNBOOKS.md
+  why: |
+    "Showcase page (/showcase) pipeline fails at step X" — numbered entries
+    end at 28; append entry 29 (overrides/scope incident matrix) in the same
+    bold-trigger/Cause/Fix format. The "Showcase workspace —
+    preserve/restore/replay/delete semantics" section's "Explicitly out of
+    scope" list says advanced seed configuration is NOT implemented — E3
+    DELIVERS it: rewrite that bullet (move seed_overrides/user_scope to the
+    documented surface; phase-level config stays out of scope).
+
+- file: docs/_base/API_CONTRACTS.md
+  why: |
+    Rows to extend additively: the /seeder/* row (mention the overrides
+    object on POST /seeder/generate), POST /demo/run, and the WS
+    /demo/stream start-frame bullet (E1/E2 notes were just added — append an
+    "E3 (#409)" note, don't disturb them).
+
+- file: docs/_base/DOMAIN_MODEL.md
+  why: |
+    showcase_workspace aggregate section — document the seed_overrides /
+    user_scope slot JSON schemas (the umbrella's "JSONB story slots become a
+    junk drawer" mitigation requires documented slot schemas here).
+
+- file: PRPs/PRP-showcase-workspace-E2-preset-exposure.md
+  why: |
+    Closest predecessor (preset exposure + seed profiles) — its gotcha block
+    (holiday_rush pinning, seeder precedence, sparse NaN-WAPE, frontend tsc
+    gate) all recur in E3; this PRP inherits and extends them.
+
+# Issue / initiative context
+- url: https://github.com/w7-mgfcode/ForecastLabAI/issues/409
+  why: The epic this PRP implements (Parallel after Foundation E1 #407).
+- url: https://github.com/w7-mgfcode/ForecastLabAI/issues/406
+  why: |
+    Umbrella — Approach ("all configuration is start-frame-time", "no new
+    router outside existing slices"), Risks table row 1 (the allow-list
+    mitigation this PRP implements), out-of-scope list (NO mid-run controls,
+    NO embedded scenario-builder).
+- url: https://github.com/w7-mgfcode/ForecastLabAI/issues/407
+  why: |
+    Foundation epic whose contract is GIVEN: JSONB story slots incl.
+    seed_overrides + user_scope; columns replayed_from_workspace_id /
+    archived / pinned / notes / tags / config_schema_version; PATCH
+    /demo/workspaces/{id}. E3 builds on, never re-decides, this surface.
+
+# External references
+- url: https://docs.pydantic.dev/latest/concepts/strict_mode/
+  why: |
+    Strict-mode semantics for nested models: a model-typed field validates
+    dict input using the NESTED model's own config — confirmed empirically
+    (verification log) so no doc-faith is required. NOTE: the docs site
+    301-redirects and anchors have drifted; the runtime verification in the
+    Known Gotchas log is the authoritative claim, not this URL.
+- url: https://docs.pydantic.dev/latest/api/config/#pydantic.config.ConfigDict.extra
+  why: extra="forbid" → unknown nested keys raise ValidationError (the 422 allow-list mechanism).
+```
+
+### Current Codebase tree (relevant subset, pre-E1)
+
+```bash
+app/shared/seeder/
+├── config.py                 # dataclasses; override TARGETS (no Pydantic here)
+├── core.py / generators/     # consume SeederConfig — untouched by E3
+app/features/seeder/
+├── schemas.py                # GenerateParams @78 (25+ flat fields)
+├── service.py                # _build_config_from_params @202; _apply_phaseN @74/@139
+├── routes.py                 # POST /generate @85 (guard @21; no route change)
+└── tests/                    # test_routes.py, test_service.py, test_schemas.py
+app/features/demo/
+├── schemas.py                # DemoRunRequest @29; Workspace* responses @169
+├── pipeline.py               # DemoContext @212; step_seed @541; step_status @582; run_pipeline @2618
+├── workspace.py              # create_workspace @46; finalize_workspace @106
+├── models.py                 # ShowcaseWorkspace (E1 adds the story slots — not edited here)
+└── tests/                    # test_pipeline.py (_RecordingClient @1025), test_schemas.py, test_workspace.py
+tests/test_e2e_demo.py        # replay regression @561
+frontend/src/
+├── pages/showcase.tsx        # handleRun @139; handleLoad @160; handleReplay @174; controls @269
+├── types/api.ts              # DemoRunRequest @778; WorkspaceListItem @806 (MIXED CRLF/LF)
+├── hooks/use-stores.ts, use-products.ts, use-seeder.ts, use-demo-pipeline.ts
+└── components/demo/          # ScenarioPicker, WorkspacePanel, ... (+ index.ts barrel)
+```
+
+### Desired Codebase tree (files added/modified)
+
+```bash
+app/shared/seeder/overrides.py            # NEW — SeederOverrides (strict, extra=forbid, 7 knobs)
+app/shared/seeder/tests/test_overrides.py # NEW — bounds, forbid, JSON-path, sparse-dump tests
+app/features/seeder/schemas.py            # MOD — GenerateParams.overrides: SeederOverrides | None
+app/features/seeder/service.py            # MOD — _apply_seed_overrides, wired LAST in _build_config_from_params
+app/features/seeder/tests/test_service.py # MOD — mapping/precedence/window/byte-identical tests
+app/features/seeder/tests/test_routes.py  # MOD — 201-with-overrides, 422-bounds, 422-unknown-knob
+app/features/demo/schemas.py              # MOD — UserScope; DemoRunRequest fields + validators; Workspace* responses
+app/features/demo/pipeline.py             # MOD — DemoContext fields; step_seed forward; step_status scope branch
+app/features/demo/workspace.py            # MOD — create_workspace writes both slots
+app/features/demo/tests/test_schemas.py   # MOD — JSON-path + validator tests
+app/features/demo/tests/test_pipeline.py  # MOD — forwarding + scope + warn/fallback tests
+app/features/demo/tests/test_workspace.py # MOD — slot persistence tests
+tests/test_e2e_demo.py                    # MOD — replay-verbatim regression incl. slots (integration)
+frontend/src/types/api.ts                 # MOD — SeedOverrides, UserScope, DemoRunRequest, Workspace* (surgical)
+frontend/src/lib/workspace-replay.ts      # NEW — workspaceToRunRequest(ws) pure helper
+frontend/src/lib/workspace-replay.test.ts # NEW — replay-verbatim FE regression
+frontend/src/components/demo/SeedConfigPanel.tsx        # NEW — collapsible 7-knob panel
+frontend/src/components/demo/SeedConfigPanel.test.tsx   # NEW
+frontend/src/components/demo/ScopeSelector.tsx          # NEW — pair selector + preview card
+frontend/src/components/demo/ScopeSelector.test.tsx     # NEW
+frontend/src/components/demo/index.ts     # MOD — export the two new components (match barrel style)
+frontend/src/pages/showcase.tsx           # MOD — wiring (state, panel, selector, start frames)
+docs/_base/API_CONTRACTS.md               # MOD — seeder overrides + /demo/run + WS start-frame E3 notes
+docs/_base/RUNBOOKS.md                    # MOD — showcase incident 29 + workspace-section scope update
+docs/_base/DOMAIN_MODEL.md                # MOD — slot schemas on the showcase_workspace aggregate
+```
+
+### Known Gotchas & Library Quirks
+
+```python
+# CRITICAL — EXECUTION ORDER: do not start until E1 #407 is merged to dev.
+#   E3 writes JSONB slots that E1's migration creates. First action of Task 1:
+#   re-read app/features/demo/models.py + workspace.py on the post-E1 dev and
+#   re-anchor every CONTRACT(E1) tag in this PRP.
+
+# CRITICAL — pydantic strict + nested models (runtime-verified 2026-06-12 on
+#   pydantic 2.12.5; re-run on lib upgrade):
+#   uv run python -c "
+#   from pydantic import BaseModel, ConfigDict, Field
+#   class N(BaseModel):
+#       model_config = ConfigDict(strict=True, extra='forbid')
+#       stores: int | None = Field(default=None, ge=1, le=100)
+#   class P(BaseModel):
+#       model_config = ConfigDict(strict=True)
+#       seed_overrides: N | None = None
+#   print(P.model_validate({'seed_overrides': {'stores': 5}}))          # OK — dict→model under strict
+#   P.model_validate({'seed_overrides': {'stores': 999}})               # ValidationError (bounds)
+#   "
+#   and N.model_validate({'stores': 5, 'bogus': 1}) → ValidationError (forbid).
+#   Conclusions baked into the design: NO Field(strict=False) needed on the
+#   nested field; extra='forbid' IS the allow-list; FastAPI's validate_python
+#   path (the JSON dict) works. All knobs are int/float → the strict-mode AST
+#   policy test (app/core/tests/test_strict_mode_policy.py) does not fire.
+
+# CRITICAL — do NOT add ConfigDict(strict=True) to GenerateParams itself: it
+#   has date fields (start_date/end_date) and is deliberately non-strict today.
+#   Only the NEW nested models are strict.
+
+# CRITICAL — seeder override precedence (service.py:213-226 + the new layer):
+#   preset → scalar stores/products/window/sparsity → phase1 → phase2 →
+#   overrides (LAST, wins). Use dataclasses.replace for every sub-config so
+#   preset-customized sibling fields survive (e.g. sparse preset's
+#   random_gaps_per_series when overrides.sparsity is set; scenario-customized
+#   region/category lists when overrides.stores is set — same reason the
+#   existing scalar override at :218-222 uses replace).
+
+# CRITICAL — holiday_rush is CALENDAR-PINNED (config.py:553-579): its
+#   HolidayConfig spikes are fixed 2024 dates. seed_overrides.window_days on
+#   scenario='holiday_rush' must be REJECTED at DemoRunRequest validation
+#   (clear ValueError message), not silently ignored — a shifted window
+#   silently drops every holiday spike. Direct /seeder/generate callers who
+#   combine them are out of scope (the preset docstring already documents
+#   explicit-dates-to-shift).
+
+# CRITICAL — seed_overrides requires skip_seed=False. The seed step is skipped
+#   on skip_seed=true (pipeline.py:543-544) so overrides would be a silent
+#   no-op; reject in a model_validator (mirror _workspace_name_requires_keep,
+#   schemas.py:80-85). The frontend enforces the same by gating the panel on
+#   the Re-seed checkbox.
+
+# CRITICAL — ids are NOT 1-based (step_status docstring, pipeline.py:585-587;
+#   memory anchor seeder-does-not-reset-id-sequences). The scope selector MUST
+#   be fed from live /dimensions data, never synthesized ids. user_scope can
+#   dangle after reset+reseed → step_status WARN + fallback to discovery (the
+#   replay path of a reset=true workspace would otherwise hard-fail forever).
+#   "warn" does NOT stop the run (only "fail" does — pipeline.py:2729-2738).
+
+# CRITICAL — high stockout_intensity / sparsity overrides can legitimately
+#   FAIL the backtest (all-NaN WAPE → step_backtest FAIL by design; same
+#   semantics as the sparse preset, RUNBOOKS incident 28). Do NOT add a
+#   graceful-skip; ship the panel caveat + runbook entry 29 instead.
+
+# CRITICAL — workspace writes stay warn-and-continue (workspace.py:10-13).
+#   The slot writes go INSIDE the existing try/except in create_workspace; a
+#   failure yields workspace_id=None and a green run, never an exception.
+
+# GOTCHA — replay reads WorkspaceListItem (the LIST row — showcase.tsx:174):
+#   seed_overrides/user_scope must be on the LIST response, not detail-only.
+#   CONTRACT(E1): if E1 already exposed the slots detail-only, ADD them to the
+#   list item here (cheap; sparse JSONB).
+
+# GOTCHA — frontend type gates: `pnpm tsc --noEmit` is vacuous (solution-style
+#   tsconfig) and `pnpm tsc -b` fails with ~24 PRE-EXISTING errors on dev,
+#   none in demo components. Gate on `pnpm lint && pnpm test --run` plus:
+#   cd frontend && pnpm tsc -b 2>&1 | grep -E "SeedConfigPanel|ScopeSelector|workspace-replay|types/api|pages/showcase"  # expect empty
+
+# GOTCHA — frontend/src/types/api.ts has MIXED CRLF/LF line endings; repo-wide
+#   files are inconsistently CRLF/LF. Keep edits surgical; check
+#   `git diff --stat` before committing (Edit/Write emit LF — avoid whole-file
+#   noise diffs).
+
+# GOTCHA — shadcn: compose from INSTALLED primitives (collapsible, select,
+#   slider, input, badge, tooltip — frontend/src/components/ui/). Semantic
+#   tokens only (text-muted-foreground, border-primary, text-destructive for
+#   the reset caveat — mirrors showcase.tsx:309). Never raw colors.
+
+# GOTCHA — mypy --strict AND pyright --strict gate every backend edit. The
+#   DemoContext additions need full annotations (SeederOverrides | None);
+#   pipeline.py imports them from app.shared.seeder.overrides (NOT from the
+#   seeder feature slice — vertical-slice rule, pipeline.py:18-19).
+
+# GOTCHA — step_seed currently derives the detail line from profile dims
+#   (pipeline.py:577). With overrides, compute effective stores/products =
+#   override-or-profile for BOTH the POST scalars and the detail string so
+#   the card tells the truth; keep scalar sparsity=0.0 (preset-character
+#   guard); the nested object carries the operator's sparsity.
+
+# CONVENTION — commits (every one references #409; no AI trailer; scopes from
+#   .claude/rules/commit-format.md — seeder slice ⊂ `data`, demo slice ⊂ `api`):
+#   feat(data): add allow-listed nested seed overrides to seeder contract (#409)
+#   feat(api): thread seed overrides and user scope through demo pipeline (#409)
+#   feat(ui): add advanced seed config panel and scope selector to showcase (#409)
+#   test(api): cover replay-verbatim seed overrides and scope slots (#409)
+#   docs(docs): document seed override contract and workspace slots (#409)
+#   docs(repo): track showcase completion e3 prp (#409)
+#   Branch off dev: feat/showcase-completion-e3-seed-config-scope (49 chars ≤ 50).
+
+# RUNTIME-VERIFICATION LOG (per prp-create step 3):
+#   - pydantic 2.12.5 nested-strict + extra=forbid + bounds behavior verified
+#     with the command in the CRITICAL block above (all four assertions pass).
+#   - Seeder precedence semantics read directly from service.py:202-247 (not
+#     inferred); the `if params.sparsity > 0` guard confirmed at :225-226.
+#   - dimensions page_size cap 100 confirmed at app/features/dimensions/
+#     routes.py:62 and :187.
+#   - `pnpm tsc -b` pre-existing-failure state re-confirmed by the E2 PRP log
+#     (2026-06-12); no demo-component errors.
+#   - No other third-party API claims — everything else cites in-repo code.
+```
+
+## Implementation Blueprint
+
+### Data models and structure
+
+```python
+# app/shared/seeder/overrides.py  (NEW)
+"""Curated, allow-listed seed-override schema (E3, issue #409).
+
+Shared between the seeder slice (GenerateParams.overrides) and the demo slice
+(DemoRunRequest.seed_overrides) — app/shared is the sanctioned cross-slice
+home (vertical-slice rule). extra='forbid' IS the allow-list: any knob not
+listed here is a 422 at the HTTP boundary (umbrella #406 risk mitigation —
+the full 25+ knob surface stays preset-driven).
+"""
+from pydantic import BaseModel, ConfigDict, Field
+
+class SeederOverrides(BaseModel):
+    # strict=True catches JSON-native coercion bugs ("5" → 5); every field is
+    # int/float so no Field(strict=False) override is needed (security-patterns.md).
+    model_config = ConfigDict(strict=True, extra="forbid")
+
+    stores: int | None = Field(default=None, ge=1, le=100, description="Store count → DimensionConfig.stores; wins over the scalar `stores` param.")
+    products: int | None = Field(default=None, ge=1, le=500, description="Product count → DimensionConfig.products; wins over the scalar `products` param.")
+    window_days: int | None = Field(default=None, ge=75, le=365, description="Seeded window length; start_date = end_date - window_days. >=75 keeps the showcase historical_backfill gate clear. Rejected on the calendar-pinned holiday_rush preset (demo surface).")
+    sparsity: float | None = Field(default=None, ge=0.0, le=0.9, description="Missing (store,product) grain fraction → SparsityConfig.missing_combinations_pct; preserves the preset's gap config. 1.0 disallowed (zero series).")
+    promotion_intensity: float | None = Field(default=None, ge=0.0, le=0.5, description="→ RetailPatternConfig.promotion_probability (preset max 0.25).")
+    stockout_intensity: float | None = Field(default=None, ge=0.0, le=0.5, description="→ RetailPatternConfig.stockout_probability. High values can legitimately NaN-WAPE-fail the backtest (documented).")
+    noise_sigma: float | None = Field(default=None, ge=0.0, le=0.5, description="→ TimeSeriesConfig.noise_sigma (preset max 0.4).")
+
+    def is_empty(self) -> bool:
+        """True when no knob is set ({} on the wire) — treated as None everywhere."""
+        return not self.model_dump(exclude_none=True)
+```
+
+```python
+# app/features/demo/schemas.py — additions (demo-only concept stays in-slice)
+class UserScope(BaseModel):
+    """Operator-selected (store, product) focus pair (E3, issue #409).
+
+    Ids are REAL discovered ids (sequences never reset — ids are not 1-based);
+    step_status validates them and warn-falls-back to discovery when dangling.
+    """
+    model_config = ConfigDict(strict=True, extra="forbid")
+    store_id: int = Field(..., ge=1)
+    product_id: int = Field(..., ge=1)
+
+# DemoRunRequest — two additive Optional fields + two validators:
+#   seed_overrides: SeederOverrides | None = None   (import from app.shared.seeder.overrides)
+#   user_scope: UserScope | None = None
+#
+# @model_validator(mode="after") _seed_overrides_require_reseed:
+#   if self.seed_overrides is not None and not self.seed_overrides.is_empty()
+#      and self.skip_seed:
+#       raise ValueError("seed_overrides requires skip_seed=false (Re-seed first)")
+#   # normalize: an empty overrides object collapses to None
+#   if self.seed_overrides is not None and self.seed_overrides.is_empty():
+#       self.seed_overrides = None      # NOTE: model_validator(after) may mutate self
+#
+# @model_validator(mode="after") _window_days_forbidden_on_holiday_rush:
+#   if (self.seed_overrides is not None
+#       and self.seed_overrides.window_days is not None
+#       and self.scenario is ScenarioPreset.HOLIDAY_RUSH):
+#       raise ValueError("window_days cannot override the calendar-pinned holiday_rush window")
+#
+# WorkspaceListItem (+ WorkspaceDetailResponse inherits):
+#   seed_overrides: dict[str, Any] | None = Field(default=None, ...)
+#   user_scope: dict[str, Any] | None = Field(default=None, ...)
+#   (from_attributes=True already set — ORM JSONB maps straight through.
+#    CONTRACT(E1): skip if E1's PRP already added them; ensure LIST exposure.)
+```
+
+```python
+# app/features/seeder/service.py — the new layer (mirror _apply_phase2_overrides)
+def _apply_seed_overrides(config: SeederConfig, overrides: SeederOverrides | None) -> None:
+    """Apply the curated nested overrides LAST — wins over scalar params.
+
+    dataclasses.replace is field-precise: preset-customized sibling fields
+    (region/category lists, random_gaps_*) survive every knob.
+    """
+    if overrides is None:
+        return
+    if overrides.stores is not None or overrides.products is not None:
+        config.dimensions = replace(
+            config.dimensions,
+            stores=overrides.stores if overrides.stores is not None else config.dimensions.stores,
+            products=overrides.products if overrides.products is not None else config.dimensions.products,
+        )
+    if overrides.window_days is not None:
+        config.start_date = config.end_date - timedelta(days=overrides.window_days)
+    if overrides.sparsity is not None:
+        config.sparsity = replace(config.sparsity, missing_combinations_pct=overrides.sparsity)
+    if overrides.promotion_intensity is not None or overrides.stockout_intensity is not None:
+        config.retail = replace(
+            config.retail,
+            promotion_probability=(overrides.promotion_intensity
+                                   if overrides.promotion_intensity is not None
+                                   else config.retail.promotion_probability),
+            stockout_probability=(overrides.stockout_intensity
+                                  if overrides.stockout_intensity is not None
+                                  else config.retail.stockout_probability),
+        )
+    if overrides.noise_sigma is not None:
+        config.time_series = replace(config.time_series, noise_sigma=overrides.noise_sigma)
+# Wire-in (one line, AFTER _apply_phase2_overrides at :241):
+#   _apply_seed_overrides(config, params.overrides)
+```
+
+```python
+# app/features/demo/pipeline.py — step changes (sketch)
+
+# DemoContext additions (after workspace_name, with an E3 #409 comment):
+#   seed_overrides: SeederOverrides | None = None
+#   user_scope: UserScope | None = None
+# run_pipeline ctx construction: thread req.seed_overrides / req.user_scope.
+
+# step_seed — effective dims + verbatim forward:
+#   stores = ctx.seed_overrides.stores if (ctx.seed_overrides and ctx.seed_overrides.stores) else profile.stores
+#   products = ... same for products ...
+#   window: if ctx.seed_overrides and ctx.seed_overrides.window_days:
+#       seed_end = datetime.now(UTC).date(); seed_start = seed_end - timedelta(days=ctx.seed_overrides.window_days)
+#   elif profile.window is not None: ... (existing pinned branch; validator already
+#       guarantees window_days is never set on holiday_rush)
+#   json_body gains: **({"overrides": ctx.seed_overrides.model_dump(exclude_none=True)}
+#                      if ctx.seed_overrides else {})
+#   detail line + data echo the effective dims and "overrides" keys applied.
+
+# step_status — user-scope branch BEFORE first-pair discovery:
+#   if ctx.user_scope is not None:
+#       try:
+#           store_body = await client.request("status[scope-store]", "GET",
+#               f"/dimensions/stores/{ctx.user_scope.store_id}")
+#           product_body = await client.request("status[scope-product]", "GET",
+#               f"/dimensions/products/{ctx.user_scope.product_id}")
+#       except _StepError:
+#           scope_warn = ("user_scope (store=%d, product=%d) not found — fell back "
+#                         "to discovered pair" % (...))   # WARN, never fail (replay safety)
+#       else:
+#           ctx.store_id, ctx.product_id = ctx.user_scope.store_id, ctx.user_scope.product_id
+#           -> return ("pass", f"... store_id={..} product_id={..} (user-selected)",
+#                      {..., "user_scope_applied": True})
+#   # fallback / no-scope path: existing discovery (582-631) unchanged; when the
+#   # scope dangled return ("warn", scope_warn + discovery detail,
+#   #                       {..., "user_scope_applied": False}).
+```
+
+```python
+# app/features/demo/workspace.py — create_workspace constructor additions
+#   (INSIDE the existing try; attribute names per the merged E1 model —
+#    CONTRACT(E1): assumed `seed_overrides` / `user_scope` nullable JSONB):
+#   seed_overrides=(req.seed_overrides.model_dump(mode="json", exclude_none=True)
+#                   if req.seed_overrides else None),
+#   user_scope=(req.user_scope.model_dump(mode="json") if req.user_scope else None),
+```
+
+```tsx
+// frontend/src/lib/workspace-replay.ts (NEW) — replay-verbatim in ONE place
+import type { DemoRunRequest, WorkspaceListItem } from '@/types/api'
+
+/** Build the verbatim replay start frame for a saved workspace (E4 semantics
+ *  + E3 #409 slots). Omits absent optionals so legacy rows replay byte-
+ *  identically to today. */
+export function workspaceToRunRequest(ws: WorkspaceListItem): DemoRunRequest {
+  return {
+    seed: ws.seed,
+    scenario: ws.scenario,
+    reset: ws.reset,
+    skip_seed: ws.skip_seed,
+    preservation: 'keep',
+    // CONTRACT(E1): replay provenance — post-E1, handleReplayWorkspace's inline
+    // object sends this field (an E1 frozen success criterion); this helper
+    // REPLACES that object and must preserve it or lineage silently regresses.
+    replayed_from_workspace_id: ws.workspace_id,
+    ...(ws.name ? { workspace_name: ws.name } : {}),
+    ...(ws.seed_overrides ? { seed_overrides: ws.seed_overrides } : {}),
+    ...(ws.user_scope ? { user_scope: ws.user_scope } : {}),
+  }
+}
+
+// types/api.ts additions (surgical):
+//   export interface SeedOverrides { stores?: number; products?: number;
+//     window_days?: number; sparsity?: number; promotion_intensity?: number;
+//     stockout_intensity?: number; noise_sigma?: number }
+//   export interface UserScope { store_id: number; product_id: number }
+//   DemoRunRequest += seed_overrides?: SeedOverrides; user_scope?: UserScope
+//   WorkspaceListItem += seed_overrides: SeedOverrides | null; user_scope: UserScope | null
+
+// SeedConfigPanel.tsx — props: { value: SeedOverrides | null; onChange(v: SeedOverrides | null): void;
+//   disabled?: boolean; windowLocked?: boolean /* holiday_rush */ }
+//   <Collapsible> "Advanced seed config"; Inputs (stores 1..20 UI-range, products 1..50,
+//   window_days 75..365) + Sliders (sparsity 0..0.9 step .05, promo/stockout 0..0.5,
+//   noise 0..0.5); live summary line; NaN-WAPE caveat <Badge>; emits null when all unset.
+//   UI ranges are TIGHTER than the API bounds (laptop-scale); the API bounds are the law.
+
+// ScopeSelector.tsx — props: { value: UserScope | null; onChange(v: UserScope | null): void;
+//   disabled?: boolean }
+//   two shadcn <Select>s fed by useStores/useProducts({ page: 1, pageSize: 100 });
+//   preview <Card>: store code/name/region/type · product sku/name/category/brand ·
+//   seeded window from useSeederStatus(); "Clear" button → onChange(null).
+
+// showcase.tsx wiring:
+//   const [seedOverrides, setSeedOverrides] = useState<SeedOverrides | null>(null)
+//   const [userScope, setUserScope] = useState<UserScope | null>(null)
+//   - panel rendered when `reseed` ticked (windowLocked={scenario === 'holiday_rush'});
+//     unticking Re-seed clears overrides (validator parity).
+//   - ticking Reset database clears userScope + shows the re-pick caveat
+//     (text-destructive, mirrors :309).
+//   - handleRun spread: ...(reseed && seedOverrides ? { seed_overrides: seedOverrides } : {}),
+//                       ...(userScope ? { user_scope: userScope } : {})
+//   - handleLoadWorkspace: setSeedOverrides(ws.seed_overrides ?? null); setUserScope(ws.user_scope ?? null)
+//   - handleReplayWorkspace: start(workspaceToRunRequest(ws))  // replaces the inline object
+```
+
+### List of tasks (dependency order)
+
+```yaml
+Task 0 — E1 gate & re-anchor (BLOCKING):
+  VERIFY: gh issue view 407 --json state   # must be CLOSED (E1 merged)
+  RUN: git switch dev && git pull
+  READ on the post-E1 dev: app/features/demo/models.py (slot attribute names/types),
+    app/features/demo/workspace.py (create_workspace shape), app/features/demo/schemas.py
+    (whether E1 surfaced slots on Workspace* responses), frontend/src/types/api.ts,
+    AND frontend/src/pages/showcase.tsx handleReplayWorkspace — E1 wires
+    replayed_from_workspace_id into the inline replay object that Task 9's
+    workspaceToRunRequest replaces; confirm the helper preserves it.
+  RESOLVE every CONTRACT(E1) tag in this PRP against reality; adjust attribute
+    names below if E1's PRP chose different ones (e.g. a single story JSONB).
+  RUN: git switch -c feat/showcase-completion-e3-seed-config-scope
+  VERIFY: gh issue view 409 --json state   # open
+
+Task 1 — CREATE app/shared/seeder/overrides.py (+ tests):
+  - SeederOverrides per the blueprint (strict, extra=forbid, 7 bounded knobs, is_empty()).
+  - CREATE app/shared/seeder/tests/test_overrides.py (the shared/seeder/tests dir exists):
+      bounds (each knob low/high rejection), unknown-knob forbid, JSON-path
+      model_validate({...}) happy path, model_dump(exclude_none=True) sparseness,
+      is_empty() truth table.
+  - Optionally re-export from app/shared/seeder/__init__.py (match how
+    ScenarioPreset/SeederConfig are exported there — service.py:32 imports them
+    from the package).
+
+Task 2 — MODIFY app/features/seeder/schemas.py + service.py:
+  - GenerateParams: ADD `overrides: SeederOverrides | None = Field(default=None,
+    description="Curated nested overrides (E3 #409); applied LAST — wins over the
+    scalar stores/products/sparsity. Absent = byte-identical legacy behavior.")`
+    (import from app.shared.seeder.overrides; do NOT touch strict-mode config).
+  - service.py: ADD _apply_seed_overrides (blueprint); CALL it after
+    _apply_phase2_overrides(config, params) in _build_config_from_params.
+  - timedelta already imported in service.py (:8).
+
+Task 3 — seeder tests:
+  - test_service.py: (a) each knob maps to its config field; (b) overrides.stores
+    beats params.stores (precedence); (c) window_days math
+    (config.start_date == config.end_date - timedelta(days=N)); (d) sparse-preset
+    character preserved (overrides.sparsity set → random_gaps_per_series still 3);
+    (e) REGRESSION: params without overrides → config equal to today's output.
+  - test_routes.py (TestGenerate class): 201 with {"overrides": {"stores": 8,
+    "promotion_intensity": 0.3}}; 422 on {"overrides": {"stores": 0}};
+    422 on {"overrides": {"bogus_knob": 1}} (extra=forbid).
+
+Task 4 — MODIFY app/features/demo/schemas.py:
+  - ADD UserScope; ADD DemoRunRequest.seed_overrides / .user_scope; ADD the two
+    model_validators (blueprint). Update the class docstring's strict-mode note
+    (nested models are JSON-native — cite the runtime verification).
+  - ADD seed_overrides/user_scope to WorkspaceListItem (Detail inherits) —
+    CONTRACT(E1): skip/merge if E1 already exposed them; ensure LIST exposure.
+
+Task 5 — demo schema tests (app/features/demo/tests/test_schemas.py):
+  - JSON-path: DemoRunRequest.model_validate({"skip_seed": False,
+    "seed_overrides": {"stores": 8}}) OK; seed_overrides + skip_seed True →
+    ValidationError; empty overrides {} normalizes to None; window_days +
+    scenario "holiday_rush" → ValidationError; user_scope happy path +
+    extra-key forbid + ge=1 bounds; LEGACY 4-field frame still validates
+    (extend test_demo_run_request_legacy_frame_still_validates' sibling).
+  - WorkspaceListItem from_attributes round-trip with slot dicts and with NULLs.
+
+Task 6 — MODIFY app/features/demo/pipeline.py:
+  - DemoContext: + seed_overrides / user_scope (typed, E3 #409 comment block).
+  - run_pipeline: thread req.seed_overrides / req.user_scope into ctx (:2646-2651).
+  - step_seed: effective dims + window_days branch + "overrides" body key
+    (blueprint); detail/data echo.
+  - step_status: user-scope validate/adopt/warn-fallback branch (blueprint);
+    data gains "user_scope_applied".
+
+Task 7 — pipeline tests (test_pipeline.py, _RecordingClient @1025):
+  - test_step_seed_forwards_seed_overrides: ctx with overrides; assert POST
+    /seeder/generate body["overrides"] == {"stores": 8, ...}, body["stores"] == 8
+    (effective), sparsity scalar stays 0.0.
+  - test_step_seed_window_days_overrides_profile_window: 120-day delta between
+    posted start/end.
+  - test_step_status_honors_user_scope: canned 200s for
+    /dimensions/stores/{id} + /dimensions/products/{id}; assert ctx.store_id/
+    product_id == scope, status "pass", data["user_scope_applied"] is True.
+  - test_step_status_dangling_scope_warns_and_falls_back: canned 404 for the
+    store GET + normal discovery responses; assert status "warn",
+    ctx ids == discovered pair, data["user_scope_applied"] is False.
+  - test_run_pipeline_threads_new_fields (ctx construction).
+
+Task 8 — MODIFY app/features/demo/workspace.py + tests:
+  - create_workspace: write both slots (blueprint; INSIDE the try —
+    warn-and-continue intact).
+  - test_workspace.py: keep-run with overrides+scope persists sparse JSON;
+    keep-run without them persists NULLs; create failure still returns None
+    (existing warn-and-continue test stays green).
+  - tests/test_e2e_demo.py (integration): extend test_demo_replay_same_config_
+    twice (or add a sibling test_demo_replay_preserves_seed_overrides_and_scope):
+    keep-run with seed_overrides + user_scope (skip_seed=False so the validator
+    passes — use the smallest overrides, e.g. {"stores": 3, "products": 10},
+    to keep wall-clock sane); replay via a second run with the row's recorded
+    config; assert both rows' seed_overrides/user_scope JSON identical.
+
+Task 9 — frontend types + replay helper:
+  - types/api.ts: SeedOverrides + UserScope interfaces; DemoRunRequest +2
+    optional fields; WorkspaceListItem +2 nullable fields (surgical — CRLF trap).
+  - CREATE lib/workspace-replay.ts + workspace-replay.test.ts:
+    legacy row (null slots) → frame WITHOUT the E3 keys (seed_overrides/
+    user_scope) but ALWAYS WITH replayed_from_workspace_id = ws.workspace_id
+    (CONTRACT(E1): deep-equal to the POST-E1 inline object, not the pre-E1
+    shape); slotted row → frame includes both E3 keys verbatim; named/unnamed.
+
+Task 10 — CREATE SeedConfigPanel.tsx + ScopeSelector.tsx (+ tests, + barrel):
+  - Blueprint above; compose from installed primitives; semantic tokens only.
+  - SeedConfigPanel.test.tsx: renders 7 controls; emits a sparse object (only
+    touched knobs); emits null when cleared; disabled state; windowLocked
+    disables the window control; caveat badge visible at high stockout/sparsity.
+  - ScopeSelector.test.tsx: renders options from mocked useStores/useProducts
+    (mock the hooks via vi.mock — keep the harness light per
+    test-requirements.md); selection fires onChange with real ids; preview
+    shows store/product names; Clear → onChange(null).
+  - components/demo/index.ts: export both (match barrel style).
+
+Task 11 — MODIFY frontend/src/pages/showcase.tsx:
+  - State + wiring per the blueprint; handleReplayWorkspace uses
+    workspaceToRunRequest; handleLoadWorkspace repopulates panel + selector;
+    Reset-database tick clears userScope (+ caveat); Re-seed untick clears
+    seedOverrides.
+
+Task 12 — docs:
+  - API_CONTRACTS.md: seeder row — "E3 (#409) — POST /seeder/generate accepts an
+    additive Optional `overrides` object (allow-listed knobs: stores, products,
+    window_days, sparsity, promotion_intensity, stockout_intensity, noise_sigma;
+    `extra=forbid` → unknown knob 422; applied last, wins over the scalar
+    stores/products/sparsity)". POST /demo/run row + WS start-frame bullet —
+    "E3 (#409) — additive Optional `seed_overrides` (same object; requires
+    skip_seed=false; window_days rejected on holiday_rush) and `user_scope`
+    ({store_id, product_id}; validated by the status step, warn+fallback on a
+    dangling pair); both persist to the workspace row and replay verbatim."
+  - RUNBOOKS.md: showcase incident 29 — overrides/scope failure matrix:
+    (a) 422 "seed_overrides requires skip_seed=false" → tick Re-seed first;
+    (b) 422 window_days on holiday_rush → expected, pinned window;
+    (c) status step ⚠️ "user_scope ... not found" → expected after reset/reseed
+    (ids re-issued; sequences never reset) — re-pick the pair;
+    (d) backtest ❌ NaN WAPE on high stockout/sparsity overrides → documented
+    expected outcome (mirrors incident 28's sparse row).
+    Workspace section: move "advanced seed configuration" out of the
+    "Explicitly out of scope" list (now shipped: seed_overrides + user_scope;
+    phase-level config remains out of scope) and note replay-verbatim covers
+    the two new slots.
+  - DOMAIN_MODEL.md: showcase_workspace aggregate — document both slot JSON
+    schemas (the table above) + the requested-vs-effective-grain distinction.
+
+Task 13 — gates, dogfood, commit, PR:
+  - Validation Loop below (all levels).
+  - Level 4 browser dogfood (mandatory per .claude/rules/ui-design.md).
+  - git diff --stat surgical check (types/api.ts CRLF trap).
+  - Commits per the convention block; PR into dev titled
+    "feat(api,ui): showcase advanced seed config and scope selection (#409)".
+```
+
+### Integration Points
+
+```yaml
+DATABASE: none in E3 — the seed_overrides/user_scope JSONB slots ship in E1
+  #407's migration. CONTRACT(E1): verify slots exist before Task 1.
+CONFIG: none — no new settings or env vars.
+ROUTES: none new — POST /seeder/generate, POST /demo/run, WS /demo/stream all
+  extend via request-model changes only (umbrella: "no new router outside
+  existing slices").
+SHARED: app/shared/seeder/overrides.py is the one new module — the sanctioned
+  cross-slice seam (both slices already import app/shared/seeder).
+WS CONTRACT: start frame gains two additive optional keys; event stream shape
+  unchanged (step data dicts gain echo keys only).
+WORKSPACE ROW: create_workspace writes the slots; finalize untouched;
+  PATCH /demo/workspaces/{id} (E1) deliberately NOT extended — overrides/scope
+  are immutable run records, not patchable metadata.
+FRONTEND: 2 new components + 1 lib helper + types + showcase wiring; WorkspacePanel /
+  RunHistoryStrip / use-demo-pipeline are generic over the widened types (no edits).
+```
+
+## Validation Loop
+
+### Level 1: Syntax & Style
+
+```bash
+uv run ruff check . && uv run ruff format --check .
+uv run mypy app/ && uv run pyright app/          # both --strict, gate merge
+cd frontend && pnpm lint
+# Types: no NEW errors mentioning touched files (pre-existing tsc -b failures exist on dev):
+cd frontend && pnpm tsc -b 2>&1 | grep -E "SeedConfigPanel|ScopeSelector|workspace-replay|types/api|pages/showcase" ; echo "exit=$? (1 = no matches = good)"
+```
+
+### Level 2: Unit Tests
+
+```bash
+uv run pytest app/shared/seeder app/features/seeder app/features/demo -v -m "not integration"
+cd frontend && pnpm test --run src/components/demo/ src/lib/
+cd frontend && pnpm test --run                      # full frontend suite
+```
+
+### Level 3: Integration (real Postgres — E1's migrated schema)
+
+```bash
+docker compose up -d && uv run alembic upgrade head
+# CAVEAT: destructive seeder tests pollute the shared DB mid-suite — reset to a
+# fresh DB before trusting Level-3 results (DROP/CREATE DATABASE, never `down -v`).
+uv run pytest -v -m integration -k "demo or seeder"   # incl. the replay-slot regression
+# Manual contract probes:
+curl -s -X POST localhost:8123/seeder/generate -H 'content-type: application/json' \
+  -d '{"scenario":"demo_minimal","stores":3,"products":10,"overrides":{"promotion_intensity":0.3,"noise_sigma":0.25}}' | head -c 300
+curl -s -X POST localhost:8123/seeder/generate -H 'content-type: application/json' \
+  -d '{"overrides":{"bogus":1}}' -o /dev/null -w '%{http_code}\n'        # 422
+curl -s -X POST localhost:8123/demo/run -H 'content-type: application/json' \
+  -d '{"skip_seed":true,"seed_overrides":{"stores":5}}' -o /dev/null -w '%{http_code}\n'  # 422
+```
+
+### Level 4: Browser dogfood (uvicorn :8123 + vite :5173)
+
+```bash
+uv run uvicorn app.main:app --port 8123 &
+cd frontend && ./node_modules/.bin/vite --host 0.0.0.0 &   # bypasses pnpm 11 depsStatusCheck
+# Real browser (webapp-testing / agent-browser; on this host Playwright needs
+# executable_path=/snap/bin/chromium):
+#  1. /showcase: tick "Re-seed first" → Advanced seed config panel appears;
+#     untick → panel collapses and overrides clear.
+#  2. Set stores=8, products=20, promo=0.3 → Run: green; seed card detail
+#     echoes "8 stores x 20 products"; /seeder/status confirms dims.
+#  3. Pick a focus pair in the ScopeSelector (preview shows names + window) →
+#     Run (skip_seed): status card says "(user-selected)"; train/backtest
+#     Inspect links target the chosen pair.
+#  4. Save as workspace + Run → workspace panel row → Replay: the replayed run
+#     uses the same overrides + scope (status card user-selected; second
+#     workspace row's slots identical — check GET /demo/workspaces).
+#  5. Tick "Reset database" → scope selection clears with the caveat.
+#  6. Pick holiday_rush + Re-seed → window_days control disabled (tooltip).
+#  7. Legacy path: no overrides, no scope → run is indistinguishable from today.
+```
+
+## Final validation Checklist
+
+- [ ] Backend gates: `uv run ruff check . && uv run ruff format --check . && uv run mypy app/ && uv run pyright app/ && uv run pytest -v -m "not integration"`
+- [ ] Frontend: `pnpm lint && pnpm test --run` green; no NEW tsc -b errors in touched files
+- [ ] Seeder: overrides 201 / bounds 422 / unknown-knob 422 / no-overrides byte-identical (tests enforce)
+- [ ] Demo validators: seed_overrides×skip_seed and window_days×holiday_rush rejected; legacy frame green (JSON-path tests)
+- [ ] Pipeline: overrides forwarded; user_scope honored; dangling scope WARNS + falls back (tests enforce)
+- [ ] Workspace: slots persisted sparse/NULL; replay-verbatim regression green (integration)
+- [ ] Replay helper: workspaceToRunRequest covers legacy + slotted rows (FE test)
+- [ ] Browser dogfood (Level 4) performed in a real browser — not just tests
+- [ ] `git diff --stat` surgical (types/api.ts CRLF trap)
+- [ ] API_CONTRACTS + RUNBOOKS 29 + workspace-section + DOMAIN_MODEL slot schemas updated additively
+- [ ] Commits reference #409, scopes from the allow-list, no AI trailer; PR into dev
+- [ ] Every CONTRACT(E1) tag was re-verified against the merged E1 code (Task 0)
+
+---
+
+## Assumptions (explicit — no user clarification was available)
+
+1. **CONTRACT(E1):** `showcase_workspace` carries `seed_overrides` and `user_scope` as TWO separate nullable JSONB columns (precedent: `created_objects` / `result_summary`, `models.py:77-81`). If E1's PRP instead nests all story slots under one JSONB column, only the `create_workspace` write and the response-schema mapping change (Task 0 re-anchor).
+2. **CONTRACT(E1):** E1's migration ships the slots; E3 ships ZERO migrations. If E1 somehow deferred a slot, E3 must STOP and add it to E1, not ship its own migration.
+3. **CONTRACT(E1):** `config_schema_version` semantics are E1's; populating reserved slots does NOT bump it (assumed to stay at E1's initial value). E3 writes nothing to that column.
+4. **CONTRACT(E1):** workspace API responses — E3 requires `seed_overrides` + `user_scope` on the LIST item (replay reads list rows, `showcase.tsx:174-186`). If E1 exposed them detail-only (or not at all), E3 adds them to `WorkspaceListItem`; if E1 already added them, Task 4 merges instead of duplicating.
+5. **CONTRACT(E1):** replay provenance (`replayed_from_workspace_id`) is written by the E1/E2 replay surface; E3's replay-verbatim test must tolerate (not assert away) that column being populated.
+6. **CONTRACT(E1):** `PATCH /demo/workspaces/{id}` exists (E1) and is deliberately NOT extended by E3 — overrides/scope are immutable run records.
+7. Knob names (`promotion_intensity`, `stockout_intensity`, `noise_sigma`, `window_days`) are this PRP's choice — business-friendly on the wire, mapped to the internal dataclass names in one documented table. Renaming costs a constant, not a rework.
+8. Bounds are this PRP's choice (table above), justified against preset reference values; the UI constrains tighter (laptop-scale) than the API.
+9. `user_scope` dangling resolution = WARN + fallback (not fail): chosen so replay of a `reset=true` workspace can never hard-fail forever; divergence stays visible via the requested-slot vs effective-columns split.
+10. The seeder-side field is named `overrides` (the slice context makes `seed_` redundant); the demo-side field is `seed_overrides` (epic-specified name). The pipeline maps one to the other in `step_seed`.
+
+## Anti-Patterns to Avoid
+
+- ❌ Don't create a new seeder endpoint — the decision above is final for E3; the nested object rides the existing contract.
+- ❌ Don't widen the knob allow-list beyond the 7 — the umbrella names this the top risk; everything else stays preset-driven (`extra="forbid"` enforces it).
+- ❌ Don't add any mid-run configuration channel — all config is start-frame-time; the single-`asyncio.Lock` linear pipeline is a design invariant.
+- ❌ Don't import `app/features/seeder/*` from the demo slice (or vice versa) — the shared schema lives in `app/shared/seeder/overrides.py`.
+- ❌ Don't add `ConfigDict(strict=True)` to `GenerateParams` (it has date fields) — only the new nested models are strict.
+- ❌ Don't make a dangling `user_scope` fail the run — warn + fallback (replay safety); equally, don't silently adopt it without validation.
+- ❌ Don't let a workspace slot write break the pipeline — slot writes stay inside the warn-and-continue try/except.
+- ❌ Don't ship a migration — E1 owns the schema.
+- ❌ Don't NaN-WAPE-proof the backtest for extreme overrides — document the expected fail (runbook 29), mirroring the sparse-preset decision in E2/#391.
+- ❌ Don't hand-roll new UI primitives or install shadcn components when collapsible/select/slider/input/badge/tooltip already exist; if forced, pin `shadcn@4.7.0`.
+- ❌ Don't ship the UI without a real-browser check — `.claude/rules/ui-design.md` makes that a hard requirement.
+
+## Confidence Score
+
+**8/10** for one-pass implementation success. Every backend change extends a
+verified, line-cited in-repo pattern (the seeder's layered override pipeline,
+the `DemoRunRequest` cross-field validators, `_RecordingClient` step tests, the
+warn-and-continue workspace writes), the pydantic strict/nested/forbid
+behavior was runtime-verified rather than assumed, and the riskiest judgment
+calls (contract shape, knob mapping, bounds, dangling-scope semantics, slot
+schemas) are decided with rationale and pinned by tests. The −2: (a) this PRP
+is authored PRE-E1 — six CONTRACT(E1) tags must survive a cross-check against
+the merged E1 code, and attribute-name drift there would touch 3 files (Task 0
+exists precisely to absorb this); (b) the two new frontend components are the
+usual UI-iteration surface (styling/dogfood may need a second pass), and
+`showcase.tsx` is a merge hotspot shared with parallel epics E2/E4/E5.
diff --git a/PRPs/PRP-showcase-completion-E4-run-config-phase-controls.md b/PRPs/PRP-showcase-completion-E4-run-config-phase-controls.md
new file mode 100644
index 00000000..85826e58
--- /dev/null
+++ b/PRPs/PRP-showcase-completion-E4-run-config-phase-controls.md
@@ -0,0 +1,820 @@
+name: "PRP — showcase-completion E4: run-config phase controls (model set + backtest params in start frame)"
+issue: "#410 (epic) · umbrella #406 · depends on E1 #407 (Foundation — MUST be merged first)"
+branch: "feat/showcase-run-config-phase-controls (off dev)"
+description: |
+  Start-frame-time run configuration for the showcase pipeline: a model-family
+  picker (baselines + feature-aware, with opt-in lightgbm/xgboost/random_forest
+  toggles surfaced ONLY when the matching `forecast_enable_*` flag is on),
+  backtest configuration (horizon, split strategy, min train size, n_splits,
+  gap, ranking metric WAPE/MAE/RMSE), a train-candidate preview before launch,
+  and the chosen config echoed into the workspace row and visible on the run.
+  NO mid-run re-entry — the linear single-`asyncio.Lock` pipeline is preserved;
+  all configuration happens in the start frame.
+
+## Core Principles
+
+1. **Context is King** — every file/line cited below was verified on 2026-06-12.
+2. **Validation Loops** — Levels 1–4 below are executable; Level 4 browser dogfood is MANDATORY (UI work, `.claude/rules/ui-design.md`).
+3. **Additive only** — a legacy start frame (no new fields) behaves **byte-identically** to today. This is a frozen umbrella #406 success criterion.
+4. **Global rules** — CLAUDE.md / AGENTS.md / `.claude/rules/*` apply. Commits: `feat(api,db): … (#410)` for backend+migration, `feat(ui): … (#410)` for frontend (or one `feat(api,ui): … (#410)`).
+
+---
+
+## Goal
+
+**Feature Goal**: An operator on `/showcase` can, before launching a run, (a) pick which forecasting models the pipeline trains/backtests, (b) tune the backtest split (horizon / strategy / n_splits / min_train_size / gap) and the winner-ranking metric (WAPE / MAE / RMSE), (c) see a train-candidate preview of exactly what will run, and (d) find that config recorded on the saved workspace row and honored verbatim on Replay.
+
+**Deliverable** (all additive):
+
+- `app/features/demo/schemas.py` — new `DemoBacktestConfig`; `DemoRunRequest` gains `train_model_types: list[str] | None` + `backtest: DemoBacktestConfig | None`; `WorkspaceListItem` gains `run_config: dict | None`.
+- `app/shared/model_taxonomy.py` — public `KNOWN_MODEL_TYPES` frozenset (validation allow-list source of truth).
+- `app/features/demo/models.py` + one Alembic migration — nullable `run_config` JSONB column on `showcase_workspace` (a **replay-input column** like `seed`/`scenario`, NOT an E1 story slot — see Decision D1).
+- `app/features/demo/workspace.py` — `create_workspace` records `run_config`.
+- `app/features/demo/pipeline.py` — `DemoContext` carries the resolved run config; `step_train` / `step_backtest` / `step_v2_train` honor it; `_select_winner` gains a metric parameter; `pipeline_complete` echoes the config.
+- `app/features/model_selection/` — `CandidateModelInfo` gains `enabled: bool` (settings overlay in the service; `capabilities.py` stays pure) so the frontend knows which opt-in toggles to surface.
+- `frontend/` — `RunConfigPanel` (collapsible advanced section on `/showcase`): model picker (reuses `CandidateModelPicker` with an enabled-filtered catalog), `DemoBacktestSettingsForm` (mirrors the champion-selector form), train-candidate preview; start-frame wiring with a dirty-only inclusion rule; Load/Replay honor `run_config`; WorkspacePanel shows a config summary.
+- Docs: `docs/_base/API_CONTRACTS.md`, `docs/_base/DOMAIN_MODEL.md`, `docs/_base/RUNBOOKS.md` additive notes.
+- Tests at every layer (schema, taxonomy drift-lock, pipeline, workspace, migration, catalog overlay, vitest).
+
+**Success Definition**: all Success Criteria check off; five CI gates green; integration suite green; a Level-4 dogfood run launches a custom-config run from `/showcase`, the preview matched what ran, the workspace row carries `run_config`, and Replay re-runs it verbatim.
+
+## Why
+
+- Umbrella #406 success criterion: *"The start frame accepts model-set + backtest config; the chosen config is echoed into the workspace row and visible on the run."*
+- Today `DEMO_MODEL_TYPES` (`pipeline.py:67`) hard-codes 3 baselines and `DEMO_HORIZON`/`DEMO_BACKTEST_SPLITS`/`DEMO_MIN_TRAIN_SIZE` (`pipeline.py:54-56`) hard-code the split — the showcase cannot demonstrate the 11-model zoo (PRP-36) or metric-driven champion selection it actually ships.
+- Replay (E4 #393) is verbatim-by-design; without recording the run config, a custom run could not be replayed faithfully — breaking the workspace story the whole umbrella is about.
+- Brainstorm Round 5 (`.flow/brainstorm-log.md`): mid-run/per-phase re-run was explicitly DEFERRED ("re-architects locked linear pipeline") — start-frame-only is the negotiated scope. Do not add mid-run controls.
+
+## What
+
+### User-visible behavior
+
+1. `/showcase` controls card gains a collapsible **"Run configuration (advanced)"** section (collapsed by default — untouched = legacy behavior):
+   - **Model picker**: checkboxes grouped by family (Baseline / Additive / Tree-based), fed by `GET /model-selection/models`. Opt-in models (`lightgbm`, `xgboost`, `random_forest`) appear **only when** their `forecast_enable_*` flag is on (new catalog `enabled` field). Default selection: `naive`, `seasonal_naive`, `moving_average` (the legacy trio). Cap 10, min 1.
+   - **Backtest settings**: ranking metric (WAPE default / MAE / RMSE), horizon (1–90, default 14), and an "Advanced split settings" collapsible: strategy (expanding/sliding), splits (2–20, default 3), min train (≥7, default 30), gap (0–30, default 0). Inline validation mirrors backend bounds; soft warning when `min_train_size + n_splits×(horizon+gap)` exceeds the scenario's seeded window.
+   - **Train-candidate preview**: a read-only chip list of exactly which models will train (selection, plus `prophet_like (V2)` appended on `showcase_rich`), with family badges and a count.
+2. The WS start frame / `POST /demo/run` body carry `train_model_types` + `backtest` **only when the operator changed something** (dirty-only rule → untouched UI sends a byte-identical legacy frame).
+3. The pipeline trains/backtests the selected models; the winner is the best **configured metric**; `pipeline_complete.data.run_config` echoes the config; the train/backtest step cards show what was requested.
+4. On `preservation="keep"` runs the workspace row records `run_config`; the Saved-workspaces panel shows a compact config summary; **Load** repopulates the controls; **Replay** re-submits it verbatim.
+5. A request naming a disabled/unknown model fails fast with an actionable message (422 on unknown at validation; a clear `fail` step detail on disabled-flag models).
+
+### Technical requirements
+
+- Pydantic v2 strict-mode policy respected (all new request fields JSON-native; nested model validated from a plain dict — add the JSON-path test).
+- Vertical-slice rule: the demo slice NEVER imports `app/features/model_selection` (or any sibling) in Python — the model allow-list comes from `app/shared/model_taxonomy.py`; the frontend talks to the catalog over HTTP.
+- Migration forward-only, applies + downgrades cleanly on a fresh DB.
+- Workspace writes stay warn-and-continue (must never break a green run).
+
+### Success Criteria
+
+- [ ] `DemoRunRequest` accepts `train_model_types` + `backtest` (additive Optional); a legacy frame validates byte-identically (existing `test_demo_run_request_legacy_frame_still_validates` extended).
+- [ ] Unknown model type → 422 / WS `error` event; duplicate model types rejected; `gap >= horizon` rejected; selection size 1–10 enforced.
+- [ ] A run with `train_model_types=["naive","seasonal_average"]` trains exactly those models; `step_backtest` sends the configured `split_config` and picks the winner by the configured metric (unit-asserted against the canned `_Client` request bodies).
+- [ ] A disabled opt-in model in the selection fails the `train` step with a detail naming the flag (`forecast_enable_lightgbm=false …`).
+- [ ] `GET /model-selection/models` items carry `enabled`; `enabled=false` exactly when the matching `forecast_enable_*` flag is off (lightgbm/xgboost/random_forest), `true` for all always-on models.
+- [ ] `showcase_workspace.run_config` records the config on keep-runs (NULL when defaults were used); migration up+down clean.
+- [ ] `/showcase` advanced section renders the picker (opt-ins hidden when disabled), backtest form, and preview; Load/Replay honor `run_config`; untouched controls send a legacy frame (vitest-asserted).
+- [ ] `pipeline_complete.data.run_config` echo present on custom runs, absent (None) on legacy runs.
+- [ ] All five CI gates green; integration tests green; Level-4 dogfood evidence captured.
+
+## All Needed Context
+
+### Documentation & References
+
+```yaml
+# ── The work order ───────────────────────────────────────────────────────────
+- issue: "#410"
+  why: Epic scope (verbatim). Parallel after Foundation E1 #407.
+- issue: "#406"
+  why: Umbrella — approach ("additive-only delta", "start-frame-time only"), success criteria, risk table.
+- file: PRPs/PRP-showcase-completion-E1-metadata-provenance-backbone.md
+  why: |
+    The Foundation this epic builds on. CRITICAL: E1 defines six JSONB story
+    slots (seed_overrides, user_scope, approval_events, rag_events, job_ids,
+    phase_summaries) — NONE of them is a run-config slot, and E1 assigns no
+    slot to E4. See Decision D1 below. Also the migration-task pattern to
+    MIRROR (down_revision discovery, up/down test) and config_schema_version
+    semantics (lines 25-70, 228-236, 560-610 of that PRP).
+
+# ── Backend: demo slice (primary surface) ────────────────────────────────────
+- file: app/features/demo/schemas.py
+  why: |
+    DemoRunRequest (lines 29-86) — the additive-field pattern to MIRROR exactly:
+    PRP-38 scenario field (enum-on-wire strict=False override, lines 51-63),
+    E1 #390 preservation/workspace_name + model_validator (lines 64-85).
+    WorkspaceListItem lines 169-189 (from_attributes response pattern).
+- file: app/features/demo/pipeline.py
+  why: |
+    THE file. Constants to make configurable: DEMO_HORIZON=14,
+    DEMO_BACKTEST_SPLITS=3, DEMO_MIN_TRAIN_SIZE=30 (54-56), DEMO_MODEL_TYPES
+    (67). _model_config_payload (271-286) — extend. _select_winner (446-460)
+    — hard-codes "wape"; gains metric param. step_train (669-703) — gather
+    over DEMO_MODEL_TYPES, train tail = date_end - DEMO_HORIZON. step_backtest
+    (731-836) — two branches (SHOWCASE_RICH single-call include_baselines=True
+    at 743-788 vs legacy per-model loop at 789-818); split_config bodies at
+    753-760/801-808. step_v2_train (998-1090) — V2 train tail also uses
+    DEMO_HORIZON (1021). run_pipeline (2618-2771) — ctx construction (2646),
+    create_workspace keep-branch (2655-2657), pipeline_complete data (2758-2770).
+    DemoContext dataclass (212-264) — where resolved config fields land.
+- file: app/features/demo/workspace.py
+  why: |
+    create_workspace (46-79) — records replay inputs at insert time; E4 adds
+    run_config here (NOT in finalize: it is an input, known before step 1).
+    Warn-and-continue pattern is load-bearing.
+- file: app/features/demo/models.py
+  why: ShowcaseWorkspace ORM (37-89). run_config column lands next to the
+       "Run configuration -- replay inputs" block (line 65-69 comment).
+- file: app/features/demo/routes.py
+  why: WS start-frame parse (166-194) — ValidationError → one error event +
+       close. No route changes needed beyond what schemas give for free
+       (POST /demo/run + WS validate via DemoRunRequest).
+- file: app/features/demo/tests/test_schemas.py
+  why: Test naming + the legacy-frame contract test to extend
+       (test_demo_run_request_legacy_frame_still_validates, line 75).
+- file: app/features/demo/tests/test_pipeline.py
+  why: Canned-_Client mocking pattern for step unit tests (assert on captured
+       request bodies — exactly how the split_config assertion should work).
+- file: app/features/demo/tests/test_workspace.py
+  why: Integration-test pattern for create/finalize roundtrips.
+
+# ── Backend: contracts the pipeline drives ───────────────────────────────────
+- file: app/features/forecasting/schemas.py
+  why: |
+    TrainRequest (441-525): store/product/dates/config (+feature_frame_version,
+    feature_groups — leave at V1 defaults for E4 training). ModelConfig is a
+    discriminated union; ALL 11 members validate from a minimal
+    {"model_type": X} payload (runtime-verified, see Gotchas). season_length
+    default 7 (line 86), window_size default 7 (line 107).
+- file: app/features/forecasting/routes.py
+  why: Flag gates — lightgbm (line 76-81) and xgboost (82-86) raise
+       BadRequestError 400 "… is disabled. Set forecast_enable_…". NOTE:
+       random_forest is gated deeper (forecasting/models.py:1761) — another
+       reason step_train must pre-check flags itself for a clean message.
+- file: app/core/config.py
+  why: "forecast_enable_lightgbm / forecast_enable_xgboost /
+       forecast_enable_random_forest — all default False (lines 118-120)."
+- file: app/features/backtesting/schemas.py
+  why: |
+    SplitConfig (24-73) — the canonical bounds DemoBacktestConfig MUST mirror:
+    strategy Literal["expanding","sliding"] def "expanding"; n_splits 2-20
+    def 5; min_train_size ge=7 def 30; gap 0-30 def 0; horizon 1-90 def 14;
+    validator horizon > gap. BacktestConfig (81-108): split_config +
+    model_config_main + include_baselines + store_fold_details.
+    aggregated_metrics keys: mae, smape, wape, bias, rmse (PRP-36;
+    rmse verified at app/features/backtesting/metrics.py:349).
+- file: app/shared/model_taxonomy.py
+  why: _MODEL_FAMILY_MAP — the 11 known model types. E4 adds public
+       KNOWN_MODEL_TYPES here (one-way import app/features/* → app/shared OK).
+
+# ── Backend: model catalog (flag exposure) ───────────────────────────────────
+- file: app/features/model_selection/capabilities.py
+  why: build_model_catalog (line 126) — pure/static by design (module
+       docstring). Do NOT read settings here; overlay in the service.
+- file: app/features/model_selection/service.py
+  why: get_model_catalog (113-119) — thin pass-through; the enabled overlay
+       goes here (model_copy(update={"enabled": …}) per item).
+- file: app/features/model_selection/schemas.py
+  why: CandidateModelInfo (412-429) + ModelCatalogResponse (431) — add
+       `enabled: bool = True` (additive, defaulted for back-compat).
+- file: app/features/model_selection/routes.py
+  why: GET /model-selection/models (74-86) — no route change; response model
+       picks up the new field automatically.
+
+# ── Frontend ─────────────────────────────────────────────────────────────────
+- file: frontend/src/pages/showcase.tsx
+  why: |
+    453 lines. Start-frame construction handleRun (139-156) — the
+    spread-only-when-set pattern for byte-compat; handleLoadWorkspace
+    (160-168) + handleReplayWorkspace (174-186) — must consume run_config;
+    controls card (257-371) — the advanced section slots after the
+    workspace-name block (line 362).
+- file: frontend/src/components/champion-selector/candidate-model-picker.tsx
+  why: REUSE this component (family-grouped checkbox grid, cap badge,
+       extra/feature-aware badges). Feed it an enabled-filtered catalog.
+- file: frontend/src/components/champion-selector/backtest-settings-form.tsx
+  why: MIRROR for DemoBacktestSettingsForm — Field helper, metric Select,
+       Collapsible advanced split knobs, splitConfigErrors display. Differences:
+       horizon is EDITABLE here (champion's is locked), metric list is
+       wape/mae/rmse (champion's is wape/smape/mae/bias).
+- file: frontend/src/components/champion-selector/split-config.ts
+  why: splitConfigErrors — REUSE as-is (field names match DemoBacktestConfig).
+- file: frontend/src/hooks/use-model-selection.ts
+  why: useModelCatalog (line 30) — REUSE for the picker's data.
+- file: frontend/src/types/api.ts
+  why: DemoRunRequest (778-788), WorkspaceListItem (805-815),
+       CandidateModelInfo (1279-1290), SplitConfig comment block (~1262-1268).
+- file: frontend/src/hooks/use-demo-pipeline.ts
+  why: start(req) serializes DemoRunRequest verbatim into the WS start frame —
+       no hook change needed; the dirty-only rule lives in showcase.tsx.
+- file: frontend/src/components/demo/ScenarioPicker.tsx
+  why: Disabled-while-running prop pattern; the scenario value feeds the
+       preview (windowDays map mirrors pipeline.py _SCENARIO_SEED_PROFILE,
+       513-538: demo_minimal/sparse/holiday_rush = 92d window, others = 180d).
+- file: frontend/src/components/demo/WorkspacePanel.tsx
+  why: Row layout to extend with the compact run-config summary line/badge.
+- file: frontend/src/components/demo/RunHistoryStrip.test.tsx
+  why: Representative vitest + RTL pattern for the new component tests.
+
+# ── Project docs to update (additive) ────────────────────────────────────────
+- file: docs/_base/API_CONTRACTS.md
+  why: DemoRunRequest/WS start-frame field docs + catalog `enabled` +
+       workspace run_config (follow the existing E1/E2/PRP-38 annotation style).
+- file: docs/_base/DOMAIN_MODEL.md
+  why: showcase_workspace aggregate — document run_config as a replay-input
+       column (explicitly NOT a story slot; D1 rationale).
+- file: docs/_base/RUNBOOKS.md
+  why: § Showcase runbook — two new numbered incidents (disabled-model fail;
+       aggressive split → NaN/insufficient-fold fail is a documented outcome,
+       sparse-preset precedent in incident 28).
+```
+
+### Current Codebase tree (relevant subset)
+
+```bash
+app/
+├── core/config.py                      # forecast_enable_* flags (118-120)
+├── shared/model_taxonomy.py            # ModelFamily + _MODEL_FAMILY_MAP (11 types)
+└── features/
+    ├── demo/
+    │   ├── models.py                   # ShowcaseWorkspace (89 lines)
+    │   ├── schemas.py                  # DemoRunRequest / StepEvent / Workspace* (213)
+    │   ├── pipeline.py                 # orchestrator + steps (2771)
+    │   ├── workspace.py                # create/finalize/list/get/delete helpers
+    │   ├── routes.py                   # POST /demo/run, WS /demo/stream, workspaces CRUD
+    │   ├── service.py                  # run lock + sync/stream wrappers
+    │   └── tests/                      # test_{schemas,pipeline,workspace,models,routes}.py
+    ├── forecasting/{schemas,routes,models}.py   # TrainRequest, flag gates
+    ├── backtesting/schemas.py          # SplitConfig / BacktestConfig
+    └── model_selection/
+        ├── capabilities.py             # build_model_catalog (pure)
+        ├── service.py                  # get_model_catalog pass-through
+        ├── schemas.py                  # CandidateModelInfo / ModelCatalogResponse
+        └── routes.py                   # GET /model-selection/models
+alembic/versions/                       # head TODAY = 324a2fa37fcc; E1 #407 adds one on top
+frontend/src/
+├── pages/showcase.tsx                  # controls card + start frame + load/replay
+├── hooks/{use-demo-pipeline,use-model-selection,use-workspaces}.ts
+├── components/demo/                    # ScenarioPicker, WorkspacePanel, … (+ tests)
+├── components/champion-selector/       # candidate-model-picker, backtest-settings-form, split-config
+└── types/api.ts                        # DemoRunRequest, WorkspaceListItem, CandidateModelInfo
+```
+
+### Desired Codebase tree (files added/changed)
+
+```bash
+app/shared/model_taxonomy.py                          # MODIFY: + KNOWN_MODEL_TYPES frozenset
+app/shared/tests/test_model_taxonomy.py               # MODIFY (or create if missing): drift-lock test
+app/features/demo/schemas.py                          # MODIFY: + DemoBacktestConfig; DemoRunRequest fields; WorkspaceListItem.run_config
+app/features/demo/models.py                           # MODIFY: + run_config JSONB column
+alembic/versions/<rev>_add_showcase_workspace_run_config.py   # CREATE: add/drop run_config
+app/features/demo/workspace.py                        # MODIFY: create_workspace records run_config
+app/features/demo/pipeline.py                         # MODIFY: ResolvedRunConfig, ctx, steps, winner metric, echo
+app/features/model_selection/schemas.py               # MODIFY: CandidateModelInfo.enabled
+app/features/model_selection/service.py               # MODIFY: enabled settings-overlay
+app/features/model_selection/tests/test_capabilities.py  # MODIFY: overlay unit tests (patched settings)
+app/features/demo/tests/test_schemas.py               # MODIFY: new-field + legacy-frame + JSON-path tests
+app/features/demo/tests/test_pipeline.py              # MODIFY: selection/flag/split/metric/echo tests
+app/features/demo/tests/test_workspace.py             # MODIFY: run_config persistence (integration)
+app/features/demo/tests/test_models.py                # MODIFY: column roundtrip
+frontend/src/types/api.ts                             # MODIFY: DemoBacktestConfig, DemoRunRequest, WorkspaceListItem, CandidateModelInfo.enabled
+frontend/src/components/demo/run-config-utils.ts      # CREATE: defaults, isDefault*, buildTrainPlan, windowDays
+frontend/src/components/demo/run-config-utils.test.ts # CREATE
+frontend/src/components/demo/DemoBacktestSettingsForm.tsx       # CREATE (mirror champion form)
+frontend/src/components/demo/DemoBacktestSettingsForm.test.tsx  # CREATE
+frontend/src/components/demo/RunConfigPanel.tsx       # CREATE: collapsible section composing picker+form+preview
+frontend/src/components/demo/RunConfigPanel.test.tsx  # CREATE
+frontend/src/pages/showcase.tsx                       # MODIFY: state + dirty-rule + load/replay + panel mount
+frontend/src/components/demo/WorkspacePanel.tsx       # MODIFY: config summary line
+docs/_base/{API_CONTRACTS,DOMAIN_MODEL,RUNBOOKS}.md   # MODIFY: additive notes
+```
+
+### Design Decisions (locked — do not re-litigate during implementation)
+
+```text
+D1 — run_config is a DEDICATED nullable JSONB COLUMN, not an E1 story slot.
+     E1 (#407) defines six slots and assigns writers for all of them to E3/E5/
+     "later epics" — none is a run-config slot. The model set + backtest params
+     are REPLAY INPUTS (same class as the existing seed/scenario/reset/skip_seed
+     columns, models.py:65-69), not run-story output. So: one additive column
+     `run_config JSONB NULL`, written by create_workspace at insert time,
+     consumed by Load/Replay. config_schema_version is NOT bumped — E1 defines
+     it as the STORY-SLOT schema marker; run_config presence is detectable by
+     NULL-check and carries its own documented shape in DOMAIN_MODEL.md.
+     NOTE: the E1 PRP (~line 230) loosely names "E4 #410 run-config echo" as a
+     candidate writer for job_ids/phase_summaries — this PRP supersedes that
+     phrasing: neither slot is run-config-shaped; job_ids/phase_summaries
+     writing stays with the later parallel epics (E2 #408 / E5 #411).
+
+D2 — E1 #407 MUST merge before this epic's migration is authored.
+     E4's migration down_revision = the head AT IMPLEMENTATION TIME (E1's
+     revision). Discover with `uv run alembic heads` — do NOT hardcode
+     324a2fa37fcc (that is today's pre-E1 head).
+
+D3 — Flag exposure rides the EXISTING catalog endpoint.
+     CandidateModelInfo gains `enabled: bool = True`; the model_selection
+     SERVICE overlays get_settings() (lightgbm→forecast_enable_lightgbm,
+     xgboost→forecast_enable_xgboost, random_forest→forecast_enable_random_forest,
+     everything else True). capabilities.build_model_catalog stays pure/static
+     (its module docstring is a contract). No new /config endpoint.
+
+D4 — Selection semantics in step_backtest:
+     • train_model_types is None → BOTH branches byte-identical to today
+       (SHOWCASE_RICH single call include_baselines=True; legacy per-model loop).
+     • train_model_types provided → ONE unified per-model loop over
+       selection ∪ ({prophet_like} when scenario==SHOWCASE_RICH), each call
+       include_baselines=False; bucketed_aggregated_metrics captured from the
+       prophet_like call's main_model_results when present. prophet_like is
+       appended because v2_train trains/registers it unconditionally on
+       SHOWCASE_RICH — it must stay in the competition or the V2 story breaks.
+
+D5 — Winner metric: Literal["wape","mae","rmse"], default "wape", all
+     lower-is-better; _select_winner(results, metric=…) skips missing/NaN.
+     (smape/bias deliberately excluded — issue #410 names WAPE/MAE/RMSE.)
+
+D6 — Flag enforcement is fail-fast in step_train (clear detail naming the
+     flag), NOT in the Pydantic schema. Settings reads inside schemas caused
+     the documented ".env-bleed" test incidents (RUNBOOKS § Settings tests);
+     schemas validate only against the static KNOWN_MODEL_TYPES allow-list.
+
+D7 — Dirty-only start-frame inclusion: showcase.tsx omits train_model_types /
+     backtest keys when they equal the defaults (legacy trio + default split).
+     Untouched UI ⇒ byte-identical legacy frame (umbrella criterion).
+
+D8 — The configured horizon drives ONLY the modeling steps: step_train /
+     step_v2_train train-tail reservation and step_backtest split_config.
+     Planning/scenario steps keep DEMO_HORIZON (out of scope; document).
+```
+
+### Known Gotchas & Library Quirks
+
+```python
+# VERIFIED 2026-06-12 (re-run these on library/schema upgrades):
+#
+# 1. ALL 11 ModelConfig union members validate from a minimal {"model_type": X}:
+#    uv run python -c "
+#    from pydantic import TypeAdapter
+#    from app.features.forecasting.schemas import TrainRequest
+#    ta = TypeAdapter(TrainRequest.model_fields['config'].annotation)
+#    [ta.validate_python({'model_type': t}) for t in (
+#      'naive','seasonal_naive','moving_average','weighted_moving_average',
+#      'seasonal_average','trend_regression_baseline','regression',
+#      'prophet_like','lightgbm','xgboost','random_forest')]"
+#    → _model_config_payload can fall back to {"model_type": t} for new types.
+#    KEEP the explicit seasonal_naive(season_length=7)/moving_average(window_size=7)
+#    branches — they match schema defaults but are load-bearing for config_hash
+#    stability of existing registry rows.
+#
+# 2. "rmse" IS in aggregated_metrics (backtesting/metrics.py:349, PRP-36) —
+#    alongside mae/smape/wape/bias. Do not invent other keys.
+#
+# 3. forecast_enable_{lightgbm,xgboost,random_forest} all default False
+#    (app/core/config.py:118-120). lightgbm/xgboost are gated at the train
+#    ROUTE (BadRequestError 400, routes.py:76-86); random_forest only deep in
+#    the model factory (models.py:1761) → without the D6 pre-check a
+#    random_forest request fails uglier. Also: flag ON but extra NOT installed
+#    still ImportErrors (catalog requires_extra badge covers the UI hint).
+#
+# 4. Pydantic strict mode: ConfigDict(strict=True) on DemoRunRequest is fine
+#    for the new fields — list[str] and a nested BaseModel validated from a
+#    JSON dict are allowed under strict (strict forbids primitive coercion,
+#    not dict→model validation). STILL add the JSON-path test
+#    (Model.model_validate({...nested dict...})) per the repo strict-mode
+#    policy (docs/_base/SECURITY.md, test_strict_mode_policy.py precedent).
+#    All new fields are JSON-native → no Field(strict=False) needed anywhere.
+#
+# 5. Demo windows are finite: demo_minimal/sparse = 92d, holiday_rush = 92d
+#    pinned, others = 180d (pipeline.py:513-538 _SCENARIO_SEED_PROFILE). An aggressive
+#    split (e.g. h=28, n_splits=5, min_train=60) CANNOT fit → backtest NaN /
+#    splitter error → step fail. This is a DOCUMENTED OUTCOME (same policy as
+#    the sparse preset's expected-fail, RUNBOOKS incident 28) — the backend
+#    must NOT silently clamp; the frontend shows the soft warning.
+#
+# 6. Feature-aware models (regression/prophet_like/…) train fine through
+#    POST /forecasting/train with V1 defaults (feature_frame_version=1) — the
+#    service builds the feature frame internally. Do NOT set
+#    feature_frame_version=2 in step_train; V2 stays step_v2_train's job.
+#    Expect noticeably longer wall-clock when selected (no budget gate change).
+#
+# 7. step_register reuses _model_config_payload(winner) and the winner's
+#    train_results model_path — both work unchanged for any selected winner.
+#    BUT registry _find_duplicate accumulation across repeated identical runs
+#    is a known trap (RUNBOOKS showcase incident 2) — unchanged risk profile,
+#    just more reachable configs now. No action; aware.
+#
+# 8. The demo slice may NOT import app/features/model_selection (vertical-slice
+#    rule). The model allow-list source is app/shared/model_taxonomy.py.
+#    Add KNOWN_MODEL_TYPES there + a drift-lock test asserting it equals
+#    _MODEL_FAMILY_MAP.keys() (precedent: forecasting's
+#    test_model_family_map_covers_every_known_model_type).
+#
+# 9. capabilities.build_model_catalog is PURE by contract (docstring: "No DB,
+#    no I/O… deterministic and unit-tested directly"). The enabled overlay
+#    belongs in ModelSelectionService.get_model_catalog (D3) via
+#    item.model_copy(update={"enabled": …}).
+#
+# 10. WS error path: a ValidationError on the start frame becomes ONE error
+#     StepEvent then close (routes.py:188-191) — new-field validation failures
+#     surface there for free; assert it in test_routes.py.
+#
+# 11. Repo quirks: mixed CRLF/LF — check `git diff --stat` for whole-file
+#     noise before committing. `pnpm tsc --noEmit` is VACUOUS (solution-style
+#     tsconfig) — rely on `pnpm lint` + `pnpm test --run` + the real `tsc -b`
+#     only informationally (it has pre-existing failures on dev). A stale
+#     uvicorn can squat :8123 during Level 3/4 — check `ps etime` first.
+#     NEVER `docker compose down -v` (kills the Ollama models volume).
+```
+
+## Implementation Blueprint
+
+### Data models and structure
+
+```python
+# ── app/shared/model_taxonomy.py (additive) ──────────────────────────────────
+KNOWN_MODEL_TYPES: frozenset[str] = frozenset(_MODEL_FAMILY_MAP)
+# Public allow-list for request validation across slices. Drift-locked by test.
+
+# ── app/features/demo/schemas.py (additive) ──────────────────────────────────
+class DemoBacktestConfig(BaseModel):
+    """Backtest knobs for the showcase pipeline (E4 #410).
+
+    Bounds MIRROR app/features/backtesting/schemas.py:SplitConfig exactly —
+    the pipeline forwards them verbatim into POST /backtesting/run.
+    """
+    model_config = ConfigDict(strict=True)
+
+    horizon: int = Field(default=14, ge=1, le=90)
+    strategy: Literal["expanding", "sliding"] = "expanding"
+    n_splits: int = Field(default=3, ge=2, le=20)        # demo default 3, NOT SplitConfig's 5
+    min_train_size: int = Field(default=30, ge=7)
+    gap: int = Field(default=0, ge=0, le=30)
+    metric: Literal["wape", "mae", "rmse"] = "wape"      # D5
+
+    @model_validator(mode="after")
+    def _gap_lt_horizon(self) -> DemoBacktestConfig:
+        if self.gap >= self.horizon:
+            raise ValueError(f"horizon ({self.horizon}) must be greater than gap ({self.gap})")
+        return self
+
+class DemoRunRequest(BaseModel):
+    ...existing fields unchanged...
+    # E4 (#410): additive run-config. None → legacy DEMO_MODEL_TYPES +
+    # legacy split constants, byte-identical behavior.
+    train_model_types: list[str] | None = Field(default=None, min_length=1, max_length=10)
+    backtest: DemoBacktestConfig | None = None
+
+    @field_validator("train_model_types")
+    @classmethod
+    def _known_unique_models(cls, v: list[str] | None) -> list[str] | None:
+        if v is None:
+            return v
+        unknown = [m for m in v if m not in KNOWN_MODEL_TYPES]
+        if unknown:
+            raise ValueError(f"Unknown model type(s): {unknown!r}. Valid: {sorted(KNOWN_MODEL_TYPES)}")
+        if len(set(v)) != len(v):
+            raise ValueError("train_model_types contains duplicates")
+        return v
+
+class WorkspaceListItem(BaseModel):
+    ...existing...
+    # E4 (#410): replay-input echo; None on default-config / pre-E4 rows.
+    run_config: dict[str, Any] | None = Field(default=None)
+
+# ── app/features/demo/models.py (additive column) ────────────────────────────
+# E4 (#410) — replay-input column (NOT an E1 story slot, see PRP D1):
+# {"train_model_types": [...], "backtest": {...}} via model_dump(mode="json");
+# NULL when the run used defaults.
+run_config: Mapped[dict[str, Any] | None] = mapped_column(JSONB, nullable=True)
+
+# ── app/features/demo/pipeline.py (resolved config) ──────────────────────────
+@dataclass(frozen=True)
+class ResolvedRunConfig:
+    """req.train_model_types/backtest with legacy defaults filled in."""
+    model_types: tuple[str, ...] = DEMO_MODEL_TYPES
+    horizon: int = DEMO_HORIZON
+    strategy: str = "expanding"
+    n_splits: int = DEMO_BACKTEST_SPLITS
+    min_train_size: int = DEMO_MIN_TRAIN_SIZE
+    gap: int = 0
+    metric: str = "wape"
+    customized: bool = False     # True when the request carried either field
+
+# DemoContext gains: run_config: ResolvedRunConfig = field(default_factory=ResolvedRunConfig)
+
+# ── app/features/model_selection/schemas.py (additive) ───────────────────────
+class CandidateModelInfo(BaseModel):
+    ...existing...
+    enabled: bool = True   # E4 #410 — forecast_enable_* overlay (service-set)
+```
+
+### Tasks (dependency-ordered)
+
+```yaml
+Task 0 — PRE-FLIGHT (read-only):
+  - VERIFY E1 #407 is merged: `gh issue view 407 --json state` + `uv run alembic heads`
+    (head must be E1's revision, NOT 324a2fa37fcc). If E1 is not merged: STOP —
+    this epic is Parallel-after-Foundation.
+  - RE-RUN the three verification commands in Known Gotchas 1-3.
+  - READ: PRPs/PRP-showcase-completion-E1-metadata-provenance-backbone.md (slot
+    contract), pipeline.py:40-90/440-470/660-840, schemas.py (demo), showcase.tsx.
+
+Task 1 — shared taxonomy allow-list:
+  MODIFY app/shared/model_taxonomy.py:
+    - ADD `KNOWN_MODEL_TYPES: frozenset[str] = frozenset(_MODEL_FAMILY_MAP)` below the map,
+      with a docstring naming it the cross-slice request-validation allow-list.
+  CREATE/EXTEND app/shared/tests/test_model_taxonomy.py:
+    - test_known_model_types_matches_family_map (drift-lock: == set(_MODEL_FAMILY_MAP)).
+    - test_known_model_types_contains_demo_trio.
+
+Task 2 — demo schemas:
+  MODIFY app/features/demo/schemas.py:
+    - ADD DemoBacktestConfig (exact shape above; module placement after DemoRunRequest's
+      dependencies — define BEFORE DemoRunRequest).
+    - ADD train_model_types + backtest to DemoRunRequest (after workspace_name block,
+      comment-tagged "E4 (#410)"); field_validator as above; import KNOWN_MODEL_TYPES
+      from app.shared.model_taxonomy.
+    - ADD run_config to WorkspaceListItem (detail inherits).
+  EXTEND app/features/demo/tests/test_schemas.py (mirror existing naming):
+    - test_demo_run_request_run_config_defaults_none
+    - test_demo_run_request_accepts_model_selection_json_path  # model_validate on plain dicts
+    - test_demo_run_request_rejects_unknown_model_type
+    - test_demo_run_request_rejects_duplicate_model_types
+    - test_demo_run_request_rejects_empty_and_oversized_selection  # [] and 11 entries
+    - test_demo_backtest_config_defaults_and_bounds              # n_splits=1→err, gap>=horizon→err
+    - test_demo_run_request_legacy_frame_still_validates         # EXTEND: assert new fields None
+    - test_workspace_list_item_run_config_round_trip
+
+Task 3 — ORM column + migration:
+  MODIFY app/features/demo/models.py: run_config column (snippet above) inside the
+    "Run configuration -- replay inputs" block; extend class docstring Attributes.
+  CREATE alembic/versions/<rev>_add_showcase_workspace_run_config.py:
+    - revision = autogen id; down_revision = OUTPUT OF `uv run alembic heads` (D2).
+    - upgrade: op.add_column("showcase_workspace", sa.Column("run_config",
+      postgresql.JSONB(astext_type=sa.Text()), nullable=True))
+    - downgrade: op.drop_column. MIRROR the E1 migration's structure/comments.
+  EXTEND app/features/demo/tests/test_models.py: run_config JSONB roundtrip +
+    NULL-default assertions (integration-marked, same pattern as existing).
+
+Task 4 — workspace write:
+  MODIFY app/features/demo/workspace.py:
+    - ADD module-level `def _run_config_payload(req: DemoRunRequest) -> dict[str, Any] | None`:
+      returns None when BOTH fields are None; else
+      {"train_model_types": req.train_model_types,
+       "backtest": req.backtest.model_dump(mode="json") if req.backtest else None}.
+    - create_workspace: pass run_config=_run_config_payload(req) into ShowcaseWorkspace(...).
+  EXTEND app/features/demo/tests/test_workspace.py:
+    - test_create_workspace_records_run_config (custom req → JSONB persisted verbatim)
+    - test_create_workspace_run_config_null_on_defaults
+
+Task 5 — pipeline:
+  MODIFY app/features/demo/pipeline.py:
+    - ADD ResolvedRunConfig dataclass (near DemoContext) + a
+      `def _resolve_run_config(req: DemoRunRequest) -> ResolvedRunConfig` helper.
+    - DemoContext: ADD run_config field (default_factory=ResolvedRunConfig).
+    - run_pipeline (2646): ctx = DemoContext(..., run_config=_resolve_run_config(req)).
+    - _model_config_payload (271): ADD fallback branch
+      `if model_type in KNOWN_MODEL_TYPES: return {"model_type": model_type}`
+      BEFORE the raise; keep existing explicit branches untouched (Gotcha 1).
+    - _select_winner (446): signature → (backtest_results, metric="wape");
+      replace metrics.get("wape") with metrics.get(metric). ONE production call
+      site (pipeline.py:820); existing tests keep passing via the default.
+    - step_train (669): iterate ctx.run_config.model_types; train tail uses
+      ctx.run_config.horizon; PREPEND fail-fast flag check (D6):
+        settings = get_settings()
+        _FLAG_BY_MODEL = {"lightgbm": settings.forecast_enable_lightgbm,
+                          "xgboost": settings.forecast_enable_xgboost,
+                          "random_forest": settings.forecast_enable_random_forest}
+        disabled = [m for m in ctx.run_config.model_types if _FLAG_BY_MODEL.get(m) is False]
+        if disabled: return ("fail", f"model(s) {disabled} requested but the matching "
+                             "forecast_enable_* flag is off — enable it or deselect", {...})
+      step data: ADD "requested_models": list(ctx.run_config.model_types).
+    - step_backtest (731): implement D4. Extract one
+      `_backtest_body(ctx, model_type, *, include_baselines)` helper building the
+      request body from ctx.run_config (split_config: strategy/n_splits/
+      min_train_size/gap/horizon). Branching:
+        if not ctx.run_config.customized: → EXISTING two branches verbatim.
+        else: loop over models = list(ctx.run_config.model_types)
+              + ([SHOWCASE_V2_MODEL_TYPE] if scenario is SHOWCASE_RICH and not
+                 already in selection else [])
+              each include_baselines=False; capture bucketed metrics from the
+              SHOWCASE_V2_MODEL_TYPE call when present.
+      winner = _select_winner(ctx.backtest_results, ctx.run_config.metric)
+      step data: ADD "metric": ctx.run_config.metric.
+    - step_v2_train (1021): train tail uses ctx.run_config.horizon (D8).
+    - run_pipeline pipeline_complete data (2758): ADD
+      "run_config": ({"train_model_types": ..., "backtest": {...}} if customized else None).
+  EXTEND app/features/demo/tests/test_pipeline.py (canned-_Client pattern):
+    - test_resolve_run_config_defaults_and_custom
+    - test_model_config_payload_minimal_fallback_for_all_known_types
+    - test_select_winner_honors_metric (+ NaN/missing skip per metric)
+    - test_step_train_trains_selected_models (capture POSTed bodies)
+    - test_step_train_fails_fast_on_disabled_flag (patch get_settings)
+    - test_step_backtest_sends_configured_split_config (assert body verbatim)
+    - test_step_backtest_custom_selection_appends_prophet_like_on_showcase_rich
+    - test_step_backtest_legacy_path_unchanged_when_not_customized
+    - test_pipeline_complete_echoes_run_config (+ None on legacy run)
+
+Task 6 — catalog enabled overlay:
+  MODIFY app/features/model_selection/schemas.py: CandidateModelInfo.enabled: bool = True
+    (comment: "E4 #410 — runtime forecast_enable_* overlay; service-set").
+  MODIFY app/features/model_selection/service.py get_model_catalog:
+    base = build_model_catalog(); settings = get_settings()
+    flag = {"lightgbm": settings.forecast_enable_lightgbm,
+            "xgboost": settings.forecast_enable_xgboost,
+            "random_forest": settings.forecast_enable_random_forest}
+    return ModelCatalogResponse(
+        models=[m.model_copy(update={"enabled": flag.get(m.model_type, True)}) for m in base.models],
+        default_candidate_model_types=base.default_candidate_model_types)
+  EXTEND model_selection tests (mirror existing catalog tests):
+    - test_catalog_enabled_false_when_flags_off (default settings)
+    - test_catalog_enabled_true_when_flag_on (patched settings)
+    - test_capabilities_stays_pure (build_model_catalog items default enabled=True)
+
+Task 7 — frontend types:
+  MODIFY frontend/src/types/api.ts:
+    - ADD `export interface DemoBacktestConfig` (horizon/strategy/n_splits/
+      min_train_size/gap + `metric: DemoRankingMetric`) and
+      `export type DemoRankingMetric = 'wape' | 'mae' | 'rmse'`.
+    - DemoRunRequest: + `train_model_types?: string[]`, `backtest?: DemoBacktestConfig`
+      (comment-tagged E4 #410, mirror the E1 comment style at 783-787).
+    - WorkspaceListItem: + `run_config?: Record<string, unknown> | null`.
+    - CandidateModelInfo: + `enabled: boolean`.
+
+Task 8 — frontend run-config building blocks:
+  CREATE frontend/src/components/demo/run-config-utils.ts:
+    - DEFAULT_TRAIN_MODELS = ['naive','seasonal_naive','moving_average']
+    - DEFAULT_BACKTEST: DemoBacktestConfig = {horizon:14, strategy:'expanding',
+      n_splits:3, min_train_size:30, gap:0, metric:'wape'}
+    - isDefaultSelection(models) / isDefaultBacktest(cfg) (order-insensitive for models)
+    - buildTrainPlan(models, scenario): {model_type, family?, v2?}[] — appends
+      'prophet_like (V2)' marker on showcase_rich (skip if already selected)
+    - windowDaysFor(scenario): 92 for demo_minimal/sparse/holiday_rush, 180 others
+      (source of truth: pipeline.py _SCENARIO_SEED_PROFILE:513-538 — keep a sync comment)
+    - splitFitWarning(cfg, scenario): string | null when
+      min_train_size + n_splits*(horizon+gap) > windowDaysFor(scenario)
+  CREATE run-config-utils.test.ts covering all of the above.
+  CREATE frontend/src/components/demo/DemoBacktestSettingsForm.tsx:
+    - MIRROR champion-selector/backtest-settings-form.tsx structure (Field
+      helper, metric Select, Collapsible advanced knobs), DIFFERENCES:
+      editable horizon Input (1-90), metrics wape/mae/rmse, REUSE
+      splitConfigErrors from '@/components/champion-selector/split-config'
+      (field names align), plus the splitFitWarning line (amber, non-blocking).
+  CREATE DemoBacktestSettingsForm.test.tsx (mirror champion form's test).
+  CREATE frontend/src/components/demo/RunConfigPanel.tsx:
+    - Collapsible "Run configuration (advanced)" (collapsed default; chevron
+      pattern from backtest-settings-form.tsx:125-137); props: scenario,
+      disabled, selection+onSelectionChange, backtest+onBacktestChange.
+    - Inside: CandidateModelPicker fed `{...catalog, models: catalog.models
+      .filter(m => m.enabled)}` from useModelCatalog() (REUSE both);
+      DemoBacktestSettingsForm; train-candidate preview (Badge chips from
+      buildTrainPlan + count line).
+    - "Reset to defaults" ghost button (restores DEFAULT_* values).
+  CREATE RunConfigPanel.test.tsx:
+    - opt-in models hidden when enabled=false (mock catalog)
+    - preview appends prophet_like on showcase_rich only
+    - reset restores defaults
+
+Task 9 — showcase page wiring:
+  MODIFY frontend/src/pages/showcase.tsx:
+    - state: trainModels (DEFAULT_TRAIN_MODELS), backtestCfg (DEFAULT_BACKTEST).
+    - handleRun: spread-only-when-dirty (D7), mirroring the existing
+      preservation spread (149-154):
+        ...(isDefaultSelection(trainModels) ? {} : {train_model_types: trainModels}),
+        ...(isDefaultBacktest(backtestCfg) ? {} : {backtest: backtestCfg}),
+    - handleLoadWorkspace: when ws.run_config present, repopulate
+      trainModels/backtestCfg (fallback to defaults for missing keys);
+      when absent, reset to defaults.
+    - handleReplayWorkspace: forward ws.run_config fields verbatim into start()
+      (same omit-when-null rule).
+    - Mount <RunConfigPanel/> inside the controls CardContent below the
+      flex-wrap control row (after line 363), disabled={isRunning}.
+    - Run button disabled when trainModels.length === 0 (picker enforces ≥1
+      anyway via toggle, belt-and-braces).
+  MODIFY frontend/src/components/demo/WorkspacePanel.tsx:
+    - rows with run_config render a compact summary line, e.g.
+      "custom: 4 models · rmse · 5×h21" (Badge 'custom config' + muted text).
+  EXTEND showcase/WorkspacePanel vitest specs:
+    - untouched controls → start() called WITHOUT the new keys (dirty rule)
+    - changed metric → start() includes backtest
+    - replay of a run_config workspace forwards it verbatim
+    - WorkspacePanel renders the custom-config badge only when run_config set.
+
+Task 10 — docs sweep (docs(docs): … (#410) or fold into the feat commits):
+  - docs/_base/API_CONTRACTS.md: POST /demo/run + WS /demo/stream rows — E4
+    (#410) additive fields (shape, defaults, validation, dirty-rule note);
+    GET /demo/workspaces run_config field; GET /model-selection/models
+    `enabled` field.
+  - docs/_base/DOMAIN_MODEL.md: showcase_workspace — run_config replay-input
+    column + D1 rationale sentence ("NOT a story slot; config_schema_version
+    unaffected").
+  - docs/_base/RUNBOOKS.md § Showcase: two numbered incidents — (a) train step
+    fails "forecast_enable_* flag is off"; (b) custom split too aggressive for
+    the seeded window → backtest fail is a documented outcome (cite incident 28
+    sparse precedent).
+```
+
+### Integration Points
+
+```yaml
+DATABASE:
+  - migration: add nullable JSONB run_config to showcase_workspace
+  - NO index (read path is by workspace_id; config is display/replay payload)
+CONFIG:
+  - none added; READS forecast_enable_* via get_settings() in step_train (D6)
+    and model_selection service (D3). Never os.environ.
+ROUTES:
+  - none added. DemoRunRequest changes flow through POST /demo/run + WS
+    /demo/stream automatically; catalog field flows through GET /model-selection/models.
+FRONTEND DATA:
+  - useModelCatalog() (existing) powers the picker; no new hooks.
+COMMITS (every one references #410, no AI trailers):
+  - feat(api,db): showcase run-config start-frame contract + workspace column (#410)
+  - feat(api): honor run config in demo pipeline + catalog enabled overlay (#410)
+  - feat(ui): showcase run-config panel, preview, and replay wiring (#410)
+  - docs(docs): document showcase run-config contract (#410)
+```
+
+## Validation Loop
+
+### Level 1 — Syntax & Style (after every task)
+
+```bash
+uv run ruff check . && uv run ruff format --check .
+uv run mypy app/ && uv run pyright app/          # both --strict, both gate merge
+cd frontend && pnpm lint                          # NOTE: pnpm tsc --noEmit is vacuous (memory)
+```
+
+### Level 2 — Unit tests (no DB)
+
+```bash
+uv run pytest app/shared/tests/ app/features/demo/tests/ app/features/model_selection/tests/ -v -m "not integration"
+uv run pytest -v -m "not integration"             # full unit suite before push
+cd frontend && pnpm test --run                    # vitest incl. new specs
+```
+
+### Level 3 — Integration (real Postgres; respect [[fresh-stack-gate-procedure]] — no `down -v`)
+
+```bash
+docker compose up -d && uv run alembic upgrade head
+uv run alembic downgrade -1 && uv run alembic upgrade head   # migration round-trip
+uv run pytest app/features/demo/tests/ -v -m integration
+# Live contract probe (backend on :8123 — kill stale uvicorn first, check ps etime):
+curl -s -X POST http://localhost:8123/demo/run -H 'Content-Type: application/json' -d '{
+  "skip_seed": true, "preservation": "keep", "workspace_name": "e4-probe",
+  "train_model_types": ["naive", "seasonal_average"],
+  "backtest": {"horizon": 14, "n_splits": 3, "min_train_size": 30, "gap": 0,
+               "strategy": "expanding", "metric": "rmse"}}' | python3 -m json.tool
+# Expect: steps green, winner picked by rmse, data.run_config echoed, workspace_id set.
+curl -s "http://localhost:8123/demo/workspaces?limit=1" | python3 -m json.tool   # run_config on the row
+curl -s http://localhost:8123/model-selection/models | python3 -c "
+import json,sys; [print(m['model_type'], m['enabled']) for m in json.load(sys.stdin)['models']]"
+# Error paths:
+curl -s -X POST http://localhost:8123/demo/run -d '{"train_model_types":["bogus"]}' \
+  -H 'Content-Type: application/json' | head -c 300    # 422 problem+json
+```
+
+### Level 4 — Browser dogfood (MANDATORY — UI change; webapp-testing / agent-browser per ui-design.md; [[playwright-dogfood-needs-snap-chromium]] on this host)
+
+```bash
+# Backend :8123 + vite :5173 up, then drive /showcase:
+# 1. Expand "Run configuration (advanced)" — opt-in models absent with default flags.
+# 2. Select naive + seasonal_average, metric RMSE → preview shows 2 chips (+V2 only on showcase_rich).
+# 3. Tick "Save as workspace", name e4-dogfood, Run → pipeline green, train card
+#    shows the 2 requested models, summary winner consistent with RMSE.
+# 4. Saved-workspaces panel: row shows the custom-config badge; Load repopulates
+#    the panel controls; Replay re-runs verbatim (watch the WS frame in devtools).
+# 5. Run once with UNTOUCHED controls → WS start frame has NO new keys (devtools).
+# Capture screenshots for the PR.
+```
+
+## Final Validation Checklist
+
+- [ ] `uv run ruff check . && uv run ruff format --check .` clean
+- [ ] `uv run mypy app/ && uv run pyright app/` clean (strict)
+- [ ] `uv run pytest -v -m "not integration"` green
+- [ ] `uv run pytest -v -m integration` green on a fresh stack (reset first — [[integration-suite-shared-state-pollution]])
+- [ ] Migration upgrade + downgrade + re-upgrade clean on fresh DB
+- [ ] `cd frontend && pnpm lint && pnpm test --run` green
+- [ ] Level-3 curl probes match expectations (incl. 422 path)
+- [ ] Level-4 dogfood evidence captured (screenshots + WS frame byte-compat check)
+- [ ] Legacy-frame byte-compat test extended and green (umbrella criterion)
+- [ ] Docs updated (API_CONTRACTS, DOMAIN_MODEL, RUNBOOKS)
+- [ ] `git diff --stat` shows no CRLF whole-file noise
+- [ ] Commits `type(scope): … (#410)`, no AI trailers; PR into dev
+
+## Anti-Patterns to Avoid
+
+- ❌ Don't add mid-run / per-phase re-entry of any kind — explicitly DEFERRED scope (brainstorm Round 5); the single `asyncio.Lock` linear stream is preserved.
+- ❌ Don't write run_config into an E1 story slot or bump `config_schema_version` (D1).
+- ❌ Don't import `app/features/model_selection` (or any sibling slice) from the demo slice — allow-list lives in `app/shared/model_taxonomy.py`.
+- ❌ Don't read settings inside Pydantic schemas (`.env`-bleed incident class) — flags are enforced in `step_train` and overlaid in the catalog service.
+- ❌ Don't make `capabilities.build_model_catalog` impure — overlay in the service.
+- ❌ Don't clamp/auto-fix an aggressive split server-side — fail honestly (sparse-preset policy precedent).
+- ❌ Don't send the new start-frame keys when the controls are untouched — byte-compat is a frozen criterion.
+- ❌ Don't hand-roll new UI primitives — reuse `CandidateModelPicker`, mirror `BacktestSettingsForm`, shadcn components only (`.claude/rules/shadcn-ui.md`).
+- ❌ Don't weaken or touch `test_leakage.py`, merged migrations, or the champion-selector's existing behavior beyond the additive `enabled` field.
+
+---
+
+## Confidence Score: 8.5/10
+
+One-pass implementation likelihood. **+** Every contract was read and runtime-verified today (minimal model-config payloads, rmse key, flag names/defaults, SplitConfig bounds, catalog purity, start-frame parse path); the additive-field, migration, and byte-compat patterns have three shipped precedents in this exact slice (PRP-38 scenario, E1 #390, E2 #391); the frontend reuses two existing, tested components. **−0.5** D4's unified-loop branch in `step_backtest` is the one genuinely new control-flow path (showcase_rich + custom selection interplay with bucketed metrics). **−0.5** Pre-flight dependency: E1 #407 must merge first and its final migration revision id is unknowable today (mitigated by the `alembic heads` instruction in Task 0/D2).
diff --git a/PRPs/PRP-showcase-completion-E5-agent-rag-story-capture.md b/PRPs/PRP-showcase-completion-E5-agent-rag-story-capture.md
new file mode 100644
index 00000000..f522467b
--- /dev/null
+++ b/PRPs/PRP-showcase-completion-E5-agent-rag-story-capture.md
@@ -0,0 +1,1185 @@
+name: "PRP — Showcase Completion E5: Agent/HITL + RAG Story Capture (issue #411)"
+description: |
+
+## Purpose
+
+Implement Parallel epic E5 of the showcase-completion initiative (umbrella #406):
+persist the HITL approval story (decision approved/rejected/timed_out, action ids,
+tool-call summary, transcript summary) into the workspace row's `approval_events`
+slot; add a **Reject** button to the Showcase HITL step card alongside Approve —
+and make both genuinely clickable by streaming the intermediate
+`awaiting_approval` event DURING the decision window (today it flushes only after
+the step ends, so the button can never render in time); render approval history
+on Showcase and `/ops`; capture RAG events (probe/index/retrieve with provider
+state) into `rag_events`; and mark on replay whether the knowledge/agent story
+was reproduced. Capture is warn-and-continue — it must never fail a green
+pipeline. **No widening of `agent_require_approval`. No agents-slice changes.**
+
+## Core Principles
+
+1. **Context is King**: every reference below was verified against live code on 2026-06-12 (branch `dev` @ `bdf85f6`).
+2. **Validation Loops**: each level is executable as written.
+3. **Information Dense**: patterns cite exact file:line.
+4. **Progressive Success**: hitl relay module → pipeline capture → workspace writes → routes → frontend → tests → docs.
+5. **Global rules**: follow CLAUDE.md / AGENTS.md; all five CI gates must pass; all changes ADDITIVE.
+
+---
+
+## ⛔ BLOCKED BY — E1 #407 (Foundation)
+
+This epic writes the `approval_events` + `rag_events` JSONB story slots that the
+E1 migration (`PRPs/PRP-showcase-completion-E1-metadata-provenance-backbone.md`)
+creates, reads `replayed_from_workspace_id` for the reproduction marker, and
+follows E1's frozen Decisions (slot-per-column, soft references, documented slot
+schema, `config_schema_version` bump rule). **Do not start until E1 #407 is
+merged to `dev`.** Verify before branching:
+
+```bash
+gh issue view 407 --json state          # must be CLOSED
+grep -n "approval_events\|rag_events\|replayed_from_workspace_id" app/features/demo/models.py
+# all three column names must exist on ShowcaseWorkspace
+```
+
+If E1 landed with deviations from its PRP (column names, slot shapes, response
+fields), **the merged code wins** — re-anchor the blueprint below to it.
+
+---
+
+## Goal
+
+A `showcase_rich` keep-run records its agent and knowledge story on the
+workspace row, the operator can genuinely approve OR reject the HITL action
+from the step card, and the story is visible afterwards:
+
+- **`approval_events` capture**: `step_agent_hitl_flow` appends one entry per
+  resolved approval (operator approve, operator reject, window-lapse
+  auto-approve, hard timeout) carrying the E1-frozen base keys plus E5's
+  documented additive keys (auto_approved, reason, execution_status,
+  tool_call_summary, transcript_summary, tokens_used, tool_calls_count).
+  `finalize_workspace` writes the list to the row (warn-and-continue).
+- **`rag_events` capture**: the three knowledge steps
+  (`embedding_provider_probe`, `rag_index_subset`, `rag_retrieve_probe`) append
+  one entry each — event kind, status, detail, count, provider state, timestamp.
+- **Interactive Reject (and a real Approve)**: a new in-demo decision relay —
+  `POST /demo/hitl-decision` + a single-slot in-memory store — makes the
+  PIPELINE the sole caller of `/agents/sessions/{id}/approve`. The step card's
+  Approve/Reject buttons relay operator intent through the demo slice; the
+  pipeline forwards the real decision to the agents HITL gate. The decision
+  window grows 3 s → 10 s so a human can actually click.
+- **Timely intermediate events**: `run_pipeline` drains the intermediate-event
+  sink concurrently with the in-flight step (today it drains only after the
+  step returns — `pipeline.py:2701-2715` — so the FE sees `awaiting_approval`
+  only after the auto-approve already fired).
+- **Approval history surfaces**: `GET /demo/approval-events` flattens recent
+  workspaces' `approval_events` newest-first; the `/ops` page renders it as an
+  "Approval History" table (frontend-only — no ops-slice backend change); the
+  Showcase loaded-workspace view renders the full story (approval events + RAG
+  events + reproduction marker).
+- **Replay reproduction marker**: on a replay keep-run
+  (`replayed_from_workspace_id` set), `finalize_workspace` compares the source
+  row's story slots against the new run's capture and records
+  `result_summary.story_reproduction = {"agent": ..., "knowledge": ...,
+  "source_workspace_id": ...}` with values
+  `reproduced | not_reproduced | not_applicable | unknown`.
+
+A run/request without the new surfaces behaves byte-identically (ephemeral
+runs, `demo_minimal`/`sparse` runs, legacy WS frames). **No Alembic migration**
+— E1 shipped every column E5 touches.
+
+**Deliverable** (all additive):
+
+- `app/features/demo/hitl.py` — NEW single-slot in-memory decision relay
+  (register / wait / resolve / clear), safe under the single-flight pipeline lock.
+- `app/features/demo/pipeline.py` — DemoContext `approval_events`/`rag_events`
+  accumulators; `step_agent_hitl_flow` rework (decision window, relay wait,
+  reject path, event entry); RAG-event appends in the three knowledge steps;
+  concurrent intermediate-event drain in `run_pipeline`.
+- `app/features/demo/workspace.py` — `finalize_workspace` writes both slots +
+  `story_reproduction`; NEW `list_approval_events` helper.
+- `app/features/demo/schemas.py` — `HitlDecisionRequest`,
+  `ApprovalEventItem`, `ApprovalEventsResponse`.
+- `app/features/demo/routes.py` — `POST /demo/hitl-decision`,
+  `GET /demo/approval-events`.
+- `app/features/demo/models.py` — `config_schema_version` ORM default 1 → 2
+  (slot-shape delta; E1 Decision 6 rule) + slot-schema comment delta.
+- Frontend — `HitlDecisionButtons` (Approve + Reject) on the step card;
+  `WorkspaceStoryPanel` on Showcase; "Approval History" section on `/ops`;
+  `use-approval-events` hook; types.
+- Tests: hitl-relay unit tests, HITL-step path tests, drain-ordering test,
+  RAG-event capture tests, route tests, finalize/reproduction integration
+  tests, FE component/hook tests.
+- Docs: `docs/_base/API_CONTRACTS.md`, `docs/_base/DOMAIN_MODEL.md` (slot-schema
+  v2 delta), `docs/_base/RUNBOOKS.md` (HITL incidents 23-25 + workspace section).
+
+**Success definition**: all Success Criteria below check off; five CI gates
+green; integration suite green; a manual `showcase_rich` keep-run lets the
+operator click **Reject** within the 10 s window, the run stays green, the
+workspace row carries the rejected `approval_events` entry + three `rag_events`
+entries, `/ops` lists the event, and a Replay of that workspace records a
+`story_reproduction` marker.
+
+## Why
+
+- Umbrella #406 success criterion: "HITL approval decisions (approve AND the
+  new Reject path) and RAG events are captured on the workspace row and
+  rendered as history on Showcase and /ops".
+- The workspace row today records WHAT a run created but not the agent/HITL or
+  knowledge STORY — the demo's most distinctive moments are unrecoverable
+  after the run ends (RUNBOOKS § Showcase workspace, "Explicitly out of scope":
+  "RAG-event and approval-decision capture on the workspace row" — this epic).
+- The PRP-41 Approve button is effectively decorative: the intermediate
+  `awaiting_approval` event is buffered in a plain list that `run_pipeline`
+  drains only AFTER the step function returns (`pipeline.py:2660-2715`), and the
+  step auto-approves after a 3 s sleep — so the browser learns about the
+  approval window only once it has closed. E5's Reject button is meaningless
+  without fixing this.
+- No approval audit trail exists anywhere today: `AgentService.approve_action`
+  clears `pending_action`, logs, and returns — nothing durable records the
+  decision (`app/features/agents/service.py:825-907`). E5 is the first capture
+  (brainstorm Round 5, `.flow/brainstorm-log.md`).
+
+## What
+
+### User-visible behavior
+
+- The HITL step card on `/showcase` (scenario `showcase_rich`) shows **Approve**
+  and **Reject** buttons while awaiting, with a live "auto-approve in Ns"
+  countdown (10 s window). Either click resolves the action; no click
+  auto-approves at window end. A reject keeps the pipeline GREEN — the step
+  passes with detail `rejected by operator`, and the gated `save_scenario`
+  never executes (no scenario_plan row is written).
+- `POST /demo/hitl-decision` accepts `{action_id, decision: "approved"|"rejected",
+  reason?}`; `404 application/problem+json` when no matching action is pending;
+  `409` when the action was already decided; `422` on a malformed body.
+- `GET /demo/approval-events?limit=N` returns recent approval events flattened
+  across saved workspaces, newest-workspace-first; `200` + empty list when none.
+- The `/ops` page gains an "Approval History" card (table: decision badge, tool,
+  workspace, transcript snippet, when). The Showcase loaded-workspace view gains
+  a story panel: approval events, RAG events (with provider state), and — on
+  replay rows — a "story reproduced / not reproduced" marker.
+- Ephemeral runs and `demo_minimal` / `sparse` runs are unchanged; legacy WS
+  start frames are byte-identical (no new request fields on `DemoRunRequest`).
+
+### Technical requirements
+
+- **No agents-slice changes.** The pipeline remains the only writer of the
+  approve POST in the showcase path; `agent_require_approval` is untouched;
+  no agents migration, no `AgentSession` column. (The durable per-session
+  approval audit is deliberately deferred — see Decisions D8.)
+- **No Alembic migration** — E1 (#407) shipped `approval_events`, `rag_events`,
+  `replayed_from_workspace_id`, `config_schema_version`.
+- **Warn-and-continue invariant**: all capture writes ride inside the existing
+  `finalize_workspace` try/except (`workspace.py:147-154`); a capture failure
+  must never break a green run. ctx accumulators always append in-memory (cheap,
+  cannot fail); only the DB write is fallible.
+- **Single-flight safety**: the in-memory decision relay is correct because at
+  most one pipeline runs per process (`service.py:19` `_pipeline_lock`) and the
+  HITL step registers at most one pending action per run. The relay is
+  module-level state in the demo slice (precedent: `_pipeline_lock`).
+- **Vertical slice**: all backend changes inside `app/features/demo/`; the
+  `/ops` approval-history surface is FRONTEND-ONLY (the ops page queries the
+  demo endpoint — no ops-slice import of demo code, no cross-slice edge).
+- RFC 7807 errors only — `NotFoundError` / `ConflictError` from
+  `app/core/exceptions.py` (demo routes precedent, `routes.py:34,76,134`).
+- Pydantic v2 `ConfigDict(strict=True, extra="forbid")` on `HitlDecisionRequest`
+  (HTTP-only body; all fields JSON-native → no `Field(strict=False)`; the AST
+  policy walker `app/core/tests/test_strict_mode_policy.py` only fires on
+  date/datetime/time/UUID/Decimal).
+- `StepEvent` data additions are additive dict keys only (legacy clients ignore
+  unknown keys — the WS forward-compat contract).
+
+### Success Criteria
+
+- [ ] `run_pipeline` yields buffered intermediate events while the step is
+  still executing: an orchestrator-level test proves the `awaiting_approval`
+  event is received BEFORE the HITL step's terminal `step_complete` in wall
+  time (not just stream order).
+- [ ] Operator approve within the window → approve POST `approved=true`,
+  `approval_events` entry `decision="approved"`, `auto_approved=false`.
+- [ ] Operator reject within the window → approve POST `approved=false`, step
+  terminal `pass` with detail `rejected by operator`, entry
+  `decision="rejected"` (+ optional `reason`), pipeline green, NO scenario_plan
+  row written by the agent.
+- [ ] No decision → auto-approve at 10 s, entry `decision="approved"`,
+  `auto_approved=true`. Hard timeout (90 s) → entry `decision="timed_out"`,
+  step skips (existing semantics preserved).
+- [ ] Each knowledge step appends exactly one `rag_events` entry on every
+  outcome path (pass / warn / skip / auth-skip), carrying `provider` state.
+- [ ] `finalize_workspace` writes both slots (NULL when empty — never `[]`),
+  and on a replay row writes `result_summary.story_reproduction` with the
+  documented values incl. `unknown` for a dangling source.
+- [ ] `POST /demo/hitl-decision`: 204 happy path; 404 no-pending; 409
+  already-decided; 422 bad body (problem+json each).
+- [ ] `GET /demo/approval-events`: 200 + empty list on empty table; flattened
+  entries carry `workspace_id` / `workspace_name`.
+- [ ] FE: Reject button renders alongside Approve, both POST the demo relay via
+  `lib/api.ts` `api()` (not bare `fetch`), countdown reads
+  `data.decision_window_s`; `/ops` Approval History table renders; Showcase
+  story panel renders events + reproduction marker.
+- [ ] Legacy byte-compat: `DemoRunRequest` unchanged; `demo_minimal`/`sparse`
+  emit no relay events and write no slots; `config_schema_version` ORM default
+  is 2 (new rows) while old rows keep 1.
+- [ ] `uv run ruff check . && uv run ruff format --check . && uv run mypy app/
+  && uv run pyright app/ && uv run pytest -v -m "not integration"` green;
+  integration suite green; `cd frontend && pnpm lint && pnpm test --run` green
+  with no NEW `tsc -b` errors vs the dev baseline.
+
+## Decisions (the open questions this PRP resolves)
+
+> Frozen for execution. E7 (release gate) authors: consume, don't re-decide.
+
+1. **D1 — Decision relay: the pipeline is the SOLE approver.** The FE buttons
+   POST `/demo/hitl-decision` (demo slice, in-memory single-slot store); the
+   HITL step waits on the relay up to the window, then POSTs
+   `/agents/sessions/{id}/approve` with the operator's decision (or
+   `approved=true` on window lapse). Rationale: `approve_action` persists NO
+   decision record (`agents/service.py:868-871` clears `pending_action`, logs,
+   returns), so the PRP-41 pattern — FE pre-empts the agents endpoint directly
+   and the pipeline absorbs the 4xx as "executed" (`pipeline.py:2357-2366`) —
+   cannot distinguish an FE approve from an FE reject. Routing intent through
+   the demo slice gives the pipeline ground truth with zero agents-slice
+   changes and zero migrations. The 4xx-absorb stays as belt-and-braces for an
+   operator curl-ing `/agents/.../approve` directly mid-run (recorded as
+   `decision="approved"`, `execution_status="external_4xx"` — honest about the
+   residual ambiguity).
+2. **D2 — Concurrent intermediate-event drain in `run_pipeline`.** Replace
+   `await fn(ctx, client)` with `task = asyncio.ensure_future(fn(ctx, client))`
+   + a `while` loop that `asyncio.wait({task}, timeout=0.25)` and flushes the
+   sink each tick (stamping index/phase fields exactly as the existing
+   post-step drain does, `pipeline.py:2707-2714`). This is NOT a pipeline
+   re-architecture: steps still run strictly one at a time under the same lock;
+   only event flushing overlaps the in-flight step. The post-step drain block
+   stays (final flush). Exception mapping moves to `task.result()` inside the
+   same try/except ladder (`pipeline.py:2681-2699`).
+3. **D3 — Decision window 10 s.** `_APPROVAL_DISPLAY_DELAY_S = 3.0`
+   (`pipeline.py:317`) is replaced by `_APPROVAL_DECISION_WINDOW_S = 10.0`.
+   3 s is unclickable by a human; 10 s keeps the showcase brisk (well under the
+   90 s hard timeout and the 180 s soft budget) and is emitted to the FE as
+   `data.decision_window_s` so the countdown never hardcodes it.
+4. **D4 — Slot-schema delta ⇒ `config_schema_version` ORM default 1 → 2.**
+   E5 widens the E1-frozen `approval_events.decision` enum
+   (`"approved"|"rejected"` → `+"timed_out"`), adds additive entry keys, and
+   adds `"probe"` to the `rag_events.event` enum + additive keys. Per E1
+   Decision 6 ("any epic that changes a documented slot shape bumps the ORM
+   default and documents the delta") this is a bump; E3's PRP explicitly does
+   NOT bump (its CONTRACT(E1) note: populating verbatim ≠ shape change), so no
+   collision is expected — but if another parallel epic bumped first, take the
+   next integer and update DOMAIN_MODEL accordingly. ORM `default=` only —
+   `server_default` stays `text("1")` (no migration; old rows legitimately
+   read 1).
+5. **D5 — Reject keeps the pipeline GREEN.** A human rejection is a SUCCESSFUL
+   demonstration of the HITL gate, not an error: terminal `("pass", "rejected
+   by operator", {..., "approval_decision": "rejected"})`. Only transport/5xx
+   failures keep the existing skip semantics. `step_cleanup` still closes the
+   session either way.
+6. **D6 — Approval history endpoint lives in the DEMO slice.** "Render on
+   /ops" is a frontend statement: the ops PAGE queries
+   `GET /demo/approval-events`. Putting the endpoint in the ops slice would
+   force a cross-slice demo import for data the demo slice owns
+   (`showcase_workspace.approval_events`). Flattening is Python-side over the
+   newest ≤50 rows with a non-NULL slot — a low-cardinality audit table; no
+   `jsonb_array_elements` SQL needed.
+7. **D7 — Reproduction marker lives in `result_summary`** (an existing
+   demo-owned JSONB whose shape is not E1-frozen), NOT a new column and not a
+   slot entry: `{"story_reproduction": {"agent": V, "knowledge": V,
+   "source_workspace_id": str}}` with `V ∈ reproduced | not_reproduced |
+   not_applicable | unknown`. `agent`: source row had ≥1 approval event →
+   compare with the new run (`reproduced`/`not_reproduced`); source had none →
+   `not_applicable`; source row missing (soft reference dangles) → `unknown`.
+   `knowledge`: same logic over `rag_events` entries whose `event` is
+   `index`/`retrieve` with `status != "skip"`. Computed inside
+   `finalize_workspace` (one extra `get`-by-id select in the same session,
+   inside the existing warn-and-continue try).
+8. **D8 — No durable approval audit on `agent_session` (deferred).** The
+   architecturally complete fix (an `approval_history` JSONB on the agents
+   aggregate) needs an agents migration + schema surface — out of this epic
+   per the umbrella approach ("additive-only delta on the existing demo +
+   seeder slices") and the epic's own scope line ("No widening of
+   agent_require_approval"; agents untouched). If E7 review wants it, it is a
+   follow-up issue, not scope creep here.
+
+### Assumptions (explicit, decided without user input)
+
+- `tool_call_summary` carries `{"description": str, "arguments_keys":
+  list[str]}` from `pending_action` — argument KEYS only, never values
+  (security-patterns.md: never echo full payloads; values may embed
+  user-supplied text).
+- `transcript_summary` is the agent's chat `message` truncated to 200 chars
+  (precedent: the #335 failure-detail 300-char cap).
+- The relay rejects decisions for an `action_id` that is not the registered
+  one with 404 (not 409): a mismatched id is "nothing pending under that id".
+- `GET /demo/approval-events` scans the newest 50 workspace rows with a
+  non-NULL slot and caps the flattened list at `limit` (1-200, default 50).
+  No offset/pagination — audit-glance surface, not a browse API.
+- The live-run Showcase surface for history is the step card itself (terminal
+  detail + `HitlFlowSummary`); the story PANEL renders for loaded workspaces.
+- FE buttons disable after either click; 404/409 responses are absorbed
+  silently (the auto-approve raced) — same UX contract as the PRP-41 button.
+
+## All Needed Context
+
+### Documentation & References
+
+```yaml
+# MUST READ — codebase patterns (verified 2026-06-12, dev @ bdf85f6)
+
+- file: PRPs/PRP-showcase-completion-E1-metadata-provenance-backbone.md
+  why: |
+    THE frozen upstream contract: slot columns + per-slot documented schemas
+    (approval_events / rag_events base keys), Decision 2 (slot-per-column),
+    Decision 5 (E5 writes these two slots), Decision 6 (config_schema_version
+    bump rule), and "Notes for parallel-epic PRP authors" (warn-and-continue
+    for pipeline-time slot writes; HTTP writes go through caller-owned-session
+    helpers). Re-verify the merged code matches before relying on line numbers.
+
+- file: app/features/demo/pipeline.py
+  why: |
+    THE file you rework. _Client.__init__ event_sink @136-155 and
+    yield_event @163-174 (plain-list sink, silently dropped when None);
+    DemoContext @213-263 (PRP-41 approval fields @254-257 — the comment style
+    your new accumulator fields follow); _llm_key_present @289;
+    HITL constants @314-322 (_APPROVAL_DISPLAY_DELAY_S=3.0 @317 — replaced;
+    _APPROVAL_HARD_TIMEOUT_S=90.0 @318 — kept; _HITL_PROMPT @319);
+    _embedding_provider_reachable @390; _is_embedding_auth_error @431;
+    step_embedding_provider_probe @1449-1468; step_rag_index_subset
+    @1471-1525 (note the auth-skip path @1493-1501); step_rag_retrieve_probe
+    @1528-1576 (warn-on-zero-hits @1552-1560); step_agent_hitl_flow
+    @2192-2394 (every outcome path you extend — skip-no-key @2222, skip-no-
+    pending @2269, intermediate event @2295-2318, display sleep @2320-2324,
+    hard-timeout @2326-2341, approve POST + 4xx absorb @2343-2377, terminal
+    @2381-2394); _phase_table @2528 (knowledge steps @2589-2593, agents step
+    @2560-2564 — registry unchanged in E5); run_pipeline @2618-2771
+    (intermediate_events buffer @2660-2663, step await + except ladder
+    @2681-2699, post-step drain @2701-2715 — THE BLOCK D2 generalizes,
+    finalize hook @2744-2747).
+
+- file: app/features/demo/workspace.py
+  why: |
+    finalize_workspace @106-155 — the warn-and-continue write you extend with
+    the two slot assignments + story_reproduction (whole-value assignment,
+    inside the existing try). get_workspace @158-171 (reuse the select shape
+    for the source-row read INSIDE finalize's own session — do NOT call
+    get_workspace, it takes a caller-owned session). list_workspaces @174-196
+    — the newest-first select your list_approval_events mirrors (add
+    .where(ShowcaseWorkspace.approval_events.isnot(None))).
+    CONTRACT in module docstring @10-13: create/finalize swallow all errors.
+
+- file: app/features/demo/service.py
+  why: |
+    _pipeline_lock @19 — the single-flight guarantee that makes the in-memory
+    relay safe. PipelineBusyError @22 + the 409 mapping (routes.py:74-77) —
+    the error-translation pattern the hitl-decision route mirrors.
+
+- file: app/features/demo/routes.py
+  why: |
+    Router you extend. delete_showcase_workspace @138-163 — NotFoundError
+    shape; run_demo_pipeline @74-77 — ConflictError shape; list_showcase_
+    workspaces @80-107 — Query(ge/le) param + list response shape for
+    GET /demo/approval-events; WS handler @166-194 (unchanged in E5).
+
+- file: app/features/demo/schemas.py
+  why: |
+    DemoRunRequest @29-85 (UNCHANGED in E5 — no new request fields);
+    StepEvent @88-127 (data is dict[str, Any] — additive keys free);
+    WorkspaceListItem @169-189 / WorkspaceDetailResponse @192-203 — E1 adds
+    the slot fields to Detail; E5 only READS them. New models follow the
+    response-model split: plain BaseModel, from_attributes only where built
+    from ORM rows (ApprovalEventItem is built from dicts — no from_attributes).
+
+- file: app/features/demo/models.py
+  why: |
+    ShowcaseWorkspace — after E1: config_schema_version (ORM default you bump
+    to 2), approval_events / rag_events slot columns + the documented
+    per-slot schema comments (extend with the E5 delta; DOMAIN_MODEL carries
+    the authoritative copy).
+
+- file: app/features/agents/schemas.py
+  why: |
+    PendingAction @170-190 (action_id / action_type / description / arguments
+    — the tool_call_summary source); ApprovalRequest @192-206 (action_id,
+    approved: bool, reason ≤500 — REJECT ALREADY EXISTS in the agents API;
+    the pipeline just sends approved=false); ApprovalResponse @208+ (status:
+    "executed"|"rejected"|"expired" — mapped into execution_status; NOTE an
+    approved-but-failed execution also reports "rejected", see
+    frontend/src/lib/approval-report.ts:10-16).
+
+- file: app/features/agents/service.py
+  why: |
+    approve_action @825-907 — READ ONLY: proves no decision is persisted
+    (pending_action cleared @868, status → ACTIVE @869, returns the response)
+    and that a consumed action raises NoApprovalPendingError → the 400 the
+    4xx-absorb handles. DO NOT MODIFY (D8).
+
+- file: app/features/demo/tests/test_pipeline.py
+  why: |
+    _make_hitl_client @1838-1921 — THE fake-client harness for HITL step
+    tests; extend it (approve body capture, decision injection).
+    test_agent_hitl_flow_happy_path @1959 + the FOUR monkeypatches of
+    _APPROVAL_DISPLAY_DELAY_S @1973/2047/2063/2081 — every one must move to
+    _APPROVAL_DECISION_WINDOW_S (set 0.0 so tests don't sleep). Phase-table
+    test @629-674 pins the 24-row layout (unchanged).
+
+- file: app/features/demo/tests/conftest.py
+  why: |
+    client fixture (ASGITransport, monkeypatched-service unit route tests) +
+    db_session fixture (integration; wipes showcase_workspace on teardown) —
+    reuse both; do not invent new fixtures.
+
+- file: frontend/src/components/demo/demo-step-card.tsx
+  why: |
+    ApproveButton @371-421 — REPLACED by HitlDecisionButtons. Note the bare
+    relative fetch(approvalUrl) @393 — only works when SPA origin == API
+    origin; the replacement MUST use lib/api.ts api() (API_BASE_URL-prefixed,
+    frontend/src/lib/api.ts:3,23-26). Render condition @496-505 (keep shape;
+    swap component). HitlFlowSummary mount @494.
+
+- file: frontend/src/pages/showcase.tsx
+  why: |
+    Page wiring: loadedWorkspace detail query @128-131; WorkspaceArtifacts
+    Panel mount @448-450 — mount WorkspaceStoryPanel beside it (same
+    `phase !== 'running' && loadedWorkspace` guard). handleReplayWorkspace
+    @174-186 (E1 adds replayed_from_workspace_id here — E5 does not touch it).
+
+- file: frontend/src/pages/ops.tsx
+  why: |
+    "Needs Attention" section @394-446 — THE Card+Table pattern (empty-state
+    paragraph, StatusBadge, formatWhen) the Approval History section mirrors.
+    Place the new section directly after Needs Attention.
+
+- file: frontend/src/hooks/use-workspaces.ts + frontend/src/hooks/use-ops.ts
+  why: |
+    TanStack patterns: queryKey arrays, api<T>() calls, refetchInterval
+    choices. use-approval-events.ts mirrors useWorkspaces (no polling — the
+    table changes only when a run finishes; document that in the hook docstring
+    like useRetrainingCandidates does).
+
+- file: frontend/src/types/api.ts
+  why: |
+    StepEvent @760 / DemoRunRequest @778 / WorkspaceListItem @806 /
+    WorkspaceDetail @819 — add ApprovalEventItem / ApprovalEventsResponse and
+    (if E1 did not already) the WorkspaceDetail slot fields E5 reads
+    (approval_events, rag_events). Comment style: `// E5 (#411) — ...`.
+
+- file: frontend/src/lib/approval-report.ts
+  why: |
+    Documents the executed/rejected/expired semantics of ApprovalResponse
+    (incl. approved-but-execution-failed → "rejected") — the mapping
+    execution_status follows.
+
+- file: docs/_base/DOMAIN_MODEL.md
+  why: |
+    § showcase_workspace — E1 documents the frozen slot schemas; E5 appends
+    the v2 delta (decision enum widening, additive keys, "probe" event,
+    story_reproduction in result_summary) and the config_schema_version=2
+    note. Authoritative slot-schema copy lives HERE.
+
+- file: docs/_base/RUNBOOKS.md
+  why: |
+    Incidents 23-25 (agent_hitl_flow) — update for the 10 s window, the
+    Reject path, and the relay endpoint; § Showcase workspace — trim
+    "RAG-event and approval-decision capture" from the out-of-scope list.
+
+- file: PRPs/PRP-showcase-completion-E3-seed-config-scope.md
+  why: |
+    Parallel-epic coordination: E3 also extends DemoContext and touches
+    create_workspace-time writes. Expect textual merge conflicts in
+    DemoContext / workspace.py if E3 lands first — both additions are
+    independent; resolve by keeping both blocks. E3's CONTRACT(E1) note
+    (line 1031) confirms E3 does NOT bump config_schema_version — E5 does (D4).
+
+# Issue / initiative context
+- url: https://github.com/w7-mgfcode/ForecastLabAI/issues/411
+  why: The epic this PRP implements.
+- url: https://github.com/w7-mgfcode/ForecastLabAI/issues/406
+  why: Umbrella — success criteria, out-of-scope list, warn-and-continue risk row.
+- url: https://github.com/w7-mgfcode/ForecastLabAI/issues/407
+  why: Foundation epic (BLOCKING) — frozen column/slot/endpoint contract.
+
+# Exemplar PRPs (style + validation-gate conventions)
+- file: PRPs/PRP-41-showcase-agent-ops-polish.md
+  why: Authored the HITL step + intermediate-event sink E5 reworks.
+- file: PRPs/ai_docs/prp-41-contract-probe-report.md
+  why: Verified agents HITL contracts (chat pending_approval shape, approve 400 on consumed action).
+```
+
+### Current Codebase tree (relevant subset, post-E1)
+
+```bash
+app/features/demo/
+├── models.py          # ShowcaseWorkspace + E1 columns (approval_events/rag_events/config_schema_version/replayed_from_workspace_id)
+├── pipeline.py        # _Client sink @136; DemoContext @213; HITL constants @314; knowledge steps @1449-1576; step_agent_hitl_flow @2192; run_pipeline @2618
+├── workspace.py       # create @46; finalize @106; get @158; list @174; delete @199; count @224 (+ E1 update_workspace)
+├── schemas.py         # DemoRunRequest @29; StepEvent @88; Workspace* @169-213 (+ E1 WorkspaceUpdateRequest, slot fields on Detail)
+├── routes.py          # POST /run @51; GET/PATCH/DELETE /workspaces @80-163; WS /stream @166
+├── service.py         # _pipeline_lock @19 (UNCHANGED)
+└── tests/             # conftest, test_models, test_pipeline (_make_hitl_client @1838), test_routes, test_schemas, test_workspace
+frontend/src/
+├── components/demo/demo-step-card.tsx   # ApproveButton @371; render condition @496
+├── components/demo/WorkspaceArtifactsPanel.tsx
+├── pages/showcase.tsx                   # loadedWorkspace @128; panels @244-450
+├── pages/ops.tsx                        # Needs Attention table @394-446
+├── hooks/use-workspaces.ts / use-ops.ts
+├── lib/api.ts                           # api<T>() @23 (API_BASE_URL @3)
+└── types/api.ts                         # StepEvent @760; Workspace* @806-830
+```
+
+### Desired Codebase tree (files added/modified)
+
+```bash
+app/features/demo/
+├── hitl.py                       # NEW — single-slot decision relay (register/wait/resolve/clear)
+├── pipeline.py                   # MOD — ctx accumulators; HITL step rework; rag-event appends; D2 drain
+├── workspace.py                  # MOD — finalize slot writes + story_reproduction; NEW list_approval_events
+├── schemas.py                    # MOD — HitlDecisionRequest; ApprovalEventItem; ApprovalEventsResponse
+├── routes.py                     # MOD — POST /hitl-decision; GET /approval-events
+├── models.py                     # MOD — config_schema_version ORM default 2; slot-comment delta
+└── tests/
+    ├── test_hitl.py              # NEW — relay unit tests (asyncio)
+    ├── test_pipeline.py          # MOD — HITL paths, rag capture, drain-ordering, constant rename
+    ├── test_routes.py            # MOD — hitl-decision 204/404/409/422; approval-events 200
+    ├── test_schemas.py           # MOD — HitlDecisionRequest (+ JSON path); response models
+    └── test_workspace.py         # MOD — finalize slots + story_reproduction (integration)
+frontend/src/
+├── components/demo/demo-step-card.tsx      # MOD — HitlDecisionButtons (Approve+Reject, api(), countdown)
+├── components/demo/demo-step-card.test.tsx # MOD — Reject render + POST body
+├── components/demo/WorkspaceStoryPanel.tsx       # NEW — approval + rag events + reproduction marker
+├── components/demo/WorkspaceStoryPanel.test.tsx  # NEW
+├── components/demo/index.ts                # MOD — export
+├── pages/showcase.tsx                      # MOD — mount WorkspaceStoryPanel
+├── pages/ops.tsx                           # MOD — Approval History section
+├── hooks/use-approval-events.ts            # NEW — useApprovalEvents
+├── hooks/use-approval-events.test.ts       # NEW
+├── hooks/index.ts                          # MOD — export
+└── types/api.ts                            # MOD — ApprovalEventItem/Response (+ Detail slot fields if E1 didn't)
+docs/_base/API_CONTRACTS.md                 # MOD — 2 endpoints + WS data-key notes
+docs/_base/DOMAIN_MODEL.md                  # MOD — slot-schema v2 delta + story_reproduction
+docs/_base/RUNBOOKS.md                      # MOD — HITL incidents + out-of-scope trim
+```
+
+### Known Gotchas & Library Quirks
+
+```python
+# CRITICAL — intermediate events flush only AFTER the step returns today
+#   (run_pipeline drains the list sink post-await, pipeline.py:2701-2715).
+#   PRP-41's Approve button therefore never renders during the window. D2's
+#   concurrent drain is LOAD-BEARING for this whole epic — implement and test
+#   it FIRST (Task 3) or every FE-interaction test downstream is meaningless.
+
+# CRITICAL — the relay wait must use asyncio primitives, not polling sleeps:
+#   asyncio.wait_for(event.wait(), timeout=...) raises TimeoutError on lapse
+#   (stdlib-verified 2026-06-12:
+#   uv run python -c "import asyncio
+#   async def m():
+#       ev=asyncio.Event()
+#       async def r(): await asyncio.sleep(0.05); ev.set()
+#       t=asyncio.ensure_future(r())
+#       await asyncio.wait_for(ev.wait(), timeout=1.0); print(True); await t
+#   asyncio.run(m())"  -> True).
+
+# CRITICAL — warn-and-continue: ALL new finalize_workspace logic (slot writes,
+#   source-row read, story_reproduction) goes INSIDE the existing try block
+#   (workspace.py:126-146). Never add a second commit path; never let a
+#   malformed source row raise out.
+
+# CRITICAL — JSONB whole-value assignment: build the full list on ctx, then
+#   row.approval_events = ctx.approval_events or None. NEVER append to a
+#   loaded row's JSONB in place (invisible to SQLAlchemy without
+#   flag_modified). Empty list -> None (E1: NULL = "slot never written").
+
+# CRITICAL — the relay is process-global mutable state. It is safe ONLY
+#   because service._pipeline_lock enforces one run at a time. Guard anyway:
+#   register() overwrites any stale slot (a crashed run must not wedge the
+#   next one) and the step clears it in a finally block.
+
+# CRITICAL — do NOT modify app/features/agents/** (D8) and do NOT touch
+#   agent_require_approval. The reject path is expressed entirely through the
+#   EXISTING ApprovalRequest.approved=false contract (agents/schemas.py:192).
+
+# GOTCHA — FE ApproveButton today uses bare fetch(approvalUrl) with a
+#   RELATIVE url (demo-step-card.tsx:393) — breaks when VITE_API_BASE_URL
+#   points off-origin. HitlDecisionButtons must use api() from lib/api.ts.
+
+# GOTCHA — tests monkeypatch pipeline._APPROVAL_DISPLAY_DELAY_S at FOUR
+#   sites: test_pipeline.py:1973, 2047, 2063, 2081. Renaming the constant
+#   without sweeping all four fails loudly (monkeypatch.setattr
+#   AttributeError). The "auto-approve in 3 s" detail string lives only in
+#   pipeline.py:2306 (no test asserts it); the FE countdown copy at
+#   demo-step-card.tsx:415 computes from the 90 s HARD timeout, not the
+#   window — replace it with the decision_window_s countdown.
+#   Grep "_APPROVAL_DISPLAY_DELAY_S\|auto-approve in" before renaming.
+
+# GOTCHA — StepEvent attribute stamping (ev.step_index = index) relies on
+#   Pydantic validate_assignment being OFF (default) — existing production
+#   behavior (pipeline.py:2708); keep the drained-event stamping identical.
+
+# GOTCHA — D2's task wrapper changes exception flow: _StepError raised inside
+#   the step now surfaces from task.result(). Keep the EXACT except ladder
+#   (_StepError -> fail / httpx.HTTPError|OSError -> transport fail /
+#   Exception -> unexpected fail).
+#   CRITICAL sub-case: the Stop button closes the WebSocket -> the async
+#   generator is CLOSED, which throws **GeneratorExit** (a BaseException) into
+#   the frame at D2's new mid-step `yield ev` suspension point. Neither
+#   `except asyncio.CancelledError` nor `except Exception` catches it, so the
+#   in-flight step task would be orphaned ("Task was destroyed but it is
+#   pending") with the _Client context exiting underneath it. The drain loop
+#   MUST therefore sit inside `try/finally: if not task.done(): task.cancel()`
+#   — cancellation on ANY exit (GeneratorExit, CancelledError, or a raise),
+#   never only on CancelledError. Verify the Stop path by hand in Level 4.
+
+# GOTCHA — parallel-epic merge friction: E3 (#409) extends DemoContext and
+#   the create-time workspace write; E2/E4 may touch finalize for
+#   job_ids/phase_summaries. All additions are disjoint — resolve conflicts
+#   by keeping both hunks; re-run the full demo test file after any rebase.
+
+# GOTCHA — repo has mixed CRLF/LF line endings; run `git diff --stat` before
+#   committing (Edit/Write emit LF — whole-file noise diffs are a regression).
+
+# GOTCHA — frontend type gate: `pnpm tsc --noEmit` is vacuous and `tsc -b`
+#   already fails on dev with pre-existing errors. Gate on "no NEW errors vs
+#   the dev baseline" + `pnpm lint` + `pnpm test --run`.
+
+# GOTCHA — mypy --strict AND pyright --strict gate merge: full annotations
+#   incl. `-> None` on tests, typed module-level relay state
+#   (e.g. _slot: _PendingDecision | None), and dataclass field types.
+
+# CONVENTION — branch: feat/showcase-completion-e5-agent-rag-story (off dev,
+#   slug ≤50). Commits reference #411: feat(api): ... (#411) for slice code,
+#   feat(ui): ... (#411) for frontend, docs(repo)/docs(docs): ... (#411).
+#   NO AI trailer (hook-enforced).
+
+# RUNTIME-VERIFICATION LOG (per prp-create step 3 — re-run on upgrade):
+#   1. asyncio.Event + wait_for timeout semantics — verified 2026-06-12
+#      (command above, prints True).
+#   2. No NEW third-party API claims: httpx-ASGITransport step client,
+#      SQLAlchemy JSONB whole-value writes, Pydantic strict-literal bodies,
+#      and TanStack query/mutation shapes are all existing production code in
+#      this repo (pipeline.py / workspace.py / schemas.py / use-workspaces.ts).
+#   3. agents approve contract probed in PRPs/ai_docs/prp-41-contract-probe-
+#      report.md (approved=false path + 400-on-consumed) — re-verify only if
+#      the agents slice changes upstream.
+```
+
+## Implementation Blueprint
+
+### Data shapes (documented slot-schema v2 — authoritative copy goes to DOMAIN_MODEL)
+
+```python
+# approval_events entry (list[dict], append-only). E1-frozen base keys first;
+# E5 additive keys below the marker. decision enum WIDENED in v2.
+{
+    "action_id": str,
+    "tool_name": str,                  # pending_action.action_type
+    "decision": "approved" | "rejected" | "timed_out",
+    "decided_at": str,                 # iso8601 UTC
+    "session_id": str,
+    # -- E5 (#411) additive, config_schema_version >= 2 --
+    "auto_approved": bool,             # True when the window lapsed
+    "reason": str | None,              # operator-supplied (Reject), <=500
+    "execution_status": str | None,    # ApprovalResponse.status: executed|rejected|expired;
+                                       # "external_4xx" on the absorbed pre-empt edge; None on timed_out
+    "tool_call_summary": {"description": str, "arguments_keys": list[str]},
+    "transcript_summary": str,         # agent chat message, <=200 chars
+    "tokens_used": int,
+    "tool_calls_count": int,
+}
+
+# rag_events entry (list[dict], append-only). "probe" event ADDED in v2.
+{
+    "event": "probe" | "index" | "retrieve" | "skip",
+    "status": "pass" | "warn" | "skip",   # E5 additive
+    "detail": str,
+    "count": int,                      # chunks indexed / results returned / 0
+    "occurred_at": str,                # iso8601 UTC
+    "provider": str | None,            # E5 additive — embedding provider name
+    "reachable": bool | None,          # E5 additive — probe only
+}
+
+# result_summary additive key (replay keep-runs only):
+{"story_reproduction": {"agent": V, "knowledge": V, "source_workspace_id": str}}
+# V ∈ "reproduced" | "not_reproduced" | "not_applicable" | "unknown"
+```
+
+```python
+# app/features/demo/hitl.py — NEW. Single-slot in-memory decision relay.
+# Safe because service._pipeline_lock enforces one pipeline per process and
+# the HITL step registers at most one action per run (precedent for
+# module-level state: service.py:19).
+"""HITL decision relay for the showcase pipeline (E5, issue #411). ..."""
+from __future__ import annotations
+import asyncio
+from dataclasses import dataclass, field
+from typing import Literal
+
+Decision = Literal["approved", "rejected"]
+ResolveOutcome = Literal["applied", "already_decided", "not_found"]
+
+@dataclass
+class _PendingDecision:
+    action_id: str
+    event: asyncio.Event = field(default_factory=asyncio.Event)
+    decision: Decision | None = None
+    reason: str | None = None
+
+_slot: _PendingDecision | None = None   # module-level; one pipeline at a time
+
+def register(action_id: str) -> None:
+    """Open the decision window; overwrites any stale slot from a dead run."""
+    global _slot
+    _slot = _PendingDecision(action_id=action_id)
+
+def resolve(action_id: str, decision: Decision, reason: str | None = None) -> ResolveOutcome:
+    """Record the operator's decision; called by POST /demo/hitl-decision."""
+    if _slot is None or _slot.action_id != action_id:
+        return "not_found"
+    if _slot.decision is not None:
+        return "already_decided"
+    _slot.decision = decision
+    _slot.reason = reason
+    _slot.event.set()
+    return "applied"
+
+async def wait_for_decision(action_id: str, timeout: float) -> tuple[Decision, str | None] | None:
+    """Block up to ``timeout`` for an operator decision; None on lapse."""
+    if _slot is None or _slot.action_id != action_id:
+        return None
+    try:
+        await asyncio.wait_for(_slot.event.wait(), timeout=timeout)
+    except TimeoutError:
+        return None
+    if _slot.decision is None:   # defensive: set() without decision
+        return None
+    return (_slot.decision, _slot.reason)
+
+def clear() -> None:
+    """Close the window (step's finally)."""
+    global _slot
+    _slot = None
+```
+
+```python
+# app/features/demo/pipeline.py — DemoContext additions (after workspace_name,
+# E3-comment style):
+    # E5 (#411) -- story-capture accumulators. Appended by step_agent_hitl_flow
+    # and the knowledge steps on SHOWCASE_RICH; finalize_workspace persists
+    # them to the workspace slots (empty -> slot stays NULL).
+    approval_events: list[dict[str, Any]] = field(default_factory=list)
+    rag_events: list[dict[str, Any]] = field(default_factory=list)
+
+# Constants — REPLACE _APPROVAL_DISPLAY_DELAY_S (line 317):
+_APPROVAL_DECISION_WINDOW_S = 10.0   # D3 — operator decision window
+
+# RAG-event helper (near the knowledge steps):
+def _record_rag_event(ctx, *, event, status, detail, count=0, provider=None, reachable=None) -> None:
+    ctx.rag_events.append({... per the v2 shape, "occurred_at": datetime.now(UTC).isoformat()})
+# Call once on EVERY return path of the three knowledge steps:
+#   probe   -> event="probe",   status="pass", provider=, reachable=
+#   index   -> event="index"|"skip", count=total_chunks
+#   retrieve-> event="retrieve"|"skip", status="warn" on zero hits, count=results_count
+# (provider for index/retrieve: reuse the probe's provider via a ctx echo or
+#  re-read get_settings().rag_embedding_provider — settings read is simplest.)
+
+# step_agent_hitl_flow rework (between the intermediate event and terminal):
+#   - intermediate event data ADDS: "decision_window_s": _APPROVAL_DECISION_WINDOW_S
+#     and "decision_url": "/demo/hitl-decision"; detail becomes
+#     f"awaiting approval (auto-approve in {int(_APPROVAL_DECISION_WINDOW_S)} s)"
+#   - hitl.register(action_id) BEFORE yielding the intermediate event;
+#     try/finally hitl.clear() around everything after registration.
+#   - replace the sleep @2320-2324 with:
+#       remaining = max(0.0, _APPROVAL_DECISION_WINDOW_S - (time.monotonic() - started_at))
+#       operator = await hitl.wait_for_decision(action_id, timeout=remaining)
+#   - keep the hard-timeout check @2326-2341; on timed_out ALSO append the
+#     approval_events entry (decision="timed_out", execution_status=None).
+#   - approved = operator is None or operator[0] == "approved"
+#     reason = operator[1] if operator else None
+#     POST /approve with {"action_id": ..., "approved": approved,
+#     **({"reason": reason} if reason else {})}
+#   - 4xx absorb stays; record execution_status="external_4xx" on that edge.
+#   - append the approval_events entry on EVERY resolved path, then terminal:
+#       reject -> ("pass", "rejected by operator", {..., "approval_decision": "rejected"})
+#       approve -> existing pass shape (+ "auto_approved" key in data)
+
+# run_pipeline D2 drain — replace `status, detail, data = await fn(ctx, client)`
+# (and its except ladder) with:
+    task = asyncio.ensure_future(fn(ctx, client))
+    try:
+        while True:
+            done, _ = await asyncio.wait({task}, timeout=0.25)
+            for ev in intermediate_events:           # same stamping as today
+                ev.step_index = index; ev.total_steps = total
+                ev.phase_index = phase_index; ev.phase_total = phase_total
+                ev.phase_name = phase_name
+                yield ev                              # NEW suspension point —
+                                                      # generator close lands HERE
+            intermediate_events.clear()
+            if done:
+                break
+        status, detail, data = task.result()
+    except _StepError as exc: ...                    # EXACT existing ladder
+    finally:
+        # LOAD-BEARING (quality-gate Finding 3): the Stop button closes the
+        # async generator, throwing GeneratorExit (BaseException) into the
+        # mid-step `yield ev` above — no except clause sees it. The finally
+        # is the ONLY hook that runs on every exit path; without it the
+        # in-flight step task is orphaned while _Client closes under it.
+        if not task.done():
+            task.cancel()
+# The existing post-task drain block @2701-2715 stays as the final flush.
+```
+
+```python
+# app/features/demo/workspace.py — inside finalize_workspace's try, after
+# row.result_summary assignment (@141-145):
+            row.approval_events = ctx.approval_events or None   # E5 (#411)
+            row.rag_events = ctx.rag_events or None
+            summary: dict[str, Any] = {... existing three keys ...}
+            if row.replayed_from_workspace_id:                  # D7 marker
+                src = (await db.execute(select(ShowcaseWorkspace).where(
+                    ShowcaseWorkspace.workspace_id == row.replayed_from_workspace_id
+                ))).scalar_one_or_none()
+                summary["story_reproduction"] = _story_reproduction(src, ctx)
+            row.result_summary = summary
+
+def _story_reproduction(src: ShowcaseWorkspace | None, ctx: DemoContext) -> dict[str, Any]:
+    """D7 — compare the source row's story slots against this run's capture."""
+    if src is None:
+        return {"agent": "unknown", "knowledge": "unknown",
+                "source_workspace_id": None}
+    def _verdict(source_had: bool, new_has: bool) -> str:
+        if not source_had: return "not_applicable"
+        return "reproduced" if new_has else "not_reproduced"
+    src_knowledge = any(e.get("event") in ("index", "retrieve") and e.get("status") != "skip"
+                        for e in (src.rag_events or []))
+    new_knowledge = any(e.get("event") in ("index", "retrieve") and e.get("status") != "skip"
+                        for e in ctx.rag_events)
+    return {
+        "agent": _verdict(bool(src.approval_events), bool(ctx.approval_events)),
+        "knowledge": _verdict(src_knowledge, new_knowledge),
+        "source_workspace_id": src.workspace_id,
+    }
+
+async def list_approval_events(db: AsyncSession, *, limit: int = 50) -> list[dict[str, Any]]:
+    """Flatten approval_events across the newest rows that carry the slot."""
+    result = await db.execute(
+        select(ShowcaseWorkspace)
+        .where(ShowcaseWorkspace.approval_events.isnot(None))
+        .order_by(ShowcaseWorkspace.created_at.desc(), ShowcaseWorkspace.id.desc())
+        .limit(50)
+    )
+    events: list[dict[str, Any]] = []
+    for row in result.scalars():
+        for entry in row.approval_events or []:
+            events.append({"workspace_id": row.workspace_id,
+                           "workspace_name": row.name, **entry})
+            if len(events) >= limit:
+                return events
+    return events
+```
+
+```python
+# app/features/demo/schemas.py — new models.
+class HitlDecisionRequest(BaseModel):
+    """Operator decision relay for the showcase HITL step (E5, #411). ..."""
+    model_config = ConfigDict(strict=True, extra="forbid")
+    action_id: str = Field(..., min_length=1, description="Pending action to decide.")
+    decision: Literal["approved", "rejected"] = Field(..., description="Operator decision.")
+    reason: str | None = Field(default=None, max_length=500,
+                               description="Optional reason (mirrors agents ApprovalRequest.reason).")
+
+class ApprovalEventItem(BaseModel):
+    """One flattened approval event (built from JSONB dicts — tolerant typing)."""
+    workspace_id: str
+    workspace_name: str | None = None
+    action_id: str | None = None
+    tool_name: str | None = None
+    decision: str | None = None
+    decided_at: str | None = None
+    session_id: str | None = None
+    auto_approved: bool | None = None
+    reason: str | None = None
+    execution_status: str | None = None
+    transcript_summary: str | None = None
+
+class ApprovalEventsResponse(BaseModel):
+    events: list[ApprovalEventItem]
+    total: int   # number returned (flattened cap), not a table count
+```
+
+```python
+# app/features/demo/routes.py — two routes.
+@router.post("/hitl-decision", status_code=status.HTTP_204_NO_CONTENT,
+             summary="Relay an operator decision to the in-flight HITL step", ...)
+async def submit_hitl_decision(body: HitlDecisionRequest) -> None:
+    outcome = hitl.resolve(body.action_id, body.decision, body.reason)
+    if outcome == "not_found":
+        raise NotFoundError(message=f"No pending HITL action: {body.action_id}")
+    if outcome == "already_decided":
+        raise ConflictError(f"Action already decided: {body.action_id}")
+
+@router.get("/approval-events", response_model=ApprovalEventsResponse,
+            summary="Recent HITL approval events across saved workspaces", ...)
+async def list_hitl_approval_events(
+    db: AsyncSession = Depends(get_db),
+    limit: int = Query(default=50, ge=1, le=200),
+) -> ApprovalEventsResponse:
+    events = await workspace.list_approval_events(db, limit=limit)
+    return ApprovalEventsResponse(
+        events=[ApprovalEventItem.model_validate(e) for e in events],
+        total=len(events),
+    )
+```
+
+```tsx
+// frontend — HitlDecisionButtons replaces ApproveButton (demo-step-card.tsx):
+//  - props: actionId, decisionWindowS (from step.data.decision_window_s ?? 10)
+//  - api('/demo/hitl-decision', { method: 'POST', body: { action_id, decision } })
+//    via lib/api.ts (NOT bare fetch); absorb 404/409 silently, surface 5xx.
+//  - Approve: variant "default"; Reject: variant "destructive" + size "sm";
+//    both disable after either click ("Approving…"/"Rejecting…").
+//  - countdown: `auto-approve in ${remaining}s` ticking from decisionWindowS.
+// WorkspaceStoryPanel (new): Card titled "Run story"; sections —
+//  Approval history (decision StatusBadge + tool + transcript snippet + when),
+//  Knowledge events (event/status/provider/count), Reproduction marker chips
+//  (from result_summary.story_reproduction; render only when present).
+//  Render nothing when the workspace has no slots (legacy rows).
+// ops.tsx: "Approval History" Card+Table after Needs Attention, fed by
+//  useApprovalEvents() (hooks/use-approval-events.ts; queryKey
+//  ['demo','approval-events',limit]; no polling); empty-state paragraph.
+```
+
+### List of tasks (dependency order)
+
+```yaml
+Task 0 — preconditions:
+  VERIFY: gh issue view 407 --json state -> CLOSED; the three E1 columns exist
+    on ShowcaseWorkspace; re-anchor blueprint line numbers if E1/E3 moved code.
+  RUN: git switch dev && git pull && git switch -c feat/showcase-completion-e5-agent-rag-story
+
+Task 1 — CREATE app/features/demo/hitl.py:
+  - IMPLEMENT the relay per blueprint (typed module state, register/resolve/
+    wait_for_decision/clear, structlog on resolve)
+  - CREATE tests/test_hitl.py: resolve-before-wait, wait-then-resolve,
+    timeout->None, wrong-action->not_found, double-resolve->already_decided,
+    register-overwrites-stale-slot, clear()
+
+Task 2 — MODIFY app/features/demo/models.py:
+  - config_schema_version ORM default 1 -> 2 (server_default UNCHANGED)
+  - EXTEND the slot-schema comments with the v2 delta (blueprint shapes)
+
+Task 3 — MODIFY pipeline.py run_pipeline (D2 drain) — FIRST pipeline change:
+  - task wrapper + 0.25s asyncio.wait flush loop per blueprint; preserve the
+    exact except ladder; `finally: if not task.done(): task.cancel()` so a
+    generator close (Stop button -> GeneratorExit at the mid-step yield)
+    never orphans the in-flight step task
+  - ADD orchestrator tests: (a) a stub step that emits an intermediate event
+    then blocks on an asyncio.Event; assert the intermediate event is yielded
+    while the step is still pending, then release and assert terminal order;
+    (b) close the generator (aclose()) while the stub step is in-flight and
+    assert the step task ends cancelled (no "destroyed but pending" warning)
+
+Task 4 — MODIFY pipeline.py HITL step + constants:
+  - REPLACE _APPROVAL_DISPLAY_DELAY_S with _APPROVAL_DECISION_WINDOW_S = 10.0
+    (sweep tests: monkeypatches @1973/2047/2063 + "auto-approve in 3 s" asserts)
+  - DemoContext: + approval_events / rag_events accumulators (E5 comment block)
+  - step_agent_hitl_flow: hitl.register before the intermediate event;
+    intermediate data += decision_window_s + decision_url; wait_for_decision;
+    reject path (approved=false POST, terminal pass "rejected by operator");
+    approval_events entry on every resolved path (incl. timed_out);
+    try/finally hitl.clear()
+  - EXTEND _make_hitl_client: capture approve POST json_body; tests for
+    operator-approve / operator-reject (resolve via hitl.resolve before the
+    wait) / window-lapse auto-approve / hard-timeout entry / skip paths
+    append nothing
+
+Task 5 — MODIFY pipeline.py knowledge steps:
+  - _record_rag_event helper + one call per return path of probe/index/retrieve
+  - tests: each path appends the right entry (pass/skip/auth-skip/warn),
+    provider populated, demo_minimal run leaves ctx.rag_events empty
+
+Task 6 — MODIFY workspace.py:
+  - finalize_workspace: slot writes + story_reproduction per blueprint
+    (all inside the existing try); _story_reproduction helper
+  - NEW list_approval_events helper
+  - tests/test_workspace.py (@integration): finalize writes slots (and NULL
+    when empty); replay row vs source-with-story -> reproduced; source-empty
+    -> not_applicable; dangling source -> unknown; list_approval_events
+    flattens newest-first and respects limit
+
+Task 7 — MODIFY schemas.py + routes.py:
+  - HitlDecisionRequest / ApprovalEventItem / ApprovalEventsResponse
+  - POST /demo/hitl-decision (204/404/409) + GET /demo/approval-events
+  - tests/test_schemas.py: JSON-dict path (security-patterns.md § strict mode):
+    HitlDecisionRequest.model_validate({"action_id": "a", "decision": "rejected"});
+    extra-key 422; bad decision literal 422; reason >500 422
+  - tests/test_routes.py: decision 204 (hitl registered via monkeypatch/
+    direct register) / 404 / 409 / 422; approval-events 200 empty + populated
+    (monkeypatch workspace.list_approval_events for the unit-shaped test,
+    follow the file's existing convention)
+
+Task 8 — frontend:
+  - types/api.ts: ApprovalEventItem/ApprovalEventsResponse (+ WorkspaceDetail
+    approval_events/rag_events fields IF E1 didn't add them); `// E5 (#411)` comments
+  - demo-step-card.tsx: HitlDecisionButtons per blueprint (replace
+    ApproveButton; keep the @496-505 render condition shape); update
+    demo-step-card.test.tsx (Reject renders, POST body, countdown text)
+  - hooks/use-approval-events.ts + test; export from hooks/index.ts
+  - components/demo/WorkspaceStoryPanel.tsx + test; export from index.ts;
+    mount in showcase.tsx beside WorkspaceArtifactsPanel (@448-450 guard)
+  - pages/ops.tsx: Approval History section after Needs Attention (@446),
+    mirroring its Card+Table+empty-state pattern
+  - GATES: pnpm lint && pnpm test --run; tsc -b no NEW errors
+
+Task 9 — docs (additive):
+  - API_CONTRACTS.md: two new demo rows (POST /demo/hitl-decision incl.
+    204/404/409 semantics; GET /demo/approval-events); WS /demo/stream section:
+    intermediate HITL event now streams DURING the window and data gains
+    decision_window_s/decision_url; note the 10 s window and the Reject path
+  - DOMAIN_MODEL.md § showcase_workspace: slot-schema v2 delta (decision enum,
+    additive keys, "probe" event), config_schema_version=2 note,
+    story_reproduction in result_summary
+  - RUNBOOKS.md: incidents 23-25 — window now 10 s, Reject button exists, a
+    rejected run is GREEN by design; § Showcase workspace — trim "RAG-event
+    and approval-decision capture" from the out-of-scope list
+
+Task 10 — gates, dogfood, PR:
+  - full Validation Loop (Levels 1-4); git diff --stat (CRLF noise check)
+  - COMMITS (reference #411, no AI trailer), e.g.:
+      feat(api): add hitl decision relay and story capture to demo pipeline (#411)
+      feat(ui): add reject button, run story panel and ops approval history (#411)
+      docs(docs): document approval and rag story capture contracts (#411)
+  - PR into dev; title:
+      feat(api,ui): showcase-completion E5 — agent/hitl + rag story capture (#411)
+```
+
+### Integration Points
+
+```yaml
+DATABASE: none — no migration (E1 shipped the columns). ORM default bump only.
+CONFIG: none — no new settings; the window is a module constant emitted to the FE.
+ROUTES: two additions on the existing demo router — no app/main.py change.
+AGENTS SLICE: untouched (D8). The pipeline keeps using the public
+  /agents/sessions/{id}/approve contract (approved=true|false + reason).
+OPS SLICE: untouched — /ops approval history is a frontend section over the
+  demo endpoint.
+FRONTEND: one replaced component, one new panel, one new ops section, one hook.
+PARALLEL EPICS: E3 touches DemoContext + create-time writes; E2/E4 may write
+  job_ids/phase_summaries in finalize — keep-both merge resolution; whoever
+  lands after a slot-shape change rebases the config_schema_version default.
+```
+
+## Validation Loop
+
+### Level 1: Syntax & Style
+
+```bash
+uv run ruff check . && uv run ruff format --check .
+uv run mypy app/ && uv run pyright app/
+```
+
+### Level 2: Unit Tests (no DB)
+
+```bash
+uv run pytest app/features/demo -v -m "not integration"
+uv run pytest app/core/tests/test_strict_mode_policy.py -v
+# Key new/updated cases (see Tasks 1,3,4,5,7):
+#   test_hitl.py — relay semantics incl. timeout + already_decided
+#   test_pipeline.py — drain-ordering (intermediate BEFORE step completion);
+#     HITL operator-approve / operator-reject / auto-approve / timed-out
+#     entries; reject terminal is pass + green; rag-event appends per path;
+#     demo_minimal leaves both accumulators empty
+#   test_routes.py — hitl-decision 204/404/409/422; approval-events 200
+#   test_schemas.py — HitlDecisionRequest JSON path + extra=forbid
+```
+
+### Level 3: Integration (real Postgres)
+
+```bash
+docker compose up -d && uv run alembic upgrade head
+uv run pytest app/features/demo -v -m integration
+# test_workspace.py — finalize writes approval_events/rag_events (NULL when
+# empty); story_reproduction matrix (reproduced / not_applicable / unknown);
+# list_approval_events flatten + limit
+```
+
+### Level 4: Manual smoke (seeded local stack, uvicorn :8123 + vite)
+
+```bash
+# 1. showcase_rich keep-run from /showcase with "Save as workspace" ticked.
+#    During the agents phase the HITL card must show Approve + Reject with a
+#    ticking "auto-approve in Ns" — click REJECT. Expect: step flips to pass
+#    with "rejected by operator"; run finishes GREEN.
+# 2. Verify capture:
+curl -s "http://localhost:8123/demo/approval-events?limit=5" | python3 -m json.tool
+#    -> one entry, decision="rejected", workspace_id set
+docker exec forecastlab-postgres psql -U forecastlab -d forecastlab -c \
+  "SELECT name, jsonb_array_length(approval_events) AS approvals, \
+          jsonb_array_length(rag_events) AS rag, config_schema_version \
+   FROM showcase_workspace ORDER BY created_at DESC LIMIT 1;"
+#    -> approvals=1, rag=3, config_schema_version=2
+# 3. Decision relay error paths:
+curl -s -X POST http://localhost:8123/demo/hitl-decision \
+  -H 'Content-Type: application/json' \
+  -d '{"action_id": "bogus", "decision": "approved"}' | python3 -m json.tool   # 404 problem+json
+# 4. Replay the kept workspace (Replay button) and let it auto-approve; then:
+#    GET /demo/workspaces/{new_id} -> result_summary.story_reproduction.agent
+#    == "reproduced" (source had an approval event; replay produced one too).
+# 5. /ops page shows the Approval History table; Showcase Load on the kept
+#    workspace renders the story panel (events + reproduction chips).
+# 6. Regression: run demo_minimal — no buttons, no relay calls, slots NULL.
+```
+
+## Final validation Checklist
+
+- [ ] Five gates green: `uv run ruff check . && uv run ruff format --check . && uv run mypy app/ && uv run pyright app/ && uv run pytest -v -m "not integration"`
+- [ ] Integration suite green on a fresh docker-compose DB (reset first if the shared DB is polluted)
+- [ ] Drain-ordering test proves intermediate events stream mid-step; Stop button still cancels cleanly (CancelledError passthrough)
+- [ ] Reject path: green run, entry captured, no scenario_plan written by the agent
+- [ ] Slots NULL on empty capture; `config_schema_version`=2 on new rows, old rows still 1
+- [ ] `POST /demo/hitl-decision` 204/404/409/422 and `GET /demo/approval-events` verified (Levels 2+4)
+- [ ] story_reproduction matrix covered (reproduced / not_reproduced / not_applicable / unknown)
+- [ ] Frontend: `pnpm lint` + `pnpm test --run` green; no NEW `tsc -b` errors; manual browser pass (Level 4 steps 1, 5)
+- [ ] No agents-slice diff; no migration; `git diff --stat` surgical (no CRLF noise)
+- [ ] Docs updated (API_CONTRACTS, DOMAIN_MODEL v2 slot delta, RUNBOOKS trim)
+- [ ] Commits `feat(api)/feat(ui)/docs(...): ... (#411)`, no AI trailer; PR into dev
+
+---
+
+## Anti-Patterns to Avoid
+
+- ❌ Don't touch `app/features/agents/**` or `agent_require_approval` — the reject path uses the EXISTING `approved=false` contract (D1/D8).
+- ❌ Don't let a reject (or any capture failure) fail the pipeline — reject is `pass`; capture rides warn-and-continue.
+- ❌ Don't re-architect the pipeline beyond the D2 drain loop — steps stay strictly sequential under the single lock; no mid-run frame reading on the WS.
+- ❌ Don't make the FE call `/agents/sessions/{id}/approve` directly anymore — the relay is the single intent channel (keep emitting `approval_url` for back-compat, but the buttons use the relay).
+- ❌ Don't echo tool-call argument VALUES or full transcripts into `approval_events` — keys + 200-char summary only.
+- ❌ Don't write `[]` into a slot — empty capture leaves the column NULL (E1: NULL = "never written").
+- ❌ Don't add a migration or change `server_default` for the version bump — ORM `default=` only.
+- ❌ Don't put the approval-history endpoint in the ops slice — demo owns the data; /ops renders client-side.
+- ❌ Don't validate that the replay source row exists — dangles are designed; `unknown` is the honest verdict.
+- ❌ Don't mutate row JSONB in place — whole-value assignment only.
+- ❌ Don't add list pagination/filtering to approval-events — audit glance, not a browse API (E7 can extend).
+
+## Notes for the release-gate epic (E7)
+
+- E5 bumps the documented slot schema to v2 (D4) — the DOMAIN_MODEL delta is
+  the authoritative copy; verify E2/E4/E6 didn't race the same default.
+- The D2 drain generalizes intermediate-event streaming for ANY step — if a
+  later epic wants mid-step progress (e.g. batch sub-job ticks), the plumbing
+  now exists; document it if used.
+- The deferred durable approval audit on `agent_session` (D8) is a candidate
+  follow-up issue if the chat surface (non-showcase) needs history too.
+
+## Confidence Score
+
+**8/10** for one-pass implementation success. Every write path has a verified
+in-repo precedent: the slot columns + warn-and-continue hook (E1/`workspace.py`),
+the module-level single-flight state (`service.py:19`), the HITL step's fake-client
+test harness (`test_pipeline.py:1838`), the agents `approved=false` contract
+(`agents/schemas.py:192` — no agents change needed), and the ops Card+Table /
+TanStack patterns. The two judgment calls with real risk are resolved and frozen:
+D1 (relay; eliminates the unknowable FE-pre-empt decision) and D2 (concurrent
+drain; the one structural change, contained to a single loop body with a
+dedicated ordering test). The −2: (a) D2 touches the orchestrator's exception/
+cancellation flow — the Stop-button path (`WebSocketDisconnect` → generator
+close → CancelledError) must be re-verified by hand in Level 4; (b) this PRP is
+written against E1's PRP rather than E1's merged code, and E3 may land in
+parallel — Task 0's re-anchoring step and the keep-both merge note mitigate but
+can't eliminate rebase friction.

From a5f72533775b790c5d10808c20ccd368e91e76b3 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 00:12:39 +0200
Subject: [PATCH 06/32] feat(db): extend showcase_workspace with metadata and
 provenance columns (#407)

---
 ..._showcase_workspace_metadata_provenance.py | 137 ++++++++++++++++++
 app/features/demo/models.py                   |  88 ++++++++++-
 2 files changed, 224 insertions(+), 1 deletion(-)
 create mode 100644 alembic/versions/d45cf40dfe47_add_showcase_workspace_metadata_provenance.py

diff --git a/alembic/versions/d45cf40dfe47_add_showcase_workspace_metadata_provenance.py b/alembic/versions/d45cf40dfe47_add_showcase_workspace_metadata_provenance.py
new file mode 100644
index 00000000..0ba9d043
--- /dev/null
+++ b/alembic/versions/d45cf40dfe47_add_showcase_workspace_metadata_provenance.py
@@ -0,0 +1,137 @@
+"""add showcase_workspace metadata and provenance columns
+
+Revision ID: d45cf40dfe47
+Revises: 324a2fa37fcc
+Create Date: 2026-06-12 12:00:00.000000
+
+E1 of the showcase-completion initiative (umbrella #406, epic #407). Extends
+``showcase_workspace`` with the metadata + provenance backbone every parallel
+epic consumes: lifecycle columns (``archived`` / ``pinned`` / ``notes`` /
+``tags`` / ``config_schema_version``), the replay-provenance soft reference
+``replayed_from_workspace_id`` (deliberately NO ForeignKey -- not even
+self-referential; ancestor rows stay independently deletable), and six
+documented JSONB story slots (``seed_overrides`` / ``user_scope`` /
+``approval_events`` / ``rag_events`` / ``job_ids`` / ``phase_summaries``)
+that stay NULL until their writer epic lands. NOT NULL columns carry server
+defaults so the migration applies on tables with existing rows. Forward-only.
+"""
+
+from collections.abc import Sequence
+
+import sqlalchemy as sa
+from sqlalchemy.dialects import postgresql
+
+from alembic import op
+
+# revision identifiers, used by Alembic.
+revision: str = "d45cf40dfe47"
+down_revision: str | None = "324a2fa37fcc"
+branch_labels: str | Sequence[str] | None = None
+depends_on: str | Sequence[str] | None = None
+
+
+def upgrade() -> None:
+    """Add the lifecycle, provenance, and story-slot columns plus indexes."""
+    op.add_column(
+        "showcase_workspace",
+        sa.Column(
+            "archived",
+            sa.Boolean(),
+            nullable=False,
+            server_default=sa.text("false"),
+        ),
+    )
+    op.add_column(
+        "showcase_workspace",
+        sa.Column(
+            "pinned",
+            sa.Boolean(),
+            nullable=False,
+            server_default=sa.text("false"),
+        ),
+    )
+    op.add_column(
+        "showcase_workspace",
+        sa.Column("notes", sa.Text(), nullable=True),
+    )
+    op.add_column(
+        "showcase_workspace",
+        sa.Column(
+            "tags",
+            postgresql.JSONB(astext_type=sa.Text()),
+            nullable=False,
+            server_default=sa.text("'[]'::jsonb"),
+        ),
+    )
+    op.add_column(
+        "showcase_workspace",
+        sa.Column(
+            "config_schema_version",
+            sa.Integer(),
+            nullable=False,
+            server_default=sa.text("1"),
+        ),
+    )
+    op.add_column(
+        "showcase_workspace",
+        sa.Column("replayed_from_workspace_id", sa.String(length=32), nullable=True),
+    )
+    op.add_column(
+        "showcase_workspace",
+        sa.Column("seed_overrides", postgresql.JSONB(astext_type=sa.Text()), nullable=True),
+    )
+    op.add_column(
+        "showcase_workspace",
+        sa.Column("user_scope", postgresql.JSONB(astext_type=sa.Text()), nullable=True),
+    )
+    op.add_column(
+        "showcase_workspace",
+        sa.Column("approval_events", postgresql.JSONB(astext_type=sa.Text()), nullable=True),
+    )
+    op.add_column(
+        "showcase_workspace",
+        sa.Column("rag_events", postgresql.JSONB(astext_type=sa.Text()), nullable=True),
+    )
+    op.add_column(
+        "showcase_workspace",
+        sa.Column("job_ids", postgresql.JSONB(astext_type=sa.Text()), nullable=True),
+    )
+    op.add_column(
+        "showcase_workspace",
+        sa.Column("phase_summaries", postgresql.JSONB(astext_type=sa.Text()), nullable=True),
+    )
+    op.create_index(
+        "ix_showcase_workspace_tags_gin",
+        "showcase_workspace",
+        ["tags"],
+        unique=False,
+        postgresql_using="gin",
+    )
+    op.create_index(
+        "ix_showcase_workspace_replayed_from",
+        "showcase_workspace",
+        ["replayed_from_workspace_id"],
+        unique=False,
+    )
+
+
+def downgrade() -> None:
+    """Drop the two indexes, then the twelve columns (reverse order)."""
+    op.drop_index("ix_showcase_workspace_replayed_from", table_name="showcase_workspace")
+    op.drop_index(
+        "ix_showcase_workspace_tags_gin",
+        table_name="showcase_workspace",
+        postgresql_using="gin",
+    )
+    op.drop_column("showcase_workspace", "phase_summaries")
+    op.drop_column("showcase_workspace", "job_ids")
+    op.drop_column("showcase_workspace", "rag_events")
+    op.drop_column("showcase_workspace", "approval_events")
+    op.drop_column("showcase_workspace", "user_scope")
+    op.drop_column("showcase_workspace", "seed_overrides")
+    op.drop_column("showcase_workspace", "replayed_from_workspace_id")
+    op.drop_column("showcase_workspace", "config_schema_version")
+    op.drop_column("showcase_workspace", "tags")
+    op.drop_column("showcase_workspace", "notes")
+    op.drop_column("showcase_workspace", "pinned")
+    op.drop_column("showcase_workspace", "archived")
diff --git a/app/features/demo/models.py b/app/features/demo/models.py
index 30ad586e..4a50eb4a 100644
--- a/app/features/demo/models.py
+++ b/app/features/demo/models.py
@@ -10,6 +10,17 @@
 references the run). E1 of the showcase-workspace initiative (umbrella #389,
 epic #390).
 
+E1 of the showcase-completion initiative (umbrella #406, epic #407) adds the
+metadata + provenance backbone: lifecycle columns (``archived`` / ``pinned`` /
+``notes`` / ``tags`` / ``config_schema_version``), the replay-provenance
+column ``replayed_from_workspace_id`` -- ALSO a soft reference, deliberately
+no ForeignKey, not even self-referential: ancestor rows must stay
+independently deletable (metadata-only delete) without cascading to or
+blocking descendants, so dangling lineage pointers are expected -- and six
+documented JSONB story slots (``seed_overrides`` / ``user_scope`` /
+``approval_events`` / ``rag_events`` / ``job_ids`` / ``phase_summaries``)
+that stay NULL until their writer epic lands (#408-#412).
+
 GOTCHA: SQLAlchemy reserves the declarative attribute name ``metadata``; the
 JSONB columns are therefore named ``created_objects`` and ``result_summary``.
 """
@@ -19,7 +30,7 @@
 import datetime as _dt
 from typing import Any
 
-from sqlalchemy import CheckConstraint, Date, Index, Integer, String, text
+from sqlalchemy import CheckConstraint, Date, Index, Integer, String, Text, text
 from sqlalchemy.dialects.postgresql import JSONB
 from sqlalchemy.orm import Mapped, mapped_column
 
@@ -52,6 +63,18 @@ class ShowcaseWorkspace(TimestampMixin, Base):
         date_end: Seeded data window end; NULL when unknown.
         created_objects: Soft-reference ids of everything the run created (JSONB).
         result_summary: Winner / WAPE / wall-clock display payload (JSONB).
+        archived: Operator curation flag -- archived rows still list in E1.
+        pinned: Operator curation flag -- no behavioral semantics in E1.
+        notes: Free-text operator annotation (capped at the Pydantic boundary).
+        tags: Queryable JSONB string array, GIN-indexed (scenario_plan pattern).
+        config_schema_version: Version of the config + story-slot schema (starts at 1).
+        replayed_from_workspace_id: Soft reference to the replayed source row.
+        seed_overrides: Story slot (E3 #409 writes) -- NULL until written.
+        user_scope: Story slot (E3 #409 writes) -- NULL until written.
+        approval_events: Story slot (E5 #411 writes) -- NULL until written.
+        rag_events: Story slot (E5 #411 writes) -- NULL until written.
+        job_ids: Story slot (later parallel epic writes) -- NULL until written.
+        phase_summaries: Story slot (later parallel epic writes) -- NULL until written.
     """
 
     __tablename__ = "showcase_workspace"
@@ -80,10 +103,73 @@ class ShowcaseWorkspace(TimestampMixin, Base):
     # winner_model_type / winner_wape / wall_clock_s -- display payload.
     result_summary: Mapped[dict[str, Any] | None] = mapped_column(JSONB, nullable=True)
 
+    # ── E1 (#407) — lifecycle metadata ────────────────────────────────────
+    # Orthogonal to ``status`` (which the pipeline owns): archive/pin are
+    # operator curation flags, PATCH-mutable, default false.
+    archived: Mapped[bool] = mapped_column(
+        nullable=False, default=False, server_default=text("false")
+    )
+    pinned: Mapped[bool] = mapped_column(
+        nullable=False, default=False, server_default=text("false")
+    )
+    # Free-text operator annotation; length capped at the Pydantic boundary (2000).
+    notes: Mapped[str | None] = mapped_column(Text, nullable=True)
+    # Queryable JSONB string array -- EXACT scenario_plan.tags pattern
+    # (app/features/scenarios/models.py); GIN-indexed below.
+    tags: Mapped[list[str]] = mapped_column(
+        JSONB, nullable=False, default=list, server_default=text("'[]'::jsonb")
+    )
+    # Version of the workspace config + story-slot schema (umbrella #406
+    # junk-drawer mitigation). Bump the ORM default when a slot shape changes.
+    config_schema_version: Mapped[int] = mapped_column(
+        Integer, nullable=False, default=1, server_default=text("1")
+    )
+
+    # ── E1 (#407) — replay provenance ─────────────────────────────────────
+    # SOFT reference to the workspace this run replayed (uuid4().hex of the
+    # source row). Deliberately NO ForeignKey -- not even self-referential:
+    # ancestor rows must stay independently deletable (metadata-only delete),
+    # and dangling lineage pointers are expected, like every created_objects id.
+    replayed_from_workspace_id: Mapped[str | None] = mapped_column(String(32), nullable=True)
+
+    # ── E1 (#407) — documented JSONB story slots ──────────────────────────
+    # Six dedicated nullable JSONB columns (precedent: created_objects /
+    # result_summary). NULL = "slot never written" (distinct from empty).
+    # E1 writes NONE of them; documented schema per slot (authoritative copy
+    # in docs/_base/DOMAIN_MODEL.md):
+    #   seed_overrides   (E3 #409 writes) — dict: the curated seeder-override
+    #                    payload from the start frame, stored verbatim
+    #                    (model_dump(mode="json")); replay echoes it.
+    #   user_scope       (E3 #409 writes) — dict: operator-selected focus,
+    #                    {"store_id": int, "product_id": int} (additive keys
+    #                    allowed later).
+    #   approval_events  (E5 #411 writes) — list[dict], append-only:
+    #                    {"action_id": str, "tool_name": str,
+    #                     "decision": "approved"|"rejected",
+    #                     "decided_at": iso8601-str, "session_id": str}.
+    #   rag_events       (E5 #411 writes) — list[dict], append-only:
+    #                    {"event": "index"|"retrieve"|"skip", "detail": str,
+    #                     "count": int, "occurred_at": iso8601-str}.
+    #   job_ids          (later parallel epic) — list[str]: job / batch
+    #                    sub-job ids the run submitted (soft references).
+    #   phase_summaries  (later parallel epic) — list[dict], one per phase:
+    #                    {"phase_name": str, "status": "pass"|"fail"|"warn"|"skip",
+    #                     "steps": int, "duration_ms": float}.
+    seed_overrides: Mapped[dict[str, Any] | None] = mapped_column(JSONB, nullable=True)
+    user_scope: Mapped[dict[str, Any] | None] = mapped_column(JSONB, nullable=True)
+    approval_events: Mapped[list[dict[str, Any]] | None] = mapped_column(JSONB, nullable=True)
+    rag_events: Mapped[list[dict[str, Any]] | None] = mapped_column(JSONB, nullable=True)
+    job_ids: Mapped[list[str] | None] = mapped_column(JSONB, nullable=True)
+    phase_summaries: Mapped[list[dict[str, Any]] | None] = mapped_column(JSONB, nullable=True)
+
     __table_args__ = (
         CheckConstraint(
             "status IN ('running', 'completed', 'failed')",
             name="ck_showcase_workspace_status",
         ),
         Index("ix_showcase_workspace_status_created", "status", "created_at"),
+        # E1 (#407) — tag containment queries (scenario_plan GIN precedent).
+        Index("ix_showcase_workspace_tags_gin", "tags", postgresql_using="gin"),
+        # E1 (#407) — lineage lookups ("which runs replayed this workspace?").
+        Index("ix_showcase_workspace_replayed_from", "replayed_from_workspace_id"),
     )

From 9e12aadc1246f07691d461a7db0308e8b2990f43 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 00:12:39 +0200
Subject: [PATCH 07/32] feat(api): add workspace patch lifecycle endpoint and
 replay provenance (#407)

---
 app/features/demo/routes.py               |  37 ++++++
 app/features/demo/schemas.py              | 105 ++++++++++++++++-
 app/features/demo/tests/test_models.py    |  82 +++++++++++++
 app/features/demo/tests/test_routes.py    | 121 ++++++++++++++++++++
 app/features/demo/tests/test_schemas.py   | 127 +++++++++++++++++++++
 app/features/demo/tests/test_workspace.py | 133 +++++++++++++++++++++-
 app/features/demo/workspace.py            |  45 +++++++-
 7 files changed, 646 insertions(+), 4 deletions(-)

diff --git a/app/features/demo/routes.py b/app/features/demo/routes.py
index d2881acb..87584247 100644
--- a/app/features/demo/routes.py
+++ b/app/features/demo/routes.py
@@ -5,6 +5,8 @@
 - ``WS   /demo/stream`` -- streams one StepEvent per step for the live UI.
 - ``GET    /demo/workspaces``                 -- E4 (#393): list saved workspaces.
 - ``GET    /demo/workspaces/{workspace_id}``  -- E4 (#393): one workspace's detail.
+- ``PATCH  /demo/workspaces/{workspace_id}``  -- E1 (#407): partial lifecycle
+  update (rename / notes / tags / archive / pin); ``status`` is not patchable.
 - ``DELETE /demo/workspaces/{workspace_id}``  -- delete the workspace METADATA
   row only; the run's created objects are soft references and stay untouched.
 
@@ -41,6 +43,7 @@
     WorkspaceDetailResponse,
     WorkspaceListItem,
     WorkspaceListResponse,
+    WorkspaceUpdateRequest,
 )
 
 logger = get_logger(__name__)
@@ -135,6 +138,40 @@ async def get_showcase_workspace(
     return WorkspaceDetailResponse.model_validate(row)
 
 
+@router.patch(
+    "/workspaces/{workspace_id}",
+    response_model=WorkspaceDetailResponse,
+    summary="Update a saved showcase workspace's lifecycle metadata",
+    description=(
+        "Partial update: rename / notes / tags / archive / pin. Only fields "
+        "present in the body change; explicit null clears name/notes. The run "
+        "lifecycle status is not patchable."
+    ),
+)
+async def update_showcase_workspace(
+    workspace_id: str,
+    update: WorkspaceUpdateRequest,
+    db: AsyncSession = Depends(get_db),
+) -> WorkspaceDetailResponse:
+    """Update a saved showcase workspace's lifecycle metadata (E1, #407).
+
+    Args:
+        workspace_id: External identifier of the workspace.
+        update: Partial-update body; only provided fields are applied.
+        db: Async database session from dependency.
+
+    Returns:
+        The full updated workspace row.
+
+    Raises:
+        NotFoundError: When no workspace matches ``workspace_id``.
+    """
+    row = await workspace.update_workspace(db, workspace_id, update)
+    if row is None:
+        raise NotFoundError(message=f"Workspace not found: {workspace_id}")
+    return WorkspaceDetailResponse.model_validate(row)
+
+
 @router.delete(
     "/workspaces/{workspace_id}",
     status_code=status.HTTP_204_NO_CONTENT,
diff --git a/app/features/demo/schemas.py b/app/features/demo/schemas.py
index cad7d32e..66bf202b 100644
--- a/app/features/demo/schemas.py
+++ b/app/features/demo/schemas.py
@@ -11,7 +11,7 @@
 from datetime import UTC, date, datetime
 from typing import Any, Literal
 
-from pydantic import BaseModel, ConfigDict, Field, model_validator
+from pydantic import BaseModel, ConfigDict, Field, field_validator, model_validator
 
 from app.shared.seeder.config import ScenarioPreset
 
@@ -76,6 +76,15 @@ class DemoRunRequest(BaseModel):
         pattern=r"^[a-z0-9][a-z0-9\-_]*$",
         description="Optional workspace label; requires preservation='keep'.",
     )
+    # E1 (#407): replay provenance. The frontend Replay handler sends the
+    # SOURCE row's workspace_id; create_workspace records it verbatim on the
+    # NEW row (soft reference -- no existence check). JSON-native str -> no
+    # Field(strict=False) needed.
+    replayed_from_workspace_id: str | None = Field(
+        default=None,
+        pattern=r"^[0-9a-f]{32}$",  # uuid4().hex shape of workspace_id
+        description="workspace_id this run replays; requires preservation='keep'.",
+    )
 
     @model_validator(mode="after")
     def _workspace_name_requires_keep(self) -> DemoRunRequest:
@@ -84,6 +93,67 @@ def _workspace_name_requires_keep(self) -> DemoRunRequest:
             raise ValueError("workspace_name requires preservation='keep'")
         return self
 
+    @model_validator(mode="after")
+    def _replayed_from_requires_keep(self) -> DemoRunRequest:
+        """Reject a lineage pointer on a run that writes no workspace row."""
+        if self.replayed_from_workspace_id is not None and self.preservation != "keep":
+            raise ValueError("replayed_from_workspace_id requires preservation='keep'")
+        return self
+
+
+class WorkspaceUpdateRequest(BaseModel):
+    """Partial lifecycle update for ``PATCH /demo/workspaces/{workspace_id}``.
+
+    exclude_unset semantics: only fields present in the body are applied;
+    explicit ``null`` clears ``name`` / ``notes``. Explicit ``null`` on
+    ``archived`` / ``pinned`` / ``tags`` is rejected (422) -- they back NOT
+    NULL columns; send ``[]`` to clear tags. ``extra="forbid"`` so a typo'd
+    field 422s instead of silently no-opping (RunUpdate precedent,
+    ``app/features/registry/schemas.py``). All fields JSON-native -> the
+    model-level ``strict=True`` needs no per-field override. ``status`` is
+    deliberately absent -- the pipeline owns the run lifecycle.
+    """
+
+    model_config = ConfigDict(strict=True, extra="forbid")
+
+    name: str | None = Field(
+        default=None,
+        max_length=100,
+        pattern=r"^[a-z0-9][a-z0-9\-_]*$",  # same as workspace_name
+        description="Rename the workspace; explicit null clears the label.",
+    )
+    notes: str | None = Field(
+        default=None,
+        max_length=2000,
+        description="Free-text annotation; explicit null clears it.",
+    )
+    tags: list[str] | None = Field(
+        default=None,
+        max_length=20,
+        description="Replace the full tag list (not a merge).",
+    )
+    archived: bool | None = Field(default=None, description="Archive flag.")
+    pinned: bool | None = Field(default=None, description="Pin flag.")
+
+    @field_validator("archived", "pinned", "tags")
+    @classmethod
+    def _reject_explicit_null(cls, v: bool | list[str] | None) -> bool | list[str]:
+        """Reject an explicit ``null`` on the NOT NULL-backed optional fields.
+
+        Fires only on explicitly provided values (pydantic skips validators
+        for defaults unless ``validate_default=True``), so an absent field
+        stays unset while an explicit ``{"archived": null}`` / ``{"tags":
+        null}`` 422s instead of reaching the NOT NULL column via
+        ``exclude_unset`` -> ``setattr`` -> IntegrityError 500. tags: send
+        ``[]`` to clear, never ``null``.
+        """
+        if v is None:
+            raise ValueError(
+                "archived/pinned accept only true/false and tags accepts a list "
+                "(send [] to clear) — explicit null is not allowed"
+            )
+        return v
+
 
 class StepEvent(BaseModel):
     """One streamed pipeline event.
@@ -187,6 +257,15 @@ class WorkspaceListItem(BaseModel):
         default=None, description="Winner / WAPE / wall-clock display payload."
     )
     created_at: datetime = Field(..., description="When the run was recorded (UTC).")
+    # E1 (#407) -- additive lifecycle + provenance fields (defaults so
+    # pre-E1 ORM-shaped stand-ins keep validating).
+    archived: bool = Field(default=False, description="Operator archive flag.")
+    pinned: bool = Field(default=False, description="Operator pin flag.")
+    tags: list[str] = Field(default_factory=list, description="Operator tags.")
+    replayed_from_workspace_id: str | None = Field(
+        default=None,
+        description="workspace_id this run replayed (soft reference; may dangle).",
+    )
 
 
 class WorkspaceDetailResponse(WorkspaceListItem):
@@ -200,6 +279,30 @@ class WorkspaceDetailResponse(WorkspaceListItem):
         default_factory=dict,
         description="Soft-reference ids of everything the run created.",
     )
+    # E1 (#407) -- additive lifecycle metadata + the six story slots
+    # (NULL until their writer epic lands; defaults keep pre-E1 stand-ins valid).
+    notes: str | None = Field(default=None, description="Free-text operator annotation.")
+    config_schema_version: int = Field(
+        default=1, description="Version of the config + story-slot schema."
+    )
+    seed_overrides: dict[str, Any] | None = Field(
+        default=None, description="Story slot (E3 #409 writes): seeder-override payload."
+    )
+    user_scope: dict[str, Any] | None = Field(
+        default=None, description="Story slot (E3 #409 writes): operator-selected focus."
+    )
+    approval_events: list[dict[str, Any]] | None = Field(
+        default=None, description="Story slot (E5 #411 writes): HITL approval audit."
+    )
+    rag_events: list[dict[str, Any]] | None = Field(
+        default=None, description="Story slot (E5 #411 writes): RAG event audit."
+    )
+    job_ids: list[str] | None = Field(
+        default=None, description="Story slot (later epic): submitted job/batch ids."
+    )
+    phase_summaries: list[dict[str, Any]] | None = Field(
+        default=None, description="Story slot (later epic): per-phase outcome summary."
+    )
 
 
 class WorkspaceListResponse(BaseModel):
diff --git a/app/features/demo/tests/test_models.py b/app/features/demo/tests/test_models.py
index 91c9d0e5..28791caa 100644
--- a/app/features/demo/tests/test_models.py
+++ b/app/features/demo/tests/test_models.py
@@ -11,6 +11,7 @@
 from datetime import date
 
 import pytest
+from sqlalchemy import select
 from sqlalchemy.exc import IntegrityError
 from sqlalchemy.ext.asyncio import AsyncSession
 
@@ -109,3 +110,84 @@ async def test_showcase_workspace_status_check_violation(db_session: AsyncSessio
     with pytest.raises(IntegrityError):
         await db_session.commit()
     await db_session.rollback()
+
+
+# =============================================================================
+# E1 (#407) -- metadata + provenance backbone
+# =============================================================================
+
+
+async def test_showcase_workspace_e1_defaults_applied(db_session: AsyncSession) -> None:
+    """A minimal insert gets the E1 defaults (ORM + server defaults agree)."""
+    row = _make_row()
+    db_session.add(row)
+    await db_session.commit()
+
+    loaded = await get_workspace(db_session, row.workspace_id)
+    assert loaded is not None
+    assert loaded.archived is False
+    assert loaded.pinned is False
+    assert loaded.notes is None
+    assert loaded.tags == []
+    assert loaded.config_schema_version == 1
+    assert loaded.replayed_from_workspace_id is None
+    # All six story slots stay NULL until their writer epic lands.
+    assert loaded.seed_overrides is None
+    assert loaded.user_scope is None
+    assert loaded.approval_events is None
+    assert loaded.rag_events is None
+    assert loaded.job_ids is None
+    assert loaded.phase_summaries is None
+
+
+async def test_showcase_workspace_tags_containment_query(db_session: AsyncSession) -> None:
+    """tags round-trips as a JSONB string array and answers .contains()."""
+    tagged = _make_row(tags=["workspace:x", "demo"])
+    untagged = _make_row(tags=["other"])
+    db_session.add_all([tagged, untagged])
+    await db_session.commit()
+
+    result = await db_session.execute(
+        select(ShowcaseWorkspace).where(ShowcaseWorkspace.tags.contains(["demo"]))
+    )
+    matches = [r.workspace_id for r in result.scalars().all()]
+    assert tagged.workspace_id in matches
+    assert untagged.workspace_id not in matches
+
+    loaded = await get_workspace(db_session, tagged.workspace_id)
+    assert loaded is not None
+    assert loaded.tags == ["workspace:x", "demo"]
+
+
+async def test_showcase_workspace_story_slot_roundtrip(db_session: AsyncSession) -> None:
+    """A dict slot and a list[dict] slot round-trip through JSONB intact."""
+    seed_overrides = {"noise_sigma": 0.2, "promo_intensity": "high"}
+    approval_events = [
+        {
+            "action_id": "act-1",
+            "tool_name": "save_scenario",
+            "decision": "approved",
+            "decided_at": "2026-06-12T12:00:00+00:00",
+            "session_id": "sess-1",
+        }
+    ]
+    row = _make_row(seed_overrides=seed_overrides, approval_events=approval_events)
+    db_session.add(row)
+    await db_session.commit()
+
+    loaded = await get_workspace(db_session, row.workspace_id)
+    assert loaded is not None
+    assert loaded.seed_overrides == seed_overrides
+    assert loaded.approval_events == approval_events
+
+
+async def test_showcase_workspace_replayed_from_recorded(db_session: AsyncSession) -> None:
+    """replayed_from_workspace_id stores a verbatim soft reference (may dangle)."""
+    dangling_source = uuid.uuid4().hex  # no such row -- dangles by design
+    row = _make_row(replayed_from_workspace_id=dangling_source)
+    db_session.add(row)
+    await db_session.commit()
+
+    loaded = await get_workspace(db_session, row.workspace_id)
+    assert loaded is not None
+    assert loaded.replayed_from_workspace_id == dangling_source
diff --git a/app/features/demo/tests/test_routes.py b/app/features/demo/tests/test_routes.py
index 6fd5b84a..1934c018 100644
--- a/app/features/demo/tests/test_routes.py
+++ b/app/features/demo/tests/test_routes.py
@@ -351,6 +351,92 @@ async def fake_delete(_db, _workspace_id: str) -> bool:
     assert "Workspace not found" in resp.json()["detail"]
 
 
+# =============================================================================
+# E1 (#407) -- PATCH /demo/workspaces/{workspace_id} (unit)
+# =============================================================================
+
+
+async def test_patch_workspace_happy_path(client, monkeypatch):
+    """E1 (#407) -- provided fields update; response echoes the full detail."""
+    seen: dict[str, object] = {}
+
+    async def fake_update(_db, workspace_id: str, update) -> SimpleNamespace:
+        seen["workspace_id"] = workspace_id
+        seen["changes"] = update.model_dump(exclude_unset=True)
+        return _orm_like_row(
+            workspace_id=workspace_id,
+            name="renamed",
+            pinned=True,
+            tags=["t1"],
+        )
+
+    monkeypatch.setattr(workspace, "update_workspace", fake_update)
+
+    resp = await client.patch(
+        "/demo/workspaces/" + "a" * 32,
+        json={"name": "renamed", "pinned": True, "tags": ["t1"]},
+    )
+    assert resp.status_code == 200
+    assert seen["workspace_id"] == "a" * 32
+    assert seen["changes"] == {"name": "renamed", "pinned": True, "tags": ["t1"]}
+    body = resp.json()
+    assert body["name"] == "renamed"
+    assert body["pinned"] is True
+    assert body["tags"] == ["t1"]
+    # Untouched fields ride through from the row.
+    assert body["status"] == "completed"
+    assert body["seed"] == 42
+
+
+async def test_patch_workspace_missing_404_problem_json(client, monkeypatch):
+    """E1 (#407) -- an unknown workspace_id is a 404 problem+json."""
+
+    async def fake_update(_db, _workspace_id: str, _update) -> None:
+        return None
+
+    monkeypatch.setattr(workspace, "update_workspace", fake_update)
+
+    resp = await client.patch("/demo/workspaces/" + "0" * 32, json={"pinned": True})
+    assert resp.status_code == 404
+    assert resp.headers["content-type"].startswith("application/problem+json")
+    assert "Workspace not found" in resp.json()["detail"]
+
+
+async def test_patch_workspace_unknown_field_422(client):
+    """E1 (#407) -- extra='forbid': a typo'd field is a 422 problem+json."""
+    resp = await client.patch("/demo/workspaces/" + "a" * 32, json={"bogus": 1})
+    assert resp.status_code == 422
+    assert resp.headers["content-type"].startswith("application/problem+json")
+
+
+async def test_patch_workspace_explicit_null_archived_422(client):
+    """E1 (#407) -- explicit null on a NOT NULL-backed field is a 422."""
+    resp = await client.patch("/demo/workspaces/" + "a" * 32, json={"archived": None})
+    assert resp.status_code == 422
+    assert resp.headers["content-type"].startswith("application/problem+json")
+
+
+async def test_patch_workspace_empty_body_noop_200(client, monkeypatch):
+    """E1 (#407) -- an empty body is a 200 no-op returning the current row."""
+
+    async def fake_update(_db, workspace_id: str, update) -> SimpleNamespace:
+        assert update.model_dump(exclude_unset=True) == {}
+        return _orm_like_row(workspace_id=workspace_id)
+
+    monkeypatch.setattr(workspace, "update_workspace", fake_update)
+
+    resp = await client.patch("/demo/workspaces/" + "a" * 32, json={})
+    assert resp.status_code == 200
+    assert resp.json()["workspace_id"] == "a" * 32
+
+
+async def test_run_demo_rejects_replayed_from_without_keep_422(client):
+    """E1 (#407) -- a lineage pointer without preservation='keep' is a 422."""
+    resp = await client.post("/demo/run", json={"replayed_from_workspace_id": "a" * 32})
+    assert resp.status_code == 422
+    assert resp.headers["content-type"].startswith("application/problem+json")
+
+
 # =============================================================================
 # E4 (#393) -- workspace GET routes against real Postgres (integration)
 # =============================================================================
@@ -405,6 +491,41 @@ async def test_get_workspace_integration_round_trip(client, db_session: AsyncSes
     assert missing.headers["content-type"].startswith("application/problem+json")
 
 
+@pytest.mark.integration
+async def test_patch_workspace_integration_round_trip(client, db_session: AsyncSession):
+    """E1 (#407) -- PATCH round-trips rename/notes/tags/archive/pin on a real row."""
+    workspace_id = await workspace.create_workspace(
+        DemoRunRequest.model_validate({"preservation": "keep", "workspace_name": "e1-patch"})
+    )
+    assert workspace_id is not None
+
+    resp = await client.patch(
+        f"/demo/workspaces/{workspace_id}",
+        json={
+            "name": "e1-renamed",
+            "notes": "kept for review",
+            "tags": ["smoke", "workspace:e1"],
+            "archived": True,
+            "pinned": True,
+        },
+    )
+    assert resp.status_code == 200
+    body = resp.json()
+    assert body["name"] == "e1-renamed"
+    assert body["notes"] == "kept for review"
+    assert body["tags"] == ["smoke", "workspace:e1"]
+    assert body["archived"] is True
+    assert body["pinned"] is True
+    # The pipeline-owned lifecycle status is untouched.
+    assert body["status"] == "running"
+
+    # The change persisted -- the detail endpoint reads it back.
+    detail = await client.get(f"/demo/workspaces/{workspace_id}")
+    assert detail.status_code == 200
+    assert detail.json()["name"] == "e1-renamed"
+    assert detail.json()["archived"] is True
+
+
 @pytest.mark.integration
 async def test_delete_workspace_integration_round_trip(client, db_session: AsyncSession):
     """DELETE removes exactly the target metadata row; a re-delete is 404."""
diff --git a/app/features/demo/tests/test_schemas.py b/app/features/demo/tests/test_schemas.py
index c4e120f2..866f708c 100644
--- a/app/features/demo/tests/test_schemas.py
+++ b/app/features/demo/tests/test_schemas.py
@@ -13,6 +13,7 @@
     WorkspaceDetailResponse,
     WorkspaceListItem,
     WorkspaceListResponse,
+    WorkspaceUpdateRequest,
 )
 from app.shared.seeder.config import ScenarioPreset
 
@@ -102,6 +103,89 @@ def test_demo_run_request_rejects_unknown_preservation():
         DemoRunRequest.model_validate({"preservation": "archive"})
 
 
+# =============================================================================
+# E1 (#407) -- replayed_from_workspace_id (replay provenance)
+# =============================================================================
+
+
+def test_demo_run_request_replayed_from_default_none():
+    """E1 (#407) -- default None; a legacy frame without the key validates."""
+    assert DemoRunRequest().replayed_from_workspace_id is None
+    legacy = DemoRunRequest.model_validate({"seed": 7})
+    assert legacy.replayed_from_workspace_id is None
+
+
+def test_demo_run_request_replayed_from_json_path():
+    """E1 (#407) -- the JSON wire form (validate_python on a parsed dict, the
+    path FastAPI uses) accepts keep + a 32-hex lineage pointer."""
+    req = DemoRunRequest.model_validate(
+        {"preservation": "keep", "replayed_from_workspace_id": "a" * 32}
+    )
+    assert req.replayed_from_workspace_id == "a" * 32
+
+
+def test_demo_run_request_replayed_from_requires_keep():
+    """E1 (#407) -- a lineage pointer without preservation='keep' is rejected."""
+    with pytest.raises(ValidationError):
+        DemoRunRequest.model_validate({"replayed_from_workspace_id": "a" * 32})
+    with pytest.raises(ValidationError):
+        DemoRunRequest.model_validate(
+            {"preservation": "ephemeral", "replayed_from_workspace_id": "a" * 32}
+        )
+
+
+def test_demo_run_request_replayed_from_pattern_rejected():
+    """E1 (#407) -- values off the uuid4().hex shape are rejected."""
+    for bad in ("not-hex!" + "0" * 24, "A" * 32, "a" * 31, "a" * 33):
+        with pytest.raises(ValidationError):
+            DemoRunRequest.model_validate(
+                {"preservation": "keep", "replayed_from_workspace_id": bad}
+            )
+
+
+# =============================================================================
+# E1 (#407) -- WorkspaceUpdateRequest (PATCH body)
+# =============================================================================
+
+
+def test_workspace_update_request_partial_fields_set():
+    """E1 (#407) -- exclude_unset distinguishes absent from explicit null."""
+    cleared = WorkspaceUpdateRequest.model_validate({"notes": None})
+    assert cleared.model_dump(exclude_unset=True) == {"notes": None}
+    empty = WorkspaceUpdateRequest.model_validate({})
+    assert empty.model_dump(exclude_unset=True) == {}
+
+
+def test_workspace_update_request_rejects_unknown_key():
+    """E1 (#407) -- extra='forbid': status (and any typo) is not patchable."""
+    with pytest.raises(ValidationError):
+        WorkspaceUpdateRequest.model_validate({"status": "archived"})
+    with pytest.raises(ValidationError):
+        WorkspaceUpdateRequest.model_validate({"archvied": True})
+
+
+def test_workspace_update_request_name_pattern_and_tags_cap():
+    """E1 (#407) -- name pattern + the 20-item tag cap are enforced."""
+    with pytest.raises(ValidationError):
+        WorkspaceUpdateRequest.model_validate({"name": "Bad Name!"})
+    with pytest.raises(ValidationError):
+        WorkspaceUpdateRequest.model_validate({"tags": [f"t{i}" for i in range(21)]})
+    ok = WorkspaceUpdateRequest.model_validate({"tags": ["workspace:x", "demo"]})
+    assert ok.tags == ["workspace:x", "demo"]
+
+
+def test_workspace_update_request_rejects_explicit_null_flags():
+    """E1 (#407) -- explicit null on the NOT NULL-backed fields is a 422."""
+    with pytest.raises(ValidationError):
+        WorkspaceUpdateRequest.model_validate({"archived": None})
+    with pytest.raises(ValidationError):
+        WorkspaceUpdateRequest.model_validate({"pinned": None})
+    with pytest.raises(ValidationError):
+        WorkspaceUpdateRequest.model_validate({"tags": None})
+    # The sanctioned clear path: an empty list, never null.
+    assert WorkspaceUpdateRequest.model_validate({"tags": []}).tags == []
+
+
 def test_step_event_json_round_trip():
     event = StepEvent(
         event_type="step_complete",
@@ -266,6 +350,49 @@ def test_workspace_detail_tolerates_running_row_nulls():
     assert detail.result_summary is None
 
 
+def test_workspace_responses_default_e1_fields_for_pre_e1_rows():
+    """E1 (#407) -- pre-E1 ORM-shaped rows (no new attrs) still validate;
+    the additive fields fall back to their defaults."""
+    item = WorkspaceListItem.model_validate(_orm_like_workspace_row())
+    assert item.archived is False
+    assert item.pinned is False
+    assert item.tags == []
+    assert item.replayed_from_workspace_id is None
+
+    detail = WorkspaceDetailResponse.model_validate(_orm_like_workspace_row())
+    assert detail.notes is None
+    assert detail.config_schema_version == 1
+    assert detail.seed_overrides is None
+    assert detail.user_scope is None
+    assert detail.approval_events is None
+    assert detail.rag_events is None
+    assert detail.job_ids is None
+    assert detail.phase_summaries is None
+
+
+def test_workspace_detail_passes_e1_fields_through():
+    """E1 (#407) -- populated lifecycle + slot values ride through verbatim."""
+    detail = WorkspaceDetailResponse.model_validate(
+        _orm_like_workspace_row(
+            archived=True,
+            pinned=True,
+            tags=["demo", "workspace:x"],
+            replayed_from_workspace_id="b" * 32,
+            notes="kept for the quarterly review",
+            config_schema_version=1,
+            seed_overrides={"noise_sigma": 0.2},
+            job_ids=["job-1", "job-2"],
+        )
+    )
+    assert detail.archived is True
+    assert detail.pinned is True
+    assert detail.tags == ["demo", "workspace:x"]
+    assert detail.replayed_from_workspace_id == "b" * 32
+    assert detail.notes == "kept for the quarterly review"
+    assert detail.seed_overrides == {"noise_sigma": 0.2}
+    assert detail.job_ids == ["job-1", "job-2"]
+
+
 def test_workspace_list_response_shape():
     """E4 (#393) -- page shape mirrors the scenarios list (items + total)."""
     item = WorkspaceListItem.model_validate(_orm_like_workspace_row())
diff --git a/app/features/demo/tests/test_workspace.py b/app/features/demo/tests/test_workspace.py
index 0b002be3..cb28dea2 100644
--- a/app/features/demo/tests/test_workspace.py
+++ b/app/features/demo/tests/test_workspace.py
@@ -20,7 +20,7 @@
     WORKSPACE_STATUS_RUNNING,
 )
 from app.features.demo.pipeline import DemoContext
-from app.features.demo.schemas import DemoRunRequest
+from app.features.demo.schemas import DemoRunRequest, WorkspaceUpdateRequest
 from app.shared.seeder.config import ScenarioPreset
 
 pytestmark = pytest.mark.integration
@@ -182,3 +182,134 @@ async def test_delete_workspace_removes_only_target_row(db_session: AsyncSession
 async def test_delete_workspace_missing_returns_false(db_session: AsyncSession) -> None:
     """delete_workspace returns False (no raise) for an unknown id."""
     assert await workspace.delete_workspace(db_session, "0" * 32) is False
+
+
+# =============================================================================
+# E1 (#407) -- replay provenance recording
+# =============================================================================
+
+
+async def test_create_workspace_records_replayed_from(db_session: AsyncSession) -> None:
+    """create_workspace records the lineage pointer verbatim on the NEW row."""
+    source_id = "a" * 32  # soft reference -- no row needs to exist
+    workspace_id = await workspace.create_workspace(
+        _keep_request(replayed_from_workspace_id=source_id)
+    )
+    assert workspace_id is not None
+
+    row = await workspace.get_workspace(db_session, workspace_id)
+    assert row is not None
+    assert row.replayed_from_workspace_id == source_id
+
+
+async def test_create_workspace_without_replayed_from_is_none(db_session: AsyncSession) -> None:
+    """A fresh keep-run (no lineage pointer) records NULL -- legacy identical."""
+    workspace_id = await workspace.create_workspace(_keep_request())
+    assert workspace_id is not None
+
+    row = await workspace.get_workspace(db_session, workspace_id)
+    assert row is not None
+    assert row.replayed_from_workspace_id is None
+    assert row.archived is False
+    assert row.pinned is False
+    assert row.tags == []
+    assert row.config_schema_version == 1
+
+
+# =============================================================================
+# E1 (#407) -- update_workspace (PATCH helper)
+# =============================================================================
+
+
+async def test_update_workspace_partial_leaves_other_fields(db_session: AsyncSession) -> None:
+    """Only provided fields change; everything else is untouched."""
+    workspace_id = await workspace.create_workspace(_keep_request(workspace_name="it-upd"))
+    assert workspace_id is not None
+
+    row = await workspace.update_workspace(
+        db_session,
+        workspace_id,
+        WorkspaceUpdateRequest.model_validate({"name": "it-upd-renamed", "pinned": True}),
+    )
+    assert row is not None
+    assert row.name == "it-upd-renamed"
+    assert row.pinned is True
+    # Untouched fields keep their values.
+    assert row.archived is False
+    assert row.notes is None
+    assert row.tags == []
+    assert row.seed == 7
+    assert row.status == WORKSPACE_STATUS_RUNNING
+
+
+async def test_update_workspace_explicit_null_clears_name(db_session: AsyncSession) -> None:
+    """An explicit null clears name/notes (exclude_unset keeps it in changes)."""
+    workspace_id = await workspace.create_workspace(_keep_request(workspace_name="it-clear"))
+    assert workspace_id is not None
+    await workspace.update_workspace(
+        db_session,
+        workspace_id,
+        WorkspaceUpdateRequest.model_validate({"notes": "temporary"}),
+    )
+
+    row = await workspace.update_workspace(
+        db_session,
+        workspace_id,
+        WorkspaceUpdateRequest.model_validate({"name": None, "notes": None}),
+    )
+    assert row is not None
+    assert row.name is None
+    assert row.notes is None
+
+
+async def test_update_workspace_tags_replaced_whole(db_session: AsyncSession) -> None:
+    """tags is replaced as a whole list (never merged); [] clears it."""
+    workspace_id = await workspace.create_workspace(_keep_request(workspace_name="it-tags"))
+    assert workspace_id is not None
+
+    row = await workspace.update_workspace(
+        db_session,
+        workspace_id,
+        WorkspaceUpdateRequest.model_validate({"tags": ["a", "b"]}),
+    )
+    assert row is not None
+    assert row.tags == ["a", "b"]
+
+    row = await workspace.update_workspace(
+        db_session,
+        workspace_id,
+        WorkspaceUpdateRequest.model_validate({"tags": ["c"]}),
+    )
+    assert row is not None
+    assert row.tags == ["c"]  # replaced, not merged
+
+    row = await workspace.update_workspace(
+        db_session,
+        workspace_id,
+        WorkspaceUpdateRequest.model_validate({"tags": []}),
+    )
+    assert row is not None
+    assert row.tags == []
+
+
+async def test_update_workspace_missing_returns_none(db_session: AsyncSession) -> None:
+    """update_workspace returns None for an unknown id (route maps to 404)."""
+    result = await workspace.update_workspace(
+        db_session,
+        "0" * 32,
+        WorkspaceUpdateRequest.model_validate({"pinned": True}),
+    )
+    assert result is None
+
+
+async def test_update_workspace_empty_request_noop(db_session: AsyncSession) -> None:
+    """An empty request is a no-op that still returns the row."""
+    workspace_id = await workspace.create_workspace(_keep_request(workspace_name="it-noop"))
+    assert workspace_id is not None
+
+    row = await workspace.update_workspace(
+        db_session, workspace_id, WorkspaceUpdateRequest.model_validate({})
+    )
+    assert row is not None
+    assert row.name == "it-noop"
+    assert row.status == WORKSPACE_STATUS_RUNNING
diff --git a/app/features/demo/workspace.py b/app/features/demo/workspace.py
index b0e65dad..0af35a50 100644
--- a/app/features/demo/workspace.py
+++ b/app/features/demo/workspace.py
@@ -15,7 +15,10 @@
 :func:`get_workspace` / :func:`list_workspaces` / :func:`count_workspaces` are
 routed since E4 (epic #393) by ``GET /demo/workspaces`` and
 ``GET /demo/workspaces/{workspace_id}`` in ``app/features/demo/routes.py``;
-:func:`delete_workspace` backs ``DELETE /demo/workspaces/{workspace_id}``.
+:func:`delete_workspace` backs ``DELETE /demo/workspaces/{workspace_id}``;
+:func:`update_workspace` backs ``PATCH /demo/workspaces/{workspace_id}``
+(E1, #407). The request-scoped helpers take a caller-owned session and raise
+normally -- the warn-and-continue contract is pipeline-only.
 """
 
 from __future__ import annotations
@@ -33,7 +36,7 @@
     WORKSPACE_STATUS_FAILED,
     ShowcaseWorkspace,
 )
-from app.features.demo.schemas import DemoRunRequest
+from app.features.demo.schemas import DemoRunRequest, WorkspaceUpdateRequest
 
 if TYPE_CHECKING:
     # NOTE: pipeline imports this module at runtime; importing DemoContext
@@ -65,6 +68,9 @@ async def create_workspace(req: DemoRunRequest) -> str | None:
                     scenario=req.scenario.value,
                     reset=req.reset,
                     skip_seed=req.skip_seed,
+                    # E1 (#407): replay provenance, recorded verbatim (soft
+                    # reference -- no existence check; dangles are designed).
+                    replayed_from_workspace_id=req.replayed_from_workspace_id,
                 )
             )
             await db.commit()
@@ -171,6 +177,41 @@ async def get_workspace(db: AsyncSession, workspace_id: str) -> ShowcaseWorkspac
     return result.scalar_one_or_none()
 
 
+async def update_workspace(
+    db: AsyncSession,
+    workspace_id: str,
+    update: WorkspaceUpdateRequest,
+) -> ShowcaseWorkspace | None:
+    """Apply a partial lifecycle update; return the row or ``None`` when missing.
+
+    ``exclude_unset`` distinguishes absent fields from explicit ``null`` --
+    only fields present in the request body are applied (explicit ``null``
+    clears ``name`` / ``notes``; the schema rejects ``null`` on the NOT NULL
+    columns). JSONB values are assigned WHOLE (never mutated in place) so
+    SQLAlchemy change detection fires. An empty request is a no-op that still
+    returns the row.
+
+    Args:
+        db: An open async session (caller-owned; this backs an HTTP route,
+            NOT the pipeline -- it raises normally, no warn-and-continue).
+        workspace_id: The external id of the row to update.
+        update: The validated partial-update request.
+
+    Returns:
+        The updated row, or ``None`` when no row matched (route maps to 404).
+    """
+    row = await get_workspace(db, workspace_id)
+    if row is None:
+        return None
+    changes = update.model_dump(exclude_unset=True)  # absent != explicit null
+    for field, value in changes.items():
+        setattr(row, field, value)  # whole-value assignment (JSONB gotcha)
+    await db.commit()
+    await db.refresh(row)
+    logger.info("demo.workspace_updated", workspace_id=workspace_id, fields=sorted(changes))
+    return row
+
+
 async def list_workspaces(
     db: AsyncSession,
     *,

From e26de84e5aec87898f0b0b5b5c9d2955dc6dd072 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 00:12:52 +0200
Subject: [PATCH 08/32] feat(ui): send replayed_from_workspace_id on showcase
 replay (#407)

---
 frontend/src/pages/showcase.tsx | 2 ++
 frontend/src/types/api.ts       | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/frontend/src/pages/showcase.tsx b/frontend/src/pages/showcase.tsx
index b7eb4444..9643de1a 100644
--- a/frontend/src/pages/showcase.tsx
+++ b/frontend/src/pages/showcase.tsx
@@ -181,6 +181,8 @@ export default function ShowcasePage() {
       reset: ws.reset,
       skip_seed: ws.skip_seed,
       preservation: 'keep',
+      // E1 (#407) — record replay lineage on the NEW row (soft reference).
+      replayed_from_workspace_id: ws.workspace_id,
       ...(ws.name ? { workspace_name: ws.name } : {}),
     })
   }
diff --git a/frontend/src/types/api.ts b/frontend/src/types/api.ts
index 93de98cc..1232e991 100644
--- a/frontend/src/types/api.ts
+++ b/frontend/src/types/api.ts
@@ -785,6 +785,8 @@ export interface DemoRunRequest {
   // Omit both to keep the legacy ephemeral behavior byte-identical.
   preservation?: 'ephemeral' | 'keep'
   workspace_name?: string
+  // E1 (#407) — replay provenance: the source workspace_id a Replay re-runs.
+  replayed_from_workspace_id?: string
 }
 
 // Aggregate result returned by the synchronous POST /demo/run.

From 493a9a436cce5810e68d0053f7cb23d494edf95b Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 00:12:52 +0200
Subject: [PATCH 09/32] docs(docs): document workspace story slots and patch
 contract (#407)

---
 docs/_base/API_CONTRACTS.md | 11 ++++++-----
 docs/_base/DOMAIN_MODEL.md  | 15 ++++++++++++---
 docs/_base/RUNBOOKS.md      |  2 +-
 3 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/docs/_base/API_CONTRACTS.md b/docs/_base/API_CONTRACTS.md
index d307d2a9..47bc7b6e 100644
--- a/docs/_base/API_CONTRACTS.md
+++ b/docs/_base/API_CONTRACTS.md
@@ -58,10 +58,11 @@ All endpoints serve JSON; error responses use `application/problem+json` (RFC 78
 | agents | WS | `/agents/stream` | Token-by-token streaming + tool-call events |
 | seeder | (see `app/features/seeder/routes.py`) | `/seeder/*` | Trigger scenarios, status, customization |
 | seeder | POST | `/seeder/phase2-enrichment` | PRP-38 — run Phase 2 generators (lifecycle, replenishment, exogenous, returns) against the existing seeded data. `422 application/problem+json` on an empty database. |
-| demo | POST | `/demo/run` | Run the end-to-end demo pipeline in-process; returns a `DemoRunResult`. `409 application/problem+json` if a run is already active. **PRP-38** — body accepts an Optional `scenario: 'demo_minimal' \| 'showcase_rich' \| 'sparse'` field; default `'demo_minimal'` (back-compat). **E1 (#390)** — body accepts additive Optional `preservation: 'ephemeral' \| 'keep'` (default `'ephemeral'`, today's no-row behavior) and `workspace_name: str \| null` (pattern `^[a-z0-9][a-z0-9\-_]*$`, ≤100 chars); `workspace_name` without `preservation='keep'` → `422 application/problem+json`. `preservation='keep'` records the run as a `showcase_workspace` row; `DemoRunResult` gains an additive Optional `workspace_id: str \| null`. **E2 (#391)** — `scenario` accepts all 8 `ScenarioPreset` values (`retail_standard` / `holiday_rush` / `high_variance` / `stockout_heavy` / `new_launches` / `sparse` / `demo_minimal` / `showcase_rich`); only `showcase_rich` changes the step table (24 rows), every other preset runs the legacy 11-row flow. |
+| demo | POST | `/demo/run` | Run the end-to-end demo pipeline in-process; returns a `DemoRunResult`. `409 application/problem+json` if a run is already active. **PRP-38** — body accepts an Optional `scenario: 'demo_minimal' \| 'showcase_rich' \| 'sparse'` field; default `'demo_minimal'` (back-compat). **E1 (#390)** — body accepts additive Optional `preservation: 'ephemeral' \| 'keep'` (default `'ephemeral'`, today's no-row behavior) and `workspace_name: str \| null` (pattern `^[a-z0-9][a-z0-9\-_]*$`, ≤100 chars); `workspace_name` without `preservation='keep'` → `422 application/problem+json`. `preservation='keep'` records the run as a `showcase_workspace` row; `DemoRunResult` gains an additive Optional `workspace_id: str \| null`. **E2 (#391)** — `scenario` accepts all 8 `ScenarioPreset` values (`retail_standard` / `holiday_rush` / `high_variance` / `stockout_heavy` / `new_launches` / `sparse` / `demo_minimal` / `showcase_rich`); only `showcase_rich` changes the step table (24 rows), every other preset runs the legacy 11-row flow. **E1 (#407)** — body accepts additive Optional `replayed_from_workspace_id: str \| null` (`^[0-9a-f]{32}$`); requires `preservation='keep'` (else `422 application/problem+json`); recorded verbatim on the new `showcase_workspace` row as a SOFT reference (no existence check — dangles are designed). |
 | demo | WS | `/demo/stream` | Stream one `StepEvent` per pipeline step for the live Showcase page |
-| demo | GET | `/demo/workspaces` | **E4 (#393)** — list saved showcase workspaces, newest first (`limit` 1-100 default 20 / `offset`); `200` + empty list on an empty table |
-| demo | GET | `/demo/workspaces/{workspace_id}` | **E4 (#393)** — full workspace row incl. `created_objects` soft references + grain/window columns; `404 application/problem+json` when missing |
+| demo | GET | `/demo/workspaces` | **E4 (#393)** — list saved showcase workspaces, newest first (`limit` 1-100 default 20 / `offset`); `200` + empty list on an empty table. **E1 (#407)** — list items additively carry `archived`, `pinned`, `tags`, `replayed_from_workspace_id`; archived rows still list (default-filtering is E2 #408) |
+| demo | GET | `/demo/workspaces/{workspace_id}` | **E4 (#393)** — full workspace row incl. `created_objects` soft references + grain/window columns; `404 application/problem+json` when missing. **E1 (#407)** — response additively carries the list-item lifecycle fields plus `notes`, `config_schema_version`, and the six story slots (`seed_overrides` / `user_scope` / `approval_events` / `rag_events` / `job_ids` / `phase_summaries` — all `null` until their writer epic lands; schemas in `docs/_base/DOMAIN_MODEL.md`) |
+| demo | PATCH | `/demo/workspaces/{workspace_id}` | **E1 (#407)** — partial lifecycle update (`name` / `notes` / `tags` / `archived` / `pinned`; `exclude_unset` semantics — only provided fields change; explicit `null` clears `name`/`notes`; explicit `null` on `archived`/`pinned`/`tags` → `422` (send `[]` to clear tags); `status` NOT patchable — the pipeline owns it); returns the updated `WorkspaceDetailResponse`; empty body = `200` no-op; `404 application/problem+json` when missing; `422` on unknown keys / bad name pattern / >20 tags |
 | demo | DELETE | `/demo/workspaces/{workspace_id}` | Delete one saved workspace METADATA row; `204` on success, `404 application/problem+json` when missing. The run's created objects (model runs, scenario plans, aliases, jobs, artifacts) are soft references and are NOT deleted |
 | config | GET | `/config/ai` | Effective AI-model config (agent LLM + RAG embeddings); API keys masked, never raw |
 | config | PATCH | `/config/ai` | Persist + apply AI-model changes live (no restart). `409` if an embedding-dimension change would orphan indexed RAG chunks (resend with `force=true`) |
@@ -86,7 +87,7 @@ Verified against `app/features/agents/websocket.py` and `app/features/agents/sch
 
 Drives the end-to-end demo pipeline for the dashboard Showcase page. Verified against `app/features/demo/routes.py` and `app/features/demo/schemas.py` (`StepEvent`).
 
-- **Client → server (one start frame):** `{"seed": int, "reset": bool, "skip_seed": bool, "scenario"?: "demo_minimal" | "showcase_rich" | "sparse", "preservation"?: "ephemeral" | "keep", "workspace_name"?: str}` — all fields optional (`DemoRunRequest` supplies defaults `seed=42`, `reset=false`, `skip_seed=true`, `scenario="demo_minimal"`, `preservation="ephemeral"`, `workspace_name=null`). E1 (#390) — `workspace_name` requires `preservation="keep"` (else one `error` event from validation); unknown start-frame keys remain ignored (forward/backward compat). E2 (#391) — `scenario` accepts all 8 `ScenarioPreset` values (`retail_standard` / `holiday_rush` / `high_variance` / `stockout_heavy` / `new_launches` / `sparse` / `demo_minimal` / `showcase_rich`); only `showcase_rich` changes the step table (24 rows), every other preset runs the legacy 11-row flow. The pipeline runs once, then the server closes.
+- **Client → server (one start frame):** `{"seed": int, "reset": bool, "skip_seed": bool, "scenario"?: "demo_minimal" | "showcase_rich" | "sparse", "preservation"?: "ephemeral" | "keep", "workspace_name"?: str}` — all fields optional (`DemoRunRequest` supplies defaults `seed=42`, `reset=false`, `skip_seed=true`, `scenario="demo_minimal"`, `preservation="ephemeral"`, `workspace_name=null`). E1 (#390) — `workspace_name` requires `preservation="keep"` (else one `error` event from validation); unknown start-frame keys remain ignored (forward/backward compat). E2 (#391) — `scenario` accepts all 8 `ScenarioPreset` values (`retail_standard` / `holiday_rush` / `high_variance` / `stockout_heavy` / `new_launches` / `sparse` / `demo_minimal` / `showcase_rich`); only `showcase_rich` changes the step table (24 rows), every other preset runs the legacy 11-row flow. E1 (#407) — the start frame additively accepts `replayed_from_workspace_id?: str` (`^[0-9a-f]{32}$`, requires `preservation="keep"` else one `error` event from validation); the Showcase Replay button sends the source row's `workspace_id`, recorded verbatim on the NEW row as a soft reference. The pipeline runs once, then the server closes.
 - **Server → client (every frame):** Pydantic-serialized `StepEvent` — `{"event_type", "step_name", "step_index", "total_steps", "status", "detail", "duration_ms", "data", "timestamp", "phase_name"?, "phase_index"?, "phase_total"?}`. PRP-38 — the three `phase_*` fields are Optional + Nullable so legacy clients that don't render phases keep working.
 - **`event_type` values (Literal in `StepEvent`):**
   - `step_start` — a step began; `status` is `null`.
@@ -97,7 +98,7 @@ Drives the end-to-end demo pipeline for the dashboard Showcase page. Verified ag
 - PRP-38 — `scenario="showcase_rich"` extends the data phase with `phase2_enrichment` + `historical_backfill` steps and the modeling phase with `v2_train` (one V2 `prophet_like` run). Phase ids are `data` / `modeling` / `decision` / `verify` / `agent` / `cleanup` (6 phases).
 - PRP-40 — `scenario="showcase_rich"` ALSO adds two phases inserted BEFORE `verify`: `planning` (2 steps — `scenario_simulate_and_save`, `multi_plan_compare`) and `knowledge` (3 steps — `embedding_provider_probe`, `rag_index_subset`, `rag_retrieve_probe`). Total step count: 19 for `showcase_rich`, 11 for `demo_minimal` and `sparse`. Phase ids on `showcase_rich` are `data` / `modeling` / `decision` / `planning` / `knowledge` / `verify` / `agent` / `cleanup` (8 phases). The knowledge steps SKIP gracefully when the embedding provider is unreachable; the pipeline still goes green.
 - E3 (#392) — the planning-phase steps tag the plans they save: pipeline-saved plans now carry `source:showcase` (alongside the legacy `showcase` + `price`/`holiday` tags), and on `preservation="keep"` runs additionally `workspace:<workspace_name|workspace_id>` — retrievable via `GET /scenarios?tags=workspace:<label>` (JSONB containment, all listed tags must match). The `scenario_simulate_and_save` step's `data` additively echoes the `tags` list it sent.
-- E4 (#393) — the start frame's E1 preservation fields are now exercised by the Showcase UI ("Save as workspace" checkbox + name + seed inputs). **Replay** re-submits a recorded workspace's config verbatim (`seed`/`scenario`/`reset`/`skip_seed`) with `preservation="keep"` (+ the recorded `workspace_name`), creating a NEW `showcase_workspace` row each time — the original row is never mutated; names are non-unique by design. Saved rows are read back over `GET /demo/workspaces` (+ `/{workspace_id}`).
+- E4 (#393) — the start frame's E1 preservation fields are now exercised by the Showcase UI ("Save as workspace" checkbox + name + seed inputs). **Replay** re-submits a recorded workspace's config verbatim (`seed`/`scenario`/`reset`/`skip_seed`) with `preservation="keep"` (+ the recorded `workspace_name`), creating a NEW `showcase_workspace` row each time — the original row is never mutated; names are non-unique by design. Saved rows are read back over `GET /demo/workspaces` (+ `/{workspace_id}`). E1 (#407) — the Replay start frame now also sends `replayed_from_workspace_id: <source workspace_id>`, so replays carry lineage (the rendering of that lineage is E2 #408).
 
 ## Async Events / Queues
 
diff --git a/docs/_base/DOMAIN_MODEL.md b/docs/_base/DOMAIN_MODEL.md
index 1ec3200b..24137fc2 100644
--- a/docs/_base/DOMAIN_MODEL.md
+++ b/docs/_base/DOMAIN_MODEL.md
@@ -58,16 +58,25 @@
 ### `showcase_workspace` (Demo)
 - **Root:** `ShowcaseWorkspace(workspace_id: str, status: str)` — one row = one preserved (`preservation="keep"`) showcase run. Ephemeral runs (the default) write no row; a `workspace_name` merely labels a keep-run row (names are non-unique).
 - **Status state machine:** `running` → `completed` | `failed` (CHECK-constrained; the finalize hook settles the row even on mid-run failure).
-- **Stored metadata:** replay config (`seed`, `scenario`, `reset`, `skip_seed`), showcase grain + window (`store_id`, `product_id`, `date_start`, `date_end` — NULL on early failure), lifecycle (`status`, `created_at`/`updated_at`), and the two JSONB payloads below.
+- **Stored metadata:** replay config (`seed`, `scenario`, `reset`, `skip_seed`), showcase grain + window (`store_id`, `product_id`, `date_start`, `date_end` — NULL on early failure), lifecycle (`status`, `created_at`/`updated_at`), and the JSONB payloads below. E1 (#407) adds operator-curation columns `archived` / `pinned` (booleans, default false, PATCH-mutable, orthogonal to `status` — the pipeline owns the run lifecycle), `notes` (free text, 2000-char cap at the Pydantic boundary), `tags` (a queryable JSONB string array — its own GIN-indexed column, exact `scenario_plan.tags` pattern, ≤20 items at the PATCH boundary), `config_schema_version` (int, default 1 — versions the workspace config + story-slot schema as a whole; any epic that changes a documented slot shape bumps the ORM default and documents the delta here), and the provenance column `replayed_from_workspace_id` (String(32), btree-indexed SOFT reference — see Invariants).
 - **JSONB fields:** `created_objects` (sparse soft-reference keys — `winning_run_id`, `v2_run_id`, `v2_model_path`, `alias`, `agent_session_id`, `batch_id`, `scenario_plan_ids`, `scenario_artifact_key`, `train_model_types`, `stale_alias_run_id`) and `result_summary` (winner / WAPE / wall-clock display payload).
+- **JSONB story slots (E1 #407 — authoritative per-slot schema):** six dedicated nullable JSONB columns; `NULL` = "slot never written" (distinct from empty). E1 ships the columns only — each slot has an assigned writer epic:
+  - `seed_overrides` (E3 #409 writes) — dict: the curated seeder-override payload from the start frame, stored verbatim (`model_dump(mode="json")`); replay echoes it.
+  - `user_scope` (E3 #409 writes) — dict: operator-selected focus, `{"store_id": int, "product_id": int}` (additive keys allowed later).
+  - `approval_events` (E5 #411 writes) — list[dict], append-only: `{"action_id": str, "tool_name": str, "decision": "approved"|"rejected", "decided_at": iso8601-str, "session_id": str}`.
+  - `rag_events` (E5 #411 writes) — list[dict], append-only: `{"event": "index"|"retrieve"|"skip", "detail": str, "count": int, "occurred_at": iso8601-str}`.
+  - `job_ids` (later parallel epic — E2 #408 / E4 #410 agree on the writer) — list[str]: job / batch sub-job ids the run submitted (soft references).
+  - `phase_summaries` (later parallel epic) — list[dict], one per phase: `{"phase_name": str, "status": "pass"|"fail"|"warn"|"skip", "steps": int, "duration_ms": float}`.
 - **Relationship to demo pipeline runs:** one workspace row per kept pipeline run — `create_workspace` inserts it as `running` before the first step; `finalize_workspace` settles it with the run's collected ids. NOT a seeder `scenario`: a preset is a reusable data-generation recipe; a workspace is the record of ONE concrete run (which preset it used, with what seed, and what it produced).
 - **Invariants:**
   - The config columns (`seed`, `scenario`, `reset`, `skip_seed`) are sufficient for a verbatim Replay through the normal run path — replay never mutates the original row; it creates a NEW row.
   - `name` is deliberately NON-unique; `workspace_id` (UUID hex) is the unique handle.
   - `created_objects` carries SOFT references only — **no ForeignKeys by design**. The workspace row is an audit record, not an ownership root: the referenced runs/plans/aliases are independently operator-deletable, and a workspace must never block (or cascade) their deletion.
   - Deletion is METADATA-ONLY, symmetric with the no-FK design: `DELETE /demo/workspaces/{id}` removes the `showcase_workspace` row and nothing else — the soft-referenced model runs, scenario plans, aliases, jobs, agent sessions, and artifacts survive, and a workspace whose references already dangle still deletes cleanly.
-  - Persistence is warn-and-continue: a workspace write failure must never break the demo pipeline (the run completes with `workspace_id: null`).
-- **Out of scope (deliberately not modeled yet):** a `replayed_from` provenance column, export bundles under `artifacts/showcase/<workspace>/`, RAG-event / approval-decision capture, advanced seed config, and per-phase interactive configuration — see `docs/_base/RUNBOOKS.md` § Showcase workspace.
+  - Persistence is warn-and-continue: a workspace write failure must never break the demo pipeline (the run completes with `workspace_id: null`). The HTTP-backed helpers (`update_workspace` for PATCH, like get/list/delete) take a caller-owned session and raise normally — warn-and-continue is pipeline-only.
+  - E1 (#407): `replayed_from_workspace_id` is a SOFT reference — **no ForeignKey, not even self-referential**: ancestor workspace rows must stay independently deletable (metadata-only delete) without cascading to or blocking descendants. The value is recorded verbatim from the request (no existence check); dangling lineage pointers after an ancestor delete are expected and harmless, like every `created_objects` id.
+  - E1 (#407): `status` is NOT patchable — `PATCH /demo/workspaces/{id}` covers `name`/`notes`/`tags`/`archived`/`pinned` only; `archived` is an orthogonal curation flag and the `ck_showcase_workspace_status` CHECK is untouched.
+- **Out of scope (deliberately not modeled yet):** export bundles under `artifacts/showcase/<workspace>/`, RAG-event / approval-decision capture (columns exist as E1 story slots; the writers are E5 #411), advanced seed config (slot exists; writer is E3 #409), and per-phase interactive configuration — see `docs/_base/RUNBOOKS.md` § Showcase workspace.
 
 ## Key Invariants — NEVER violate
 
diff --git a/docs/_base/RUNBOOKS.md b/docs/_base/RUNBOOKS.md
index 007176be..1cda5125 100644
--- a/docs/_base/RUNBOOKS.md
+++ b/docs/_base/RUNBOOKS.md
@@ -157,7 +157,7 @@ uv run python scripts/run_demo.py --seed 42 --quiet 2>&1 | tee demo.log
 
 **Notes:** keep-runs are recorded by warn-and-continue hooks — a DB hiccup during `create_workspace` yields a green pipeline with `workspace_id: null` and no row (check uvicorn logs for `demo.workspace_create_failed`). Ephemeral runs write no workspace rows and stay in the localStorage Run-history strip; kept runs appear ONLY in the server-backed panel. On `showcase_rich` keep-runs, the planning-phase scenario plans carry the `workspace:<name|id>` tag (E3 #392) — retrieve them via `GET /scenarios?tags=workspace:<label>`.
 
-**Explicitly out of scope (not implemented; future epics, do not assume they exist):** advanced seed configuration on `/showcase` (beyond seed/scenario/reset/skip_seed); export bundles under `artifacts/showcase/<workspace>/`; a `replayed_from` provenance column (replays are indistinguishable from fresh keep-runs except by name/timestamp); RAG-event and approval-decision capture on the workspace row; full phase-level interactive configuration.
+**Explicitly out of scope (not implemented; future epics, do not assume they exist):** advanced seed configuration on `/showcase` (beyond seed/scenario/reset/skip_seed); export bundles under `artifacts/showcase/<workspace>/`; RAG-event and approval-decision capture on the workspace row (the E1 #407 story-slot columns exist but stay NULL until E5 #411 writes them); full phase-level interactive configuration. (Replay provenance shipped in E1 #407 — `replayed_from_workspace_id` is recorded on every Replay.)
 
 ### release-please skipped the bump after a dev → main merge
 **Symptoms:** `dev → main` PR is merged, `CD Release` workflow on `main` completes in ~10s, **no Release PR** is opened. release-please log shows `No user facing commits found since <sha> - skipping`.

From ab1871534dd6fdb310c6aab3b71f63d2ce22d5f4 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 01:24:02 +0200
Subject: [PATCH 10/32] feat(api): add workspace list filters and link-health
 endpoint (#408)

---
 app/features/demo/link_health.py | 178 +++++++++++++++++++++++++++++++
 app/features/demo/routes.py      | 101 ++++++++++++++++--
 app/features/demo/schemas.py     |  46 ++++++++
 app/features/demo/workspace.py   | 106 +++++++++++++++---
 4 files changed, 412 insertions(+), 19 deletions(-)
 create mode 100644 app/features/demo/link_health.py

diff --git a/app/features/demo/link_health.py b/app/features/demo/link_health.py
new file mode 100644
index 00000000..d3748fd4
--- /dev/null
+++ b/app/features/demo/link_health.py
@@ -0,0 +1,178 @@
+"""Soft-reference liveness probes for showcase workspaces (E2, issue #408).
+
+A workspace row records everything its run created as OPAQUE SOFT REFERENCES
+(no ForeignKeys -- see ``app/features/demo/models.py``), so referenced objects
+can be deleted out from under it by design. This module turns that silent
+staleness into a per-workspace health signal.
+
+The demo slice may NOT import another feature slice (vertical-slice rule), so
+liveness is checked through the public HTTP API **in-process** via
+``httpx.ASGITransport`` -- the exact mechanism ``pipeline._Client`` already
+uses (``app/features/demo/pipeline.py``). ``raise_app_exceptions=False`` is
+load-bearing: an unhandled error inside a probed endpoint must surface as a
+500 *response* (classified ``unknown``), never as a re-raised exception.
+
+Classification table:
+
+    2xx              -> "alive"    (the referenced object still exists)
+    404              -> "dead"     (deleted after the run -- expected, designed)
+    anything else    -> "unknown"  (5xx, timeout, transport error -- no false alarms)
+
+A probe NEVER raises -- a flaky slice must not 500 the health route.
+"""
+
+from __future__ import annotations
+
+import asyncio
+from dataclasses import dataclass
+from typing import TYPE_CHECKING, Any
+
+import httpx
+
+from app.features.demo.models import WORKSPACE_STATUS_COMPLETED, ShowcaseWorkspace
+from app.features.demo.schemas import (
+    RefHealthStatus,
+    RefType,
+    WorkspaceHealthResponse,
+    WorkspaceRefHealth,
+)
+
+if TYPE_CHECKING:
+    from fastapi import FastAPI
+
+# Probe budget -- generous for an in-process call; a hung dependency inside a
+# probed endpoint classifies as "unknown" instead of hanging the health route.
+_PROBE_TIMEOUT = httpx.Timeout(10.0, connect=5.0)
+
+
+@dataclass(frozen=True)
+class _ProbeTarget:
+    """One probeable soft reference resolved from a workspace row."""
+
+    key: str  # created_objects key (list keys carry an index, e.g. "scenario_plan_ids[0]")
+    ref_type: RefType
+    ref_id: str
+    probe_path: str  # public API path whose status code decides liveness
+
+
+def build_probe_targets(ws: ShowcaseWorkspace) -> list[_ProbeTarget]:
+    """Map a workspace's soft references to probeable public API paths.
+
+    Non-probeable ``created_objects`` keys (``v2_model_path``,
+    ``scenario_artifact_key``, ``train_model_types``) are skipped -- they have
+    no HTTP identity to check. The E1 ``job_ids`` story slot (CONTRACT(E1)-6)
+    probes through ``GET /jobs/{job_id}`` when present; pre-backfill rows
+    where the slot is NULL are silently skipped.
+    """
+    targets: list[_ProbeTarget] = []
+    objects = ws.created_objects or {}
+
+    def _str_value(key: str) -> str | None:
+        value = objects.get(key)
+        return value if isinstance(value, str) and value else None
+
+    for key in ("winning_run_id", "v2_run_id", "stale_alias_run_id"):
+        run_id = _str_value(key)
+        if run_id:
+            targets.append(_ProbeTarget(key, "model_run", run_id, f"/registry/runs/{run_id}"))
+
+    plan_ids = objects.get("scenario_plan_ids")
+    if isinstance(plan_ids, list):
+        for index, plan_id in enumerate(plan_ids):
+            if isinstance(plan_id, str) and plan_id:
+                targets.append(
+                    _ProbeTarget(
+                        f"scenario_plan_ids[{index}]",
+                        "scenario_plan",
+                        plan_id,
+                        f"/scenarios/{plan_id}",
+                    )
+                )
+
+    alias = _str_value("alias")
+    if alias:
+        targets.append(_ProbeTarget("alias", "alias", alias, f"/registry/aliases/{alias}"))
+
+    batch_id = _str_value("batch_id")
+    if batch_id:
+        targets.append(_ProbeTarget("batch_id", "batch", batch_id, f"/batch/{batch_id}"))
+
+    session_id = _str_value("agent_session_id")
+    if session_id:
+        targets.append(
+            _ProbeTarget(
+                "agent_session_id",
+                "agent_session",
+                session_id,
+                f"/agents/sessions/{session_id}",
+            )
+        )
+
+    # The ORM types job_ids as list[str], but JSONB enforces nothing at
+    # runtime -- treat entries as untrusted (mirrors the created_objects guards).
+    job_ids: list[Any] = list(ws.job_ids or [])
+    for index, job_id in enumerate(job_ids):
+        if isinstance(job_id, str) and job_id:
+            targets.append(_ProbeTarget(f"job_ids[{index}]", "job", job_id, f"/jobs/{job_id}"))
+
+    return targets
+
+
+async def _probe_one(client: httpx.AsyncClient, target: _ProbeTarget) -> WorkspaceRefHealth:
+    """Probe one reference; classify the status code. NEVER raises."""
+    status: RefHealthStatus
+    try:
+        response = await client.get(target.probe_path)
+    except (httpx.HTTPError, OSError):
+        status = "unknown"
+    else:
+        if 200 <= response.status_code < 300:
+            status = "alive"
+        elif response.status_code == 404:
+            status = "dead"
+        else:
+            status = "unknown"
+    return WorkspaceRefHealth(
+        key=target.key,
+        ref_type=target.ref_type,
+        ref_id=target.ref_id,
+        status=status,
+        probe_path=target.probe_path,
+    )
+
+
+async def probe_workspace_links(app: FastAPI, ws: ShowcaseWorkspace) -> WorkspaceHealthResponse:
+    """Probe every soft reference a workspace recorded; aggregate the counts.
+
+    Probes run concurrently. ``partial_run`` flags a row whose pipeline never
+    settled to ``completed`` -- its artifacts may be missing regardless of
+    what the probes find.
+
+    Args:
+        app: The live FastAPI app (``request.app`` -- the slice never imports
+            ``app.main``).
+        ws: The workspace row whose references are probed.
+
+    Returns:
+        The per-reference results plus alive/dead/unknown counts.
+    """
+    targets = build_probe_targets(ws)
+    references: list[WorkspaceRefHealth] = []
+    if targets:
+        async with httpx.AsyncClient(
+            transport=httpx.ASGITransport(app=app, raise_app_exceptions=False),
+            base_url="http://demo.internal",
+            timeout=_PROBE_TIMEOUT,
+        ) as client:
+            references = list(
+                await asyncio.gather(*(_probe_one(client, target) for target in targets))
+            )
+    return WorkspaceHealthResponse(
+        workspace_id=ws.workspace_id,
+        workspace_status=ws.status,
+        partial_run=ws.status != WORKSPACE_STATUS_COMPLETED,
+        references=references,
+        alive=sum(1 for ref in references if ref.status == "alive"),
+        dead=sum(1 for ref in references if ref.status == "dead"),
+        unknown=sum(1 for ref in references if ref.status == "unknown"),
+    )
diff --git a/app/features/demo/routes.py b/app/features/demo/routes.py
index 87584247..deaa8ac0 100644
--- a/app/features/demo/routes.py
+++ b/app/features/demo/routes.py
@@ -4,7 +4,12 @@
 - ``POST /demo/run``    -- synchronous; runs the whole pipeline, returns a result.
 - ``WS   /demo/stream`` -- streams one StepEvent per step for the live UI.
 - ``GET    /demo/workspaces``                 -- E4 (#393): list saved workspaces.
+  E2 (#408): ``q`` name search, repeated ``tags`` containment,
+  ``include_archived`` (default false), allow-listed ``sort_by``/``sort_order``;
+  pinned rows always order first.
 - ``GET    /demo/workspaces/{workspace_id}``  -- E4 (#393): one workspace's detail.
+- ``GET    /demo/workspaces/{workspace_id}/health`` -- E2 (#408): probe the
+  workspace's soft references in-process; per-ref alive/dead/unknown + counts.
 - ``PATCH  /demo/workspaces/{workspace_id}``  -- E1 (#407): partial lifecycle
   update (rename / notes / tags / archive / pin); ``status`` is not patchable.
 - ``DELETE /demo/workspaces/{workspace_id}``  -- delete the workspace METADATA
@@ -35,12 +40,13 @@
 from app.core.database import get_db
 from app.core.exceptions import ConflictError, NotFoundError
 from app.core.logging import get_logger
-from app.features.demo import service, workspace
+from app.features.demo import link_health, service, workspace
 from app.features.demo.schemas import (
     DemoRunRequest,
     DemoRunResult,
     StepEvent,
     WorkspaceDetailResponse,
+    WorkspaceHealthResponse,
     WorkspaceListItem,
     WorkspaceListResponse,
     WorkspaceUpdateRequest,
@@ -84,26 +90,70 @@ async def run_demo_pipeline(request: Request, params: DemoRunRequest) -> DemoRun
     "/workspaces",
     response_model=WorkspaceListResponse,
     summary="List saved showcase workspaces",
-    description="List saved showcase workspaces, newest first. Returns 200 + "
-    "an empty list when no workspaces exist.",
+    description=(
+        "List saved showcase workspaces, newest first (pinned rows always "
+        "order first). E2 (#408): `q` searches names case-insensitively, "
+        "repeated `tags` params filter by containment, archived rows are "
+        "hidden unless `include_archived=true`, and `sort_by`/`sort_order` "
+        "are allow-listed (unknown values use the default order). Returns "
+        "200 + an empty list when nothing matches."
+    ),
 )
 async def list_showcase_workspaces(
     db: AsyncSession = Depends(get_db),
     limit: int = Query(default=20, ge=1, le=100, description="Maximum workspaces to return."),
     offset: int = Query(default=0, ge=0, description="Number of workspaces to skip."),
+    q: str | None = Query(
+        default=None,
+        min_length=2,
+        description="Search in workspace name (case-insensitive).",
+    ),
+    tags: list[str] | None = Query(
+        default=None,
+        description="Repeatable tag filter -- a workspace matches when it "
+        "carries every listed tag.",
+    ),
+    include_archived: bool = Query(
+        default=False,
+        description="Include archived workspaces (hidden by default).",
+    ),
+    sort_by: str | None = Query(
+        default=None,
+        description="Sort column: created_at, name, seed, or status. "
+        "Unknown values fall back to the default order (created_at desc).",
+    ),
+    sort_order: str = Query(
+        default="desc",
+        pattern="^(asc|desc)$",
+        description="Sort direction: asc or desc.",
+    ),
 ) -> WorkspaceListResponse:
-    """List saved showcase workspaces (E4, issue #393).
+    """List saved showcase workspaces (E4 #393; filters/sort E2 #408).
 
     Args:
         db: Async database session from dependency.
         limit: Maximum workspaces to return (1-100).
         offset: Number of workspaces to skip.
+        q: Case-insensitive name search.
+        tags: Repeatable tag containment filter.
+        include_archived: Include archived workspaces.
+        sort_by: Allow-listed sort column (unknown values use default order).
+        sort_order: Sort direction (asc or desc).
 
     Returns:
-        A page of saved workspaces plus the total count.
+        A page of saved workspaces plus the filtered total count.
     """
-    rows = await workspace.list_workspaces(db, limit=limit, offset=offset)
-    total = await workspace.count_workspaces(db)
+    rows = await workspace.list_workspaces(
+        db,
+        limit=limit,
+        offset=offset,
+        q=q,
+        tags=tags,
+        include_archived=include_archived,
+        sort_by=sort_by,
+        sort_order=sort_order,
+    )
+    total = await workspace.count_workspaces(db, q=q, tags=tags, include_archived=include_archived)
     return WorkspaceListResponse(
         workspaces=[WorkspaceListItem.model_validate(row) for row in rows],
         total=total,
@@ -138,6 +188,43 @@ async def get_showcase_workspace(
     return WorkspaceDetailResponse.model_validate(row)
 
 
+@router.get(
+    "/workspaces/{workspace_id}/health",
+    response_model=WorkspaceHealthResponse,
+    summary="Probe a workspace's soft-reference link health",
+    description=(
+        "Probe every soft reference the workspace recorded (model runs, "
+        "scenario plans, alias, batch, agent session, job ids) through the "
+        "public API in-process. Each reference classifies as alive (2xx), "
+        "dead (404 -- deleted after the run), or unknown (anything else). "
+        "`partial_run` flags a row whose pipeline never completed."
+    ),
+)
+async def get_workspace_health(
+    workspace_id: str,
+    request: Request,
+    db: AsyncSession = Depends(get_db),
+) -> WorkspaceHealthResponse:
+    """Probe a saved workspace's soft references (E2, issue #408).
+
+    Args:
+        workspace_id: External identifier of the workspace.
+        request: The incoming request (used to obtain the live FastAPI app
+            for the in-process probes).
+        db: Async database session from dependency.
+
+    Returns:
+        Per-reference liveness plus aggregate counts.
+
+    Raises:
+        NotFoundError: When no workspace matches ``workspace_id``.
+    """
+    row = await workspace.get_workspace(db, workspace_id)
+    if row is None:
+        raise NotFoundError(message=f"Workspace not found: {workspace_id}")
+    return await link_health.probe_workspace_links(request.app, row)
+
+
 @router.patch(
     "/workspaces/{workspace_id}",
     response_model=WorkspaceDetailResponse,
diff --git a/app/features/demo/schemas.py b/app/features/demo/schemas.py
index 66bf202b..58daf891 100644
--- a/app/features/demo/schemas.py
+++ b/app/features/demo/schemas.py
@@ -314,3 +314,49 @@ class WorkspaceListResponse(BaseModel):
         ..., description="Saved workspaces for the current page; empty when none."
     )
     total: int = Field(..., ge=0, description="Total saved workspaces.")
+
+
+# E2 (#408) -- link-health classification of one probed soft reference.
+RefHealthStatus = Literal["alive", "dead", "unknown"]
+# E2 (#408) -- kind of soft-referenced object a workspace can record.
+RefType = Literal["model_run", "scenario_plan", "alias", "batch", "agent_session", "job"]
+
+
+class WorkspaceRefHealth(BaseModel):
+    """Liveness of one soft reference recorded on a workspace (E2, #408).
+
+    Response model -- plain ``BaseModel``, NOT strict (``StepEvent``
+    precedent above; strict mode is request-body-only policy).
+    """
+
+    key: str = Field(
+        ...,
+        description="created_objects key, e.g. 'winning_run_id' or 'scenario_plan_ids[0]'.",
+    )
+    ref_type: RefType = Field(..., description="Kind of referenced object.")
+    ref_id: str = Field(..., description="The recorded soft-reference id.")
+    status: RefHealthStatus = Field(
+        ..., description="alive (2xx) / dead (404) / unknown (anything else)."
+    )
+    probe_path: str = Field(..., description="The public API path probed.")
+
+
+class WorkspaceHealthResponse(BaseModel):
+    """Per-workspace link-health summary (E2, #408).
+
+    Response model -- plain ``BaseModel``, NOT strict.
+    """
+
+    workspace_id: str = Field(..., description="The probed workspace's external id.")
+    workspace_status: str = Field(..., description="running / completed / failed.")
+    partial_run: bool = Field(
+        ..., description="True when workspace_status != 'completed' (the run never settled)."
+    )
+    references: list[WorkspaceRefHealth] = Field(
+        default_factory=list,
+        description="Per-reference probe results; empty when nothing was recorded.",
+    )
+    alive: int = Field(..., ge=0, description="Count of references that probed alive.")
+    dead: int = Field(..., ge=0, description="Count of references that probed dead (404).")
+    unknown: int = Field(..., ge=0, description="Count of references whose probe was inconclusive.")
+    checked_at: datetime = Field(default_factory=_utc_now, description="When the probes ran (UTC).")
diff --git a/app/features/demo/workspace.py b/app/features/demo/workspace.py
index 0af35a50..364b64fd 100644
--- a/app/features/demo/workspace.py
+++ b/app/features/demo/workspace.py
@@ -17,8 +17,11 @@
 ``GET /demo/workspaces/{workspace_id}`` in ``app/features/demo/routes.py``;
 :func:`delete_workspace` backs ``DELETE /demo/workspaces/{workspace_id}``;
 :func:`update_workspace` backs ``PATCH /demo/workspaces/{workspace_id}``
-(E1, #407). The request-scoped helpers take a caller-owned session and raise
-normally -- the warn-and-continue contract is pipeline-only.
+(E1, #407). E2 (#408) adds server-side list filters (``q`` name search,
+``tags`` containment, ``include_archived``) and an allow-listed sort with
+unconditional pinned-first ordering. The request-scoped helpers take a
+caller-owned session and raise normally -- the warn-and-continue contract is
+pipeline-only.
 """
 
 from __future__ import annotations
@@ -26,8 +29,9 @@
 import uuid
 from typing import TYPE_CHECKING, Any
 
-from sqlalchemy import func, select
+from sqlalchemy import Select, func, select
 from sqlalchemy.ext.asyncio import AsyncSession
+from sqlalchemy.orm import InstrumentedAttribute
 
 from app.core.database import get_session_maker
 from app.core.logging import get_logger
@@ -45,6 +49,43 @@
 
 logger = get_logger(__name__)
 
+# E2 (#408) -- allow-listed sort columns for GET /demo/workspaces. sort_by is
+# user input; unknown values fall back to the default order (created_at desc)
+# rather than erroring (dimensions precedent, app/features/dimensions/service.py).
+_SORT_COLUMNS: dict[str, InstrumentedAttribute[Any]] = {
+    "created_at": ShowcaseWorkspace.created_at,
+    "name": ShowcaseWorkspace.name,
+    "seed": ShowcaseWorkspace.seed,
+    "status": ShowcaseWorkspace.status,
+}
+
+
+def _apply_filters[SelectT: Select[Any]](
+    stmt: SelectT,
+    *,
+    q: str | None = None,
+    tags: list[str] | None = None,
+    include_archived: bool = False,
+) -> SelectT:
+    """Apply the E2 list filters to a select statement.
+
+    Shared by :func:`list_workspaces` and :func:`count_workspaces` so the
+    page's ``total`` always respects the active filters (scenarios precedent:
+    ``app/features/scenarios/service.py`` applies the same ``.where`` chain to
+    both the count and rows statements).
+    """
+    if not include_archived:
+        stmt = stmt.where(ShowcaseWorkspace.archived.is_(False))
+    if q:
+        # Case-insensitive name search (dimensions ILIKE precedent). NAME only
+        # -- workspace_id prefixes are copy-paste handles, not search terms.
+        stmt = stmt.where(ShowcaseWorkspace.name.ilike(f"%{q}%"))
+    if tags:
+        # JSONB @> containment -- a workspace matches when it carries every
+        # listed tag (scenario_plan.tags precedent; GIN-indexed since E1 #407).
+        stmt = stmt.where(ShowcaseWorkspace.tags.contains(tags))
+    return stmt
+
 
 async def create_workspace(req: DemoRunRequest) -> str | None:
     """Insert a ``running`` workspace row for a ``preservation="keep"`` run.
@@ -217,20 +258,44 @@ async def list_workspaces(
     *,
     limit: int = 50,
     offset: int = 0,
+    q: str | None = None,
+    tags: list[str] | None = None,
+    include_archived: bool = False,
+    sort_by: str | None = None,
+    sort_order: str = "desc",
 ) -> list[ShowcaseWorkspace]:
-    """List workspace rows, newest first (tie-broken by id, descending).
+    """List workspace rows with E2 (#408) filters; pinned rows always first.
+
+    Default order is newest first (tie-broken by id, descending). ``sort_by``
+    is allow-listed (created_at / name / seed / status); unknown values fall
+    back to the default order. ``name`` sorts NULLS LAST so unnamed rows sink.
+    Pinned rows order first regardless of the active sort.
 
     Args:
         db: An open async session (caller-owned).
         limit: Maximum rows to return.
-        offset: Rows to skip from the newest end.
+        offset: Rows to skip from the sorted front.
+        q: Case-insensitive name search (ILIKE substring).
+        tags: Tag containment filter -- a row must carry every listed tag.
+        include_archived: Include archived rows (hidden by default).
+        sort_by: Allow-listed sort column; unknown values use the default order.
+        sort_order: Sort direction ("asc" or "desc").
 
     Returns:
-        The matching rows, newest first.
+        The matching rows in the requested order.
     """
+    sort_column = _SORT_COLUMNS.get(sort_by) if sort_by else None
+    if sort_column is not None:
+        order_expr = sort_column.desc() if sort_order == "desc" else sort_column.asc()
+        if sort_by == "name":
+            order_expr = order_expr.nulls_last()
+    else:
+        order_expr = ShowcaseWorkspace.created_at.desc()
+    stmt = _apply_filters(
+        select(ShowcaseWorkspace), q=q, tags=tags, include_archived=include_archived
+    )
     result = await db.execute(
-        select(ShowcaseWorkspace)
-        .order_by(ShowcaseWorkspace.created_at.desc(), ShowcaseWorkspace.id.desc())
+        stmt.order_by(ShowcaseWorkspace.pinned.desc(), order_expr, ShowcaseWorkspace.id.desc())
         .limit(limit)
         .offset(offset)
     )
@@ -262,14 +327,31 @@ async def delete_workspace(db: AsyncSession, workspace_id: str) -> bool:
     return True
 
 
-async def count_workspaces(db: AsyncSession) -> int:
-    """Count all workspace rows (E4, issue #393).
+async def count_workspaces(
+    db: AsyncSession,
+    *,
+    q: str | None = None,
+    tags: list[str] | None = None,
+    include_archived: bool = False,
+) -> int:
+    """Count workspace rows matching the active filters (E4 #393, E2 #408).
+
+    Applies the SAME filter chain as :func:`list_workspaces` (via
+    :func:`_apply_filters`) so a filtered page's ``total`` stays honest.
 
     Args:
         db: An open async session (caller-owned).
+        q: Case-insensitive name search (ILIKE substring).
+        tags: Tag containment filter -- a row must carry every listed tag.
+        include_archived: Include archived rows (hidden by default).
 
     Returns:
-        The total number of saved workspaces.
+        The number of saved workspaces matching the filters.
     """
-    count_stmt = select(func.count()).select_from(ShowcaseWorkspace)
+    count_stmt = _apply_filters(
+        select(func.count()).select_from(ShowcaseWorkspace),
+        q=q,
+        tags=tags,
+        include_archived=include_archived,
+    )
     return int(await db.scalar(count_stmt) or 0)

From 9f9993a8f6d576d236b075533196921516ae724a Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 01:24:02 +0200
Subject: [PATCH 11/32] test(api): cover workspace filters and link-health
 probes (#408)

---
 app/features/demo/tests/test_link_health.py | 204 ++++++++++++++++
 app/features/demo/tests/test_routes.py      | 253 +++++++++++++++++++-
 2 files changed, 449 insertions(+), 8 deletions(-)
 create mode 100644 app/features/demo/tests/test_link_health.py

diff --git a/app/features/demo/tests/test_link_health.py b/app/features/demo/tests/test_link_health.py
new file mode 100644
index 00000000..dd27f828
--- /dev/null
+++ b/app/features/demo/tests/test_link_health.py
@@ -0,0 +1,204 @@
+"""Unit tests for the link-health probe module (E2, issue #408).
+
+Probes run against a THROWAWAY FastAPI stub app -- no database, no real
+slices. The stub returns 200 / 404 / 500 (and one raising endpoint) at the
+probed paths so every classification branch is exercised. Workspace rows are
+constructed in memory (never persisted) -- Python-side column defaults apply
+at INSERT time, so every consumed field is passed explicitly.
+"""
+
+from typing import Any
+
+from fastapi import FastAPI, Response
+
+from app.features.demo import link_health
+from app.features.demo.link_health import _ProbeTarget, build_probe_targets
+from app.features.demo.models import ShowcaseWorkspace
+
+
+def _make_workspace(**overrides: Any) -> ShowcaseWorkspace:
+    """An in-memory (unpersisted) ShowcaseWorkspace with explicit fields."""
+    base: dict[str, Any] = {
+        "workspace_id": "a" * 32,
+        "name": "e2-health",
+        "status": "completed",
+        "seed": 42,
+        "scenario": "demo_minimal",
+        "reset": False,
+        "skip_seed": True,
+        "created_objects": {},
+        "job_ids": None,
+    }
+    base.update(overrides)
+    return ShowcaseWorkspace(**base)
+
+
+def _stub_app() -> FastAPI:
+    """A throwaway ASGI app standing in for the probed public surface."""
+    app = FastAPI()
+
+    @app.get("/registry/runs/{run_id}")
+    def get_run(run_id: str) -> Response:
+        if run_id == "run-alive":
+            return Response(status_code=200, content="{}", media_type="application/json")
+        return Response(status_code=404)
+
+    @app.get("/scenarios/{scenario_id}")
+    def get_scenario(scenario_id: str) -> Response:
+        return Response(status_code=404)
+
+    @app.get("/registry/aliases/{alias_name}")
+    def get_alias(alias_name: str) -> Response:
+        return Response(status_code=200, content="{}", media_type="application/json")
+
+    @app.get("/batch/{batch_id}")
+    def get_batch(batch_id: str) -> Response:
+        return Response(status_code=500)
+
+    @app.get("/agents/sessions/{session_id}")
+    def get_session(session_id: str) -> Response:
+        raise RuntimeError("probed endpoint blew up")  # -> 500 response, never re-raised
+
+    @app.get("/jobs/{job_id}")
+    def get_job(job_id: str) -> Response:
+        if job_id == "job-alive":
+            return Response(status_code=200, content="{}", media_type="application/json")
+        return Response(status_code=404)
+
+    return app
+
+
+# =============================================================================
+# build_probe_targets
+# =============================================================================
+
+
+def test_build_probe_targets_covers_every_probeable_key() -> None:
+    """Every probeable created_objects key + the job_ids slot map to a path."""
+    ws = _make_workspace(
+        created_objects={
+            "winning_run_id": "run-1",
+            "v2_run_id": "run-2",
+            "stale_alias_run_id": "run-3",
+            "scenario_plan_ids": ["sp-1", "sp-2"],
+            "alias": "demo-production",
+            "batch_id": "batch-1",
+            "agent_session_id": "sess-1",
+            # Non-probeable keys -- no HTTP identity; must be skipped.
+            "v2_model_path": "artifacts/models/model_x.joblib",
+            "scenario_artifact_key": "abc123",
+            "train_model_types": ["naive", "seasonal_naive"],
+        },
+        job_ids=["job-1", "job-2"],
+    )
+    targets = build_probe_targets(ws)
+    by_key = {t.key: t for t in targets}
+
+    assert by_key["winning_run_id"].probe_path == "/registry/runs/run-1"
+    assert by_key["winning_run_id"].ref_type == "model_run"
+    assert by_key["v2_run_id"].probe_path == "/registry/runs/run-2"
+    assert by_key["stale_alias_run_id"].probe_path == "/registry/runs/run-3"
+    assert by_key["scenario_plan_ids[0]"].probe_path == "/scenarios/sp-1"
+    assert by_key["scenario_plan_ids[1]"].probe_path == "/scenarios/sp-2"
+    assert by_key["scenario_plan_ids[0]"].ref_type == "scenario_plan"
+    assert by_key["alias"].probe_path == "/registry/aliases/demo-production"
+    assert by_key["batch_id"].probe_path == "/batch/batch-1"
+    assert by_key["agent_session_id"].probe_path == "/agents/sessions/sess-1"
+    assert by_key["job_ids[0]"].probe_path == "/jobs/job-1"
+    assert by_key["job_ids[1]"].probe_path == "/jobs/job-2"
+    assert by_key["job_ids[0]"].ref_type == "job"
+    # 3 run ids + 2 plans + alias + batch + session + 2 jobs -- and nothing
+    # for the non-probeable keys.
+    assert len(targets) == 10
+    assert not any("model_path" in t.key or "artifact" in t.key for t in targets)
+
+
+def test_build_probe_targets_empty_objects() -> None:
+    """No recorded references (and a NULL job_ids slot) -> no targets."""
+    assert build_probe_targets(_make_workspace()) == []
+
+
+def test_build_probe_targets_skips_non_string_values() -> None:
+    """Malformed JSONB values (non-strings, empties) are skipped, not raised."""
+    ws = _make_workspace(
+        created_objects={
+            "winning_run_id": 123,  # not a str
+            "alias": "",  # empty
+            "scenario_plan_ids": ["sp-1", 7, None, ""],
+            "batch_id": None,
+        },
+        job_ids=["job-1", 42],
+    )
+    targets = build_probe_targets(ws)
+    assert [t.key for t in targets] == ["scenario_plan_ids[0]", "job_ids[0]"]
+
+
+# =============================================================================
+# probe_workspace_links (against the stub app)
+# =============================================================================
+
+
+async def test_probe_classification_alive_dead_unknown() -> None:
+    """2xx -> alive, 404 -> dead, 5xx/exception -> unknown; counts add up."""
+    ws = _make_workspace(
+        created_objects={
+            "winning_run_id": "run-alive",  # 200 -> alive
+            "v2_run_id": "run-gone",  # 404 -> dead
+            "scenario_plan_ids": ["sp-gone"],  # 404 -> dead
+            "alias": "demo-production",  # 200 -> alive
+            "batch_id": "batch-1",  # 500 -> unknown
+            "agent_session_id": "sess-1",  # raises -> 500 response -> unknown
+        },
+        job_ids=["job-alive", "job-gone"],  # 200 + 404
+    )
+    health = await link_health.probe_workspace_links(_stub_app(), ws)
+
+    by_key = {r.key: r.status for r in health.references}
+    assert by_key["winning_run_id"] == "alive"
+    assert by_key["v2_run_id"] == "dead"
+    assert by_key["scenario_plan_ids[0]"] == "dead"
+    assert by_key["alias"] == "alive"
+    assert by_key["batch_id"] == "unknown"
+    assert by_key["agent_session_id"] == "unknown"
+    assert by_key["job_ids[0]"] == "alive"
+    assert by_key["job_ids[1]"] == "dead"
+
+    assert health.alive == 3
+    assert health.dead == 3
+    assert health.unknown == 2
+    assert health.workspace_id == ws.workspace_id
+    assert health.partial_run is False
+
+
+async def test_probe_empty_workspace_short_circuits() -> None:
+    """No references -> empty result, zero counts, no client construction."""
+    health = await link_health.probe_workspace_links(_stub_app(), _make_workspace())
+    assert health.references == []
+    assert (health.alive, health.dead, health.unknown) == (0, 0, 0)
+
+
+async def test_partial_run_flag_tracks_status() -> None:
+    """partial_run is True exactly when the row never reached 'completed'."""
+    app = _stub_app()
+    for status, expected in (("completed", False), ("failed", True), ("running", True)):
+        health = await link_health.probe_workspace_links(app, _make_workspace(status=status))
+        assert health.partial_run is expected
+        assert health.workspace_status == status
+
+
+async def test_probe_transport_error_classifies_unknown() -> None:
+    """A transport-level failure classifies as unknown -- never raises."""
+
+    class _ExplodingClient:
+        async def get(self, _path: str) -> Response:
+            raise OSError("transport down")
+
+    target = _ProbeTarget(
+        key="winning_run_id",
+        ref_type="model_run",
+        ref_id="run-1",
+        probe_path="/registry/runs/run-1",
+    )
+    result = await link_health._probe_one(_ExplodingClient(), target)  # type: ignore[arg-type]
+    assert result.status == "unknown"
+    assert result.ref_id == "run-1"
diff --git a/app/features/demo/tests/test_routes.py b/app/features/demo/tests/test_routes.py
index 1934c018..f5813d79 100644
--- a/app/features/demo/tests/test_routes.py
+++ b/app/features/demo/tests/test_routes.py
@@ -236,10 +236,10 @@ def _orm_like_row(workspace_id: str = "a" * 32, **overrides: object) -> SimpleNa
 async def test_list_workspaces_empty(client, monkeypatch):
     """E4 (#393) -- empty table yields 200 + an empty page (no 404)."""
 
-    async def fake_list(_db, *, limit: int, offset: int) -> list[SimpleNamespace]:
+    async def fake_list(_db, **_kwargs: object) -> list[SimpleNamespace]:
         return []
 
-    async def fake_count(_db) -> int:
+    async def fake_count(_db, **_kwargs: object) -> int:
         return 0
 
     monkeypatch.setattr(workspace, "list_workspaces", fake_list)
@@ -252,14 +252,13 @@ async def fake_count(_db) -> int:
 
 async def test_list_workspaces_passes_pagination(client, monkeypatch):
     """E4 (#393) -- limit/offset query params reach the helper."""
-    seen: dict[str, int] = {}
+    seen: dict[str, object] = {}
 
-    async def fake_list(_db, *, limit: int, offset: int) -> list[SimpleNamespace]:
-        seen["limit"] = limit
-        seen["offset"] = offset
+    async def fake_list(_db, **kwargs: object) -> list[SimpleNamespace]:
+        seen.update(kwargs)
         return [_orm_like_row()]
 
-    async def fake_count(_db) -> int:
+    async def fake_count(_db, **_kwargs: object) -> int:
         return 5
 
     monkeypatch.setattr(workspace, "list_workspaces", fake_list)
@@ -267,7 +266,8 @@ async def fake_count(_db) -> int:
 
     resp = await client.get("/demo/workspaces", params={"limit": 2, "offset": 3})
     assert resp.status_code == 200
-    assert seen == {"limit": 2, "offset": 3}
+    assert seen["limit"] == 2
+    assert seen["offset"] == 3
     body = resp.json()
     assert body["total"] == 5
     assert body["workspaces"][0]["workspace_id"] == "a" * 32
@@ -283,6 +283,136 @@ async def test_list_workspaces_rejects_bad_pagination(client):
     assert resp.status_code == 422
 
 
+# =============================================================================
+# E2 (#408) -- list filters / sort + GET /demo/workspaces/{id}/health (unit)
+# =============================================================================
+
+
+async def test_list_workspaces_passes_filters_and_sort(client, monkeypatch):
+    """E2 (#408) -- q/tags/include_archived/sort params reach BOTH helpers."""
+    seen_list: dict[str, object] = {}
+    seen_count: dict[str, object] = {}
+
+    async def fake_list(_db, **kwargs: object) -> list[SimpleNamespace]:
+        seen_list.update(kwargs)
+        return []
+
+    async def fake_count(_db, **kwargs: object) -> int:
+        seen_count.update(kwargs)
+        return 0
+
+    monkeypatch.setattr(workspace, "list_workspaces", fake_list)
+    monkeypatch.setattr(workspace, "count_workspaces", fake_count)
+
+    resp = await client.get(
+        "/demo/workspaces",
+        params=[
+            ("q", "demo"),
+            ("tags", "smoke"),
+            ("tags", "e2"),
+            ("include_archived", "true"),
+            ("sort_by", "name"),
+            ("sort_order", "asc"),
+        ],
+    )
+    assert resp.status_code == 200
+    assert seen_list["q"] == "demo"
+    assert seen_list["tags"] == ["smoke", "e2"]
+    assert seen_list["include_archived"] is True
+    assert seen_list["sort_by"] == "name"
+    assert seen_list["sort_order"] == "asc"
+    # The count helper gets the SAME filters -- total respects them.
+    assert seen_count["q"] == "demo"
+    assert seen_count["tags"] == ["smoke", "e2"]
+    assert seen_count["include_archived"] is True
+
+
+async def test_list_workspaces_defaults_hide_archived(client, monkeypatch):
+    """E2 (#408) -- a legacy no-param call defaults to include_archived=False."""
+    seen: dict[str, object] = {}
+
+    async def fake_list(_db, **kwargs: object) -> list[SimpleNamespace]:
+        seen.update(kwargs)
+        return []
+
+    async def fake_count(_db, **_kwargs: object) -> int:
+        return 0
+
+    monkeypatch.setattr(workspace, "list_workspaces", fake_list)
+    monkeypatch.setattr(workspace, "count_workspaces", fake_count)
+
+    resp = await client.get("/demo/workspaces")
+    assert resp.status_code == 200
+    assert seen["include_archived"] is False
+    assert seen["q"] is None
+    assert seen["tags"] is None
+    assert seen["sort_by"] is None
+
+
+async def test_list_workspaces_rejects_bad_sort_order(client):
+    """E2 (#408) -- sort_order is pattern-constrained (asc|desc only)."""
+    resp = await client.get("/demo/workspaces", params={"sort_order": "sideways"})
+    assert resp.status_code == 422
+    assert resp.headers["content-type"].startswith("application/problem+json")
+
+
+async def test_workspace_health_404(client, monkeypatch):
+    """E2 (#408) -- health on a missing workspace is a 404 problem+json."""
+
+    async def fake_get(_db, _workspace_id: str) -> None:
+        return None
+
+    monkeypatch.setattr(workspace, "get_workspace", fake_get)
+
+    resp = await client.get("/demo/workspaces/" + "0" * 32 + "/health")
+    assert resp.status_code == 404
+    assert resp.headers["content-type"].startswith("application/problem+json")
+    assert "Workspace not found" in resp.json()["detail"]
+
+
+async def test_workspace_health_happy_path(client, monkeypatch):
+    """E2 (#408) -- the route resolves the row and returns the probe result."""
+    from app.features.demo import link_health
+    from app.features.demo.schemas import WorkspaceHealthResponse, WorkspaceRefHealth
+
+    row = _orm_like_row(status="failed")
+
+    async def fake_get(_db, workspace_id: str) -> SimpleNamespace:
+        return row
+
+    async def fake_probe(_app, ws) -> WorkspaceHealthResponse:
+        assert ws is row  # the route passes the resolved ORM row through
+        return WorkspaceHealthResponse(
+            workspace_id="a" * 32,
+            workspace_status="failed",
+            partial_run=True,
+            references=[
+                WorkspaceRefHealth(
+                    key="winning_run_id",
+                    ref_type="model_run",
+                    ref_id="run-abc",
+                    status="dead",
+                    probe_path="/registry/runs/run-abc",
+                )
+            ],
+            alive=0,
+            dead=1,
+            unknown=0,
+        )
+
+    monkeypatch.setattr(workspace, "get_workspace", fake_get)
+    monkeypatch.setattr(link_health, "probe_workspace_links", fake_probe)
+
+    resp = await client.get("/demo/workspaces/" + "a" * 32 + "/health")
+    assert resp.status_code == 200
+    body = resp.json()
+    assert body["workspace_id"] == "a" * 32
+    assert body["partial_run"] is True
+    assert body["dead"] == 1
+    assert body["references"][0]["status"] == "dead"
+    assert body["references"][0]["probe_path"] == "/registry/runs/run-abc"
+
+
 async def test_get_workspace_404(client, monkeypatch):
     """E4 (#393) -- unknown workspace_id is a 404 problem+json."""
 
@@ -585,3 +715,110 @@ async def test_delete_workspace_integration_keeps_created_objects(client, db_ses
         assert still_there.status_code == 200
     finally:
         await client.delete(f"/agents/sessions/{agent_session_id}")
+
+
+# =============================================================================
+# E2 (#408) -- list filters / sort + health against real Postgres (integration)
+# =============================================================================
+
+
+@pytest.mark.integration
+async def test_list_workspaces_integration_filters_and_sort(client, db_session: AsyncSession):
+    """Filters, sort, pinned-first ordering, and filtered totals on real rows."""
+    ids: dict[str, str] = {}
+    # Creation order matters for the default created_at sort assertions.
+    for name in ("alpha-match", "beta", "zeta-pinned"):
+        workspace_id = await workspace.create_workspace(
+            DemoRunRequest.model_validate({"preservation": "keep", "workspace_name": name})
+        )
+        assert workspace_id is not None
+        ids[name] = workspace_id
+    unnamed = await workspace.create_workspace(
+        DemoRunRequest.model_validate({"preservation": "keep"})
+    )
+    assert unnamed is not None
+
+    # Curate via the PATCH surface (E1): pin zeta, archive beta, tag alpha.
+    assert (
+        await client.patch(f"/demo/workspaces/{ids['zeta-pinned']}", json={"pinned": True})
+    ).status_code == 200
+    assert (
+        await client.patch(f"/demo/workspaces/{ids['beta']}", json={"archived": True})
+    ).status_code == 200
+    assert (
+        await client.patch(f"/demo/workspaces/{ids['alpha-match']}", json={"tags": ["smoke", "e2"]})
+    ).status_code == 200
+
+    # Default list: archived hidden, pinned first, then newest-first.
+    resp = await client.get("/demo/workspaces")
+    assert resp.status_code == 200
+    body = resp.json()
+    assert body["total"] == 3  # beta (archived) excluded from the total too
+    listed = [w["workspace_id"] for w in body["workspaces"]]
+    assert ids["beta"] not in listed
+    assert listed == [ids["zeta-pinned"], unnamed, ids["alpha-match"]]
+
+    # include_archived=true surfaces the archived row again.
+    resp = await client.get("/demo/workspaces", params={"include_archived": "true"})
+    assert resp.json()["total"] == 4
+    assert ids["beta"] in [w["workspace_id"] for w in resp.json()["workspaces"]]
+
+    # q: case-insensitive name substring; total respects the filter.
+    resp = await client.get("/demo/workspaces", params={"q": "ALPHA"})
+    body = resp.json()
+    assert body["total"] == 1
+    assert [w["workspace_id"] for w in body["workspaces"]] == [ids["alpha-match"]]
+
+    # tags: containment -- ALL listed tags must match.
+    resp = await client.get("/demo/workspaces", params=[("tags", "smoke"), ("tags", "e2")])
+    assert [w["workspace_id"] for w in resp.json()["workspaces"]] == [ids["alpha-match"]]
+    resp = await client.get("/demo/workspaces", params=[("tags", "smoke"), ("tags", "nope")])
+    assert resp.json()["total"] == 0
+
+    # sort_by=name asc: pinned row STILL first, unnamed row sinks (NULLS LAST).
+    resp = await client.get("/demo/workspaces", params={"sort_by": "name", "sort_order": "asc"})
+    names = [w["name"] for w in resp.json()["workspaces"]]
+    assert names == ["zeta-pinned", "alpha-match", None]
+
+    # Unknown sort_by silently falls back to the default order (no 422).
+    resp = await client.get("/demo/workspaces", params={"sort_by": "bogus"})
+    assert resp.status_code == 200
+    assert [w["workspace_id"] for w in resp.json()["workspaces"]] == [
+        ids["zeta-pinned"],
+        unnamed,
+        ids["alpha-match"],
+    ]
+
+
+@pytest.mark.integration
+async def test_workspace_health_integration_alive_and_dead(client, db_session: AsyncSession):
+    """A real reference probes alive; a bogus one probes dead (E2, #408)."""
+    session_resp = await client.post("/agents/sessions", json={"agent_type": "experiment"})
+    assert session_resp.status_code == 201
+    agent_session_id = session_resp.json()["session_id"]
+    try:
+        workspace_id = await workspace.create_workspace(
+            DemoRunRequest.model_validate({"preservation": "keep", "workspace_name": "e2-health"})
+        )
+        assert workspace_id is not None
+        row = await workspace.get_workspace(db_session, workspace_id)
+        assert row is not None
+        row.created_objects = {
+            "agent_session_id": agent_session_id,
+            "winning_run_id": "run-dangling-never-created",
+        }
+        await db_session.commit()
+
+        resp = await client.get(f"/demo/workspaces/{workspace_id}/health")
+        assert resp.status_code == 200
+        body = resp.json()
+        by_key = {r["key"]: r["status"] for r in body["references"]}
+        assert by_key["agent_session_id"] == "alive"
+        assert by_key["winning_run_id"] == "dead"
+        assert body["alive"] == 1
+        assert body["dead"] == 1
+        assert body["unknown"] == 0
+        # The row was inserted as 'running' (never finalized) -> partial run.
+        assert body["partial_run"] is True
+    finally:
+        await client.delete(f"/agents/sessions/{agent_session_id}")

From f26507f7f4e0d6a456f862d5e11125fe0b9b3a2e Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 01:24:10 +0200
Subject: [PATCH 12/32] feat(ui): add workspace lifecycle types and hooks
 (#408)

---
 frontend/src/hooks/use-workspaces.test.ts | 210 +++++++++++++++++++++-
 frontend/src/hooks/use-workspaces.ts      | 124 ++++++++++++-
 frontend/src/types/api.ts                 |  53 ++++++
 3 files changed, 377 insertions(+), 10 deletions(-)

diff --git a/frontend/src/hooks/use-workspaces.test.ts b/frontend/src/hooks/use-workspaces.test.ts
index d804f96b..66825d61 100644
--- a/frontend/src/hooks/use-workspaces.test.ts
+++ b/frontend/src/hooks/use-workspaces.test.ts
@@ -1,5 +1,5 @@
 /**
- * Unit tests for the use-workspaces hooks (``useDeleteWorkspace``).
+ * Unit tests for the use-workspaces hooks.
  *
  * Stubs ``fetch`` to assert the hook issues a DELETE to the workspace
  * endpoint and invalidates the workspaces list on success; no real backend
@@ -10,7 +10,13 @@ import { act, renderHook, waitFor } from '@testing-library/react'
 import { afterEach, describe, expect, it, vi } from 'vitest'
 import { createElement, type ReactNode } from 'react'
 
-import { useDeleteWorkspace } from './use-workspaces'
+import {
+  useDeleteWorkspace,
+  usePatchWorkspace,
+  useWorkspaceHealth,
+  useWorkspaceLineage,
+  useWorkspaces,
+} from './use-workspaces'
 import { ApiError } from '@/lib/api'
 
 function makeWrapper(client: QueryClient) {
@@ -88,3 +94,203 @@ describe('useDeleteWorkspace', () => {
     expect((error as ApiError).message).toContain('Workspace not found')
   })
 })
+
+// =============================================================================
+// E2 (#408) — params-aware list + PATCH + health + lineage
+// =============================================================================
+
+function jsonResponse(body: unknown, status = 200): Response {
+  return new Response(JSON.stringify(body), {
+    status,
+    headers: { 'content-type': 'application/json' },
+  })
+}
+
+function problemResponse(detail: string, status: number): Response {
+  return new Response(
+    JSON.stringify({ type: '/errors/not-found', title: 'Not Found', status, detail }),
+    { status, headers: { 'content-type': 'application/problem+json' } },
+  )
+}
+
+describe('useWorkspaces (E2 params)', () => {
+  it('serializes the list params onto the query string', async () => {
+    const fetchMock = vi.fn().mockResolvedValue(jsonResponse({ workspaces: [], total: 0 }))
+    vi.stubGlobal('fetch', fetchMock)
+
+    const client = new QueryClient({ defaultOptions: { queries: { retry: false } } })
+    const { result } = renderHook(
+      () =>
+        useWorkspaces({
+          q: 'demo',
+          tags: 'smoke',
+          include_archived: true,
+          sort_by: 'name',
+          sort_order: 'asc',
+        }),
+      { wrapper: makeWrapper(client) },
+    )
+    await waitFor(() => expect(result.current.isSuccess).toBe(true))
+
+    const url = String(fetchMock.mock.calls[0]![0])
+    expect(url).toContain('/demo/workspaces')
+    expect(url).toContain('q=demo')
+    expect(url).toContain('tags=smoke')
+    expect(url).toContain('include_archived=true')
+    expect(url).toContain('sort_by=name')
+    expect(url).toContain('sort_order=asc')
+  })
+
+  it('omits unset params (legacy URL shape preserved)', async () => {
+    const fetchMock = vi.fn().mockResolvedValue(jsonResponse({ workspaces: [], total: 0 }))
+    vi.stubGlobal('fetch', fetchMock)
+
+    const client = new QueryClient({ defaultOptions: { queries: { retry: false } } })
+    const { result } = renderHook(() => useWorkspaces(), { wrapper: makeWrapper(client) })
+    await waitFor(() => expect(result.current.isSuccess).toBe(true))
+
+    const url = String(fetchMock.mock.calls[0]![0])
+    expect(url).toContain('limit=20')
+    expect(url).not.toContain('q=')
+    expect(url).not.toContain('include_archived')
+    expect(url).not.toContain('sort_by')
+  })
+})
+
+describe('usePatchWorkspace', () => {
+  it('issues a PATCH with the partial body and invalidates the list', async () => {
+    const workspaceId = 'a'.repeat(32)
+    const fetchMock = vi
+      .fn()
+      .mockResolvedValue(jsonResponse({ workspace_id: workspaceId, pinned: true }))
+    vi.stubGlobal('fetch', fetchMock)
+
+    const client = new QueryClient({ defaultOptions: { queries: { retry: false } } })
+    const invalidateSpy = vi.spyOn(client, 'invalidateQueries')
+    const { result } = renderHook(() => usePatchWorkspace(), {
+      wrapper: makeWrapper(client),
+    })
+
+    await act(async () => {
+      result.current.mutate({ workspaceId, update: { pinned: true } })
+    })
+    await waitFor(() => expect(result.current.isSuccess).toBe(true))
+
+    const call = fetchMock.mock.calls[0]!
+    expect(String(call[0])).toContain(`/demo/workspaces/${workspaceId}`)
+    const init = call[1] as RequestInit
+    expect(init.method).toBe('PATCH')
+    expect(JSON.parse(String(init.body))).toEqual({ pinned: true })
+    expect(invalidateSpy).toHaveBeenCalledWith({ queryKey: ['workspaces'] })
+  })
+})
+
+describe('useWorkspaceHealth', () => {
+  it('fetches the health endpoint for the loaded workspace', async () => {
+    const workspaceId = 'a'.repeat(32)
+    const health = {
+      workspace_id: workspaceId,
+      workspace_status: 'completed',
+      partial_run: false,
+      references: [],
+      alive: 0,
+      dead: 0,
+      unknown: 0,
+      checked_at: '2026-06-13T00:00:00Z',
+    }
+    const fetchMock = vi.fn().mockResolvedValue(jsonResponse(health))
+    vi.stubGlobal('fetch', fetchMock)
+
+    const client = new QueryClient({ defaultOptions: { queries: { retry: false } } })
+    const { result } = renderHook(() => useWorkspaceHealth(workspaceId), {
+      wrapper: makeWrapper(client),
+    })
+    await waitFor(() => expect(result.current.isSuccess).toBe(true))
+    expect(String(fetchMock.mock.calls[0]![0])).toContain(
+      `/demo/workspaces/${workspaceId}/health`,
+    )
+    expect(result.current.data).toEqual(health)
+  })
+
+  it('stays disabled without a workspace id', () => {
+    const fetchMock = vi.fn()
+    vi.stubGlobal('fetch', fetchMock)
+    const client = new QueryClient({ defaultOptions: { queries: { retry: false } } })
+    renderHook(() => useWorkspaceHealth(''), { wrapper: makeWrapper(client) })
+    expect(fetchMock).not.toHaveBeenCalled()
+  })
+})
+
+describe('useWorkspaceLineage', () => {
+  const idA = 'a'.repeat(32)
+  const idB = 'b'.repeat(32)
+  const idC = 'c'.repeat(32)
+
+  function detailBody(id: string, name: string | null, parent: string | null) {
+    return {
+      workspace_id: id,
+      name,
+      replayed_from_workspace_id: parent,
+      tags: [],
+      archived: false,
+      pinned: false,
+    }
+  }
+
+  it('walks the chain newest → original and stops at the root', async () => {
+    const fetchMock = vi
+      .fn()
+      .mockResolvedValueOnce(jsonResponse(detailBody(idA, 'child', idB)))
+      .mockResolvedValueOnce(jsonResponse(detailBody(idB, 'origin', null)))
+    vi.stubGlobal('fetch', fetchMock)
+
+    const client = new QueryClient({ defaultOptions: { queries: { retry: false } } })
+    const { result } = renderHook(() => useWorkspaceLineage(idA), {
+      wrapper: makeWrapper(client),
+    })
+    await waitFor(() => expect(result.current.isSuccess).toBe(true))
+
+    const lineage = result.current.data!
+    expect(lineage.entries.map((e) => e.workspace_id)).toEqual([idA, idB])
+    expect(lineage.entries.map((e) => e.deleted)).toEqual([false, false])
+    expect(lineage.truncated).toBe(false)
+    expect(fetchMock).toHaveBeenCalledTimes(2)
+  })
+
+  it('terminates the walk with a deleted sentinel on a 404 ancestor', async () => {
+    const fetchMock = vi
+      .fn()
+      .mockResolvedValueOnce(jsonResponse(detailBody(idA, 'child', idC)))
+      .mockResolvedValueOnce(problemResponse(`Workspace not found: ${idC}`, 404))
+    vi.stubGlobal('fetch', fetchMock)
+
+    const client = new QueryClient({ defaultOptions: { queries: { retry: false } } })
+    const { result } = renderHook(() => useWorkspaceLineage(idA), {
+      wrapper: makeWrapper(client),
+    })
+    await waitFor(() => expect(result.current.isSuccess).toBe(true))
+
+    const lineage = result.current.data!
+    expect(lineage.entries).toHaveLength(2)
+    expect(lineage.entries[1]).toMatchObject({ workspace_id: idC, deleted: true, detail: null })
+    expect(lineage.truncated).toBe(false)
+  })
+
+  it('caps the walk depth and flags truncation', async () => {
+    // Every row points at another parent — an unbounded chain.
+    const fetchMock = vi.fn().mockImplementation((url: unknown) => {
+      const id = String(url).split('/').pop()!
+      return Promise.resolve(jsonResponse(detailBody(id, null, 'f'.repeat(32))))
+    })
+    vi.stubGlobal('fetch', fetchMock)
+
+    const client = new QueryClient({ defaultOptions: { queries: { retry: false } } })
+    const { result } = renderHook(() => useWorkspaceLineage(idA), {
+      wrapper: makeWrapper(client),
+    })
+    await waitFor(() => expect(result.current.isSuccess).toBe(true))
+
+    expect(result.current.data!.entries).toHaveLength(5)
+    expect(result.current.data!.truncated).toBe(true)
+  })
+})
diff --git a/frontend/src/hooks/use-workspaces.ts b/frontend/src/hooks/use-workspaces.ts
index 76fd01bd..610cefb8 100644
--- a/frontend/src/hooks/use-workspaces.ts
+++ b/frontend/src/hooks/use-workspaces.ts
@@ -1,16 +1,35 @@
 import { useMutation, useQuery, useQueryClient } from '@tanstack/react-query'
-import { api } from '@/lib/api'
-import type { WorkspaceDetail, WorkspaceListResponse } from '@/types/api'
+import { api, ApiError } from '@/lib/api'
+import type {
+  WorkspaceDetail,
+  WorkspaceHealth,
+  WorkspaceListParams,
+  WorkspaceListResponse,
+  WorkspaceUpdate,
+} from '@/types/api'
 
 /**
- * E4 (#393) — list saved showcase workspaces, newest first. Server-backed
- * source of truth for `preservation="keep"` runs (the localStorage
- * RunHistoryStrip stays ephemeral-only).
+ * E4 (#393) — list saved showcase workspaces. Server-backed source of truth
+ * for `preservation="keep"` runs (the localStorage RunHistoryStrip stays
+ * ephemeral-only). E2 (#408) — params-aware: q name search, single-tag
+ * filter, include_archived (server default hides archived), allow-listed
+ * sort_by/sort_order. Pinned rows always order first server-side.
  */
-export function useWorkspaces(limit = 20, enabled = true) {
+export function useWorkspaces(params: WorkspaceListParams = {}, enabled = true) {
   return useQuery({
-    queryKey: ['workspaces', { limit }],
-    queryFn: () => api<WorkspaceListResponse>('/demo/workspaces', { params: { limit } }),
+    queryKey: ['workspaces', params],
+    queryFn: () =>
+      api<WorkspaceListResponse>('/demo/workspaces', {
+        params: {
+          limit: params.limit ?? 20,
+          offset: params.offset,
+          q: params.q,
+          tags: params.tags,
+          include_archived: params.include_archived,
+          sort_by: params.sort_by,
+          sort_order: params.sort_order,
+        },
+      }),
     enabled,
   })
 }
@@ -40,3 +59,92 @@ export function useDeleteWorkspace() {
     },
   })
 }
+
+/**
+ * E2 (#408) — partial lifecycle update (rename / notes / tags / pin /
+ * archive) through the E1 PATCH endpoint. Only provided fields change.
+ * Invalidates the blanket ['workspaces'] key so list + detail + lineage
+ * queries all refetch.
+ */
+export function usePatchWorkspace() {
+  const queryClient = useQueryClient()
+  return useMutation({
+    mutationFn: ({ workspaceId, update }: { workspaceId: string; update: WorkspaceUpdate }) =>
+      api<WorkspaceDetail>(`/demo/workspaces/${workspaceId}`, {
+        method: 'PATCH',
+        body: update,
+      }),
+    onSuccess: () => {
+      void queryClient.invalidateQueries({ queryKey: ['workspaces'] })
+    },
+  })
+}
+
+/**
+ * E2 (#408) — soft-reference link health for the LOADED workspace only
+ * (never probed per list row — the backend fans out one in-process probe
+ * per reference). staleTime keeps reloads from hammering the probe fan-out.
+ */
+export function useWorkspaceHealth(workspaceId: string, enabled = true) {
+  return useQuery({
+    queryKey: ['workspaces', workspaceId, 'health'],
+    queryFn: () => api<WorkspaceHealth>(`/demo/workspaces/${workspaceId}/health`),
+    enabled: enabled && !!workspaceId,
+    staleTime: 30_000,
+  })
+}
+
+/** One ancestor entry in a workspace's replay lineage chain (newest first). */
+export interface LineageEntry {
+  workspace_id: string
+  name: string | null
+  /** True when the ancestor row was deleted — dangling pointers are designed. */
+  deleted: boolean
+  detail: WorkspaceDetail | null
+}
+
+export interface WorkspaceLineage {
+  entries: LineageEntry[]
+  /** True when the chain continues past the depth cap. */
+  truncated: boolean
+}
+
+// A replay-of-a-replay chain deeper than this is pathological; the strip
+// renders a trailing ellipsis instead of walking forever.
+const LINEAGE_DEPTH_CAP = 5
+
+/**
+ * E2 (#408) — walk the replayed_from_workspace_id chain (newest → original)
+ * as ONE query of serial fetches. A 404 ancestor terminates the walk with a
+ * deleted sentinel — dangling lineage is expected, never an error.
+ */
+export function useWorkspaceLineage(workspaceId: string | null) {
+  return useQuery({
+    queryKey: ['workspaces', workspaceId, 'lineage'],
+    enabled: !!workspaceId,
+    queryFn: async (): Promise<WorkspaceLineage> => {
+      const entries: LineageEntry[] = []
+      let current: string | null = workspaceId
+      for (let depth = 0; depth < LINEAGE_DEPTH_CAP && current; depth += 1) {
+        try {
+          const detail = await api<WorkspaceDetail>(`/demo/workspaces/${current}`)
+          entries.push({
+            workspace_id: current,
+            name: detail.name,
+            deleted: false,
+            detail,
+          })
+          current = detail.replayed_from_workspace_id
+        } catch (error) {
+          if (error instanceof ApiError && error.status === 404) {
+            entries.push({ workspace_id: current, name: null, deleted: true, detail: null })
+            current = null
+          } else {
+            throw error
+          }
+        }
+      }
+      return { entries, truncated: current !== null }
+    },
+  })
+}
diff --git a/frontend/src/types/api.ts b/frontend/src/types/api.ts
index 1232e991..f64ae24a 100644
--- a/frontend/src/types/api.ts
+++ b/frontend/src/types/api.ts
@@ -815,6 +815,11 @@ export interface WorkspaceListItem {
   skip_seed: boolean
   result_summary: Record<string, unknown> | null
   created_at: string
+  // E1 (#407) — lifecycle + provenance fields (consumed by E2 #408).
+  archived: boolean
+  pinned: boolean
+  tags: string[]
+  replayed_from_workspace_id: string | null
 }
 
 // Full row from GET /demo/workspaces/{workspace_id}.
@@ -824,6 +829,9 @@ export interface WorkspaceDetail extends WorkspaceListItem {
   date_start: string | null
   date_end: string | null
   created_objects: Record<string, unknown>
+  // E1 (#407) — operator annotation + schema version.
+  notes: string | null
+  config_schema_version: number
 }
 
 // Page shape of GET /demo/workspaces.
@@ -832,6 +840,51 @@ export interface WorkspaceListResponse {
   total: number
 }
 
+// E2 (#408) — partial-update body for PATCH /demo/workspaces/{workspace_id}
+// (E1 endpoint). Absent field = unchanged; explicit null clears name/notes.
+export interface WorkspaceUpdate {
+  name?: string | null
+  notes?: string | null
+  tags?: string[]
+  archived?: boolean
+  pinned?: boolean
+}
+
+// E2 (#408) — query params for GET /demo/workspaces. Archived rows are
+// hidden unless include_archived; unknown sort_by falls back server-side.
+export interface WorkspaceListParams {
+  limit?: number
+  offset?: number
+  q?: string
+  tags?: string
+  include_archived?: boolean
+  sort_by?: 'created_at' | 'name' | 'seed' | 'status'
+  sort_order?: 'asc' | 'desc'
+}
+
+// E2 (#408) — link-health classification of one probed soft reference.
+export type RefHealthStatus = 'alive' | 'dead' | 'unknown'
+
+export interface WorkspaceRefHealth {
+  key: string
+  ref_type: 'model_run' | 'scenario_plan' | 'alias' | 'batch' | 'agent_session' | 'job'
+  ref_id: string
+  status: RefHealthStatus
+  probe_path: string
+}
+
+// E2 (#408) — GET /demo/workspaces/{workspace_id}/health response.
+export interface WorkspaceHealth {
+  workspace_id: string
+  workspace_status: 'running' | 'completed' | 'failed'
+  partial_run: boolean
+  references: WorkspaceRefHealth[]
+  alive: number
+  dead: number
+  unknown: number
+  checked_at: string
+}
+
 // === AI Model Configuration (/config) ===
 
 // Presence + masked preview of one provider API key (never the raw value).

From 7012fd08b2eb314aea372d5afbf931c95e3b640d Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 01:24:10 +0200
Subject: [PATCH 13/32] feat(ui): add safe replay and workspace lifecycle to
 showcase (#408)

---
 .../demo/ReplayConfirmDialog.test.tsx         | 110 ++++
 .../components/demo/ReplayConfirmDialog.tsx   | 158 ++++++
 .../demo/WorkspaceArtifactsPanel.test.tsx     |  86 +++-
 .../demo/WorkspaceArtifactsPanel.tsx          |  88 +++-
 .../demo/WorkspaceEditDialog.test.tsx         | 140 ++++++
 .../components/demo/WorkspaceEditDialog.tsx   | Bin 0 -> 6206 bytes
 .../demo/WorkspaceLineageStrip.test.tsx       |  89 ++++
 .../components/demo/WorkspaceLineageStrip.tsx |  64 +++
 .../components/demo/WorkspacePanel.test.tsx   | 277 +++++++++--
 .../src/components/demo/WorkspacePanel.tsx    | 470 +++++++++++++++---
 frontend/src/components/demo/index.ts         |   5 +
 .../components/demo/replay-request.test.ts    |  39 ++
 .../src/components/demo/replay-request.ts     |  19 +
 .../src/components/demo/workspace-name.ts     |   8 +
 frontend/src/pages/showcase.tsx               |  72 ++-
 15 files changed, 1460 insertions(+), 165 deletions(-)
 create mode 100644 frontend/src/components/demo/ReplayConfirmDialog.test.tsx
 create mode 100644 frontend/src/components/demo/ReplayConfirmDialog.tsx
 create mode 100644 frontend/src/components/demo/WorkspaceEditDialog.test.tsx
 create mode 100644 frontend/src/components/demo/WorkspaceEditDialog.tsx
 create mode 100644 frontend/src/components/demo/WorkspaceLineageStrip.test.tsx
 create mode 100644 frontend/src/components/demo/WorkspaceLineageStrip.tsx
 create mode 100644 frontend/src/components/demo/replay-request.test.ts
 create mode 100644 frontend/src/components/demo/replay-request.ts
 create mode 100644 frontend/src/components/demo/workspace-name.ts

diff --git a/frontend/src/components/demo/ReplayConfirmDialog.test.tsx b/frontend/src/components/demo/ReplayConfirmDialog.test.tsx
new file mode 100644
index 00000000..8c3d705d
--- /dev/null
+++ b/frontend/src/components/demo/ReplayConfirmDialog.test.tsx
@@ -0,0 +1,110 @@
+import { cleanup, fireEvent, render, screen } from '@testing-library/react'
+import { afterEach, beforeAll, describe, expect, it, vi } from 'vitest'
+import { ReplayConfirmDialog } from './ReplayConfirmDialog'
+import { buildReplayRequest } from './replay-request'
+import type { WorkspaceListItem } from '@/types/api'
+
+beforeAll(() => {
+  class ResizeObserverStub {
+    observe() {}
+    unobserve() {}
+    disconnect() {}
+  }
+  vi.stubGlobal('ResizeObserver', ResizeObserverStub)
+})
+
+afterEach(() => {
+  cleanup()
+  vi.clearAllMocks()
+})
+
+const baseItem: WorkspaceListItem = {
+  workspace_id: 'a'.repeat(32),
+  name: 'replay-me',
+  status: 'completed',
+  seed: 7,
+  scenario: 'demo_minimal',
+  reset: false,
+  skip_seed: true,
+  result_summary: null,
+  created_at: '2026-06-01T12:00:00Z',
+  archived: false,
+  pinned: false,
+  tags: [],
+  replayed_from_workspace_id: null,
+}
+
+function renderDialog(workspace: WorkspaceListItem | null, handlers = {}) {
+  const onConfirm = vi.fn()
+  const onCancel = vi.fn()
+  render(
+    <ReplayConfirmDialog
+      workspace={workspace}
+      requestPreview={workspace ? buildReplayRequest(workspace) : null}
+      onConfirm={onConfirm}
+      onCancel={onCancel}
+      {...handlers}
+    />,
+  )
+  return { onConfirm, onCancel }
+}
+
+describe('ReplayConfirmDialog', () => {
+  it('renders nothing while no replay is pending', () => {
+    renderDialog(null)
+    expect(document.body.textContent).not.toContain('Replay workspace')
+  })
+
+  it('renders the recorded-vs-sent preview values', () => {
+    renderDialog(baseItem)
+    const copy = document.body.textContent ?? ''
+    expect(copy).toContain('Replay workspace “replay-me”?')
+    expect(copy).toContain('seed')
+    expect(copy).toContain('7')
+    expect(copy).toContain('demo_minimal')
+    expect(copy).toContain('keep')
+    // replayed_from points at the source row on both columns.
+    expect(copy).toContain(baseItem.workspace_id)
+    // The verbatim-replay hint for operators who want a different config.
+    expect(copy).toContain('Use Load instead')
+  })
+
+  it('uses a plain confirm label on a non-destructive replay', () => {
+    renderDialog(baseItem)
+    const action = screen.getByTestId('replay-confirm')
+    expect(action.textContent).toBe('Replay')
+    expect(document.body.textContent).not.toContain('WIPES the database')
+  })
+
+  it('escalates to destructive copy + label when reset=true', () => {
+    renderDialog({ ...baseItem, reset: true })
+    expect(document.body.textContent).toContain('WIPES the database')
+    const action = screen.getByTestId('replay-confirm')
+    expect(action.textContent).toBe('Replay & wipe database')
+    expect(action.className).toContain('bg-destructive')
+  })
+
+  it('confirm fires onConfirm once; cancel fires onCancel and never confirms', () => {
+    const { onConfirm } = renderDialog(baseItem)
+    fireEvent.click(screen.getByTestId('replay-confirm'))
+    expect(onConfirm).toHaveBeenCalledTimes(1)
+    cleanup()
+    const second = renderDialog(baseItem)
+    fireEvent.click(screen.getByText('Cancel'))
+    expect(second.onCancel).toHaveBeenCalledTimes(1)
+    expect(second.onConfirm).not.toHaveBeenCalled()
+  })
+
+  it('highlights a mismatching row (defensive — verbatim replays match)', () => {
+    render(
+      <ReplayConfirmDialog
+        workspace={baseItem}
+        requestPreview={{ ...buildReplayRequest(baseItem), seed: 99 }}
+        onConfirm={vi.fn()}
+        onCancel={vi.fn()}
+      />,
+    )
+    const mismatched = document.querySelector('td.font-semibold.text-destructive')
+    expect(mismatched?.textContent).toBe('99')
+  })
+})
diff --git a/frontend/src/components/demo/ReplayConfirmDialog.tsx b/frontend/src/components/demo/ReplayConfirmDialog.tsx
new file mode 100644
index 00000000..8395f446
--- /dev/null
+++ b/frontend/src/components/demo/ReplayConfirmDialog.tsx
@@ -0,0 +1,158 @@
+/**
+ * E2 (#408) — replay confirmation dialog with a recorded-vs-sent preview.
+ *
+ * Every panel Replay goes through this dialog (no code path starts a replay
+ * without it). The body renders a Field / Recorded / Will-send table; rows
+ * where the two values differ are highlighted (defensive — a verbatim replay
+ * normally matches). A reset=true workspace escalates: destructive warning
+ * copy + a destructive-styled confirm button ("Replay & wipe database").
+ *
+ * Replay policy stays verbatim by design — operators who want a different
+ * config use Load (which repopulates every control) and Run instead.
+ */
+
+import { AlertTriangle } from 'lucide-react'
+import {
+  AlertDialog,
+  AlertDialogAction,
+  AlertDialogCancel,
+  AlertDialogContent,
+  AlertDialogDescription,
+  AlertDialogFooter,
+  AlertDialogHeader,
+  AlertDialogTitle,
+} from '@/components/ui/alert-dialog'
+import {
+  Table,
+  TableBody,
+  TableCell,
+  TableHead,
+  TableHeader,
+  TableRow,
+} from '@/components/ui/table'
+import { cn } from '@/lib/utils'
+import type { DemoRunRequest, WorkspaceListItem } from '@/types/api'
+
+interface ReplayConfirmDialogProps {
+  /** The workspace pending replay — null keeps the dialog closed. */
+  workspace: WorkspaceListItem | null
+  /** The exact request the confirmed replay will send (single source). */
+  requestPreview: DemoRunRequest | null
+  onConfirm: () => void
+  onCancel: () => void
+}
+
+function fmt(value: unknown): string {
+  if (value === undefined || value === null || value === '') return '—'
+  return String(value)
+}
+
+interface PreviewRow {
+  field: string
+  recorded: unknown
+  willSend: unknown
+}
+
+function buildRows(ws: WorkspaceListItem, req: DemoRunRequest): PreviewRow[] {
+  return [
+    { field: 'seed', recorded: ws.seed, willSend: req.seed },
+    { field: 'scenario', recorded: ws.scenario, willSend: req.scenario },
+    { field: 'reset', recorded: ws.reset, willSend: req.reset },
+    { field: 'skip_seed', recorded: ws.skip_seed, willSend: req.skip_seed },
+    { field: 'name', recorded: ws.name, willSend: req.workspace_name ?? null },
+    { field: 'preservation', recorded: 'keep', willSend: req.preservation },
+    {
+      field: 'replayed_from',
+      recorded: ws.workspace_id,
+      willSend: req.replayed_from_workspace_id,
+    },
+  ]
+}
+
+export function ReplayConfirmDialog({
+  workspace,
+  requestPreview,
+  onConfirm,
+  onCancel,
+}: ReplayConfirmDialogProps) {
+  const rows =
+    workspace && requestPreview ? buildRows(workspace, requestPreview) : []
+  const destructive = workspace?.reset === true
+  const label = workspace?.name ?? workspace?.workspace_id.slice(0, 8) ?? ''
+
+  return (
+    <AlertDialog
+      open={workspace !== null}
+      onOpenChange={(open) => {
+        if (!open) onCancel()
+      }}
+    >
+      <AlertDialogContent>
+        <AlertDialogHeader>
+          <AlertDialogTitle>Replay workspace “{label}”?</AlertDialogTitle>
+          <AlertDialogDescription>
+            The recorded config is re-submitted verbatim as a new kept run —
+            the original workspace row is never changed.
+          </AlertDialogDescription>
+        </AlertDialogHeader>
+
+        {destructive && (
+          <div className="flex items-start gap-2 rounded-md border border-destructive/50 bg-destructive/10 p-3 text-sm text-destructive">
+            <AlertTriangle className="mt-0.5 h-4 w-4 shrink-0" />
+            <span>
+              Replaying this workspace <strong>WIPES the database</strong> and
+              reseeds it from scratch.
+            </span>
+          </div>
+        )}
+
+        <Table>
+          <TableHeader>
+            <TableRow>
+              <TableHead>Field</TableHead>
+              <TableHead>Recorded</TableHead>
+              <TableHead>Will send</TableHead>
+            </TableRow>
+          </TableHeader>
+          <TableBody>
+            {rows.map((row) => {
+              const mismatch = fmt(row.recorded) !== fmt(row.willSend)
+              return (
+                <TableRow key={row.field}>
+                  <TableCell className="font-medium">{row.field}</TableCell>
+                  <TableCell className="font-mono text-xs">{fmt(row.recorded)}</TableCell>
+                  <TableCell
+                    className={cn(
+                      'font-mono text-xs',
+                      mismatch && 'font-semibold text-destructive'
+                    )}
+                  >
+                    {fmt(row.willSend)}
+                  </TableCell>
+                </TableRow>
+              )
+            })}
+          </TableBody>
+        </Table>
+
+        <p className="text-xs text-muted-foreground">
+          Want to change the config first? Use Load instead.
+        </p>
+
+        <AlertDialogFooter>
+          <AlertDialogCancel>Cancel</AlertDialogCancel>
+          <AlertDialogAction
+            data-testid="replay-confirm"
+            onClick={onConfirm}
+            className={cn(
+              destructive &&
+                'bg-destructive text-destructive-foreground hover:bg-destructive/90'
+            )}
+          >
+            {destructive ? 'Replay & wipe database' : 'Replay'}
+          </AlertDialogAction>
+        </AlertDialogFooter>
+      </AlertDialogContent>
+    </AlertDialog>
+  )
+}
diff --git a/frontend/src/components/demo/WorkspaceArtifactsPanel.test.tsx b/frontend/src/components/demo/WorkspaceArtifactsPanel.test.tsx
index 8d1e60ce..8a6d0549 100644
--- a/frontend/src/components/demo/WorkspaceArtifactsPanel.test.tsx
+++ b/frontend/src/components/demo/WorkspaceArtifactsPanel.test.tsx
@@ -1,8 +1,8 @@
-import { cleanup, render } from '@testing-library/react'
+import { cleanup, render, screen } from '@testing-library/react'
 import { afterEach, describe, expect, it } from 'vitest'
 import { MemoryRouter } from 'react-router-dom'
 import { WorkspaceArtifactsPanel } from './WorkspaceArtifactsPanel'
-import type { WorkspaceDetail } from '@/types/api'
+import type { WorkspaceDetail, WorkspaceHealth } from '@/types/api'
 
 afterEach(() => cleanup())
 
@@ -28,12 +28,18 @@ const fullWorkspace: WorkspaceDetail = {
     agent_session_id: 'sess-1',
     scenario_plan_ids: ['sp-1', 'sp-2'],
   },
+  archived: false,
+  pinned: false,
+  tags: [],
+  replayed_from_workspace_id: null,
+  notes: null,
+  config_schema_version: 1,
 }
 
-function renderPanel(workspace: WorkspaceDetail) {
+function renderPanel(workspace: WorkspaceDetail, health: WorkspaceHealth | null = null) {
   return render(
     <MemoryRouter>
-      <WorkspaceArtifactsPanel workspace={workspace} />
+      <WorkspaceArtifactsPanel workspace={workspace} health={health} />
     </MemoryRouter>,
   )
 }
@@ -90,3 +96,75 @@ describe('WorkspaceArtifactsPanel', () => {
     expect(hrefs).toContain('/visualize/forecast?store_id=3&product_id=7')
   })
 })
+
+// =============================================================================
+// E2 (#408) — link-health markers + summary chip
+// =============================================================================
+
+const baseHealth: WorkspaceHealth = {
+  workspace_id: fullWorkspace.workspace_id,
+  workspace_status: 'completed',
+  partial_run: false,
+  references: [],
+  alive: 0,
+  dead: 0,
+  unknown: 0,
+  checked_at: '2026-06-13T00:00:00Z',
+}
+
+describe('WorkspaceArtifactsPanel — health', () => {
+  it('renders the summary chip with alive/dead counts', () => {
+    const health: WorkspaceHealth = { ...baseHealth, alive: 5, dead: 2 }
+    renderPanel(fullWorkspace, health)
+    const chip = screen.getByTestId('workspace-health-summary')
+    expect(chip.textContent).toContain('5 live')
+    expect(chip.textContent).toContain('2 dead')
+  })
+
+  it('hides the dead count at zero and the chip without health data', () => {
+    const { container, unmount } = renderPanel(fullWorkspace, { ...baseHealth, alive: 3 })
+    expect(container.textContent).toContain('3 live')
+    expect(container.textContent).not.toContain('dead')
+    unmount()
+    renderPanel(fullWorkspace, null)
+    expect(screen.queryByTestId('workspace-health-summary')).toBeNull()
+  })
+
+  it('marks a card whose reference probed dead — unknown gets no marker', () => {
+    const health: WorkspaceHealth = {
+      ...baseHealth,
+      alive: 4,
+      dead: 1,
+      unknown: 1,
+      references: [
+        {
+          key: 'scenario_plan_ids[0]',
+          ref_type: 'scenario_plan',
+          ref_id: 'sp-1',
+          status: 'dead',
+          probe_path: '/scenarios/sp-1',
+        },
+        {
+          key: 'batch_id',
+          ref_type: 'batch',
+          ref_id: 'batch-1',
+          status: 'unknown',
+          probe_path: '/batch/batch-1',
+        },
+      ],
+    }
+    renderPanel(fullWorkspace, health)
+    expect(screen.getByTestId('dead-link-sp-1')).toBeTruthy()
+    expect(screen.queryByTestId('dead-link-batch-1')).toBeNull()
+  })
+
+  it('renders the partial-run badge for a never-completed row', () => {
+    const health: WorkspaceHealth = {
+      ...baseHealth,
+      workspace_status: 'failed',
+      partial_run: true,
+    }
+    const { container } = renderPanel({ ...fullWorkspace, status: 'failed' }, health)
+    expect(container.textContent).toContain('partial run')
+  })
+})
diff --git a/frontend/src/components/demo/WorkspaceArtifactsPanel.tsx b/frontend/src/components/demo/WorkspaceArtifactsPanel.tsx
index 255d62fa..0246f7da 100644
--- a/frontend/src/components/demo/WorkspaceArtifactsPanel.tsx
+++ b/frontend/src/components/demo/WorkspaceArtifactsPanel.tsx
@@ -4,23 +4,33 @@
  * Mirrors InspectArtifactsPanel's card shape but reads the persisted
  * `created_objects` soft references + grain columns from the workspace row
  * instead of live step.data — the run is long gone; the row is the memory.
+ *
+ * E2 (#408) — health-aware: cards whose soft reference probed `dead` carry a
+ * warning marker, and a summary chip row shows alive/dead counts plus a
+ * partial-run warning for rows whose pipeline never completed. `unknown`
+ * references render without a marker (no false alarms on transient 5xx).
  */
 
 import { Link } from 'react-router-dom'
-import { ArrowUpRight } from 'lucide-react'
+import { AlertTriangle, ArrowUpRight } from 'lucide-react'
+import { Badge } from '@/components/ui/badge'
 import { Card, CardContent } from '@/components/ui/card'
 import { ROUTES } from '@/lib/constants'
-import type { WorkspaceDetail } from '@/types/api'
+import type { WorkspaceDetail, WorkspaceHealth } from '@/types/api'
 
 interface ArtifactCard {
   label: string
   blurb: string
   href: string | null
   disabledReason?: string
+  /** E2 (#408) — the soft-reference id backing this card, when probeable. */
+  refId?: string
 }
 
 interface WorkspaceArtifactsPanelProps {
   workspace: WorkspaceDetail
+  /** E2 (#408) — link-health result; undefined while loading / not probed. */
+  health?: WorkspaceHealth | null
 }
 
 function asString(value: unknown): string | null {
@@ -46,18 +56,21 @@ function buildCards(ws: WorkspaceDetail): ArtifactCard[] {
     blurb: 'Registry detail for the run this workspace promoted.',
     href: winningRunId ? `${ROUTES.EXPLORER.RUNS}/${winningRunId}` : null,
     disabledReason: 'The run never registered a winner.',
+    refId: winningRunId ?? undefined,
   })
   cards.push({
     label: 'V2 feature-frame run',
     blurb: 'The prophet_like V2 run with feature groups + safety classes.',
     href: v2RunId ? `${ROUTES.EXPLORER.RUNS}/${v2RunId}` : null,
     disabledReason: 'No V2 run recorded (demo_minimal flow or v2_train skipped).',
+    refId: v2RunId ?? undefined,
   })
   planIds.forEach((planId, index) => {
     cards.push({
       label: `Scenario plan ${index + 1}`,
       blurb: 'Saved what-if plan from the planning phase.',
       href: `${ROUTES.VISUALIZE.PLANNER}?scenario_id=${planId}`,
+      refId: planId,
     })
   })
   if (planIds.length === 0) {
@@ -73,12 +86,14 @@ function buildCards(ws: WorkspaceDetail): ArtifactCard[] {
     blurb: 'Run-by-run results for the batch preset sweep.',
     href: batchId ? `${ROUTES.VISUALIZE.BATCH}/${batchId}` : null,
     disabledReason: 'No batch recorded (demo_minimal flow or batch skipped).',
+    refId: batchId ?? undefined,
   })
   cards.push({
     label: 'Deployment alias',
     blurb: alias ? `Ops view of the ${alias} alias.` : 'Ops view of aliases.',
     href: alias ? ROUTES.OPS : null,
     disabledReason: 'No alias recorded.',
+    refId: alias ?? undefined,
   })
   cards.push({
     label: 'Forecast on grain',
@@ -101,22 +116,53 @@ function buildCards(ws: WorkspaceDetail): ArtifactCard[] {
     blurb: 'The chat surface — the recorded session has likely expired.',
     href: sessionId ? ROUTES.CHAT : null,
     disabledReason: 'No agent session recorded (no LLM key or step skipped).',
+    refId: sessionId ?? undefined,
   })
 
   return cards
 }
 
-export function WorkspaceArtifactsPanel({ workspace }: WorkspaceArtifactsPanelProps) {
+const DEAD_LINK_TOOLTIP = 'This object no longer exists — it was deleted after the run.'
+
+export function WorkspaceArtifactsPanel({ workspace, health }: WorkspaceArtifactsPanelProps) {
   const cards = buildCards(workspace)
+  // E2 (#408) — ref_id -> status lookup; only `dead` produces a marker.
+  const deadRefIds = new Set(
+    (health?.references ?? [])
+      .filter((ref) => ref.status === 'dead')
+      .map((ref) => ref.ref_id)
+  )
   return (
     <Card>
       <CardContent className="space-y-3 p-4">
-        <h2 className="text-lg font-semibold">
-          Workspace artifacts
-          <span className="ml-2 font-mono text-sm text-muted-foreground">
-            {workspace.name ?? workspace.workspace_id.slice(0, 8)}
-          </span>
-        </h2>
+        <div className="flex flex-wrap items-center gap-3">
+          <h2 className="text-lg font-semibold">
+            Workspace artifacts
+            <span className="ml-2 font-mono text-sm text-muted-foreground">
+              {workspace.name ?? workspace.workspace_id.slice(0, 8)}
+            </span>
+          </h2>
+          {health && (
+            <div
+              className="flex flex-wrap items-center gap-2 text-xs"
+              data-testid="workspace-health-summary"
+            >
+              <span className="text-success">✓ {health.alive} live</span>
+              {health.dead > 0 && (
+                <span className="text-destructive">✕ {health.dead} dead</span>
+              )}
+              {health.partial_run && (
+                <Badge
+                  variant="outline"
+                  className="text-destructive"
+                  title="This run never completed — artifacts may be missing."
+                >
+                  partial run
+                </Badge>
+              )}
+            </div>
+          )}
+        </div>
         <p className="text-sm text-muted-foreground">
           Everything this kept run created, re-attached from its workspace row.
           Cards greyed out when the run did not record the matching object.
@@ -124,26 +170,38 @@ export function WorkspaceArtifactsPanel({ workspace }: WorkspaceArtifactsPanelPr
         <div className="grid grid-cols-2 gap-3 lg:grid-cols-4">
           {cards.map((card) => {
             const isActive = typeof card.href === 'string' && card.href.length > 0
+            const isDead = card.refId !== undefined && deadRefIds.has(card.refId)
+            const cardTitle = (
+              <div className="flex items-center justify-between gap-1">
+                <span className="flex items-center gap-1 text-sm font-semibold">
+                  {card.label}
+                  {isDead && (
+                    <AlertTriangle
+                      className="h-3 w-3 shrink-0 text-destructive"
+                      data-testid={`dead-link-${card.refId}`}
+                    />
+                  )}
+                </span>
+                {isActive && <ArrowUpRight className="h-3 w-3 shrink-0" />}
+              </div>
+            )
             return (
               <div
                 key={card.label}
                 className={isActive ? '' : 'opacity-50'}
-                title={isActive ? undefined : card.disabledReason}
+                title={isDead ? DEAD_LINK_TOOLTIP : isActive ? undefined : card.disabledReason}
               >
                 {isActive ? (
                   <Link
                     to={card.href!}
                     className="block h-full rounded-md border p-3 transition-colors hover:bg-muted"
                   >
-                    <div className="flex items-center justify-between gap-1">
-                      <span className="text-sm font-semibold">{card.label}</span>
-                      <ArrowUpRight className="h-3 w-3 shrink-0" />
-                    </div>
+                    {cardTitle}
                     <p className="mt-1 text-xs text-muted-foreground">{card.blurb}</p>
                   </Link>
                 ) : (
                   <div className="block h-full cursor-not-allowed rounded-md border p-3">
-                    <div className="text-sm font-semibold">{card.label}</div>
+                    {cardTitle}
                     <p className="mt-1 text-xs text-muted-foreground">{card.blurb}</p>
                   </div>
                 )}
diff --git a/frontend/src/components/demo/WorkspaceEditDialog.test.tsx b/frontend/src/components/demo/WorkspaceEditDialog.test.tsx
new file mode 100644
index 00000000..ca73e36b
--- /dev/null
+++ b/frontend/src/components/demo/WorkspaceEditDialog.test.tsx
@@ -0,0 +1,140 @@
+import { cleanup, fireEvent, render, screen } from '@testing-library/react'
+import { afterEach, beforeAll, beforeEach, describe, expect, it, vi } from 'vitest'
+import { toast } from 'sonner'
+import { WorkspaceEditDialog } from './WorkspaceEditDialog'
+import type { WorkspaceDetail, WorkspaceListItem } from '@/types/api'
+
+beforeAll(() => {
+  class ResizeObserverStub {
+    observe() {}
+    unobserve() {}
+    disconnect() {}
+  }
+  vi.stubGlobal('ResizeObserver', ResizeObserverStub)
+})
+
+afterEach(() => {
+  cleanup()
+  vi.clearAllMocks()
+})
+
+const baseItem: WorkspaceListItem = {
+  workspace_id: 'a'.repeat(32),
+  name: 'edit-me',
+  status: 'completed',
+  seed: 7,
+  scenario: 'demo_minimal',
+  reset: false,
+  skip_seed: true,
+  result_summary: null,
+  created_at: '2026-06-01T12:00:00Z',
+  archived: false,
+  pinned: false,
+  tags: ['smoke'],
+  replayed_from_workspace_id: null,
+}
+
+let mockDetail: {
+  data: Partial<WorkspaceDetail> | undefined
+  isSuccess: boolean
+  isError: boolean
+} = { data: undefined, isSuccess: false, isError: false }
+
+let mockPatchResult: { mutate: ReturnType<typeof vi.fn>; isPending: boolean } = {
+  mutate: vi.fn(),
+  isPending: false,
+}
+
+vi.mock('@/hooks/use-workspaces', () => ({
+  useWorkspace: () => mockDetail,
+  usePatchWorkspace: () => mockPatchResult,
+}))
+
+vi.mock('sonner', () => ({
+  toast: { success: vi.fn(), error: vi.fn() },
+}))
+
+beforeEach(() => {
+  mockDetail = {
+    data: { ...baseItem, notes: 'old notes' },
+    isSuccess: true,
+    isError: false,
+  }
+  mockPatchResult = { mutate: vi.fn(), isPending: false }
+})
+
+function renderDialog(workspace: WorkspaceListItem | null = baseItem) {
+  const onClose = vi.fn()
+  render(<WorkspaceEditDialog workspace={workspace} onClose={onClose} />)
+  return { onClose }
+}
+
+describe('WorkspaceEditDialog', () => {
+  it('renders nothing when closed', () => {
+    renderDialog(null)
+    expect(document.body.textContent).not.toContain('Edit workspace details')
+  })
+
+  it('primes the form from the row + detail notes', () => {
+    renderDialog()
+    expect((screen.getByLabelText('Name') as HTMLInputElement).value).toBe('edit-me')
+    expect((screen.getByLabelText('Notes') as HTMLTextAreaElement).value).toBe('old notes')
+    expect(
+      (screen.getByLabelText(/Tags/) as HTMLInputElement).value,
+    ).toBe('smoke')
+  })
+
+  it('disables Save with an inline hint on a pattern violation', () => {
+    renderDialog()
+    fireEvent.change(screen.getByLabelText('Name'), { target: { value: 'Bad Name!' } })
+    expect(document.body.textContent).toContain('Lowercase letters/digits only')
+    expect((screen.getByTestId('workspace-edit-save') as HTMLButtonElement).disabled).toBe(true)
+    expect(mockPatchResult.mutate).not.toHaveBeenCalled()
+  })
+
+  it('sends ONLY dirty fields (partial-update semantics)', () => {
+    renderDialog()
+    fireEvent.change(screen.getByLabelText(/Tags/), { target: { value: 'smoke, e2' } })
+    fireEvent.click(screen.getByTestId('workspace-edit-save'))
+    expect(mockPatchResult.mutate).toHaveBeenCalledTimes(1)
+    const [payload] = mockPatchResult.mutate.mock.calls[0] as [
+      { workspaceId: string; update: Record<string, unknown> },
+      unknown,
+    ]
+    expect(payload.workspaceId).toBe(baseItem.workspace_id)
+    expect(payload.update).toEqual({ tags: ['smoke', 'e2'] })
+  })
+
+  it('clearing the name sends an explicit null', () => {
+    renderDialog()
+    fireEvent.change(screen.getByLabelText('Name'), { target: { value: '' } })
+    fireEvent.click(screen.getByTestId('workspace-edit-save'))
+    const [payload] = mockPatchResult.mutate.mock.calls[0] as [
+      { update: Record<string, unknown> },
+      unknown,
+    ]
+    expect(payload.update).toEqual({ name: null })
+  })
+
+  it('a clean save (no changes) just closes without a mutation', () => {
+    const { onClose } = renderDialog()
+    fireEvent.click(screen.getByTestId('workspace-edit-save'))
+    expect(mockPatchResult.mutate).not.toHaveBeenCalled()
+    expect(onClose).toHaveBeenCalledTimes(1)
+  })
+
+  it('success toasts and closes; failure toasts an error', () => {
+    const { onClose } = renderDialog()
+    fireEvent.change(screen.getByLabelText('Name'), { target: { value: 'renamed' } })
+    fireEvent.click(screen.getByTestId('workspace-edit-save'))
+    const [, options] = mockPatchResult.mutate.mock.calls[0] as [
+      unknown,
+      { onSuccess: () => void; onError: (error: unknown) => void },
+    ]
+    options.onSuccess()
+    expect(toast.success).toHaveBeenCalledWith('Workspace updated.')
+    expect(onClose).toHaveBeenCalled()
+    options.onError(new Error('boom'))
+    expect(toast.error).toHaveBeenCalledWith(expect.stringContaining('Update failed'))
+  })
+})
diff --git a/frontend/src/components/demo/WorkspaceEditDialog.tsx b/frontend/src/components/demo/WorkspaceEditDialog.tsx
new file mode 100644
index 0000000000000000000000000000000000000000..82367182f07cbf1ee74e0bb5814877cb73a58109
GIT binary patch
literal 6206
zcmc&&?`|Wv5$|U|#n>&Pl_9NsX+9J_$?nDH;EZ1GjI-0A2!acxrR0rwmy0Fkv#T?Z
zhv+l(!SW>i4awzlWw}NAp*4`4mYjc_8UAKClf%P3I;8V2Df#_ZM}Hd9ufP0*8d-@_
z(u68yq{hEStTf3iH>#l}{u89doy_QyYHoEc7BYnpmh+8jO3HFks1+saeByqNSP;D`
zWMf|2y&<VP*^s%BbpC=aUtOQQAz5X$$}7X7=zUc@(Cdp2*MFmBE{jZ)Xe4P_8A@u=
z7)Tkf>r5D4EJbDVg|?=Pxh1kpG-FZgYH=f~mCS}OX;Dbg<kiY*<}*&*3BkiEDw6kg
zkuP#X)w(EBy5@7UX2Fscsw~A=OBevf8Rb>IhCLbr3prM`#4?;?lq#$(l_Ie=3T$;D
zjU2X4ayoNJrc4@<7h}c8wKk7PD{Gy1Vk5OG?hpj_Rii!)wC7=ia-(Ijd};R(mXb=f
zu8gKz`2h1^PZuCk)j~SKRwdkKOd&PvvtH8j%CI~0DrfKJf0vD>gySZcOSv$5NOhAV
zE7%Yh6F$#mZ!a%v)fn2)x|Q)r%SM$nXrx$}K}TSe(1zMwt*S~k(e~FOTgkBf=LtJb
zRfxHsAeH948&j{1QPuP0cDtkBBXYSn3ct=&Wgxv<crEp!$!n9Vs$2X<;Sx06_Zum)
z`1?ACA4hxbskY5_aqR4%3!631k=vY#sITR{K>~_r&uKhC#Dh3o3bVNRzP+6z4o&!q
zMBONL3+HM%4mVjxf>zR;H;rn(m0F8lXo`G35p^DD%|k7b0YS&xT$>9c%Tc%Vox85x
z!I?+*q(H#?4}ZD3e06sI@!hL$&p*Docy|pE{YAjV_4$W);TY*87(eDaJwQq#u+0*<
z)TuuQdbtF+Y*ejjgEW{N9?~_^Iq=h5GCDBQ;Q}!fG7jKANpNuD`&+pvl$Ke_=}#T^
zai^_4KYg|mK&7fPW;&*1NV5~VQ+c-cxVN_iJaN7eKt)T&o1_t+j!7HFsbTx$haauM
zjWp|~B2YjgQL2GfCK-$dL-$>ZI!OT5Rt5e|A*D=)es`G{a7&+{kk!iE&<Pz4d*-ma
z7|av)Q}7Sp0*hY!!=DJLe{78lR6G38^0nvr4}KY1feTduF6v$?(~NVupWX?d&k_H?
z2>?u!<{f_g=%_8QCBeLuY3THn2G(sYBbU0riel~A!)H&3gYIo!csgR`AX-5^&Q3j3
zr$3|AA#gN6aDw$5tx{%eIxJ^1eE<UmuU%Akz~>AUG5!8*h6B+TU(o*U`;y{(CW%-g
z#q)JuWd4HAkfBio+tm{rlGe|z_U<5~%_EE@%9j-DA~tV#2wbe0#|d@e=^t+94tMIV
z<8gd&tF6;%17yALRK@{&s>!xD@N@_^ZENbbB1JA`Dj+Y%8iD98Z%74q=hzJ`osaqN
z219Gw&V<0`6DwIHr-N4)+2P^9`3>v9lQuwTd9p%Yp-Tg+gB-re?Y3jVRup-Aja=m3
z&wobAPtix{#E}kDj}+{nqs~ST0-l%9Ix<2<HukS?TCTavNGt)NVgxT-WXIh{8kRri
z8LF2r9gI5s=T5@ERCTppEKr_WT<x|9jTqXER6BxX@O@hjy}T^-u*re~oQ0FXovl9J
zw1p5_Bo{=BuTNFkz2v9woViPMb`q!h!)B|FdKr&Dxg25TL6ys7-iU&^ST|_N|825?
z!DoZ>?exU^YMSMD=zE3M3@ft(+pdov#$O$@6hzO_yohK(RfxGPVp-gKjCF}8L1o4O
zu6$h{(2XezRF2s}h~b#)sV&Vtc|>fHH<O((omdT>U0AZ+QbdFN%yx_J5dk6Ba<*Z(
zbRK2-{D*wCuYuVnkv59HR3v~}(I8N%WgYgp>H@vQjVjRi%?^}p5JfQ}nXXcrWAApn
zZ1PMz9Q2GR#r<1bUT2#ZM@M~$c!r2^?vUKs>0v!4o5ZfLhiS1NFh8xIgyJ4e3Wp>E
z(l=}L6?gLBWV2NcxTkuYPU@3R(-iNtyAuWv_xHca2L|2kgy6Mc#_o)0JK~6JcT!mS
zJ*qY<0^D~l;DhhXjX?VwGGy6EIp#><Nr?g?xWRiiHNHIB&J8^C^lbCT*tjSY{x8Ae
z8nhKFTl%Jelj=zx0_0^j#*0+08oWregOg<I?L3@Lm^04r|N8b@+jhm7H*6~qdvbS}
z`7oCBH$+*N>K4u7bN4?A87fu;;cX>F&&};xTMKtzy=5WqiT^kL)QFl^q8`7X+Kl_q
z%)C7ny@x<w1Jh33QycYtwt<%(>5lQJG61)=De?-i(Q_#CKm(ILS)Wd<ux*Cn7KHk=
zb&~l1(SC2QpWjpWW^<45PRI0X_xgWgH(z2*aBgQC-<dpiOhjh_Q;;GY0j66q9f_A4
zD2Om)61dCS?5I0D1dH!=(A{-kvA?ohi#b|?9^TXE&s$HNG}o7yN8$O`F?SXcaZ?uz
z+&;o|^3-8dn~L^*hxr!IYQn2C^hEsLbDT-kl#Q%A@oXi$@pjPKr?hJZC+)PbK-JOU
zi-%3Mcqo*}uv$F4#MsNVMzoZeQsAQ~G|f>((XrtTgRzo-&AI6qy3}}l(Y+#0du^h7
z(T!+-KQtcQyU<CxvNJD!Ik|6m+P9A!g|KE@J+6k_y$50TX~5%vVcTi(2b1+1FXu%c
zAPfavGt98_9v`9m;GL~DlGW|Ib)USU8GUrA(58XE6=!aM)vn@RJLPUf)h&k&JF7b3
zSa-d)Keui($I*EE`J#HE(Qqu5O!UD{i$`s(10-Sy+Moamp0i@ixjid((_KH6jTF)L
ztd}#P{cFwJX-2!beeg{`?x#Jlow6Tz#t~^Do}JD%A?Upf>#idk6=)*zK81Zd&b$pN
z1IcM>*Em%<dHUo&<{ffHc16WxNO*-rTRZwdwB)uAd|HR!;ui9RD;WRlFaP9TFvtt2
P+`6=vW9=f-GoJhlSJ7gY

literal 0
HcmV?d00001

diff --git a/frontend/src/components/demo/WorkspaceLineageStrip.test.tsx b/frontend/src/components/demo/WorkspaceLineageStrip.test.tsx
new file mode 100644
index 00000000..27e4e3da
--- /dev/null
+++ b/frontend/src/components/demo/WorkspaceLineageStrip.test.tsx
@@ -0,0 +1,89 @@
+import { cleanup, fireEvent, render, screen } from '@testing-library/react'
+import { afterEach, describe, expect, it, vi } from 'vitest'
+import { WorkspaceLineageStrip } from './WorkspaceLineageStrip'
+import type { WorkspaceLineage } from '@/hooks/use-workspaces'
+import type { WorkspaceDetail } from '@/types/api'
+
+afterEach(() => {
+  cleanup()
+  vi.clearAllMocks()
+})
+
+let mockLineage: { data: WorkspaceLineage | undefined } = { data: undefined }
+
+vi.mock('@/hooks/use-workspaces', () => ({
+  useWorkspaceLineage: () => mockLineage,
+}))
+
+const detailOf = (id: string, name: string | null): WorkspaceDetail =>
+  ({ workspace_id: id, name }) as WorkspaceDetail
+
+function renderStrip(onLoadAncestor = vi.fn()) {
+  render(<WorkspaceLineageStrip workspaceId={'a'.repeat(32)} onLoadAncestor={onLoadAncestor} />)
+  return onLoadAncestor
+}
+
+describe('WorkspaceLineageStrip', () => {
+  it('renders nothing when the workspace has no lineage', () => {
+    mockLineage = {
+      data: {
+        entries: [
+          { workspace_id: 'a'.repeat(32), name: 'solo', deleted: false, detail: detailOf('a'.repeat(32), 'solo') },
+        ],
+        truncated: false,
+      },
+    }
+    renderStrip()
+    expect(screen.queryByTestId('workspace-lineage')).toBeNull()
+  })
+
+  it('renders the chain newest → original with clickable ancestors', () => {
+    const parentDetail = detailOf('b'.repeat(32), 'parent')
+    mockLineage = {
+      data: {
+        entries: [
+          { workspace_id: 'a'.repeat(32), name: 'child', deleted: false, detail: detailOf('a'.repeat(32), 'child') },
+          { workspace_id: 'b'.repeat(32), name: 'parent', deleted: false, detail: parentDetail },
+          { workspace_id: 'c'.repeat(32), name: 'origin', deleted: false, detail: detailOf('c'.repeat(32), 'origin') },
+        ],
+        truncated: false,
+      },
+    }
+    const onLoadAncestor = renderStrip()
+    const strip = screen.getByTestId('workspace-lineage')
+    const text = strip.textContent ?? ''
+    // Order: current first, then parents.
+    expect(text.indexOf('child')).toBeLessThan(text.indexOf('parent'))
+    expect(text.indexOf('parent')).toBeLessThan(text.indexOf('origin'))
+    fireEvent.click(screen.getByText('parent'))
+    expect(onLoadAncestor).toHaveBeenCalledWith(parentDetail)
+  })
+
+  it('renders the deleted-ancestor sentinel without erroring', () => {
+    mockLineage = {
+      data: {
+        entries: [
+          { workspace_id: 'a'.repeat(32), name: 'child', deleted: false, detail: detailOf('a'.repeat(32), 'child') },
+          { workspace_id: 'b'.repeat(32), name: null, deleted: true, detail: null },
+        ],
+        truncated: false,
+      },
+    }
+    renderStrip()
+    expect(screen.getByTestId('workspace-lineage').textContent).toContain('(original deleted)')
+  })
+
+  it('renders a trailing ellipsis when the chain is depth-capped', () => {
+    mockLineage = {
+      data: {
+        entries: [
+          { workspace_id: 'a'.repeat(32), name: 'child', deleted: false, detail: detailOf('a'.repeat(32), 'child') },
+          { workspace_id: 'b'.repeat(32), name: 'parent', deleted: false, detail: detailOf('b'.repeat(32), 'parent') },
+        ],
+        truncated: true,
+      },
+    }
+    renderStrip()
+    expect(screen.getByTestId('workspace-lineage').textContent).toContain('…')
+  })
+})
diff --git a/frontend/src/components/demo/WorkspaceLineageStrip.tsx b/frontend/src/components/demo/WorkspaceLineageStrip.tsx
new file mode 100644
index 00000000..c405fdcc
--- /dev/null
+++ b/frontend/src/components/demo/WorkspaceLineageStrip.tsx
@@ -0,0 +1,64 @@
+/**
+ * E2 (#408) — replay lineage breadcrumb for the loaded workspace.
+ *
+ * Renders the replayed_from_workspace_id chain newest → original:
+ * `this ← parent ← grandparent …` (depth-capped). Ancestors are clickable
+ * (loads them); a deleted ancestor renders as "(original deleted)" — dangling
+ * soft references are designed, never an error. Renders nothing when the
+ * loaded workspace is not a replay.
+ */
+
+import { Fragment } from 'react'
+import { Button } from '@/components/ui/button'
+import { useWorkspaceLineage } from '@/hooks/use-workspaces'
+import type { WorkspaceDetail } from '@/types/api'
+
+interface WorkspaceLineageStripProps {
+  workspaceId: string
+  /** Load an ancestor into the page (full detail — the walk already has it). */
+  onLoadAncestor: (ws: WorkspaceDetail) => void
+}
+
+function labelOf(workspaceId: string, name: string | null): string {
+  return name ?? workspaceId.slice(0, 8)
+}
+
+export function WorkspaceLineageStrip({ workspaceId, onLoadAncestor }: WorkspaceLineageStripProps) {
+  const { data } = useWorkspaceLineage(workspaceId)
+  const entries = data?.entries ?? []
+
+  // No lineage to show: still walking, or the loaded row is not a replay.
+  if (entries.length < 2) return null
+
+  return (
+    <div
+      className="flex flex-wrap items-center gap-1 text-xs text-muted-foreground"
+      data-testid="workspace-lineage"
+    >
+      <span className="font-medium">Replay lineage:</span>
+      {entries.map((entry, index) => (
+        <Fragment key={`${entry.workspace_id}-${index}`}>
+          {index > 0 && <span aria-hidden>←</span>}
+          {entry.deleted ? (
+            <span className="italic">(original deleted)</span>
+          ) : index === 0 ? (
+            // The loaded workspace itself — not a link.
+            <span className="font-mono font-semibold text-foreground">
+              {labelOf(entry.workspace_id, entry.name)}
+            </span>
+          ) : (
+            <Button
+              variant="link"
+              size="sm"
+              className="h-auto p-0 font-mono text-xs"
+              onClick={() => entry.detail && onLoadAncestor(entry.detail)}
+            >
+              {labelOf(entry.workspace_id, entry.name)}
+            </Button>
+          )}
+        </Fragment>
+      ))}
+      {data?.truncated && <span aria-hidden>…</span>}
+    </div>
+  )
+}
diff --git a/frontend/src/components/demo/WorkspacePanel.test.tsx b/frontend/src/components/demo/WorkspacePanel.test.tsx
index 843415f0..75bf0f56 100644
--- a/frontend/src/components/demo/WorkspacePanel.test.tsx
+++ b/frontend/src/components/demo/WorkspacePanel.test.tsx
@@ -1,13 +1,14 @@
 import { QueryClient, QueryClientProvider } from '@tanstack/react-query'
-import { cleanup, fireEvent, render, screen } from '@testing-library/react'
-import { afterEach, beforeAll, describe, expect, it, vi } from 'vitest'
+import { cleanup, fireEvent, render, screen, waitFor } from '@testing-library/react'
+import { MemoryRouter } from 'react-router-dom'
+import { afterEach, beforeAll, beforeEach, describe, expect, it, vi } from 'vitest'
 import { toast } from 'sonner'
 import { WorkspacePanel } from './WorkspacePanel'
 import { ApiError } from '@/lib/api'
-import type { WorkspaceListItem, WorkspaceListResponse } from '@/types/api'
+import type { WorkspaceListItem, WorkspaceListParams, WorkspaceListResponse } from '@/types/api'
 
 beforeAll(() => {
-  // Radix AlertDialog needs these in jsdom (pattern: cancel-run-dialog.test.tsx).
+  // Radix AlertDialog/DropdownMenu need these in jsdom.
   class ResizeObserverStub {
     observe() {}
     unobserve() {}
@@ -17,6 +18,9 @@ beforeAll(() => {
   if (!Element.prototype.hasPointerCapture) {
     Element.prototype.hasPointerCapture = () => false
   }
+  if (!Element.prototype.scrollIntoView) {
+    Element.prototype.scrollIntoView = () => {}
+  }
 })
 
 afterEach(() => {
@@ -34,6 +38,16 @@ const baseItem: WorkspaceListItem = {
   skip_seed: true,
   result_summary: { winner_model_type: 'seasonal_naive' },
   created_at: '2026-06-01T12:00:00Z',
+  archived: false,
+  pinned: false,
+  tags: [],
+  replayed_from_workspace_id: null,
+}
+
+const secondItem: WorkspaceListItem = {
+  ...baseItem,
+  workspace_id: 'b'.repeat(32),
+  name: 'second',
 }
 
 let mockResponse: { data: WorkspaceListResponse | undefined; isLoading: boolean } = {
@@ -41,35 +55,71 @@ let mockResponse: { data: WorkspaceListResponse | undefined; isLoading: boolean
   isLoading: false,
 }
 
-let mockDeleteResult: { mutate: ReturnType<typeof vi.fn>; isPending: boolean } = {
+let lastListParams: WorkspaceListParams | undefined
+
+let mockDeleteResult: {
+  mutate: ReturnType<typeof vi.fn>
+  mutateAsync: ReturnType<typeof vi.fn>
+  isPending: boolean
+} = { mutate: vi.fn(), mutateAsync: vi.fn(), isPending: false }
+
+let mockPatchResult: { mutate: ReturnType<typeof vi.fn>; isPending: boolean } = {
   mutate: vi.fn(),
   isPending: false,
 }
 
+const mockNavigate = vi.fn()
+
 vi.mock('@/hooks/use-workspaces', () => ({
-  useWorkspaces: () => mockResponse,
+  useWorkspaces: (params: WorkspaceListParams) => {
+    lastListParams = params
+    return mockResponse
+  },
+  // WorkspaceEditDialog dependencies (mounted closed by the panel).
+  useWorkspace: () => ({ data: undefined, isSuccess: false, isError: false }),
   useDeleteWorkspace: () => mockDeleteResult,
+  usePatchWorkspace: () => mockPatchResult,
 }))
 
+vi.mock('react-router-dom', async (importOriginal) => {
+  const actual = await importOriginal<typeof import('react-router-dom')>()
+  return { ...actual, useNavigate: () => mockNavigate }
+})
+
 vi.mock('sonner', () => ({
   toast: { success: vi.fn(), error: vi.fn() },
 }))
 
+beforeEach(() => {
+  lastListParams = undefined
+  mockDeleteResult = { mutate: vi.fn(), mutateAsync: vi.fn(), isPending: false }
+  mockPatchResult = { mutate: vi.fn(), isPending: false }
+})
+
 function renderPanel(props: Partial<Parameters<typeof WorkspacePanel>[0]> = {}) {
   const queryClient = new QueryClient({ defaultOptions: { queries: { retry: false } } })
   return render(
     <QueryClientProvider client={queryClient}>
-      <WorkspacePanel
-        onLoad={() => {}}
-        onReplay={() => {}}
-        isRunning={false}
-        lastWorkspaceId={null}
-        {...props}
-      />
+      <MemoryRouter>
+        <WorkspacePanel
+          onLoad={() => {}}
+          onRequestReplay={() => {}}
+          isRunning={false}
+          lastWorkspaceId={null}
+          {...props}
+        />
+      </MemoryRouter>
     </QueryClientProvider>,
   )
 }
 
+/** Open a Radix dropdown/select (pattern: model-family-tabs.test.tsx). */
+function radixOpen(target: HTMLElement) {
+  fireEvent.pointerDown(target, { button: 0, ctrlKey: false })
+  fireEvent.mouseDown(target, { button: 0 })
+  fireEvent.click(target)
+}
+
 describe('WorkspacePanel', () => {
   it('renders the discoverable empty state (panel never hidden)', () => {
     mockResponse = { data: { workspaces: [], total: 0 }, isLoading: false }
@@ -86,7 +136,6 @@ describe('WorkspacePanel', () => {
     expect(container.textContent).toContain('seed 7')
     expect(container.textContent).toContain('COMPLETED')
     expect(container.textContent).toContain('winner seasonal_naive')
-    // No destructive badge on a reset=false row.
     expect(container.textContent).not.toContain('DESTRUCTIVE')
   })
 
@@ -99,57 +148,192 @@ describe('WorkspacePanel', () => {
     expect(container.textContent).toContain('DESTRUCTIVE')
   })
 
-  it('falls back to the workspace_id slice when the row is unnamed', () => {
-    mockResponse = {
-      data: { workspaces: [{ ...baseItem, name: null }], total: 1 },
-      isLoading: false,
-    }
-    const { container } = renderPanel()
-    expect(container.textContent).toContain('aaaaaaaa')
-  })
-
-  it('invokes onLoad / onReplay with the list item', () => {
+  it('invokes onLoad / onRequestReplay with the list item — replay never starts here', () => {
     mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
     const onLoad = vi.fn()
-    const onReplay = vi.fn()
-    const { container } = renderPanel({ onLoad, onReplay })
+    const onRequestReplay = vi.fn()
+    const { container } = renderPanel({ onLoad, onRequestReplay })
     const buttons = Array.from(container.querySelectorAll('button'))
     fireEvent.click(buttons.find((b) => (b.textContent ?? '').includes('Load'))!)
     expect(onLoad).toHaveBeenCalledWith(baseItem)
     fireEvent.click(buttons.find((b) => (b.textContent ?? '').includes('Replay'))!)
-    expect(onReplay).toHaveBeenCalledWith(baseItem)
+    expect(onRequestReplay).toHaveBeenCalledWith(baseItem)
   })
 
-  it('disables both actions while a run is in flight', () => {
+  it('disables row actions while a run is in flight', () => {
     mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
-    const { container } = renderPanel({ isRunning: true })
-    const buttons = Array.from(container.querySelectorAll('button'))
-    expect(buttons.length).toBeGreaterThanOrEqual(2)
-    expect(buttons.every((b) => b.disabled)).toBe(true)
+    renderPanel({ isRunning: true })
+    const labels = ['Load', 'Replay']
+    for (const label of labels) {
+      const button = screen
+        .getAllByRole('button')
+        .find((b) => (b.textContent ?? '').includes(label))! as HTMLButtonElement
+      expect(button.disabled).toBe(true)
+    }
   })
 })
 
-describe('WorkspacePanel — delete', () => {
-  function openDeleteDialog() {
+describe('WorkspacePanel — E2 lifecycle badges + toolbar params', () => {
+  it('renders pinned / archived / replay badges', () => {
+    mockResponse = {
+      data: {
+        workspaces: [
+          {
+            ...baseItem,
+            pinned: true,
+            archived: true,
+            replayed_from_workspace_id: 'c'.repeat(32),
+          },
+        ],
+        total: 1,
+      },
+      isLoading: false,
+    }
+    const { container } = renderPanel()
+    expect(container.textContent).toContain('archived')
+    expect(container.textContent).toContain('replay')
+    expect(screen.getByLabelText('Unpin e4-panel')).toBeTruthy()
+  })
+
+  it('flows the debounced search into the q list param (min 2 chars)', async () => {
+    mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
+    renderPanel()
+    fireEvent.change(screen.getByLabelText('Search workspaces by name'), {
+      target: { value: 'demo' },
+    })
+    await waitFor(() => expect(lastListParams?.q).toBe('demo'))
+  })
+
+  it('flows the show-archived toggle into include_archived', () => {
     mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
-    mockDeleteResult = { mutate: vi.fn(), isPending: false }
+    const { container } = renderPanel()
+    expect(lastListParams?.include_archived).toBeUndefined()
+    const checkbox = Array.from(container.querySelectorAll('button[role="checkbox"]')).find(
+      (el) => el.parentElement?.textContent?.includes('Show archived'),
+    )!
+    fireEvent.click(checkbox)
+    expect(lastListParams?.include_archived).toBe(true)
+  })
+
+  it('flows the sort select into sort_by/sort_order', () => {
+    mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
+    renderPanel()
+    radixOpen(screen.getByLabelText('Sort workspaces'))
+    fireEvent.click(screen.getByText('Name'))
+    expect(lastListParams?.sort_by).toBe('name')
+    expect(lastListParams?.sort_order).toBe('asc')
+  })
+
+  it('clicking a tag chip filters by that tag; the toolbar chip clears it', () => {
+    mockResponse = {
+      data: { workspaces: [{ ...baseItem, tags: ['smoke'] }], total: 1 },
+      isLoading: false,
+    }
+    renderPanel()
+    fireEvent.click(screen.getByLabelText('Filter by tag smoke'))
+    expect(lastListParams?.tags).toBe('smoke')
+    fireEvent.click(screen.getByLabelText('Clear tag filter smoke'))
+    expect(lastListParams?.tags).toBeUndefined()
+  })
+
+  it('pin toggle fires the PATCH mutation', () => {
+    mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
+    renderPanel()
+    fireEvent.click(screen.getByLabelText('Pin e4-panel'))
+    expect(mockPatchResult.mutate).toHaveBeenCalledWith(
+      { workspaceId: baseItem.workspace_id, update: { pinned: true } },
+      expect.anything(),
+    )
+  })
+
+  it('archive action in the dropdown fires the PATCH mutation', () => {
+    mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
+    renderPanel()
+    radixOpen(screen.getByLabelText('More actions for e4-panel'))
+    fireEvent.click(screen.getByText('Archive'))
+    expect(mockPatchResult.mutate).toHaveBeenCalledWith(
+      { workspaceId: baseItem.workspace_id, update: { archived: true } },
+      expect.anything(),
+    )
+  })
+})
+
+describe('WorkspacePanel — multi-select', () => {
+  function selectBoth() {
+    mockResponse = { data: { workspaces: [baseItem, secondItem], total: 2 }, isLoading: false }
     const result = renderPanel({ onDeleted: vi.fn() })
-    fireEvent.click(screen.getByLabelText('Delete workspace e4-panel'))
+    fireEvent.click(screen.getByLabelText('Select workspace e4-panel'))
+    fireEvent.click(screen.getByLabelText('Select workspace second'))
     return result
   }
 
-  it('renders a Delete action for each saved workspace row', () => {
-    mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
-    const { container } = renderPanel()
-    const buttons = Array.from(container.querySelectorAll('button'))
-    expect(buttons.some((b) => (b.textContent ?? '').includes('Delete'))).toBe(true)
+  it('shows the selection footer with the count', () => {
+    const { container } = selectBoth()
+    expect(container.textContent).toContain('2 selected')
+  })
+
+  it('Compare is enabled only at exactly two selections', () => {
+    mockResponse = { data: { workspaces: [baseItem, secondItem], total: 2 }, isLoading: false }
+    renderPanel()
+    fireEvent.click(screen.getByLabelText('Select workspace e4-panel'))
+    const compare = () =>
+      screen
+        .getAllByRole('button')
+        .find((b) => (b.textContent ?? '') === 'Compare')! as HTMLButtonElement
+    expect(compare().disabled).toBe(true)
+    fireEvent.click(screen.getByLabelText('Select workspace second'))
+    expect(compare().disabled).toBe(false)
+    fireEvent.click(compare())
+    expect(mockNavigate).toHaveBeenCalledWith(
+      `/showcase/compare?a=${baseItem.workspace_id}&b=${secondItem.workspace_id}`,
+    )
+  })
+
+  it('delete-selected confirms once then issues N sequential single deletes', async () => {
+    mockDeleteResult.mutateAsync.mockResolvedValue(undefined)
+    selectBoth()
+    fireEvent.click(
+      screen.getAllByRole('button').find((b) => (b.textContent ?? '').includes('Delete selected'))!,
+    )
+    // Nothing deleted before the confirmation.
+    expect(mockDeleteResult.mutateAsync).not.toHaveBeenCalled()
+    expect(document.body.textContent).toContain('Delete 2 workspace records?')
+    fireEvent.click(screen.getByTestId('workspace-multi-delete-confirm'))
+    await waitFor(() => expect(mockDeleteResult.mutateAsync).toHaveBeenCalledTimes(2))
+    expect(mockDeleteResult.mutateAsync).toHaveBeenNthCalledWith(1, baseItem.workspace_id)
+    expect(mockDeleteResult.mutateAsync).toHaveBeenNthCalledWith(2, secondItem.workspace_id)
+    await waitFor(() =>
+      expect(toast.success).toHaveBeenCalledWith(expect.stringContaining('2 workspace records')),
+    )
+  })
+
+  it('collects multi-delete failures into one error toast', async () => {
+    mockDeleteResult.mutateAsync
+      .mockResolvedValueOnce(undefined)
+      .mockRejectedValueOnce(new ApiError('Workspace not found', 404))
+    selectBoth()
+    fireEvent.click(
+      screen.getAllByRole('button').find((b) => (b.textContent ?? '').includes('Delete selected'))!,
+    )
+    fireEvent.click(screen.getByTestId('workspace-multi-delete-confirm'))
+    await waitFor(() =>
+      expect(toast.error).toHaveBeenCalledWith(expect.stringContaining('Some deletes failed')),
+    )
   })
+})
+
+describe('WorkspacePanel — single delete', () => {
+  function openDeleteDialog() {
+    mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
+    const result = renderPanel({ onDeleted: vi.fn() })
+    radixOpen(screen.getByLabelText('More actions for e4-panel'))
+    fireEvent.click(screen.getByText('Delete…'))
+    return result
+  }
 
   it('shows a confirmation whose copy makes metadata-only deletion clear', () => {
     openDeleteDialog()
-    // The mutation must not fire before confirmation.
     expect(mockDeleteResult.mutate).not.toHaveBeenCalled()
-    // Radix renders the dialog in a portal — read the whole document.
     const copy = document.body.textContent ?? ''
     expect(copy).toContain('Delete workspace "e4-panel"?')
     expect(copy).toContain('only the saved workspace record')
@@ -159,9 +343,9 @@ describe('WorkspacePanel — delete', () => {
   it('confirming deletes the row and notifies the page on success', () => {
     const onDeleted = vi.fn()
     mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
-    mockDeleteResult = { mutate: vi.fn(), isPending: false }
     renderPanel({ onDeleted })
-    fireEvent.click(screen.getByLabelText('Delete workspace e4-panel'))
+    radixOpen(screen.getByLabelText('More actions for e4-panel'))
+    fireEvent.click(screen.getByText('Delete…'))
     fireEvent.click(screen.getByTestId('workspace-delete-confirm'))
 
     expect(mockDeleteResult.mutate).toHaveBeenCalledTimes(1)
@@ -170,9 +354,6 @@ describe('WorkspacePanel — delete', () => {
       { onSuccess: () => void; onError: (error: unknown) => void },
     ]
     expect(workspaceId).toBe(baseItem.workspace_id)
-
-    // Success path: the page hook is told so it can drop a loaded workspace;
-    // the list refetch itself lives in useDeleteWorkspace (hook test).
     options.onSuccess()
     expect(onDeleted).toHaveBeenCalledWith(baseItem.workspace_id)
     expect(toast.success).toHaveBeenCalledWith(expect.stringContaining('were kept'))
diff --git a/frontend/src/components/demo/WorkspacePanel.tsx b/frontend/src/components/demo/WorkspacePanel.tsx
index 3231cf14..1fa62fe6 100644
--- a/frontend/src/components/demo/WorkspacePanel.tsx
+++ b/frontend/src/components/demo/WorkspacePanel.tsx
@@ -1,22 +1,39 @@
 /**
- * E4 (#393) — server-backed saved-workspaces panel for the Showcase page.
+ * E4 (#393) / E2 (#408) — server-backed saved-workspaces panel for the
+ * Showcase page.
  *
- * Lists `showcase_workspace` rows (newest first) with three actions per row:
- * - Load   — re-attach: the page repopulates the run controls + renders the
- *            artifact deep-link cards. Read-only; no run starts.
- * - Replay — re-run: the page re-submits the recorded config verbatim through
- *            the existing WS run path with preservation="keep".
- * - Delete — remove the saved workspace METADATA row only (confirmed via
- *            dialog). The run's created objects — model runs, scenario plans,
- *            aliases, jobs, artifacts — are soft references and stay intact.
+ * Lists `showcase_workspace` rows with lifecycle management (E2 #408):
+ * - Toolbar: name search, show-archived toggle, allow-listed sort, active
+ *   tag-filter chip. The panel owns the list params; filtering/sorting is
+ *   server-side (pinned rows always order first).
+ * - Per-row: Load (restore config, read-only), Replay (routes through the
+ *   page's confirm dialog via onRequestReplay — NO replay starts here),
+ *   pin toggle, actions dropdown (pin / archive / edit details / delete),
+ *   pinned/archived/replay badges, clickable tag chips.
+ * - Multi-select: per-row checkboxes; Delete selected (N sequential single
+ *   DELETEs behind one confirmation — deliberately NO bulk endpoint) and
+ *   Compare (exactly 2 → /showcase/compare?a=&b=).
  *
- * The panel stays dumb: it hands the LIST item to the page callbacks; detail
- * fetching (created_objects) lives in the page via useWorkspace.
+ * Deletes remove the workspace METADATA row only — created objects are soft
+ * references and stay intact.
  */
 
-import { useEffect, useState } from 'react'
+import { useEffect, useMemo, useState } from 'react'
+import { useNavigate } from 'react-router-dom'
 import { useQueryClient } from '@tanstack/react-query'
-import { FolderOpen, Play, Trash2 } from 'lucide-react'
+import {
+  Archive,
+  ArchiveRestore,
+  FolderOpen,
+  MoreHorizontal,
+  Pencil,
+  Pin,
+  PinOff,
+  Play,
+  Search,
+  Trash2,
+  X,
+} from 'lucide-react'
 import { toast } from 'sonner'
 import {
   AlertDialog,
@@ -28,17 +45,39 @@ import {
   AlertDialogHeader,
   AlertDialogTitle,
 } from '@/components/ui/alert-dialog'
+import { Badge } from '@/components/ui/badge'
 import { Button } from '@/components/ui/button'
 import { Card, CardContent } from '@/components/ui/card'
-import { useDeleteWorkspace, useWorkspaces } from '@/hooks/use-workspaces'
+import { Checkbox } from '@/components/ui/checkbox'
+import {
+  DropdownMenu,
+  DropdownMenuContent,
+  DropdownMenuItem,
+  DropdownMenuTrigger,
+} from '@/components/ui/dropdown-menu'
+import { Input } from '@/components/ui/input'
+import {
+  Select,
+  SelectContent,
+  SelectItem,
+  SelectTrigger,
+  SelectValue,
+} from '@/components/ui/select'
+import { useDeleteWorkspace, usePatchWorkspace, useWorkspaces } from '@/hooks/use-workspaces'
 import { ApiError, getErrorMessage } from '@/lib/api'
-import type { WorkspaceListItem } from '@/types/api'
+import { ROUTES } from '@/lib/constants'
+import { cn } from '@/lib/utils'
+import type { WorkspaceListItem, WorkspaceListParams } from '@/types/api'
+import { WorkspaceEditDialog } from './WorkspaceEditDialog'
 
 interface WorkspacePanelProps {
   /** Called when the operator clicks Load — restore config + artifacts, no run. */
   onLoad: (ws: WorkspaceListItem) => void
-  /** Called when the operator clicks Replay — re-run the recorded config. */
-  onReplay: (ws: WorkspaceListItem) => void
+  /**
+   * E2 (#408) — called when the operator clicks Replay. The PAGE owns the
+   * confirmation dialog; the panel never starts a replay itself.
+   */
+  onRequestReplay: (ws: WorkspaceListItem) => void
   /** Called after a workspace row was deleted — lets the page drop a loaded one. */
   onDeleted?: (workspaceId: string) => void
   /** Disables all actions while a pipeline run is in flight. */
@@ -47,6 +86,15 @@ interface WorkspacePanelProps {
   lastWorkspaceId: string | null
 }
 
+type SortKey = 'newest' | 'oldest' | 'name' | 'status'
+
+const SORT_PARAMS: Record<SortKey, Pick<WorkspaceListParams, 'sort_by' | 'sort_order'>> = {
+  newest: {},
+  oldest: { sort_by: 'created_at', sort_order: 'asc' },
+  name: { sort_by: 'name', sort_order: 'asc' },
+  status: { sort_by: 'status', sort_order: 'asc' },
+}
+
 function statusClass(status: WorkspaceListItem['status']): string {
   switch (status) {
     case 'completed':
@@ -69,16 +117,45 @@ function labelOf(ws: WorkspaceListItem): string {
 
 export function WorkspacePanel({
   onLoad,
-  onReplay,
+  onRequestReplay,
   onDeleted,
   isRunning,
   lastWorkspaceId,
 }: WorkspacePanelProps) {
-  const { data, isLoading } = useWorkspaces()
+  // ── E2 (#408) — server-side list params ─────────────────────────────────
+  const [search, setSearch] = useState('')
+  const [appliedQ, setAppliedQ] = useState('')
+  const [showArchived, setShowArchived] = useState(false)
+  const [sortKey, setSortKey] = useState<SortKey>('newest')
+  const [tagFilter, setTagFilter] = useState<string | null>(null)
+
+  // Debounced search — the q param needs >= 2 chars (server min_length).
+  useEffect(() => {
+    const handle = window.setTimeout(() => setAppliedQ(search.trim()), 300)
+    return () => window.clearTimeout(handle)
+  }, [search])
+
+  const params = useMemo<WorkspaceListParams>(
+    () => ({
+      ...(appliedQ.length >= 2 ? { q: appliedQ } : {}),
+      ...(tagFilter ? { tags: tagFilter } : {}),
+      ...(showArchived ? { include_archived: true } : {}),
+      ...SORT_PARAMS[sortKey],
+    }),
+    [appliedQ, tagFilter, showArchived, sortKey]
+  )
+
+  const { data, isLoading } = useWorkspaces(params)
   const queryClient = useQueryClient()
   const deleteWorkspace = useDeleteWorkspace()
-  // The row awaiting confirmation — one shared dialog instead of one per row.
+  const patchWorkspace = usePatchWorkspace()
+
+  // ── dialogs + selection state ────────────────────────────────────────────
   const [pendingDelete, setPendingDelete] = useState<WorkspaceListItem | null>(null)
+  const [pendingEdit, setPendingEdit] = useState<WorkspaceListItem | null>(null)
+  const [confirmMultiDelete, setConfirmMultiDelete] = useState(false)
+  const [selected, setSelected] = useState<ReadonlySet<string>>(new Set())
+  const navigate = useNavigate()
 
   const handleConfirmDelete = () => {
     const ws = pendingDelete
@@ -101,6 +178,58 @@ export function WorkspacePanel({
     })
   }
 
+  // E2 (#408) — multi-select delete: N sequential SINGLE deletes (no bulk
+  // endpoint by design); failures collect into one summary toast.
+  const handleConfirmDeleteSelected = async () => {
+    const ids = Array.from(selected)
+    setConfirmMultiDelete(false)
+    const failures: string[] = []
+    for (const id of ids) {
+      try {
+        await deleteWorkspace.mutateAsync(id)
+        onDeleted?.(id)
+      } catch (error) {
+        failures.push(`${id.slice(0, 8)}: ${getErrorMessage(error)}`)
+      }
+    }
+    setSelected(new Set())
+    if (failures.length === 0) {
+      toast.success(
+        `Deleted ${ids.length} workspace record${ids.length === 1 ? '' : 's'} — created objects were kept.`
+      )
+    } else {
+      toast.error(`Some deletes failed: ${failures.join('; ')}`)
+    }
+  }
+
+  const handleTogglePin = (ws: WorkspaceListItem) => {
+    patchWorkspace.mutate(
+      { workspaceId: ws.workspace_id, update: { pinned: !ws.pinned } },
+      { onError: (error) => toast.error(`Update failed: ${getErrorMessage(error)}`) }
+    )
+  }
+
+  const handleToggleArchive = (ws: WorkspaceListItem) => {
+    patchWorkspace.mutate(
+      { workspaceId: ws.workspace_id, update: { archived: !ws.archived } },
+      {
+        onSuccess: () => {
+          toast.success(ws.archived ? 'Workspace unarchived.' : 'Workspace archived.')
+        },
+        onError: (error) => toast.error(`Update failed: ${getErrorMessage(error)}`),
+      }
+    )
+  }
+
+  const toggleSelected = (workspaceId: string) => {
+    setSelected((prev) => {
+      const next = new Set(prev)
+      if (next.has(workspaceId)) next.delete(workspaceId)
+      else next.add(workspaceId)
+      return next
+    })
+  }
+
   // Refetch the list once the latest kept run settles — syncing React state to
   // an external system (the server-backed list) is the sanctioned effect use.
   useEffect(() => {
@@ -110,6 +239,9 @@ export function WorkspacePanel({
   }, [lastWorkspaceId, queryClient])
 
   const items = data?.workspaces ?? []
+  const allSelected = items.length > 0 && items.every((ws) => selected.has(ws.workspace_id))
+  const selectedIds = Array.from(selected)
+  const hasActiveFilter = appliedQ.length >= 2 || tagFilter !== null || showArchived
 
   return (
     <Card>
@@ -122,68 +254,225 @@ export function WorkspacePanel({
             </span>
           )}
         </div>
+
+        {/* E2 (#408) — toolbar: search / show-archived / sort / tag chip. */}
+        <div className="flex flex-wrap items-center gap-3 text-xs">
+          <div className="relative">
+            <Search className="absolute left-2 top-1/2 h-3 w-3 -translate-y-1/2 text-muted-foreground" />
+            <Input
+              className="h-8 w-44 pl-7 text-xs"
+              placeholder="Search by name…"
+              value={search}
+              onChange={(e) => setSearch(e.target.value)}
+              aria-label="Search workspaces by name"
+            />
+          </div>
+          <label className="flex items-center gap-2">
+            <Checkbox
+              checked={showArchived}
+              onCheckedChange={(v) => setShowArchived(v === true)}
+            />
+            <span>Show archived</span>
+          </label>
+          <Select value={sortKey} onValueChange={(v) => setSortKey(v as SortKey)}>
+            <SelectTrigger className="h-8 w-32 text-xs" aria-label="Sort workspaces">
+              <SelectValue />
+            </SelectTrigger>
+            <SelectContent>
+              <SelectItem value="newest">Newest</SelectItem>
+              <SelectItem value="oldest">Oldest</SelectItem>
+              <SelectItem value="name">Name</SelectItem>
+              <SelectItem value="status">Status</SelectItem>
+            </SelectContent>
+          </Select>
+          {tagFilter && (
+            <Badge variant="secondary" className="gap-1">
+              tag: {tagFilter}
+              <button
+                type="button"
+                aria-label={`Clear tag filter ${tagFilter}`}
+                onClick={() => setTagFilter(null)}
+              >
+                <X className="h-3 w-3" />
+              </button>
+            </Badge>
+          )}
+        </div>
+
         {items.length === 0 ? (
           <p className="text-sm text-muted-foreground">
             {isLoading
               ? 'Loading workspaces…'
-              : 'No saved workspaces yet — tick "Save as workspace" before a run to keep it.'}
+              : hasActiveFilter
+                ? 'No workspaces match the active filters.'
+                : 'No saved workspaces yet — tick "Save as workspace" before a run to keep it.'}
           </p>
         ) : (
-          <ul className="space-y-2">
-            {items.map((ws) => (
-              <li
-                key={ws.workspace_id}
-                className="flex flex-wrap items-center justify-between gap-2 rounded-md border px-3 py-2 text-xs"
-              >
-                <div className="flex flex-wrap items-center gap-3 font-mono">
-                  <span className="font-semibold">{labelOf(ws)}</span>
-                  <span className="rounded bg-muted px-2 py-0.5">{ws.scenario}</span>
-                  <span>seed {ws.seed}</span>
-                  <span className={statusClass(ws.status)}>{ws.status.toUpperCase()}</span>
-                  {winnerOf(ws) && <span>winner {winnerOf(ws)}</span>}
-                  {ws.reset && (
-                    <span className="text-destructive">
-                      DESTRUCTIVE (replay wipes all data)
-                    </span>
+          <>
+            <label className="flex items-center gap-2 text-xs text-muted-foreground">
+              <Checkbox
+                checked={allSelected}
+                onCheckedChange={(v) =>
+                  setSelected(
+                    v === true ? new Set(items.map((ws) => ws.workspace_id)) : new Set()
+                  )
+                }
+                aria-label="Select all workspaces"
+              />
+              <span>Select all</span>
+            </label>
+            <ul className="space-y-2">
+              {items.map((ws) => (
+                <li
+                  key={ws.workspace_id}
+                  className={cn(
+                    'flex flex-wrap items-center justify-between gap-2 rounded-md border px-3 py-2 text-xs',
+                    ws.archived && 'opacity-60'
                   )}
-                  <span className="text-muted-foreground">
-                    {new Date(ws.created_at).toLocaleString()}
-                  </span>
-                </div>
-                <div className="flex items-center gap-2">
-                  <Button
-                    size="sm"
-                    variant="outline"
-                    disabled={isRunning}
-                    onClick={() => onLoad(ws)}
-                  >
-                    <FolderOpen className="mr-1 h-3 w-3" />
-                    Load
-                  </Button>
-                  <Button
-                    size="sm"
-                    variant="outline"
-                    disabled={isRunning}
-                    onClick={() => onReplay(ws)}
-                  >
-                    <Play className="mr-1 h-3 w-3" />
-                    Replay
-                  </Button>
-                  <Button
-                    size="sm"
-                    variant="ghost"
-                    className="text-destructive"
-                    disabled={isRunning || deleteWorkspace.isPending}
-                    onClick={() => setPendingDelete(ws)}
-                    aria-label={`Delete workspace ${labelOf(ws)}`}
-                  >
-                    <Trash2 className="mr-1 h-3 w-3" />
-                    Delete
-                  </Button>
-                </div>
-              </li>
-            ))}
-          </ul>
+                >
+                  <div className="flex flex-wrap items-center gap-3 font-mono">
+                    <Checkbox
+                      checked={selected.has(ws.workspace_id)}
+                      onCheckedChange={() => toggleSelected(ws.workspace_id)}
+                      aria-label={`Select workspace ${labelOf(ws)}`}
+                    />
+                    <Button
+                      size="sm"
+                      variant="ghost"
+                      className="h-6 w-6 p-0"
+                      disabled={isRunning || patchWorkspace.isPending}
+                      onClick={() => handleTogglePin(ws)}
+                      aria-label={ws.pinned ? `Unpin ${labelOf(ws)}` : `Pin ${labelOf(ws)}`}
+                    >
+                      {ws.pinned ? (
+                        <Pin className="h-3 w-3 fill-current" />
+                      ) : (
+                        <PinOff className="h-3 w-3 text-muted-foreground" />
+                      )}
+                    </Button>
+                    <span className="font-semibold">{labelOf(ws)}</span>
+                    {ws.archived && <Badge variant="outline">archived</Badge>}
+                    {ws.replayed_from_workspace_id && <Badge variant="outline">replay</Badge>}
+                    <span className="rounded bg-muted px-2 py-0.5">{ws.scenario}</span>
+                    <span>seed {ws.seed}</span>
+                    <span className={statusClass(ws.status)}>{ws.status.toUpperCase()}</span>
+                    {winnerOf(ws) && <span>winner {winnerOf(ws)}</span>}
+                    {ws.reset && (
+                      <span className="text-destructive">
+                        DESTRUCTIVE (replay wipes all data)
+                      </span>
+                    )}
+                    {ws.tags.map((tag) => (
+                      <button
+                        key={tag}
+                        type="button"
+                        onClick={() => setTagFilter(tag)}
+                        aria-label={`Filter by tag ${tag}`}
+                      >
+                        <Badge variant="secondary">{tag}</Badge>
+                      </button>
+                    ))}
+                    <span className="text-muted-foreground">
+                      {new Date(ws.created_at).toLocaleString()}
+                    </span>
+                  </div>
+                  <div className="flex items-center gap-2">
+                    <Button
+                      size="sm"
+                      variant="outline"
+                      disabled={isRunning}
+                      onClick={() => onLoad(ws)}
+                    >
+                      <FolderOpen className="mr-1 h-3 w-3" />
+                      Load
+                    </Button>
+                    <Button
+                      size="sm"
+                      variant="outline"
+                      disabled={isRunning}
+                      onClick={() => onRequestReplay(ws)}
+                    >
+                      <Play className="mr-1 h-3 w-3" />
+                      Replay
+                    </Button>
+                    <DropdownMenu>
+                      <DropdownMenuTrigger asChild>
+                        <Button
+                          size="sm"
+                          variant="ghost"
+                          disabled={isRunning}
+                          aria-label={`More actions for ${labelOf(ws)}`}
+                        >
+                          <MoreHorizontal className="h-3 w-3" />
+                        </Button>
+                      </DropdownMenuTrigger>
+                      <DropdownMenuContent align="end">
+                        <DropdownMenuItem onClick={() => handleTogglePin(ws)}>
+                          {ws.pinned ? (
+                            <PinOff className="mr-2 h-3 w-3" />
+                          ) : (
+                            <Pin className="mr-2 h-3 w-3" />
+                          )}
+                          {ws.pinned ? 'Unpin' : 'Pin'}
+                        </DropdownMenuItem>
+                        <DropdownMenuItem onClick={() => handleToggleArchive(ws)}>
+                          {ws.archived ? (
+                            <ArchiveRestore className="mr-2 h-3 w-3" />
+                          ) : (
+                            <Archive className="mr-2 h-3 w-3" />
+                          )}
+                          {ws.archived ? 'Unarchive' : 'Archive'}
+                        </DropdownMenuItem>
+                        <DropdownMenuItem onClick={() => setPendingEdit(ws)}>
+                          <Pencil className="mr-2 h-3 w-3" />
+                          Edit details…
+                        </DropdownMenuItem>
+                        <DropdownMenuItem
+                          className="text-destructive"
+                          onClick={() => setPendingDelete(ws)}
+                        >
+                          <Trash2 className="mr-2 h-3 w-3" />
+                          Delete…
+                        </DropdownMenuItem>
+                      </DropdownMenuContent>
+                    </DropdownMenu>
+                  </div>
+                </li>
+              ))}
+            </ul>
+
+            {/* E2 (#408) — selection footer. */}
+            {selectedIds.length > 0 && (
+              <div className="flex flex-wrap items-center gap-3 rounded-md border bg-muted/50 px-3 py-2 text-xs">
+                <span className="font-medium">{selectedIds.length} selected</span>
+                <Button
+                  size="sm"
+                  variant="outline"
+                  className="text-destructive"
+                  disabled={isRunning || deleteWorkspace.isPending}
+                  onClick={() => setConfirmMultiDelete(true)}
+                >
+                  <Trash2 className="mr-1 h-3 w-3" />
+                  Delete selected ({selectedIds.length})
+                </Button>
+                <Button
+                  size="sm"
+                  variant="outline"
+                  disabled={selectedIds.length !== 2}
+                  title={
+                    selectedIds.length !== 2 ? 'Select exactly two workspaces to compare' : undefined
+                  }
+                  onClick={() =>
+                    navigate(
+                      `${ROUTES.SHOWCASE_COMPARE}?a=${selectedIds[0]}&b=${selectedIds[1]}`
+                    )
+                  }
+                >
+                  Compare
+                </Button>
+              </div>
+            )}
+          </>
         )}
       </CardContent>
 
@@ -214,6 +503,37 @@ export function WorkspacePanel({
           </AlertDialogFooter>
         </AlertDialogContent>
       </AlertDialog>
+
+      {/* E2 (#408) — one confirmation for the whole selection. */}
+      <AlertDialog
+        open={confirmMultiDelete}
+        onOpenChange={(open) => {
+          if (!open) setConfirmMultiDelete(false)
+        }}
+      >
+        <AlertDialogContent>
+          <AlertDialogHeader>
+            <AlertDialogTitle>Delete {selectedIds.length} workspace records?</AlertDialogTitle>
+            <AlertDialogDescription>
+              Their created objects are NOT deleted — model runs, scenario
+              plans, aliases, jobs, and artifacts stay available elsewhere in
+              the app. This cannot be undone.
+            </AlertDialogDescription>
+          </AlertDialogHeader>
+          <AlertDialogFooter>
+            <AlertDialogCancel>Keep workspaces</AlertDialogCancel>
+            <AlertDialogAction
+              onClick={() => void handleConfirmDeleteSelected()}
+              data-testid="workspace-multi-delete-confirm"
+            >
+              Delete selected
+            </AlertDialogAction>
+          </AlertDialogFooter>
+        </AlertDialogContent>
+      </AlertDialog>
+
+      {/* E2 (#408) — rename / notes / tags editor. */}
+      <WorkspaceEditDialog workspace={pendingEdit} onClose={() => setPendingEdit(null)} />
     </Card>
   )
 }
diff --git a/frontend/src/components/demo/index.ts b/frontend/src/components/demo/index.ts
index ccfe7b71..88731868 100644
--- a/frontend/src/components/demo/index.ts
+++ b/frontend/src/components/demo/index.ts
@@ -2,3 +2,8 @@ export * from './demo-step-card'
 // E4 (#393) — showcase workspace restore/replay panels.
 export * from './WorkspacePanel'
 export * from './WorkspaceArtifactsPanel'
+// E2 (#408) — safe replay + lifecycle + lineage.
+export * from './ReplayConfirmDialog'
+export * from './WorkspaceEditDialog'
+export * from './WorkspaceLineageStrip'
+export * from './workspace-name'
diff --git a/frontend/src/components/demo/replay-request.test.ts b/frontend/src/components/demo/replay-request.test.ts
new file mode 100644
index 00000000..1e50759d
--- /dev/null
+++ b/frontend/src/components/demo/replay-request.test.ts
@@ -0,0 +1,39 @@
+import { describe, expect, it } from 'vitest'
+import { buildReplayRequest } from './replay-request'
+import type { WorkspaceListItem } from '@/types/api'
+
+const baseItem: WorkspaceListItem = {
+  workspace_id: 'a'.repeat(32),
+  name: 'replayable',
+  status: 'completed',
+  seed: 7,
+  scenario: 'showcase_rich',
+  reset: true,
+  skip_seed: false,
+  result_summary: null,
+  created_at: '2026-06-01T12:00:00Z',
+  archived: false,
+  pinned: false,
+  tags: [],
+  replayed_from_workspace_id: null,
+}
+
+describe('buildReplayRequest', () => {
+  it('re-submits the recorded config verbatim with keep + provenance', () => {
+    expect(buildReplayRequest(baseItem)).toEqual({
+      seed: 7,
+      scenario: 'showcase_rich',
+      reset: true,
+      skip_seed: false,
+      preservation: 'keep',
+      replayed_from_workspace_id: baseItem.workspace_id,
+      workspace_name: 'replayable',
+    })
+  })
+
+  it('omits workspace_name on an unnamed row (names stay optional)', () => {
+    const request = buildReplayRequest({ ...baseItem, name: null })
+    expect('workspace_name' in request).toBe(false)
+    expect(request.preservation).toBe('keep')
+  })
+})
diff --git a/frontend/src/components/demo/replay-request.ts b/frontend/src/components/demo/replay-request.ts
new file mode 100644
index 00000000..e2ecee3d
--- /dev/null
+++ b/frontend/src/components/demo/replay-request.ts
@@ -0,0 +1,19 @@
+import type { DemoRunRequest, WorkspaceListItem } from '@/types/api'
+
+/**
+ * E2 (#408) — the EXACT request a confirmed replay sends. Single source for
+ * the confirm dialog's "Will send" column AND the page's executeReplay, so
+ * the preview can never lie about what goes on the wire.
+ */
+export function buildReplayRequest(ws: WorkspaceListItem): DemoRunRequest {
+  return {
+    seed: ws.seed,
+    scenario: ws.scenario,
+    reset: ws.reset,
+    skip_seed: ws.skip_seed,
+    preservation: 'keep',
+    // E1 (#407) — record replay lineage on the NEW row (soft reference).
+    replayed_from_workspace_id: ws.workspace_id,
+    ...(ws.name ? { workspace_name: ws.name } : {}),
+  }
+}
diff --git a/frontend/src/components/demo/workspace-name.ts b/frontend/src/components/demo/workspace-name.ts
new file mode 100644
index 00000000..cd14aa34
--- /dev/null
+++ b/frontend/src/components/demo/workspace-name.ts
@@ -0,0 +1,8 @@
+// E2 (#408) — single source for the workspace-name client validation,
+// shared by the showcase run controls and the WorkspaceEditDialog. Mirrors
+// the backend DemoRunRequest.workspace_name pattern (app/features/demo/
+// schemas.py): lowercase letters/digits, then -/_ allowed; ≤100 chars.
+export const WORKSPACE_NAME_PATTERN = /^[a-z0-9][a-z0-9\-_]*$/
+
+export const WORKSPACE_NAME_HINT =
+  'Lowercase letters/digits only, then “-” or “_” (must not start with either).'
diff --git a/frontend/src/pages/showcase.tsx b/frontend/src/pages/showcase.tsx
index 9643de1a..6a3497ce 100644
--- a/frontend/src/pages/showcase.tsx
+++ b/frontend/src/pages/showcase.tsx
@@ -3,14 +3,18 @@ import { Play, Loader2, Trophy, AlertTriangle, ArrowRight, Square } from 'lucide
 import { useState } from 'react'
 import { useDemoPipeline } from '@/hooks/use-demo-pipeline'
 import type { DemoStep } from '@/hooks/use-demo-pipeline'
-import { useWorkspace } from '@/hooks/use-workspaces'
+import { useWorkspace, useWorkspaceHealth } from '@/hooks/use-workspaces'
 import { DemoPhasePanel } from '@/components/demo/DemoPhasePanel'
 import { ScenarioPicker } from '@/components/demo/ScenarioPicker'
 import { ShowcaseKpiStrip } from '@/components/demo/ShowcaseKpiStrip'
 import { InspectArtifactsPanel } from '@/components/demo/InspectArtifactsPanel'
 import { RunHistoryStrip } from '@/components/demo/RunHistoryStrip'
+import { ReplayConfirmDialog } from '@/components/demo/ReplayConfirmDialog'
+import { WorkspaceLineageStrip } from '@/components/demo/WorkspaceLineageStrip'
 import { WorkspacePanel } from '@/components/demo/WorkspacePanel'
 import { WorkspaceArtifactsPanel } from '@/components/demo/WorkspaceArtifactsPanel'
+import { buildReplayRequest } from '@/components/demo/replay-request'
+import { WORKSPACE_NAME_PATTERN } from '@/components/demo/workspace-name'
 import { Button } from '@/components/ui/button'
 import { Card, CardContent, CardDescription, CardHeader, CardTitle } from '@/components/ui/card'
 import { Checkbox } from '@/components/ui/checkbox'
@@ -21,10 +25,6 @@ import type { WorkspaceListItem } from '@/types/api'
 
 const TERMINAL_STATUSES = new Set(['pass', 'fail', 'skip', 'warn'])
 
-// E4 (#393) — mirrors the backend DemoRunRequest.workspace_name pattern
-// (schemas.py): lowercase letters/digits, then -/_ allowed; ≤100 chars.
-const WORKSPACE_NAME_PATTERN = /^[a-z0-9][a-z0-9\-_]*$/
-
 /**
  * PRP-38 / PRP-39 / PRP-40 — resolve the per-step Inspect deep link.
  *
@@ -122,6 +122,8 @@ export default function ShowcasePage() {
   const [keepWorkspace, setKeepWorkspace] = useState(false)
   const [workspaceName, setWorkspaceName] = useState('')
   const [selectedWorkspaceId, setSelectedWorkspaceId] = useState<string | null>(null)
+  // E2 (#408) — the workspace awaiting replay confirmation (null = no dialog).
+  const [pendingReplay, setPendingReplay] = useState<WorkspaceListItem | null>(null)
 
   // The page (not the panel) resolves the loaded workspace's detail — the
   // artifacts panel needs detail-only created_objects.
@@ -129,6 +131,11 @@ export default function ShowcasePage() {
     selectedWorkspaceId ?? '',
     !!selectedWorkspaceId
   )
+  // E2 (#408) — probe the LOADED workspace's soft references (never per row).
+  const { data: workspaceHealth } = useWorkspaceHealth(
+    selectedWorkspaceId ?? '',
+    !!selectedWorkspaceId
+  )
 
   const completed = steps.filter((s) => TERMINAL_STATUSES.has(s.status)).length
 
@@ -167,24 +174,24 @@ export default function ShowcasePage() {
     setSelectedWorkspaceId(ws.workspace_id)
   }
 
-  // E4 (#393) — Replay: Load, then re-submit the recorded config VERBATIM
-  // through the existing WS run path with preservation='keep' (a replay is
-  // itself a workspace run). setScenario runs first (picker-desync gotcha:
-  // start() does not sync the picker state).
+  // E2 (#408) — Replay request: every replay first opens the confirmation
+  // dialog (recorded-vs-sent preview; destructive variant on reset=true).
+  // NO code path starts a replay without it.
   const handleReplayWorkspace = (ws: WorkspaceListItem) => {
+    setPendingReplay(ws)
+  }
+
+  // E4 (#393) / E2 (#408) — the CONFIRMED replay: Load, then re-submit the
+  // recorded config VERBATIM through the existing WS run path with
+  // preservation='keep' (a replay is itself a workspace run). setScenario
+  // runs first via handleLoadWorkspace (picker-desync gotcha: start() does
+  // not sync the picker state).
+  const executeReplay = (ws: WorkspaceListItem) => {
     handleLoadWorkspace(ws)
     // The re-run's live cards take over; the original row stays untouched.
     setSelectedWorkspaceId(null)
-    start({
-      seed: ws.seed,
-      scenario: ws.scenario,
-      reset: ws.reset,
-      skip_seed: ws.skip_seed,
-      preservation: 'keep',
-      // E1 (#407) — record replay lineage on the NEW row (soft reference).
-      replayed_from_workspace_id: ws.workspace_id,
-      ...(ws.name ? { workspace_name: ws.name } : {}),
-    })
+    start(buildReplayRequest(ws))
+    setPendingReplay(null)
   }
 
   // For the Inspect link to surface store_id/product_id on the train/backtest
@@ -243,10 +250,11 @@ export default function ShowcasePage() {
         scenario={scenario}
       />
 
-      {/* E4 (#393) — server-backed saved workspaces (Load + Replay + Delete). */}
+      {/* E4 (#393) / E2 (#408) — server-backed saved workspaces (lifecycle
+          panel; Replay routes through the confirm dialog below). */}
       <WorkspacePanel
         onLoad={handleLoadWorkspace}
-        onReplay={handleReplayWorkspace}
+        onRequestReplay={handleReplayWorkspace}
         onDeleted={(workspaceId) => {
           // Deleting the currently loaded workspace detaches its artifacts
           // panel — the metadata row backing it is gone (created objects stay).
@@ -446,10 +454,28 @@ export default function ShowcasePage() {
       )}
 
       {/* E4 (#393) — re-attached artifacts of a LOADED workspace. Any started
-          run detaches it (selectedWorkspaceId cleared) so live cards take over. */}
+          run detaches it (selectedWorkspaceId cleared) so live cards take over.
+          E2 (#408) — lineage strip + link-health markers ride along. */}
       {phase !== 'running' && loadedWorkspace && (
-        <WorkspaceArtifactsPanel workspace={loadedWorkspace} />
+        <div className="space-y-2">
+          <WorkspaceLineageStrip
+            workspaceId={loadedWorkspace.workspace_id}
+            onLoadAncestor={(ancestor) => handleLoadWorkspace(ancestor)}
+          />
+          <WorkspaceArtifactsPanel
+            workspace={loadedWorkspace}
+            health={workspaceHealth ?? null}
+          />
+        </div>
       )}
+
+      {/* E2 (#408) — replay confirmation with the recorded-vs-sent preview. */}
+      <ReplayConfirmDialog
+        workspace={pendingReplay}
+        requestPreview={pendingReplay ? buildReplayRequest(pendingReplay) : null}
+        onConfirm={() => pendingReplay && executeReplay(pendingReplay)}
+        onCancel={() => setPendingReplay(null)}
+      />
     </div>
   )
 }

From c957de80bcf1c5d826adb716ed3ab9c4bf32b3e8 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 01:24:10 +0200
Subject: [PATCH 14/32] feat(ui): add two-workspace compare page (#408)

---
 frontend/src/App.tsx                          |   9 +
 frontend/src/lib/constants.ts                 |   2 +
 frontend/src/pages/workspace-compare.test.tsx | 157 +++++++
 frontend/src/pages/workspace-compare.tsx      | 394 ++++++++++++++++++
 4 files changed, 562 insertions(+)
 create mode 100644 frontend/src/pages/workspace-compare.test.tsx
 create mode 100644 frontend/src/pages/workspace-compare.tsx

diff --git a/frontend/src/App.tsx b/frontend/src/App.tsx
index 2dc4042f..56df44e9 100644
--- a/frontend/src/App.tsx
+++ b/frontend/src/App.tsx
@@ -10,6 +10,7 @@ import { ROUTES } from '@/lib/constants'
 // Lazy-loaded page components
 const DashboardPage = lazy(() => import('@/pages/dashboard'))
 const ShowcasePage = lazy(() => import('@/pages/showcase'))
+const WorkspaceComparePage = lazy(() => import('@/pages/workspace-compare'))
 const OpsPage = lazy(() => import('@/pages/ops'))
 const SalesExplorerPage = lazy(() => import('@/pages/explorer/sales'))
 const StoresExplorerPage = lazy(() => import('@/pages/explorer/stores'))
@@ -59,6 +60,14 @@ function App() {
                   </Suspense>
                 }
               />
+              <Route
+                path={ROUTES.SHOWCASE_COMPARE}
+                element={
+                  <Suspense fallback={<PageLoader />}>
+                    <WorkspaceComparePage />
+                  </Suspense>
+                }
+              />
               <Route
                 path={ROUTES.OPS}
                 element={
diff --git a/frontend/src/lib/constants.ts b/frontend/src/lib/constants.ts
index 95cb28b8..68de8031 100644
--- a/frontend/src/lib/constants.ts
+++ b/frontend/src/lib/constants.ts
@@ -2,6 +2,8 @@
 export const ROUTES = {
   DASHBOARD: '/',
   SHOWCASE: '/showcase',
+  // E2 (#408) — two-workspace compare; deep-linkable via ?a=&b=.
+  SHOWCASE_COMPARE: '/showcase/compare',
   OPS: '/ops',
   EXPLORER: {
     SALES: '/explorer/sales',
diff --git a/frontend/src/pages/workspace-compare.test.tsx b/frontend/src/pages/workspace-compare.test.tsx
new file mode 100644
index 00000000..d4dfd706
--- /dev/null
+++ b/frontend/src/pages/workspace-compare.test.tsx
@@ -0,0 +1,157 @@
+import { cleanup, render } from '@testing-library/react'
+import { MemoryRouter, Route, Routes } from 'react-router-dom'
+import { afterEach, beforeAll, beforeEach, describe, expect, it, vi } from 'vitest'
+import WorkspaceComparePage from './workspace-compare'
+import type { WorkspaceDetail } from '@/types/api'
+
+beforeAll(() => {
+  class ResizeObserverStub {
+    observe() {}
+    unobserve() {}
+    disconnect() {}
+  }
+  vi.stubGlobal('ResizeObserver', ResizeObserverStub)
+})
+
+afterEach(() => {
+  cleanup()
+  vi.clearAllMocks()
+})
+
+const idA = 'a'.repeat(32)
+const idB = 'b'.repeat(32)
+
+function makeDetail(overrides: Partial<WorkspaceDetail>): WorkspaceDetail {
+  return {
+    workspace_id: idA,
+    name: 'ws-a',
+    status: 'completed',
+    seed: 42,
+    scenario: 'demo_minimal',
+    reset: false,
+    skip_seed: true,
+    result_summary: {
+      winner_model_type: 'seasonal_naive',
+      winner_wape: 0.15,
+      wall_clock_s: 12,
+    },
+    created_at: '2026-06-01T12:00:00Z',
+    archived: false,
+    pinned: false,
+    tags: [],
+    replayed_from_workspace_id: null,
+    store_id: 3,
+    product_id: 7,
+    date_start: '2026-01-01',
+    date_end: '2026-03-31',
+    created_objects: { winning_run_id: 'run-1', alias: 'demo-production' },
+    notes: null,
+    config_schema_version: 1,
+    ...overrides,
+  }
+}
+
+let details: Record<string, WorkspaceDetail | undefined> = {}
+
+vi.mock('@/hooks/use-workspaces', () => ({
+  useWorkspaces: () => ({
+    data: {
+      workspaces: Object.values(details).filter(Boolean),
+      total: Object.keys(details).length,
+    },
+    isLoading: false,
+  }),
+  useWorkspace: (workspaceId: string, enabled = true) => {
+    if (!enabled || !workspaceId) return { data: undefined, isLoading: false, error: null }
+    const detail = details[workspaceId]
+    return detail
+      ? { data: detail, isLoading: false, error: null }
+      : { data: undefined, isLoading: false, error: new Error('not found') }
+  },
+}))
+
+beforeEach(() => {
+  details = {
+    [idA]: makeDetail({}),
+    [idB]: makeDetail({
+      workspace_id: idB,
+      name: 'ws-b',
+      seed: 99,
+      status: 'failed',
+      replayed_from_workspace_id: idA,
+      result_summary: {
+        winner_model_type: 'naive',
+        winner_wape: 0.25,
+        wall_clock_s: 20,
+      },
+      created_objects: { winning_run_id: 'run-2', batch_id: 'batch-1' },
+    }),
+  }
+})
+
+function renderPage(query = `?a=${idA}&b=${idB}`) {
+  return render(
+    <MemoryRouter initialEntries={[`/showcase/compare${query}`]}>
+      <Routes>
+        <Route path="/showcase/compare" element={<WorkspaceComparePage />} />
+      </Routes>
+    </MemoryRouter>,
+  )
+}
+
+describe('WorkspaceComparePage', () => {
+  it('renders the config diff for two deep-linked workspaces', () => {
+    const { container } = renderPage()
+    const copy = container.textContent ?? ''
+    expect(copy).toContain('ws-a')
+    expect(copy).toContain('ws-b')
+    expect(copy).toContain('42')
+    expect(copy).toContain('99')
+    // Mismatching seed rows are emphasized.
+    const bolded = Array.from(container.querySelectorAll('td.font-semibold')).map(
+      (el) => el.textContent,
+    )
+    expect(bolded).toContain('42')
+    expect(bolded).toContain('99')
+  })
+
+  it('renders the result diff with the sign-only WAPE delta', () => {
+    const { container } = renderPage()
+    const copy = container.textContent ?? ''
+    expect(copy).toContain('seasonal_naive')
+    expect(copy).toContain('0.1500')
+    expect(copy).toContain('0.2500')
+    expect(copy).toContain('0.1000') // 0.25 - 0.15
+  })
+
+  it('renders the created-objects presence matrix over the key union', () => {
+    const { container } = renderPage()
+    const copy = container.textContent ?? ''
+    expect(copy).toContain('winning_run_id')
+    expect(copy).toContain('alias')
+    expect(copy).toContain('batch_id')
+  })
+
+  it('renders the lineage note when one side replays the other', () => {
+    const { container } = renderPage()
+    expect(container.textContent).toContain('Workspace B is a replay of workspace A.')
+  })
+
+  it('renders the partial-run badge on a failed side', () => {
+    const { container } = renderPage()
+    expect(container.textContent).toContain('partial run')
+  })
+
+  it('degrades to the picker when an id no longer resolves (no crash)', () => {
+    details[idB] = undefined
+    const { container } = renderPage()
+    expect(container.textContent).toContain('no longer exists')
+    // The diff sections never render half-ready.
+    expect(container.textContent).not.toContain('Created objects')
+  })
+
+  it('prompts for selection when ids are missing', () => {
+    const { container } = renderPage('')
+    expect(container.textContent).toContain('Select two workspaces')
+  })
+})
diff --git a/frontend/src/pages/workspace-compare.tsx b/frontend/src/pages/workspace-compare.tsx
new file mode 100644
index 00000000..f6e8b785
--- /dev/null
+++ b/frontend/src/pages/workspace-compare.tsx
@@ -0,0 +1,394 @@
+/**
+ * E2 (#408) — two-workspace compare page (/showcase/compare?a=&b=).
+ *
+ * Mirrors the run-compare two-picker pattern (pages/explorer/run-compare.tsx)
+ * but the diff is FRONTEND-ONLY: a workspace compare is a plain field diff
+ * over two already-served WorkspaceDetail payloads — no backend endpoint.
+ * Renders: config table (mismatches highlighted), result-summary diff
+ * (WAPE delta is sign-only), created-objects presence matrix, lineage note
+ * when one side replays the other, and partial-run badges. Invalid/missing
+ * ids degrade to the picker — never a crash.
+ */
+
+import { Link, useSearchParams } from 'react-router-dom'
+import { ArrowDown, ArrowLeft, ArrowUp } from 'lucide-react'
+import { useWorkspace, useWorkspaces } from '@/hooks/use-workspaces'
+import { Badge } from '@/components/ui/badge'
+import { Button } from '@/components/ui/button'
+import { Card, CardContent, CardDescription, CardHeader, CardTitle } from '@/components/ui/card'
+import {
+  Select,
+  SelectContent,
+  SelectItem,
+  SelectTrigger,
+  SelectValue,
+} from '@/components/ui/select'
+import {
+  Table,
+  TableBody,
+  TableCell,
+  TableHead,
+  TableHeader,
+  TableRow,
+} from '@/components/ui/table'
+import { formatNumber } from '@/lib/api'
+import { ROUTES } from '@/lib/constants'
+import { cn } from '@/lib/utils'
+import type { WorkspaceDetail, WorkspaceListItem } from '@/types/api'
+
+/** Neutral delta indicator — sign only, no better/worse colour-coding. */
+function DeltaCell({ diff }: { diff: number | null }) {
+  if (diff == null) {
+    return <span className="text-muted-foreground">—</span>
+  }
+  if (diff > 0) {
+    return (
+      <span className="inline-flex items-center gap-1">
+        <ArrowUp className="h-3 w-3" />
+        {formatNumber(diff, 4)}
+      </span>
+    )
+  }
+  if (diff < 0) {
+    return (
+      <span className="inline-flex items-center gap-1">
+        <ArrowDown className="h-3 w-3" />
+        {formatNumber(diff, 4)}
+      </span>
+    )
+  }
+  return <span>{formatNumber(diff, 4)}</span>
+}
+
+function labelOf(ws: WorkspaceListItem): string {
+  return ws.name ?? ws.workspace_id.slice(0, 8)
+}
+
+function WorkspacePicker({
+  label,
+  value,
+  workspaces,
+  onSelect,
+}: {
+  label: string
+  value: string
+  workspaces: WorkspaceListItem[]
+  onSelect: (workspaceId: string) => void
+}) {
+  return (
+    <div className="space-y-1">
+      <span className="text-xs text-muted-foreground">{label}</span>
+      <Select value={value || undefined} onValueChange={onSelect}>
+        <SelectTrigger className="w-full">
+          <SelectValue placeholder="Select a workspace…" />
+        </SelectTrigger>
+        <SelectContent>
+          {workspaces.map((ws) => (
+            <SelectItem key={ws.workspace_id} value={ws.workspace_id}>
+              {labelOf(ws)} · {ws.scenario} · {ws.status}
+            </SelectItem>
+          ))}
+        </SelectContent>
+      </Select>
+    </div>
+  )
+}
+
+function summaryNumber(ws: WorkspaceDetail, key: string): number | null {
+  const value = ws.result_summary?.[key]
+  return typeof value === 'number' ? value : null
+}
+
+function summaryString(ws: WorkspaceDetail, key: string): string | null {
+  const value = ws.result_summary?.[key]
+  return typeof value === 'string' ? value : null
+}
+
+interface ConfigRow {
+  field: string
+  a: string
+  b: string
+}
+
+function buildConfigRows(a: WorkspaceDetail, b: WorkspaceDetail): ConfigRow[] {
+  const fmt = (value: unknown): string =>
+    value === null || value === undefined || value === '' ? '—' : String(value)
+  return [
+    { field: 'seed', a: fmt(a.seed), b: fmt(b.seed) },
+    { field: 'scenario', a: fmt(a.scenario), b: fmt(b.scenario) },
+    { field: 'reset', a: fmt(a.reset), b: fmt(b.reset) },
+    { field: 'skip_seed', a: fmt(a.skip_seed), b: fmt(b.skip_seed) },
+    { field: 'name', a: fmt(a.name), b: fmt(b.name) },
+    { field: 'tags', a: fmt(a.tags.join(', ')), b: fmt(b.tags.join(', ')) },
+  ]
+}
+
+/** Union of soft-reference keys recorded on either side. */
+function objectKeys(a: WorkspaceDetail, b: WorkspaceDetail): string[] {
+  return Array.from(
+    new Set([...Object.keys(a.created_objects), ...Object.keys(b.created_objects)])
+  ).sort()
+}
+
+function lineageNote(a: WorkspaceDetail, b: WorkspaceDetail): string | null {
+  if (b.replayed_from_workspace_id === a.workspace_id) {
+    return 'Workspace B is a replay of workspace A.'
+  }
+  if (a.replayed_from_workspace_id === b.workspace_id) {
+    return 'Workspace A is a replay of workspace B.'
+  }
+  return null
+}
+
+function SideStatus({ ws }: { ws: WorkspaceDetail }) {
+  return (
+    <span className="inline-flex items-center gap-2">
+      {ws.status}
+      {ws.status !== 'completed' && (
+        <Badge variant="outline" className="text-destructive">
+          partial run
+        </Badge>
+      )}
+    </span>
+  )
+}
+
+export default function WorkspaceComparePage() {
+  const [params, setParams] = useSearchParams()
+  const a = params.get('a') ?? ''
+  const b = params.get('b') ?? ''
+
+  // Pickers include archived rows — comparing an archived run is legitimate.
+  const listQuery = useWorkspaces({ limit: 100, include_archived: true })
+  const detailA = useWorkspace(a, !!a)
+  const detailB = useWorkspace(b, !!b)
+
+  function selectWorkspace(slot: 'a' | 'b', workspaceId: string) {
+    setParams((prev) => {
+      const next = new URLSearchParams(prev)
+      next.set(slot, workspaceId)
+      return next
+    })
+  }
+
+  const workspaces = listQuery.data?.workspaces ?? []
+  const wsA = detailA.data
+  const wsB = detailB.data
+  // A 404 (deleted id in the URL) degrades to the picker — never a crash.
+  const bothReady = !!wsA && !!wsB
+
+  const wapeA = wsA ? summaryNumber(wsA, 'winner_wape') : null
+  const wapeB = wsB ? summaryNumber(wsB, 'winner_wape') : null
+  const note = bothReady ? lineageNote(wsA, wsB) : null
+
+  return (
+    <div className="space-y-6">
+      <div className="space-y-1">
+        <Button asChild variant="ghost" size="sm" className="-ml-2 h-7">
+          <Link to={ROUTES.SHOWCASE}>
+            <ArrowLeft className="mr-1 h-4 w-4" />
+            Back to Showcase
+          </Link>
+        </Button>
+        <h1 className="text-3xl font-bold">Compare workspaces</h1>
+        <p className="text-sm text-muted-foreground">
+          Pick two saved showcase workspaces to compare their replay config,
+          results, and recorded objects side by side.
+        </p>
+      </div>
+
+      <Card>
+        <CardHeader>
+          <CardTitle>Select workspaces</CardTitle>
+          <CardDescription>
+            The comparison is deep-linkable — the URL carries the two workspace ids.
+          </CardDescription>
+        </CardHeader>
+        <CardContent className="grid gap-4 sm:grid-cols-2">
+          <WorkspacePicker
+            label="Workspace A"
+            value={a}
+            workspaces={workspaces}
+            onSelect={(id) => selectWorkspace('a', id)}
+          />
+          <WorkspacePicker
+            label="Workspace B"
+            value={b}
+            workspaces={workspaces}
+            onSelect={(id) => selectWorkspace('b', id)}
+          />
+        </CardContent>
+      </Card>
+
+      {(!a || !b || detailA.error || detailB.error || !bothReady) && (
+        <Card>
+          <CardContent className="py-10 text-center text-sm text-muted-foreground">
+            {detailA.error || detailB.error
+              ? 'One of the selected workspaces no longer exists — select another above.'
+              : detailA.isLoading || detailB.isLoading
+                ? 'Loading workspaces…'
+                : 'Select two workspaces above to see the comparison.'}
+          </CardContent>
+        </Card>
+      )}
+
+      {bothReady && (
+        <>
+          {note && (
+            <Card>
+              <CardContent className="py-4 text-sm" data-testid="lineage-note">
+                {note}
+              </CardContent>
+            </Card>
+          )}
+
+          <Card>
+            <CardHeader>
+              <CardTitle>Config</CardTitle>
+              <CardDescription>
+                Recorded replay config — mismatching rows are highlighted.
+              </CardDescription>
+            </CardHeader>
+            <CardContent>
+              <Table>
+                <TableHeader>
+                  <TableRow>
+                    <TableHead>Field</TableHead>
+                    <TableHead>Workspace A</TableHead>
+                    <TableHead>Workspace B</TableHead>
+                  </TableRow>
+                </TableHeader>
+                <TableBody>
+                  <TableRow>
+                    <TableCell className="font-medium">Workspace ID</TableCell>
+                    <TableCell className="break-all font-mono text-xs">
+                      {wsA.workspace_id}
+                    </TableCell>
+                    <TableCell className="break-all font-mono text-xs">
+                      {wsB.workspace_id}
+                    </TableCell>
+                  </TableRow>
+                  <TableRow>
+                    <TableCell className="font-medium">Status</TableCell>
+                    <TableCell>
+                      <SideStatus ws={wsA} />
+                    </TableCell>
+                    <TableCell>
+                      <SideStatus ws={wsB} />
+                    </TableCell>
+                  </TableRow>
+                  {buildConfigRows(wsA, wsB).map((row) => {
+                    const mismatch = row.a !== row.b
+                    return (
+                      <TableRow key={row.field}>
+                        <TableCell className="font-medium">{row.field}</TableCell>
+                        <TableCell className={cn('text-xs', mismatch && 'font-semibold')}>
+                          {row.a}
+                        </TableCell>
+                        <TableCell className={cn('text-xs', mismatch && 'font-semibold')}>
+                          {row.b}
+                        </TableCell>
+                      </TableRow>
+                    )
+                  })}
+                </TableBody>
+              </Table>
+            </CardContent>
+          </Card>
+
+          <Card>
+            <CardHeader>
+              <CardTitle>Results</CardTitle>
+              <CardDescription>
+                Δ is Workspace B minus Workspace A — sign only, not a quality judgement.
+              </CardDescription>
+            </CardHeader>
+            <CardContent>
+              <Table>
+                <TableHeader>
+                  <TableRow>
+                    <TableHead>Metric</TableHead>
+                    <TableHead>Workspace A</TableHead>
+                    <TableHead>Workspace B</TableHead>
+                    <TableHead>Δ</TableHead>
+                  </TableRow>
+                </TableHeader>
+                <TableBody>
+                  <TableRow>
+                    <TableCell className="font-medium">Winner</TableCell>
+                    <TableCell>{summaryString(wsA, 'winner_model_type') ?? '—'}</TableCell>
+                    <TableCell>{summaryString(wsB, 'winner_model_type') ?? '—'}</TableCell>
+                    <TableCell>
+                      <span className="text-muted-foreground">—</span>
+                    </TableCell>
+                  </TableRow>
+                  <TableRow>
+                    <TableCell className="font-medium">Winner WAPE</TableCell>
+                    <TableCell>{wapeA != null ? formatNumber(wapeA, 4) : '—'}</TableCell>
+                    <TableCell>{wapeB != null ? formatNumber(wapeB, 4) : '—'}</TableCell>
+                    <TableCell>
+                      <DeltaCell
+                        diff={wapeA != null && wapeB != null ? wapeB - wapeA : null}
+                      />
+                    </TableCell>
+                  </TableRow>
+                  <TableRow>
+                    <TableCell className="font-medium">Wall-clock (s)</TableCell>
+                    <TableCell>
+                      {summaryNumber(wsA, 'wall_clock_s') != null
+                        ? formatNumber(summaryNumber(wsA, 'wall_clock_s')!, 1)
+                        : '—'}
+                    </TableCell>
+                    <TableCell>
+                      {summaryNumber(wsB, 'wall_clock_s') != null
+                        ? formatNumber(summaryNumber(wsB, 'wall_clock_s')!, 1)
+                        : '—'}
+                    </TableCell>
+                    <TableCell>
+                      <span className="text-muted-foreground">—</span>
+                    </TableCell>
+                  </TableRow>
+                </TableBody>
+              </Table>
+            </CardContent>
+          </Card>
+
+          <Card>
+            <CardHeader>
+              <CardTitle>Created objects</CardTitle>
+              <CardDescription>
+                Which soft references each run recorded (✓ recorded / — absent).
+              </CardDescription>
+            </CardHeader>
+            <CardContent>
+              {objectKeys(wsA, wsB).length === 0 ? (
+                <p className="text-sm text-muted-foreground">
+                  Neither workspace recorded any created objects.
+                </p>
+              ) : (
+                <Table>
+                  <TableHeader>
+                    <TableRow>
+                      <TableHead>Object</TableHead>
+                      <TableHead>Workspace A</TableHead>
+                      <TableHead>Workspace B</TableHead>
+                    </TableRow>
+                  </TableHeader>
+                  <TableBody>
+                    {objectKeys(wsA, wsB).map((key) => (
+                      <TableRow key={key}>
+                        <TableCell className="font-mono text-xs">{key}</TableCell>
+                        <TableCell>{key in wsA.created_objects ? '✓' : '—'}</TableCell>
+                        <TableCell>{key in wsB.created_objects ? '✓' : '—'}</TableCell>
+                      </TableRow>
+                    ))}
+                  </TableBody>
+                </Table>
+              )}
+            </CardContent>
+          </Card>
+        </>
+      )}
+    </div>
+  )
+}

From 0560e0eba6e56958f8605087d4dd099a088a3da2 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 01:24:10 +0200
Subject: [PATCH 15/32] docs(api): document workspace lifecycle and health
 contracts (#408)

---
 docs/_base/API_CONTRACTS.md | 5 +++--
 docs/_base/RUNBOOKS.md      | 8 ++++----
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/docs/_base/API_CONTRACTS.md b/docs/_base/API_CONTRACTS.md
index 47bc7b6e..70e6f5ab 100644
--- a/docs/_base/API_CONTRACTS.md
+++ b/docs/_base/API_CONTRACTS.md
@@ -60,8 +60,9 @@ All endpoints serve JSON; error responses use `application/problem+json` (RFC 78
 | seeder | POST | `/seeder/phase2-enrichment` | PRP-38 — run Phase 2 generators (lifecycle, replenishment, exogenous, returns) against the existing seeded data. `422 application/problem+json` on an empty database. |
 | demo | POST | `/demo/run` | Run the end-to-end demo pipeline in-process; returns a `DemoRunResult`. `409 application/problem+json` if a run is already active. **PRP-38** — body accepts an Optional `scenario: 'demo_minimal' \| 'showcase_rich' \| 'sparse'` field; default `'demo_minimal'` (back-compat). **E1 (#390)** — body accepts additive Optional `preservation: 'ephemeral' \| 'keep'` (default `'ephemeral'`, today's no-row behavior) and `workspace_name: str \| null` (pattern `^[a-z0-9][a-z0-9\-_]*$`, ≤100 chars); `workspace_name` without `preservation='keep'` → `422 application/problem+json`. `preservation='keep'` records the run as a `showcase_workspace` row; `DemoRunResult` gains an additive Optional `workspace_id: str \| null`. **E2 (#391)** — `scenario` accepts all 8 `ScenarioPreset` values (`retail_standard` / `holiday_rush` / `high_variance` / `stockout_heavy` / `new_launches` / `sparse` / `demo_minimal` / `showcase_rich`); only `showcase_rich` changes the step table (24 rows), every other preset runs the legacy 11-row flow. **E1 (#407)** — body accepts additive Optional `replayed_from_workspace_id: str \| null` (`^[0-9a-f]{32}$`); requires `preservation='keep'` (else `422 application/problem+json`); recorded verbatim on the new `showcase_workspace` row as a SOFT reference (no existence check — dangles are designed). |
 | demo | WS | `/demo/stream` | Stream one `StepEvent` per pipeline step for the live Showcase page |
-| demo | GET | `/demo/workspaces` | **E4 (#393)** — list saved showcase workspaces, newest first (`limit` 1-100 default 20 / `offset`); `200` + empty list on an empty table. **E1 (#407)** — list items additively carry `archived`, `pinned`, `tags`, `replayed_from_workspace_id`; archived rows still list (default-filtering is E2 #408) |
+| demo | GET | `/demo/workspaces` | **E4 (#393)** — list saved showcase workspaces, newest first (`limit` 1-100 default 20 / `offset`); `200` + empty list on an empty table. **E1 (#407)** — list items additively carry `archived`, `pinned`, `tags`, `replayed_from_workspace_id`. **E2 (#408)** — additive query params: `q` (name ILIKE search, min 2 chars), repeated `tags` (JSONB containment — all listed tags must match), `include_archived` (default `false` — archived rows are now HIDDEN by default), allow-listed `sort_by` (`created_at`/`name`/`seed`/`status`; unknown → default `created_at desc`, no 422) + `sort_order` (`asc`/`desc`); pinned rows always order first; `total` respects the active filters |
 | demo | GET | `/demo/workspaces/{workspace_id}` | **E4 (#393)** — full workspace row incl. `created_objects` soft references + grain/window columns; `404 application/problem+json` when missing. **E1 (#407)** — response additively carries the list-item lifecycle fields plus `notes`, `config_schema_version`, and the six story slots (`seed_overrides` / `user_scope` / `approval_events` / `rag_events` / `job_ids` / `phase_summaries` — all `null` until their writer epic lands; schemas in `docs/_base/DOMAIN_MODEL.md`) |
+| demo | GET | `/demo/workspaces/{workspace_id}/health` | **E2 (#408)** — probe the workspace's soft references in-process (model runs, scenario plans, alias, batch, agent session, `job_ids` slot) via `httpx.ASGITransport`; per-reference `status` ∈ `alive` (2xx) / `dead` (404 — deleted after the run) / `unknown` (anything else — never a 500), plus `alive`/`dead`/`unknown` counts and `partial_run` (true when the row's status ≠ `completed`); non-probeable keys (`v2_model_path`, `scenario_artifact_key`, `train_model_types`) are skipped; `404 application/problem+json` when the workspace is missing |
 | demo | PATCH | `/demo/workspaces/{workspace_id}` | **E1 (#407)** — partial lifecycle update (`name` / `notes` / `tags` / `archived` / `pinned`; `exclude_unset` semantics — only provided fields change; explicit `null` clears `name`/`notes`; explicit `null` on `archived`/`pinned`/`tags` → `422` (send `[]` to clear tags); `status` NOT patchable — the pipeline owns it); returns the updated `WorkspaceDetailResponse`; empty body = `200` no-op; `404 application/problem+json` when missing; `422` on unknown keys / bad name pattern / >20 tags |
 | demo | DELETE | `/demo/workspaces/{workspace_id}` | Delete one saved workspace METADATA row; `204` on success, `404 application/problem+json` when missing. The run's created objects (model runs, scenario plans, aliases, jobs, artifacts) are soft references and are NOT deleted |
 | config | GET | `/config/ai` | Effective AI-model config (agent LLM + RAG embeddings); API keys masked, never raw |
@@ -98,7 +99,7 @@ Drives the end-to-end demo pipeline for the dashboard Showcase page. Verified ag
 - PRP-38 — `scenario="showcase_rich"` extends the data phase with `phase2_enrichment` + `historical_backfill` steps and the modeling phase with `v2_train` (one V2 `prophet_like` run). Phase ids are `data` / `modeling` / `decision` / `verify` / `agent` / `cleanup` (6 phases).
 - PRP-40 — `scenario="showcase_rich"` ALSO adds two phases inserted BEFORE `verify`: `planning` (2 steps — `scenario_simulate_and_save`, `multi_plan_compare`) and `knowledge` (3 steps — `embedding_provider_probe`, `rag_index_subset`, `rag_retrieve_probe`). Total step count: 19 for `showcase_rich`, 11 for `demo_minimal` and `sparse`. Phase ids on `showcase_rich` are `data` / `modeling` / `decision` / `planning` / `knowledge` / `verify` / `agent` / `cleanup` (8 phases). The knowledge steps SKIP gracefully when the embedding provider is unreachable; the pipeline still goes green.
 - E3 (#392) — the planning-phase steps tag the plans they save: pipeline-saved plans now carry `source:showcase` (alongside the legacy `showcase` + `price`/`holiday` tags), and on `preservation="keep"` runs additionally `workspace:<workspace_name|workspace_id>` — retrievable via `GET /scenarios?tags=workspace:<label>` (JSONB containment, all listed tags must match). The `scenario_simulate_and_save` step's `data` additively echoes the `tags` list it sent.
-- E4 (#393) — the start frame's E1 preservation fields are now exercised by the Showcase UI ("Save as workspace" checkbox + name + seed inputs). **Replay** re-submits a recorded workspace's config verbatim (`seed`/`scenario`/`reset`/`skip_seed`) with `preservation="keep"` (+ the recorded `workspace_name`), creating a NEW `showcase_workspace` row each time — the original row is never mutated; names are non-unique by design. Saved rows are read back over `GET /demo/workspaces` (+ `/{workspace_id}`). E1 (#407) — the Replay start frame now also sends `replayed_from_workspace_id: <source workspace_id>`, so replays carry lineage (the rendering of that lineage is E2 #408).
+- E4 (#393) — the start frame's E1 preservation fields are now exercised by the Showcase UI ("Save as workspace" checkbox + name + seed inputs). **Replay** re-submits a recorded workspace's config verbatim (`seed`/`scenario`/`reset`/`skip_seed`) with `preservation="keep"` (+ the recorded `workspace_name`), creating a NEW `showcase_workspace` row each time — the original row is never mutated; names are non-unique by design. Saved rows are read back over `GET /demo/workspaces` (+ `/{workspace_id}`). E1 (#407) — the Replay start frame now also sends `replayed_from_workspace_id: <source workspace_id>`, so replays carry lineage. E2 (#408) — every panel Replay now requires an explicit confirmation dialog with a recorded-vs-sent config preview (destructive copy + destructive-styled confirm on `reset=true` workspaces); the saved-workspaces panel renders the lineage as a replay badge + clickable ancestor chain (deleted ancestors marked, never an error), and a two-workspace compare page lives at `/showcase/compare?a=&b=` (frontend-only diff — no new backend endpoint).
 
 ## Async Events / Queues
 
diff --git a/docs/_base/RUNBOOKS.md b/docs/_base/RUNBOOKS.md
index 1cda5125..22e06e49 100644
--- a/docs/_base/RUNBOOKS.md
+++ b/docs/_base/RUNBOOKS.md
@@ -149,10 +149,10 @@ uv run python scripts/run_demo.py --seed 42 --quiet 2>&1 | tee demo.log
 
 **Lifecycle modes:** an **ephemeral** run (the default) writes no workspace row — it lives only in the localStorage Run-history strip. A **keep** run (`preservation="keep"` / the "Save as workspace" checkbox) records a `showcase_workspace` row with the run's replay config and soft references to what it created. A **named** keep run additionally carries the operator-supplied `workspace_name` label (non-unique). Kept rows back the panel's **Load** (restore config + artifact links, read-only), **Replay** (re-run verbatim), and **Delete** (remove the saved record) actions.
 
-1. **Replay is verbatim — replaying a `reset=true` workspace WIPES the database.** Replay re-submits the recorded config exactly (`seed`/`scenario`/`reset`/`skip_seed`) with `preservation="keep"`. A workspace saved from a Reset-database run therefore wipes + reseeds on every Replay; the panel styles such rows with a `DESTRUCTIVE` marker. This is designed E4 semantics (#393), not a bug — there is deliberately no confirm dialog (consistency with the Reset checkbox's severity styling).
-2. **Names are non-unique by design.** Every Replay creates a NEW `showcase_workspace` row; same-named rows accumulate (the replay regression test itself leaves two `replay-regression` rows). Disambiguate by `workspace_id` or `created_at` (panel lists newest first).
-3. **Rows accumulate unless deleted.** `DELETE /demo/workspaces/{workspace_id}` (and the panel's per-row **Delete** button, behind a confirmation dialog) removes a saved row; a missing id is an RFC 7807 404. Undeleted rows are harmless audit records.
-4. **Deleting a workspace deletes METADATA ONLY.** The delete removes just the `showcase_workspace` row — the model runs, scenario plans, aliases, jobs, agent sessions, and on-disk artifacts the run created are NOT touched (and the seeded data is not reverted). `created_objects` ids are SOFT references (deliberately no FKs), so deletion in either direction never cascades: an operator-issued `DELETE /registry/runs/{id}` or scenario-plan delete leaves dangling deep links on a loaded workspace's artifact cards — expected; the workspace row records what WAS created, not what still exists.
+1. **Replay is verbatim — replaying a `reset=true` workspace WIPES the database.** Replay re-submits the recorded config exactly (`seed`/`scenario`/`reset`/`skip_seed`) with `preservation="keep"`. A workspace saved from a Reset-database run therefore wipes + reseeds on every Replay; the panel styles such rows with a `DESTRUCTIVE` marker. This is designed E4 semantics (#393). **E2 (#408) supersedes the no-dialog decision:** every panel Replay now opens a confirmation dialog with a recorded-vs-sent config preview, and a `reset=true` workspace escalates it — destructive warning copy + a destructive-styled "Replay & wipe database" button. No code path starts a replay without the dialog; the `DESTRUCTIVE` row marker stays.
+2. **Names are non-unique by design.** Every Replay creates a NEW `showcase_workspace` row; same-named rows accumulate (the replay regression test itself leaves two `replay-regression` rows). Disambiguate by `workspace_id` or `created_at` (panel lists newest first). E2 (#408) — replayed rows carry a `replay` badge and the loaded view renders the ancestor chain (a deleted ancestor shows "(original deleted)" — dangling lineage is designed, never an error).
+3. **Rows accumulate unless deleted.** `DELETE /demo/workspaces/{workspace_id}` (and the panel's per-row **Delete** button, behind a confirmation dialog) removes a saved row; a missing id is an RFC 7807 404. Undeleted rows are harmless audit records. E2 (#408) — the panel adds search / tag filter / sort / show-archived (archived rows are hidden from the list by default) and a multi-select **Delete selected** action — N sequential single DELETEs behind one confirmation; deliberately NO bulk endpoint.
+4. **Deleting a workspace deletes METADATA ONLY.** The delete removes just the `showcase_workspace` row — the model runs, scenario plans, aliases, jobs, agent sessions, and on-disk artifacts the run created are NOT touched (and the seeded data is not reverted). `created_objects` ids are SOFT references (deliberately no FKs), so deletion in either direction never cascades: an operator-issued `DELETE /registry/runs/{id}` or scenario-plan delete leaves dangling deep links on a loaded workspace's artifact cards — expected; the workspace row records what WAS created, not what still exists. E2 (#408) — that staleness now SURFACES instead of dangling silently: loading a workspace probes its references via `GET /demo/workspaces/{id}/health`, dead references get a warning marker on the artifact cards, and a summary chip shows alive/dead counts plus a partial-run warning for never-completed rows.
 5. **`holiday_rush` workspaces replay the pinned 2024 window.** The preset seeds a fixed Oct–Dec 2024 window (incident 28 above); a Replay with `reset=false` ADDS those rows to a today-anchored dataset, so `/seeder/status` reports the union range afterwards. For a clean pinned window, save the workspace from a run with **Reset database** ticked — its (destructive) Replay then reproduces the pinned window exactly.
 
 **Notes:** keep-runs are recorded by warn-and-continue hooks — a DB hiccup during `create_workspace` yields a green pipeline with `workspace_id: null` and no row (check uvicorn logs for `demo.workspace_create_failed`). Ephemeral runs write no workspace rows and stay in the localStorage Run-history strip; kept runs appear ONLY in the server-backed panel. On `showcase_rich` keep-runs, the planning-phase scenario plans carry the `workspace:<name|id>` tag (E3 #392) — retrieve them via `GET /scenarios?tags=workspace:<label>`.

From 890675bc3c9a3d48e234b7e44ae1cd6e73aaa213 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 01:47:34 +0200
Subject: [PATCH 16/32] feat(data): add allow-listed nested seed overrides to
 seeder contract (#409)

---
 app/features/seeder/schemas.py            |  12 +++
 app/features/seeder/service.py            |  47 +++++++++
 app/features/seeder/tests/test_routes.py  |  46 +++++++++
 app/features/seeder/tests/test_service.py | 116 ++++++++++++++++++++++
 app/shared/seeder/overrides.py            |  90 +++++++++++++++++
 app/shared/seeder/tests/test_overrides.py |  95 ++++++++++++++++++
 6 files changed, 406 insertions(+)
 create mode 100644 app/shared/seeder/overrides.py
 create mode 100644 app/shared/seeder/tests/test_overrides.py

diff --git a/app/features/seeder/schemas.py b/app/features/seeder/schemas.py
index 20a22dc3..42b697c1 100644
--- a/app/features/seeder/schemas.py
+++ b/app/features/seeder/schemas.py
@@ -7,6 +7,7 @@
 from pydantic import BaseModel, ConfigDict, Field, field_validator, model_validator
 
 from app.shared.seeder.config import default_seed_end_date, default_seed_start_date
+from app.shared.seeder.overrides import SeederOverrides
 
 VALID_CHANNELS: frozenset[str] = frozenset({"in_store", "online", "click_collect", "wholesale"})
 """Allow-list for ``sales_daily.channel`` — mirrors the SQL CHECK."""
@@ -257,6 +258,17 @@ class GenerateParams(BaseModel):
         ),
     )
 
+    # E3 (#409) — curated nested overrides. Absent field = byte-identical
+    # legacy behavior (the same promise as the Phase 1/2 blocks above).
+    overrides: SeederOverrides | None = Field(
+        default=None,
+        description=(
+            "Curated nested overrides (E3 #409); applied LAST — wins over the "
+            "scalar stores/products/sparsity. Unknown knobs are rejected "
+            "(extra=forbid). Absent = byte-identical legacy behavior."
+        ),
+    )
+
     @model_validator(mode="after")
     def _validate_date_range(self) -> "GenerateParams":
         """Reject inverted date ranges with a clear message."""
diff --git a/app/features/seeder/service.py b/app/features/seeder/service.py
index 87b20709..341afdef 100644
--- a/app/features/seeder/service.py
+++ b/app/features/seeder/service.py
@@ -52,6 +52,7 @@
 from app.shared.seeder.generators.lifecycle import LifecycleGenerator
 from app.shared.seeder.generators.replenishment import ReplenishmentGenerator
 from app.shared.seeder.generators.returns import ReturnsGenerator
+from app.shared.seeder.overrides import SeederOverrides
 
 logger = get_logger(__name__)
 
@@ -199,6 +200,49 @@ def _apply_phase2_overrides(config: SeederConfig, params: schemas.GenerateParams
         )
 
 
+def _apply_seed_overrides(config: SeederConfig, overrides: SeederOverrides | None) -> None:
+    """Apply the curated nested overrides LAST -- wins over scalar params (E3, #409).
+
+    Mutates ``config`` in place (the ``_apply_phaseN_overrides`` pattern).
+    ``dataclasses.replace`` is field-precise: preset-customized sibling fields
+    (region/category lists, ``random_gaps_*``) survive every knob. ``None``
+    (or an all-``None`` object) is a no-op so legacy bodies stay
+    byte-identical.
+    """
+    if overrides is None:
+        return
+    if overrides.stores is not None or overrides.products is not None:
+        config.dimensions = replace(
+            config.dimensions,
+            stores=overrides.stores if overrides.stores is not None else config.dimensions.stores,
+            products=(
+                overrides.products if overrides.products is not None else config.dimensions.products
+            ),
+        )
+    if overrides.window_days is not None:
+        # Recompute the window length from the (scalar-or-default) end_date;
+        # end_date itself is untouched.
+        config.start_date = config.end_date - timedelta(days=overrides.window_days)
+    if overrides.sparsity is not None:
+        config.sparsity = replace(config.sparsity, missing_combinations_pct=overrides.sparsity)
+    if overrides.promotion_intensity is not None or overrides.stockout_intensity is not None:
+        config.retail = replace(
+            config.retail,
+            promotion_probability=(
+                overrides.promotion_intensity
+                if overrides.promotion_intensity is not None
+                else config.retail.promotion_probability
+            ),
+            stockout_probability=(
+                overrides.stockout_intensity
+                if overrides.stockout_intensity is not None
+                else config.retail.stockout_probability
+            ),
+        )
+    if overrides.noise_sigma is not None:
+        config.time_series = replace(config.time_series, noise_sigma=overrides.noise_sigma)
+
+
 def _build_config_from_params(params: schemas.GenerateParams) -> SeederConfig:
     """Build SeederConfig from API parameters.
 
@@ -239,6 +283,9 @@ def _build_config_from_params(params: schemas.GenerateParams) -> SeederConfig:
 
     _apply_phase1_overrides(config, params)
     _apply_phase2_overrides(config, params)
+    # E3 (#409) — the curated nested overrides apply LAST so they win over
+    # the scalar stores/products/sparsity params above.
+    _apply_seed_overrides(config, params.overrides)
 
     settings = get_settings()
     config.batch_size = settings.seeder_batch_size
diff --git a/app/features/seeder/tests/test_routes.py b/app/features/seeder/tests/test_routes.py
index f1733142..7da2e947 100644
--- a/app/features/seeder/tests/test_routes.py
+++ b/app/features/seeder/tests/test_routes.py
@@ -163,6 +163,52 @@ def test_generate_validation_error(self, client, mock_settings):
 
         assert response.status_code == status.HTTP_422_UNPROCESSABLE_ENTITY
 
+    def test_generate_with_overrides(self, client, mock_settings, mock_db):
+        """E3 (#409) — the nested overrides object is accepted (201)."""
+        mock_result = schemas.GenerateResult(
+            success=True,
+            records_created={"stores": 8, "products": 20, "sales": 5000},
+            duration_seconds=12.0,
+            message="Success",
+            seed=42,
+        )
+
+        with patch(
+            "app.features.seeder.routes.service.generate_data", return_value=mock_result
+        ) as mock_generate:
+            response = client.post(
+                "/seeder/generate",
+                json={
+                    "scenario": "demo_minimal",
+                    "overrides": {"stores": 8, "promotion_intensity": 0.3},
+                },
+            )
+
+        assert response.status_code == status.HTTP_201_CREATED
+        # The validated params object carries the parsed nested model.
+        params = mock_generate.call_args.args[1]
+        assert params.overrides is not None
+        assert params.overrides.stores == 8
+        assert params.overrides.promotion_intensity == 0.3
+
+    def test_generate_overrides_out_of_bounds_rejected(self, client, mock_settings):
+        """E3 (#409) — an out-of-bounds knob is a 422."""
+        response = client.post(
+            "/seeder/generate",
+            json={"overrides": {"stores": 0}},
+        )
+
+        assert response.status_code == status.HTTP_422_UNPROCESSABLE_ENTITY
+
+    def test_generate_overrides_unknown_knob_rejected(self, client, mock_settings):
+        """E3 (#409) — extra='forbid' rejects knobs outside the allow-list."""
+        response = client.post(
+            "/seeder/generate",
+            json={"overrides": {"bogus_knob": 1}},
+        )
+
+        assert response.status_code == status.HTTP_422_UNPROCESSABLE_ENTITY
+
     def test_generate_blocked_in_production(self, client, mock_db):
         """Test generate is blocked in production."""
         with patch("app.features.seeder.routes.get_settings") as mock_settings:
diff --git a/app/features/seeder/tests/test_service.py b/app/features/seeder/tests/test_service.py
index f21aa28b..7a29058d 100644
--- a/app/features/seeder/tests/test_service.py
+++ b/app/features/seeder/tests/test_service.py
@@ -7,6 +7,7 @@
 
 from app.features.seeder import schemas, service
 from app.shared.seeder.config import DEMO_MINIMAL_SPAN_DAYS, default_seed_end_date
+from app.shared.seeder.overrides import SeederOverrides
 
 
 class TestListScenarios:
@@ -198,6 +199,121 @@ def test_custom_scenario_preserves_holiday_list(self):
         assert config.time_series.monthly_seasonality == {10: 1.0, 11: 1.3, 12: 1.8}
 
 
+class TestApplySeedOverrides:
+    """Tests for the E3 (#409) curated nested overrides layer."""
+
+    def test_each_knob_maps_to_its_config_field(self):
+        """Every knob lands on the documented SeederConfig target."""
+        params = schemas.GenerateParams(
+            scenario="demo_minimal",
+            overrides=SeederOverrides(
+                stores=8,
+                products=20,
+                sparsity=0.3,
+                promotion_intensity=0.3,
+                stockout_intensity=0.1,
+                noise_sigma=0.25,
+            ),
+        )
+        config = service._build_config_from_params(params)
+
+        assert config.dimensions.stores == 8
+        assert config.dimensions.products == 20
+        assert config.sparsity.missing_combinations_pct == 0.3
+        assert config.retail.promotion_probability == 0.3
+        assert config.retail.stockout_probability == 0.1
+        assert config.time_series.noise_sigma == 0.25
+
+    def test_overrides_win_over_scalar_params(self):
+        """Nested overrides apply LAST and beat the legacy scalar params."""
+        params = schemas.GenerateParams(
+            scenario="demo_minimal",
+            stores=3,
+            products=10,
+            sparsity=0.5,
+            overrides=SeederOverrides(stores=8, products=20, sparsity=0.2),
+        )
+        config = service._build_config_from_params(params)
+
+        assert config.dimensions.stores == 8
+        assert config.dimensions.products == 20
+        assert config.sparsity.missing_combinations_pct == 0.2
+
+    def test_window_days_recomputes_start_from_end(self):
+        """window_days derives start_date from the request's end_date."""
+        params = schemas.GenerateParams(
+            scenario="demo_minimal",
+            start_date=date(2025, 1, 1),
+            end_date=date(2025, 6, 30),
+            overrides=SeederOverrides(window_days=120),
+        )
+        config = service._build_config_from_params(params)
+
+        assert config.end_date == date(2025, 6, 30)
+        assert config.start_date == date(2025, 6, 30) - timedelta(days=120)
+
+    def test_sparse_preset_gap_character_survives_sparsity_override(self):
+        """dataclasses.replace preserves the preset's random_gaps_* siblings."""
+        baseline = service._build_config_from_params(schemas.GenerateParams(scenario="sparse"))
+        assert baseline.sparsity.random_gaps_per_series > 0  # preset character
+
+        params = schemas.GenerateParams(
+            scenario="sparse",
+            overrides=SeederOverrides(sparsity=0.2),
+        )
+        config = service._build_config_from_params(params)
+
+        assert config.sparsity.missing_combinations_pct == 0.2
+        assert config.sparsity.random_gaps_per_series == baseline.sparsity.random_gaps_per_series
+        assert config.sparsity.gap_min_days == baseline.sparsity.gap_min_days
+        assert config.sparsity.gap_max_days == baseline.sparsity.gap_max_days
+
+    def test_partial_overrides_leave_other_fields_untouched(self):
+        """Setting one retail knob preserves the preset's other retail fields."""
+        baseline = service._build_config_from_params(
+            schemas.GenerateParams(scenario="stockout_heavy")
+        )
+        params = schemas.GenerateParams(
+            scenario="stockout_heavy",
+            overrides=SeederOverrides(promotion_intensity=0.4),
+        )
+        config = service._build_config_from_params(params)
+
+        assert config.retail.promotion_probability == 0.4
+        assert config.retail.stockout_probability == baseline.retail.stockout_probability
+        assert config.retail.promotion_lift == baseline.retail.promotion_lift
+
+    def test_no_overrides_is_byte_identical_regression(self):
+        """A body without overrides produces the exact config it does today."""
+
+        def _params(**extra: object) -> schemas.GenerateParams:
+            body: dict[str, object] = {
+                "scenario": "demo_minimal",
+                "seed": 42,
+                "stores": 3,
+                "products": 10,
+                "start_date": "2025-01-01",
+                "end_date": "2025-03-31",
+                "sparsity": 0.0,
+            }
+            body.update(extra)
+            return schemas.GenerateParams.model_validate(body)
+
+        legacy = service._build_config_from_params(_params())
+        with_none = service._build_config_from_params(_params(overrides=None))
+
+        assert legacy == with_none
+
+    def test_empty_overrides_object_is_noop(self):
+        """An all-None overrides object changes nothing."""
+        base = service._build_config_from_params(schemas.GenerateParams(scenario="demo_minimal"))
+        with_empty = service._build_config_from_params(
+            schemas.GenerateParams(scenario="demo_minimal", overrides=SeederOverrides())
+        )
+
+        assert base == with_empty
+
+
 class TestGetStatus:
     """Tests for get_status function."""
 
diff --git a/app/shared/seeder/overrides.py b/app/shared/seeder/overrides.py
new file mode 100644
index 00000000..11d8ed9f
--- /dev/null
+++ b/app/shared/seeder/overrides.py
@@ -0,0 +1,90 @@
+"""Curated, allow-listed seed-override schema (E3, issue #409).
+
+Shared between the seeder slice (``GenerateParams.overrides``) and the demo
+slice (``DemoRunRequest.seed_overrides``) -- ``app/shared`` is the sanctioned
+cross-slice home (vertical-slice rule; precedent: ``ScenarioPreset`` is
+imported by both slices from ``app.shared.seeder.config``).
+
+``extra="forbid"`` IS the allow-list: any knob not listed here is a 422 at
+the HTTP boundary (umbrella #406 risk mitigation -- the seeder's full 25+
+knob surface stays preset-driven; only these 7 curated knobs are exposed).
+"""
+
+from __future__ import annotations
+
+from pydantic import BaseModel, ConfigDict, Field
+
+
+class SeederOverrides(BaseModel):
+    """The 7 curated seed knobs, applied LAST in ``_build_config_from_params``.
+
+    Precedence: preset -> scalar ``stores``/``products``/``sparsity`` params ->
+    phase 1/2 overrides -> THIS object (wins). Each knob maps onto one
+    ``SeederConfig`` sub-dataclass field via ``dataclasses.replace`` so
+    preset-customized sibling fields survive.
+    """
+
+    # strict=True catches JSON-native coercion bugs ("5" -> 5); every field is
+    # int/float so no Field(strict=False) override is needed (see
+    # docs/_base/SECURITY.md -> "Pydantic v2 strict mode").
+    model_config = ConfigDict(strict=True, extra="forbid")
+
+    stores: int | None = Field(
+        default=None,
+        ge=1,
+        le=100,
+        description=("Store count -> DimensionConfig.stores; wins over the scalar `stores` param."),
+    )
+    products: int | None = Field(
+        default=None,
+        ge=1,
+        le=500,
+        description=(
+            "Product count -> DimensionConfig.products; wins over the scalar `products` param."
+        ),
+    )
+    window_days: int | None = Field(
+        default=None,
+        ge=75,
+        le=365,
+        description=(
+            "Seeded window length; start_date = end_date - window_days. >=75 keeps "
+            "the showcase historical_backfill gate clear. Rejected on the "
+            "calendar-pinned holiday_rush preset (demo surface)."
+        ),
+    )
+    sparsity: float | None = Field(
+        default=None,
+        ge=0.0,
+        le=0.9,
+        description=(
+            "Missing (store,product) grain fraction -> "
+            "SparsityConfig.missing_combinations_pct; preserves the preset's gap "
+            "config. 1.0 disallowed (would seed zero series)."
+        ),
+    )
+    promotion_intensity: float | None = Field(
+        default=None,
+        ge=0.0,
+        le=0.5,
+        description="-> RetailPatternConfig.promotion_probability (preset max 0.25).",
+    )
+    stockout_intensity: float | None = Field(
+        default=None,
+        ge=0.0,
+        le=0.5,
+        description=(
+            "-> RetailPatternConfig.stockout_probability. High values can "
+            "legitimately NaN-WAPE-fail the backtest (documented expected outcome)."
+        ),
+    )
+    noise_sigma: float | None = Field(
+        default=None,
+        ge=0.0,
+        le=0.5,
+        description="-> TimeSeriesConfig.noise_sigma (preset max 0.4).",
+    )
+
+    def is_empty(self) -> bool:
+        """True when no knob is set (``{}`` on the wire) -- treated as None everywhere."""
+        return not self.model_dump(exclude_none=True)
diff --git a/app/shared/seeder/tests/test_overrides.py b/app/shared/seeder/tests/test_overrides.py
new file mode 100644
index 00000000..06ad6034
--- /dev/null
+++ b/app/shared/seeder/tests/test_overrides.py
@@ -0,0 +1,95 @@
+"""Unit tests for the curated SeederOverrides allow-list model (E3, #409)."""
+
+from __future__ import annotations
+
+import pytest
+from pydantic import ValidationError
+
+from app.shared.seeder.overrides import SeederOverrides
+
+
+class TestBounds:
+    """Each knob rejects out-of-bounds values at both edges."""
+
+    @pytest.mark.parametrize(
+        ("knob", "low", "high"),
+        [
+            ("stores", 0, 101),
+            ("products", 0, 501),
+            ("window_days", 74, 366),
+        ],
+    )
+    def test_int_knob_bounds(self, knob: str, low: int, high: int) -> None:
+        with pytest.raises(ValidationError):
+            SeederOverrides.model_validate({knob: low})
+        with pytest.raises(ValidationError):
+            SeederOverrides.model_validate({knob: high})
+
+    @pytest.mark.parametrize(
+        ("knob", "low", "high"),
+        [
+            ("sparsity", -0.1, 0.91),
+            ("promotion_intensity", -0.1, 0.51),
+            ("stockout_intensity", -0.1, 0.51),
+            ("noise_sigma", -0.1, 0.51),
+        ],
+    )
+    def test_float_knob_bounds(self, knob: str, low: float, high: float) -> None:
+        with pytest.raises(ValidationError):
+            SeederOverrides.model_validate({knob: low})
+        with pytest.raises(ValidationError):
+            SeederOverrides.model_validate({knob: high})
+
+    def test_boundary_values_accepted(self) -> None:
+        ov = SeederOverrides.model_validate(
+            {
+                "stores": 100,
+                "products": 500,
+                "window_days": 75,
+                "sparsity": 0.9,
+                "promotion_intensity": 0.5,
+                "stockout_intensity": 0.0,
+                "noise_sigma": 0.5,
+            }
+        )
+        assert ov.stores == 100
+        assert ov.window_days == 75
+
+
+class TestAllowList:
+    """extra='forbid' is the machine-enforced allow-list."""
+
+    def test_unknown_knob_rejected(self) -> None:
+        with pytest.raises(ValidationError):
+            SeederOverrides.model_validate({"stores": 5, "bogus_knob": 1})
+
+    def test_strict_rejects_string_int(self) -> None:
+        # strict=True: a JSON string is not coerced (validate_python path).
+        with pytest.raises(ValidationError):
+            SeederOverrides.model_validate({"stores": "5"})
+
+
+class TestJsonPath:
+    """JSON-dict validation (FastAPI's validate_python path) happy paths."""
+
+    def test_partial_object_validates(self) -> None:
+        ov = SeederOverrides.model_validate({"stores": 8, "promotion_intensity": 0.3})
+        assert ov.stores == 8
+        assert ov.promotion_intensity == 0.3
+        assert ov.products is None
+
+    def test_model_dump_exclude_none_is_sparse(self) -> None:
+        ov = SeederOverrides.model_validate({"stores": 8, "noise_sigma": 0.25})
+        assert ov.model_dump(exclude_none=True) == {"stores": 8, "noise_sigma": 0.25}
+
+
+class TestIsEmpty:
+    """is_empty() truth table -- {} on the wire collapses to None everywhere."""
+
+    def test_empty_object_is_empty(self) -> None:
+        assert SeederOverrides().is_empty() is True
+        assert SeederOverrides.model_validate({}).is_empty() is True
+
+    def test_any_knob_makes_non_empty(self) -> None:
+        assert SeederOverrides(stores=1).is_empty() is False
+        assert SeederOverrides(noise_sigma=0.0).is_empty() is False

From 859b24b63909444262ea727cd5bb5a5acdf20dab Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 01:47:34 +0200
Subject: [PATCH 17/32] feat(api): thread seed overrides and user scope through
 demo pipeline (#409)

---
 app/features/demo/pipeline.py             | 106 +++++++++++-
 app/features/demo/schemas.py              |  86 +++++++++-
 app/features/demo/tests/test_pipeline.py  | 190 +++++++++++++++++++++-
 app/features/demo/tests/test_schemas.py   | 112 +++++++++++++
 app/features/demo/tests/test_workspace.py |  31 ++++
 app/features/demo/workspace.py            |  15 ++
 6 files changed, 524 insertions(+), 16 deletions(-)

diff --git a/app/features/demo/pipeline.py b/app/features/demo/pipeline.py
index a8ae7c3c..6052a4be 100644
--- a/app/features/demo/pipeline.py
+++ b/app/features/demo/pipeline.py
@@ -41,8 +41,9 @@
 from app.core.logging import get_logger
 from app.core.problem_details import EMBEDDING_AUTH_CODE, ERROR_TYPES
 from app.features.demo import workspace
-from app.features.demo.schemas import DemoRunRequest, StepEvent, StepStatus
+from app.features.demo.schemas import DemoRunRequest, StepEvent, StepStatus, UserScope
 from app.shared.seeder.config import ScenarioPreset
+from app.shared.seeder.overrides import SeederOverrides
 
 logger = get_logger(__name__)
 
@@ -261,6 +262,12 @@ class DemoContext:
     # E3 (#392) -- workspace label for plan tagging. Set alongside
     # workspace_id in run_pipeline's keep-branch; None on ephemeral runs.
     workspace_name: str | None = None
+    # E3 (#409) -- additive Optional start-frame config. seed_overrides is
+    # forwarded verbatim to /seeder/generate by step_seed (None on legacy
+    # frames); user_scope is the operator-selected focus pair step_status
+    # validates and adopts (warn + fallback to discovery when dangling).
+    seed_overrides: SeederOverrides | None = None
+    user_scope: UserScope | None = None
 
 
 # =============================================================================
@@ -546,12 +553,33 @@ async def step_seed(ctx: DemoContext, client: _Client) -> StepResult:
         ctx.scenario,
         _SeedProfile(DEMO_SEED_STORES, DEMO_SEED_PRODUCTS, DEMO_SEED_SPAN_DAYS),
     )
-    stores, products = profile.stores, profile.products
-    if profile.window is not None:
+    # E3 (#409) -- effective dims = override-or-profile, used for BOTH the POST
+    # scalars and the detail line so the step card tells the truth. The nested
+    # object is ALSO forwarded verbatim; the seeder applies it last (wins).
+    overrides = ctx.seed_overrides
+    stores = (
+        overrides.stores
+        if overrides is not None and overrides.stores is not None
+        else profile.stores
+    )
+    products = (
+        overrides.products
+        if overrides is not None and overrides.products is not None
+        else profile.products
+    )
+    if overrides is not None and overrides.window_days is not None:
+        # The DemoRunRequest validator guarantees window_days is never set on
+        # the calendar-pinned holiday_rush preset, so today-anchored is safe.
+        seed_end = datetime.now(UTC).date()
+        seed_start = seed_end - timedelta(days=overrides.window_days)
+    elif profile.window is not None:
         seed_start, seed_end = profile.window
     else:
         seed_end = datetime.now(UTC).date()
         seed_start = seed_end - timedelta(days=profile.span_days)
+    # Scalar sparsity stays 0.0 (preserves preset character per the
+    # `if params.sparsity > 0` guard); overrides.sparsity is the only way the
+    # demo overrides sparsity.
     body = await client.request(
         "seed",
         "POST",
@@ -565,6 +593,11 @@ async def step_seed(ctx: DemoContext, client: _Client) -> StepResult:
             "end_date": seed_end.isoformat(),
             "sparsity": 0.0,
             "dry_run": False,
+            **(
+                {"overrides": overrides.model_dump(exclude_none=True)}
+                if overrides is not None
+                else {}
+            ),
         },
     )
     raw_records: dict[str, Any] = body.get("records_created", {})
@@ -572,10 +605,21 @@ async def step_seed(ctx: DemoContext, client: _Client) -> StepResult:
     ctx.seed_records = records
     # GenerateResult.records_created uses "sales" (singular), not "sales_daily".
     sales = records.get("sales", records.get("sales_daily", 0))
+    overrides_applied = (
+        sorted(overrides.model_dump(exclude_none=True)) if overrides is not None else []
+    )
+    detail = f"{ctx.scenario.value}: {stores} stores x {products} products, {sales} sales rows"
+    if overrides_applied:
+        detail += f" (overrides: {', '.join(overrides_applied)})"
     return (
         "pass",
-        f"{ctx.scenario.value}: {stores} stores x {products} products, {sales} sales rows",
-        {"records_created": records, "scenario": ctx.scenario.value},
+        detail,
+        {
+            "records_created": records,
+            "scenario": ctx.scenario.value,
+            # E3 (#409) -- additive echo of the applied override knobs.
+            "overrides_applied": overrides_applied,
+        },
     )
 
 
@@ -593,6 +637,46 @@ async def step_status(ctx: DemoContext, client: _Client) -> StepResult:
         return ("fail", "no date_range in /seeder/status -- seed the database first", {})
     ctx.date_start = date.fromisoformat(raw_start)
     ctx.date_end = date.fromisoformat(raw_end)
+    sales = body.get("sales", 0)
+
+    # E3 (#409) -- operator-selected focus pair: validate both ids against the
+    # dimensions endpoints and adopt them. A dangling pair (e.g. after a
+    # reset+reseed re-issued ids -- sequences never reset) WARNS and falls back
+    # to discovery so a replayed reset=true workspace can never hard-fail here.
+    scope_warning = ""
+    if ctx.user_scope is not None:
+        try:
+            await client.request(
+                "status[scope-store]",
+                "GET",
+                f"/dimensions/stores/{ctx.user_scope.store_id}",
+            )
+            await client.request(
+                "status[scope-product]",
+                "GET",
+                f"/dimensions/products/{ctx.user_scope.product_id}",
+            )
+        except _StepError:
+            scope_warning = (
+                f"user_scope (store={ctx.user_scope.store_id}, "
+                f"product={ctx.user_scope.product_id}) not found -- "
+                "fell back to discovered pair; "
+            )
+        else:
+            ctx.store_id = ctx.user_scope.store_id
+            ctx.product_id = ctx.user_scope.product_id
+            return (
+                "pass",
+                f"date_range={raw_start}..{raw_end} sales={sales} "
+                f"store_id={ctx.store_id} product_id={ctx.product_id} (user-selected)",
+                {
+                    "store_id": ctx.store_id,
+                    "product_id": ctx.product_id,
+                    "date_range_start": raw_start,
+                    "date_range_end": raw_end,
+                    "user_scope_applied": True,
+                },
+            )
 
     stores_body = await client.request(
         "status[stores]", "GET", "/dimensions/stores?page=1&page_size=1"
@@ -617,16 +701,19 @@ async def step_status(ctx: DemoContext, client: _Client) -> StepResult:
     ctx.store_id = store_id_raw
     ctx.product_id = product_id_raw
 
-    sales = body.get("sales", 0)
     return (
-        "pass",
-        f"date_range={raw_start}..{raw_end} sales={sales} "
+        # E3 (#409) -- "warn" (never "fail") when a requested scope dangled:
+        # only "fail" stops the run, so the pipeline proceeds on the
+        # discovered pair with the divergence visible on the step card.
+        "warn" if scope_warning else "pass",
+        f"{scope_warning}date_range={raw_start}..{raw_end} sales={sales} "
         f"store_id={ctx.store_id} product_id={ctx.product_id}",
         {
             "store_id": ctx.store_id,
             "product_id": ctx.product_id,
             "date_range_start": raw_start,
             "date_range_end": raw_end,
+            "user_scope_applied": False,
         },
     )
 
@@ -2648,6 +2735,9 @@ async def run_pipeline(app: FastAPI, req: DemoRunRequest) -> AsyncIterator[StepE
         skip_seed=req.skip_seed,
         reset=req.reset,
         scenario=req.scenario,
+        # E3 (#409) -- thread the validated start-frame config verbatim.
+        seed_overrides=req.seed_overrides,
+        user_scope=req.user_scope,
     )
     # E1 (#390) -- create the workspace row BEFORE the first step executes so
     # even an early failure records the run config. create_workspace is
diff --git a/app/features/demo/schemas.py b/app/features/demo/schemas.py
index 58daf891..d5aa78ea 100644
--- a/app/features/demo/schemas.py
+++ b/app/features/demo/schemas.py
@@ -14,6 +14,7 @@
 from pydantic import BaseModel, ConfigDict, Field, field_validator, model_validator
 
 from app.shared.seeder.config import ScenarioPreset
+from app.shared.seeder.overrides import SeederOverrides
 
 # One pipeline step's outcome.
 StepStatus = Literal["running", "pass", "fail", "skip", "warn"]
@@ -26,6 +27,22 @@ def _utc_now() -> datetime:
     return datetime.now(UTC)
 
 
+class UserScope(BaseModel):
+    """Operator-selected (store, product) focus pair (E3, issue #409).
+
+    Ids are REAL discovered ids (Postgres sequences never reset -- ids are not
+    1-based); ``step_status`` validates them against ``/dimensions/*/{id}``
+    and warn-falls-back to discovery when the pair dangles (e.g. after a
+    reset+reseed re-issued ids). ``extra="forbid"`` keeps the slot schema
+    closed; additive keys need a documented schema change.
+    """
+
+    model_config = ConfigDict(strict=True, extra="forbid")
+
+    store_id: int = Field(..., ge=1, description="Real store id from /dimensions/stores.")
+    product_id: int = Field(..., ge=1, description="Real product id from /dimensions/products.")
+
+
 class DemoRunRequest(BaseModel):
     """Request body for ``POST /demo/run`` and the ``WS /demo/stream`` start frame.
 
@@ -34,7 +51,10 @@ class DemoRunRequest(BaseModel):
     override -- there is no ``date`` / ``datetime`` / ``UUID`` / ``Decimal``
     field (see ``.claude/rules/security-patterns.md`` and
     ``test_strict_mode_policy.py``). The sole exception is ``scenario``, whose
-    enum-on-the-wire form carries its own override (PRP-38).
+    enum-on-the-wire form carries its own override (PRP-38). The nested
+    ``seed_overrides`` / ``user_scope`` models are themselves all-JSON-native
+    and validate from the JSON-parsed dict under the parent's strict mode
+    (runtime-verified on pydantic 2.12.5 -- E3 #409).
     """
 
     model_config = ConfigDict(strict=True)
@@ -85,6 +105,25 @@ class DemoRunRequest(BaseModel):
         pattern=r"^[0-9a-f]{32}$",  # uuid4().hex shape of workspace_id
         description="workspace_id this run replays; requires preservation='keep'.",
     )
+    # E3 (#409): curated seed overrides + operator-selected focus pair. Both
+    # additive Optional with None defaults so legacy frames stay byte-identical.
+    # The nested models carry their own ConfigDict(strict=True, extra="forbid").
+    seed_overrides: SeederOverrides | None = Field(
+        default=None,
+        description=(
+            "Curated seeder overrides (allow-listed knobs); requires "
+            "skip_seed=false (Re-seed first). Forwarded verbatim to "
+            "POST /seeder/generate and recorded on a kept workspace row."
+        ),
+    )
+    user_scope: UserScope | None = Field(
+        default=None,
+        description=(
+            "Operator-selected (store, product) focus pair the pipeline models "
+            "instead of the auto-discovered first pair; validated by the status "
+            "step (warn + fallback to discovery on a dangling pair)."
+        ),
+    )
 
     @model_validator(mode="after")
     def _workspace_name_requires_keep(self) -> DemoRunRequest:
@@ -100,6 +139,34 @@ def _replayed_from_requires_keep(self) -> DemoRunRequest:
             raise ValueError("replayed_from_workspace_id requires preservation='keep'")
         return self
 
+    @model_validator(mode="after")
+    def _seed_overrides_require_reseed(self) -> DemoRunRequest:
+        """Reject overrides on a run that skips the seed step (silent no-op trap).
+
+        An empty overrides object (``{}`` on the wire) normalizes to ``None``
+        so downstream code has a single "no overrides" representation.
+        """
+        if self.seed_overrides is not None and self.seed_overrides.is_empty():
+            self.seed_overrides = None
+        if self.seed_overrides is not None and self.skip_seed:
+            raise ValueError("seed_overrides requires skip_seed=false (Re-seed first)")
+        return self
+
+    @model_validator(mode="after")
+    def _window_days_forbidden_on_holiday_rush(self) -> DemoRunRequest:
+        """Reject window_days on the calendar-pinned holiday_rush preset.
+
+        The preset's HolidayConfig spikes are fixed 2024 dates -- a shifted
+        window would silently drop every holiday spike, so this fails loudly.
+        """
+        if (
+            self.seed_overrides is not None
+            and self.seed_overrides.window_days is not None
+            and self.scenario is ScenarioPreset.HOLIDAY_RUSH
+        ):
+            raise ValueError("window_days cannot override the calendar-pinned holiday_rush window")
+        return self
+
 
 class WorkspaceUpdateRequest(BaseModel):
     """Partial lifecycle update for ``PATCH /demo/workspaces/{workspace_id}``.
@@ -266,6 +333,15 @@ class WorkspaceListItem(BaseModel):
         default=None,
         description="workspace_id this run replayed (soft reference; may dangle).",
     )
+    # E3 (#409) -- the two replay-relevant story slots live on the LIST item
+    # (not detail-only): the frontend Replay reads list rows, and the
+    # replay-verbatim contract includes both slots.
+    seed_overrides: dict[str, Any] | None = Field(
+        default=None, description="Story slot (E3 #409): seeder-override payload."
+    )
+    user_scope: dict[str, Any] | None = Field(
+        default=None, description="Story slot (E3 #409): operator-selected focus."
+    )
 
 
 class WorkspaceDetailResponse(WorkspaceListItem):
@@ -285,12 +361,8 @@ class WorkspaceDetailResponse(WorkspaceListItem):
     config_schema_version: int = Field(
         default=1, description="Version of the config + story-slot schema."
     )
-    seed_overrides: dict[str, Any] | None = Field(
-        default=None, description="Story slot (E3 #409 writes): seeder-override payload."
-    )
-    user_scope: dict[str, Any] | None = Field(
-        default=None, description="Story slot (E3 #409 writes): operator-selected focus."
-    )
+    # E3 (#409) -- seed_overrides / user_scope moved UP to WorkspaceListItem
+    # (replay reads list rows); the four remaining story slots stay detail-only.
     approval_events: list[dict[str, Any]] | None = Field(
         default=None, description="Story slot (E5 #411 writes): HITL approval audit."
     )
diff --git a/app/features/demo/tests/test_pipeline.py b/app/features/demo/tests/test_pipeline.py
index 197c2842..1fc4c1b4 100644
--- a/app/features/demo/tests/test_pipeline.py
+++ b/app/features/demo/tests/test_pipeline.py
@@ -16,8 +16,9 @@
 from fastapi import FastAPI
 
 from app.features.demo import pipeline
-from app.features.demo.schemas import DemoRunRequest
+from app.features.demo.schemas import DemoRunRequest, UserScope
 from app.shared.seeder.config import ScenarioPreset
+from app.shared.seeder.overrides import SeederOverrides
 
 # A bare app instance -- the fake clients ignore it; it only satisfies the
 # run_pipeline(app: FastAPI, ...) signature.
@@ -2454,3 +2455,190 @@ async def test_step_seed_retail_standard_posts_demo_scaled_profile():
     # sparsity stays 0.0 — the seeder override fires only when > 0, which is
     # what preserves the sparse preset's 50%-missing character.
     assert body["sparsity"] == 0.0
+
+
+# =============================================================================
+# E3 (#409) — seed overrides + user scope
+# =============================================================================
+
+
+async def test_step_seed_forwards_seed_overrides():
+    """E3 (#409) — the nested overrides ride the /seeder/generate body verbatim,
+    the POST scalars echo the effective dims, and scalar sparsity stays 0.0."""
+    ctx = pipeline.DemoContext(
+        seed=42,
+        skip_seed=False,
+        reset=False,
+        scenario=ScenarioPreset.DEMO_MINIMAL,
+        seed_overrides=SeederOverrides(stores=8, products=20, promotion_intensity=0.3),
+    )
+    client = _RecordingClient(
+        None,
+        responses={("POST", "/seeder/generate"): {"records_created": {"sales": 1}}},
+    )
+    status, detail, data = await pipeline.step_seed(ctx, _as_client(client))
+    assert status == "pass"
+    body = client.calls[0][2]
+    assert body is not None
+    assert body["overrides"] == {"stores": 8, "products": 20, "promotion_intensity": 0.3}
+    # Effective dims on the scalars + the detail line (the card tells the truth).
+    assert body["stores"] == 8
+    assert body["products"] == 20
+    assert body["sparsity"] == 0.0  # preset-character guard; nested wins anyway
+    assert "8 stores x 20 products" in detail
+    assert "overrides: products, promotion_intensity, stores" in detail
+    assert data["overrides_applied"] == ["products", "promotion_intensity", "stores"]
+
+
+async def test_step_seed_without_overrides_is_legacy_identical():
+    """E3 (#409) — a legacy ctx posts NO overrides key (byte-identical body)."""
+    ctx = pipeline.DemoContext(
+        seed=42, skip_seed=False, reset=False, scenario=ScenarioPreset.DEMO_MINIMAL
+    )
+    client = _RecordingClient(
+        None,
+        responses={("POST", "/seeder/generate"): {"records_created": {"sales": 1}}},
+    )
+    status, _detail, data = await pipeline.step_seed(ctx, _as_client(client))
+    assert status == "pass"
+    body = client.calls[0][2]
+    assert body is not None
+    assert "overrides" not in body
+    assert body["stores"] == 3  # demo_minimal profile
+    assert data["overrides_applied"] == []
+
+
+async def test_step_seed_window_days_overrides_profile_window():
+    """E3 (#409) — window_days drives a today-anchored window of that length."""
+    ctx = pipeline.DemoContext(
+        seed=42,
+        skip_seed=False,
+        reset=False,
+        scenario=ScenarioPreset.DEMO_MINIMAL,
+        seed_overrides=SeederOverrides(window_days=120),
+    )
+    client = _RecordingClient(
+        None,
+        responses={("POST", "/seeder/generate"): {"records_created": {"sales": 1}}},
+    )
+    status, _detail, _data = await pipeline.step_seed(ctx, _as_client(client))
+    assert status == "pass"
+    body = client.calls[0][2]
+    assert body is not None
+    start = date.fromisoformat(body["start_date"])
+    end = date.fromisoformat(body["end_date"])
+    assert end - start == timedelta(days=120)
+    assert body["overrides"] == {"window_days": 120}
+
+
+def _status_discovery_responses() -> dict[tuple[str, str], Any]:
+    """Canned responses for the legacy first-pair discovery path."""
+    return {
+        ("GET", "/seeder/status"): {
+            "date_range_start": "2026-01-01",
+            "date_range_end": "2026-03-31",
+            "sales": 900,
+        },
+        ("GET", "/dimensions/stores?page=1&page_size=1"): {"stores": [{"id": 4}]},
+        ("GET", "/dimensions/products?page=1&page_size=1"): {"products": [{"id": 9}]},
+    }
+
+
+async def test_step_status_honors_user_scope():
+    """E3 (#409) — a valid pair is validated via GET-by-id and adopted."""
+    ctx = pipeline.DemoContext(
+        seed=42,
+        skip_seed=True,
+        reset=False,
+        scenario=ScenarioPreset.DEMO_MINIMAL,
+        user_scope=UserScope(store_id=12, product_id=47),
+    )
+    client = _RecordingClient(
+        None,
+        responses={
+            ("GET", "/seeder/status"): {
+                "date_range_start": "2026-01-01",
+                "date_range_end": "2026-03-31",
+                "sales": 900,
+            },
+            ("GET", "/dimensions/stores/12"): {"id": 12, "code": "S012"},
+            ("GET", "/dimensions/products/47"): {"id": 47, "sku": "P047"},
+        },
+    )
+    status, detail, data = await pipeline.step_status(ctx, _as_client(client))
+    assert status == "pass"
+    assert ctx.store_id == 12
+    assert ctx.product_id == 47
+    assert "(user-selected)" in detail
+    assert data["user_scope_applied"] is True
+    # Both GET-by-id validations were issued; no discovery call happened.
+    paths = [path for _method, path, _body in client.calls]
+    assert "/dimensions/stores/12" in paths
+    assert "/dimensions/products/47" in paths
+    assert "/dimensions/stores?page=1&page_size=1" not in paths
+
+
+async def test_step_status_dangling_scope_warns_and_falls_back():
+    """E3 (#409) — a 404 pair WARNS (never fails) and discovery takes over."""
+    responses = _status_discovery_responses()
+    responses[("GET", "/dimensions/products/47")] = {"id": 47}
+    ctx = pipeline.DemoContext(
+        seed=42,
+        skip_seed=True,
+        reset=False,
+        scenario=ScenarioPreset.DEMO_MINIMAL,
+        user_scope=UserScope(store_id=12, product_id=47),
+    )
+    client = _RecordingClient(
+        None,
+        responses=responses,
+        errors={
+            ("GET", "/dimensions/stores/12"): pipeline._StepError(
+                "status[scope-store]", 404, {"title": "Not Found"}
+            ),
+        },
+    )
+    status, detail, data = await pipeline.step_status(ctx, _as_client(client))
+    assert status == "warn"
+    assert ctx.store_id == 4  # discovered pair
+    assert ctx.product_id == 9
+    assert "user_scope (store=12, product=47) not found" in detail
+    assert data["user_scope_applied"] is False
+
+
+async def test_step_status_without_scope_unchanged():
+    """E3 (#409) — the legacy discovery path is byte-identical (pass, no warn)."""
+    ctx = pipeline.DemoContext(
+        seed=42, skip_seed=True, reset=False, scenario=ScenarioPreset.DEMO_MINIMAL
+    )
+    client = _RecordingClient(None, responses=_status_discovery_responses())
+    status, detail, data = await pipeline.step_status(ctx, _as_client(client))
+    assert status == "pass"
+    assert ctx.store_id == 4
+    assert ctx.product_id == 9
+    assert "user_scope" not in detail
+    assert data["user_scope_applied"] is False
+
+
+async def test_run_pipeline_threads_e3_fields(monkeypatch):
+    """E3 (#409) — run_pipeline threads seed_overrides/user_scope into ctx."""
+    captured: dict[str, Any] = {}
+
+    async def _capturing_precheck(ctx: Any, _client: Any) -> Any:
+        captured["seed_overrides"] = ctx.seed_overrides
+        captured["user_scope"] = ctx.user_scope
+        return ("fail", "stop after capture", {})
+
+    monkeypatch.setattr(pipeline, "step_precheck", _capturing_precheck)
+    monkeypatch.setattr(pipeline, "_Client", _build_fake_client("unused", {}))
+
+    req = DemoRunRequest.model_validate(
+        {
+            "skip_seed": False,
+            "seed_overrides": {"stores": 8, "noise_sigma": 0.25},
+            "user_scope": {"store_id": 12, "product_id": 47},
+        }
+    )
+    _events = [e async for e in pipeline.run_pipeline(app=_FAKE_APP, req=req)]
+    assert captured["seed_overrides"] == SeederOverrides(stores=8, noise_sigma=0.25)
+    assert captured["user_scope"] == UserScope(store_id=12, product_id=47)
diff --git a/app/features/demo/tests/test_schemas.py b/app/features/demo/tests/test_schemas.py
index 866f708c..8019d219 100644
--- a/app/features/demo/tests/test_schemas.py
+++ b/app/features/demo/tests/test_schemas.py
@@ -10,6 +10,7 @@
     DemoRunRequest,
     DemoRunResult,
     StepEvent,
+    UserScope,
     WorkspaceDetailResponse,
     WorkspaceListItem,
     WorkspaceListResponse,
@@ -143,6 +144,95 @@ def test_demo_run_request_replayed_from_pattern_rejected():
             )
 
 
+# =============================================================================
+# E3 (#409) -- seed_overrides + user_scope (advanced seed config + focus pair)
+# =============================================================================
+
+
+def test_demo_run_request_e3_field_defaults():
+    """E3 (#409) -- defaults None; a legacy 4-field frame stays byte-identical."""
+    req = DemoRunRequest.model_validate(
+        {"seed": 7, "reset": False, "skip_seed": True, "scenario": "demo_minimal"}
+    )
+    assert req.seed_overrides is None
+    assert req.user_scope is None
+
+
+def test_demo_run_request_seed_overrides_json_path():
+    """E3 (#409) -- the JSON wire form (validate_python on a parsed dict, the
+    path FastAPI uses) accepts a nested overrides object on a re-seed run."""
+    req = DemoRunRequest.model_validate(
+        {"skip_seed": False, "seed_overrides": {"stores": 8, "promotion_intensity": 0.3}}
+    )
+    assert req.seed_overrides is not None
+    assert req.seed_overrides.stores == 8
+    assert req.seed_overrides.promotion_intensity == 0.3
+
+
+def test_demo_run_request_seed_overrides_require_reseed():
+    """E3 (#409) -- overrides on a skip_seed run would be a silent no-op."""
+    with pytest.raises(ValidationError):
+        DemoRunRequest.model_validate({"skip_seed": True, "seed_overrides": {"stores": 8}})
+    # skip_seed defaults to True -- omitting it must also reject.
+    with pytest.raises(ValidationError):
+        DemoRunRequest.model_validate({"seed_overrides": {"stores": 8}})
+
+
+def test_demo_run_request_empty_seed_overrides_normalizes_to_none():
+    """E3 (#409) -- {} on the wire collapses to None (single no-overrides form),
+    and is therefore legal even on a skip_seed run."""
+    req = DemoRunRequest.model_validate({"skip_seed": True, "seed_overrides": {}})
+    assert req.seed_overrides is None
+
+
+def test_demo_run_request_window_days_rejected_on_holiday_rush():
+    """E3 (#409) -- holiday_rush is calendar-pinned; window_days fails loudly."""
+    with pytest.raises(ValidationError):
+        DemoRunRequest.model_validate(
+            {
+                "skip_seed": False,
+                "scenario": "holiday_rush",
+                "seed_overrides": {"window_days": 120},
+            }
+        )
+    # The same knob is fine on a today-anchored preset.
+    req = DemoRunRequest.model_validate(
+        {
+            "skip_seed": False,
+            "scenario": "retail_standard",
+            "seed_overrides": {"window_days": 120},
+        }
+    )
+    assert req.seed_overrides is not None
+    assert req.seed_overrides.window_days == 120
+
+
+def test_demo_run_request_seed_overrides_unknown_knob_rejected():
+    """E3 (#409) -- the nested extra='forbid' allow-list holds on the demo path."""
+    with pytest.raises(ValidationError):
+        DemoRunRequest.model_validate({"skip_seed": False, "seed_overrides": {"bogus_knob": 1}})
+
+
+def test_demo_run_request_user_scope_json_path():
+    """E3 (#409) -- user_scope accepts a real id pair; works with skip_seed."""
+    req = DemoRunRequest.model_validate({"user_scope": {"store_id": 12, "product_id": 47}})
+    assert req.user_scope is not None
+    assert req.user_scope.store_id == 12
+    assert req.user_scope.product_id == 47
+
+
+def test_user_scope_rejects_extra_keys_and_bad_ids():
+    """E3 (#409) -- closed schema; ids are ge=1; strict rejects string ints."""
+    with pytest.raises(ValidationError):
+        UserScope.model_validate({"store_id": 1, "product_id": 1, "extra": True})
+    with pytest.raises(ValidationError):
+        UserScope.model_validate({"store_id": 0, "product_id": 1})
+    with pytest.raises(ValidationError):
+        UserScope.model_validate({"store_id": 1})  # product_id required
+    with pytest.raises(ValidationError):
+        UserScope.model_validate({"store_id": "1", "product_id": 1})
+
+
 # =============================================================================
 # E1 (#407) -- WorkspaceUpdateRequest (PATCH body)
 # =============================================================================
@@ -393,6 +483,28 @@ def test_workspace_detail_passes_e1_fields_through():
     assert detail.job_ids == ["job-1", "job-2"]
 
 
+def test_workspace_list_item_exposes_e3_slots():
+    """E3 (#409) -- seed_overrides/user_scope live on the LIST item (replay
+    reads list rows), defaulting to None on rows without them."""
+    bare = WorkspaceListItem.model_validate(_orm_like_workspace_row())
+    assert bare.seed_overrides is None
+    assert bare.user_scope is None
+
+    slotted = WorkspaceListItem.model_validate(
+        _orm_like_workspace_row(
+            seed_overrides={"stores": 8, "noise_sigma": 0.25},
+            user_scope={"store_id": 12, "product_id": 47},
+        )
+    )
+    assert slotted.seed_overrides == {"stores": 8, "noise_sigma": 0.25}
+    assert slotted.user_scope == {"store_id": 12, "product_id": 47}
+    # Detail inherits the same exposure.
+    detail = WorkspaceDetailResponse.model_validate(
+        _orm_like_workspace_row(seed_overrides={"sparsity": 0.3})
+    )
+    assert detail.seed_overrides == {"sparsity": 0.3}
+
+
 def test_workspace_list_response_shape():
     """E4 (#393) -- page shape mirrors the scenarios list (items + total)."""
     item = WorkspaceListItem.model_validate(_orm_like_workspace_row())
diff --git a/app/features/demo/tests/test_workspace.py b/app/features/demo/tests/test_workspace.py
index cb28dea2..fcef7115 100644
--- a/app/features/demo/tests/test_workspace.py
+++ b/app/features/demo/tests/test_workspace.py
@@ -76,6 +76,37 @@ async def test_create_workspace_persists_config(db_session: AsyncSession) -> Non
     assert row.result_summary is None
 
 
+async def test_create_workspace_persists_e3_slots(db_session: AsyncSession) -> None:
+    """E3 (#409) -- seed_overrides/user_scope land in the story slots, sparse."""
+    workspace_id = await workspace.create_workspace(
+        _keep_request(
+            skip_seed=False,
+            seed_overrides={"stores": 8, "promotion_intensity": 0.3},
+            user_scope={"store_id": 12, "product_id": 47},
+        )
+    )
+    assert workspace_id is not None
+
+    row = await workspace.get_workspace(db_session, workspace_id)
+    assert row is not None
+    # Sparse JSON: only the operator-set knobs appear.
+    assert row.seed_overrides == {"stores": 8, "promotion_intensity": 0.3}
+    assert row.user_scope == {"store_id": 12, "product_id": 47}
+
+
+async def test_create_workspace_without_e3_fields_persists_nulls(
+    db_session: AsyncSession,
+) -> None:
+    """E3 (#409) -- a legacy keep-run stores NULL slots (never {})."""
+    workspace_id = await workspace.create_workspace(_keep_request())
+    assert workspace_id is not None
+
+    row = await workspace.get_workspace(db_session, workspace_id)
+    assert row is not None
+    assert row.seed_overrides is None
+    assert row.user_scope is None
+
+
 async def test_finalize_workspace_completed(db_session: AsyncSession) -> None:
     """finalize(failed=False) settles to completed with collected ids."""
     workspace_id = await workspace.create_workspace(_keep_request())
diff --git a/app/features/demo/workspace.py b/app/features/demo/workspace.py
index 364b64fd..ca3002df 100644
--- a/app/features/demo/workspace.py
+++ b/app/features/demo/workspace.py
@@ -112,6 +112,21 @@ async def create_workspace(req: DemoRunRequest) -> str | None:
                     # E1 (#407): replay provenance, recorded verbatim (soft
                     # reference -- no existence check; dangles are designed).
                     replayed_from_workspace_id=req.replayed_from_workspace_id,
+                    # E3 (#409): the two replay-relevant story slots, recorded
+                    # at create time (the REQUESTED config -- the effective
+                    # grain lands separately on store_id/product_id at
+                    # finalize, so a fallen-back scope stays visible). Sparse
+                    # JSON: only operator-set knobs appear; never {}.
+                    seed_overrides=(
+                        req.seed_overrides.model_dump(mode="json", exclude_none=True)
+                        if req.seed_overrides is not None
+                        else None
+                    ),
+                    user_scope=(
+                        req.user_scope.model_dump(mode="json")
+                        if req.user_scope is not None
+                        else None
+                    ),
                 )
             )
             await db.commit()

From e0dc2d87b884c232d63a11d3208b4c11d5ee9688 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 01:47:34 +0200
Subject: [PATCH 18/32] test(api): cover replay-verbatim seed overrides and
 scope slots (#409)

---
 tests/test_e2e_demo.py | 56 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 56 insertions(+)

diff --git a/tests/test_e2e_demo.py b/tests/test_e2e_demo.py
index 5ef406ff..3323c524 100644
--- a/tests/test_e2e_demo.py
+++ b/tests/test_e2e_demo.py
@@ -614,6 +614,62 @@ def test_demo_replay_same_config_twice(
             assert row["status"] == "completed"
 
 
+@pytest.mark.integration
+def test_demo_replay_preserves_seed_overrides_and_scope(
+    uvicorn_subprocess: subprocess.Popen[bytes],
+) -> None:
+    """E3 (#409) — replayed runs carry identical seed_overrides/user_scope slots.
+
+    The replay-verbatim contract: the slots record the REQUESTED config, so two
+    runs of the same body must produce two workspace rows with identical slot
+    JSON. ``user_scope`` is deliberately a (1, 1) pair that almost certainly
+    dangles (sequences never reset) — the status step must WARN + fall back,
+    the run must still pass, and the slot must still record the request
+    verbatim (requested-vs-effective divergence stays visible: the row's
+    store_id/product_id columns carry the discovered grain).
+    """
+    import json
+
+    body_dict: dict[str, object] = {
+        "seed": 42,
+        "reset": True,
+        "skip_seed": False,
+        "scenario": "demo_minimal",
+        "preservation": "keep",
+        "workspace_name": "e3-replay-slots",
+        # Smallest overrides to keep wall-clock sane (matches demo_minimal dims).
+        "seed_overrides": {"stores": 3, "products": 10},
+        "user_scope": {"store_id": 1, "product_id": 1},
+    }
+
+    first = _post_demo_run(body_dict, REPLAY_RUN_TIMEOUT_S)
+    assert first["overall_status"] == "pass", (
+        f"first run did not pass: "
+        f"steps={[(s['step_name'], s['status'], s['detail']) for s in first['steps']]}"  # type: ignore[index]
+    )
+    second = _post_demo_run(body_dict, REPLAY_RUN_TIMEOUT_S)
+    assert second["overall_status"] == "pass", (
+        f"replay did not pass: "
+        f"steps={[(s['step_name'], s['status'], s['detail']) for s in second['steps']]}"  # type: ignore[index]
+    )
+
+    with urllib.request.urlopen(  # noqa: S310 — http://127.0.0.1 internal URL
+        f"{DEMO_API_URL}/demo/workspaces?limit=100", timeout=10.0
+    ) as resp:
+        assert resp.status == 200
+        page = json.loads(resp.read())
+    rows_by_id = {w["workspace_id"]: w for w in page["workspaces"]}
+    first_row = rows_by_id[first["workspace_id"]]
+    second_row = rows_by_id[second["workspace_id"]]
+
+    # The slots are exposed on the LIST item (replay reads list rows) and are
+    # identical across the original and the replay.
+    assert first_row["seed_overrides"] == {"stores": 3, "products": 10}
+    assert first_row["user_scope"] == {"store_id": 1, "product_id": 1}
+    assert second_row["seed_overrides"] == first_row["seed_overrides"]
+    assert second_row["user_scope"] == first_row["user_scope"]
+
+
 @pytest.mark.integration
 def test_run_demo_precondition_failure_exits_2() -> None:
     """A bogus API URL surfaces as a precondition failure with exit 2.

From ee59cbd392ff15affd7142633159fbcb50aab37d Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 01:56:54 +0200
Subject: [PATCH 19/32] feat(ui): add advanced seed config panel and scope
 selector to showcase (#409)

---
 .../demo/ReplayConfirmDialog.test.tsx         |   2 +
 .../components/demo/ScopeSelector.test.tsx    | 109 +++++++++
 .../src/components/demo/ScopeSelector.tsx     | 143 ++++++++++++
 .../components/demo/SeedConfigPanel.test.tsx  |  95 ++++++++
 .../src/components/demo/SeedConfigPanel.tsx   | 213 ++++++++++++++++++
 .../demo/WorkspaceArtifactsPanel.test.tsx     |   2 +
 .../demo/WorkspaceEditDialog.test.tsx         |   2 +
 .../components/demo/WorkspacePanel.test.tsx   |   2 +
 frontend/src/components/demo/index.ts         |   3 +
 .../components/demo/replay-request.test.ts    |  26 +++
 .../src/components/demo/replay-request.ts     |   4 +
 frontend/src/pages/showcase.tsx               |  52 ++++-
 frontend/src/pages/workspace-compare.test.tsx |   2 +
 frontend/src/types/api.ts                     |  26 +++
 14 files changed, 678 insertions(+), 3 deletions(-)
 create mode 100644 frontend/src/components/demo/ScopeSelector.test.tsx
 create mode 100644 frontend/src/components/demo/ScopeSelector.tsx
 create mode 100644 frontend/src/components/demo/SeedConfigPanel.test.tsx
 create mode 100644 frontend/src/components/demo/SeedConfigPanel.tsx

diff --git a/frontend/src/components/demo/ReplayConfirmDialog.test.tsx b/frontend/src/components/demo/ReplayConfirmDialog.test.tsx
index 8c3d705d..cee3981b 100644
--- a/frontend/src/components/demo/ReplayConfirmDialog.test.tsx
+++ b/frontend/src/components/demo/ReplayConfirmDialog.test.tsx
@@ -32,6 +32,8 @@ const baseItem: WorkspaceListItem = {
   pinned: false,
   tags: [],
   replayed_from_workspace_id: null,
+  seed_overrides: null,
+  user_scope: null,
 }
 
 function renderDialog(workspace: WorkspaceListItem | null, handlers = {}) {
diff --git a/frontend/src/components/demo/ScopeSelector.test.tsx b/frontend/src/components/demo/ScopeSelector.test.tsx
new file mode 100644
index 00000000..643dc057
--- /dev/null
+++ b/frontend/src/components/demo/ScopeSelector.test.tsx
@@ -0,0 +1,109 @@
+import { afterEach, describe, expect, it, vi } from 'vitest'
+import { cleanup, fireEvent, render, screen } from '@testing-library/react'
+import { ScopeSelector } from './ScopeSelector'
+
+afterEach(cleanup)
+
+vi.mock('@/hooks/use-stores', () => ({
+  useStores: () => ({
+    data: {
+      stores: [
+        {
+          id: 12,
+          code: 'S012',
+          name: 'Riverside',
+          region: 'North',
+          city: null,
+          store_type: 'supermarket',
+          created_at: '',
+          updated_at: '',
+        },
+        {
+          id: 13,
+          code: 'S013',
+          name: 'Hilltop',
+          region: 'South',
+          city: null,
+          store_type: 'express',
+          created_at: '',
+          updated_at: '',
+        },
+      ],
+    },
+    isLoading: false,
+  }),
+}))
+
+vi.mock('@/hooks/use-products', () => ({
+  useProducts: () => ({
+    data: {
+      products: [
+        {
+          id: 47,
+          sku: 'SKU-047',
+          name: 'Oat Milk',
+          category: 'Dairy',
+          brand: 'BrandA',
+          base_price: null,
+          base_cost: null,
+          created_at: '',
+          updated_at: '',
+        },
+      ],
+    },
+    isLoading: false,
+  }),
+}))
+
+vi.mock('@/hooks/use-seeder', () => ({
+  useSeederStatus: () => ({
+    data: {
+      date_range_start: '2026-01-01',
+      date_range_end: '2026-03-31',
+    },
+  }),
+}))
+
+describe('ScopeSelector', () => {
+  it('renders the two dropdowns with auto-discover placeholders', () => {
+    render(<ScopeSelector value={null} onChange={() => undefined} />)
+    expect(screen.getByText('Auto-discover first store')).toBeTruthy()
+    expect(screen.getByText('Auto-discover first product')).toBeTruthy()
+    // No preview card while nothing is selected.
+    expect(screen.queryByText('Focus pair')).toBeNull()
+  })
+
+  it('previews the selected pair with names, traits, and the seeded window', () => {
+    render(
+      <ScopeSelector value={{ store_id: 12, product_id: 47 }} onChange={() => undefined} />
+    )
+    expect(screen.getByText('Focus pair')).toBeTruthy()
+    expect(screen.getByText('S012 · Riverside (North, supermarket)')).toBeTruthy()
+    expect(screen.getByText('SKU-047 · Oat Milk (Dairy, BrandA)')).toBeTruthy()
+    expect(screen.getByText(/2026-01-01 → 2026-03-31/)).toBeTruthy()
+  })
+
+  it('falls back to raw ids when the pair is not in the loaded page', () => {
+    render(
+      <ScopeSelector value={{ store_id: 999, product_id: 888 }} onChange={() => undefined} />
+    )
+    expect(screen.getByText('store #999')).toBeTruthy()
+    expect(screen.getByText('product #888')).toBeTruthy()
+  })
+
+  it('clears the selection via the Clear focus button', () => {
+    const onChange = vi.fn()
+    render(<ScopeSelector value={{ store_id: 12, product_id: 47 }} onChange={onChange} />)
+    fireEvent.click(screen.getByText('Clear focus'))
+    expect(onChange).toHaveBeenCalledWith(null)
+  })
+
+  it('hides the Clear button and disables triggers when disabled', () => {
+    render(
+      <ScopeSelector value={{ store_id: 12, product_id: 47 }} onChange={() => undefined} disabled />
+    )
+    const storeTrigger = screen.getByLabelText('Focus store') as HTMLButtonElement
+    expect(storeTrigger.disabled).toBe(true)
+    expect((screen.getByText('Clear focus') as HTMLButtonElement).disabled).toBe(true)
+  })
+})
diff --git a/frontend/src/components/demo/ScopeSelector.tsx b/frontend/src/components/demo/ScopeSelector.tsx
new file mode 100644
index 00000000..1ed1b1e8
--- /dev/null
+++ b/frontend/src/components/demo/ScopeSelector.tsx
@@ -0,0 +1,143 @@
+import { Crosshair } from 'lucide-react'
+import { Button } from '@/components/ui/button'
+import { Card, CardContent } from '@/components/ui/card'
+import {
+  Select,
+  SelectContent,
+  SelectItem,
+  SelectTrigger,
+  SelectValue,
+} from '@/components/ui/select'
+import { useProducts } from '@/hooks/use-products'
+import { useSeederStatus } from '@/hooks/use-seeder'
+import { useStores } from '@/hooks/use-stores'
+import type { UserScope } from '@/types/api'
+
+/**
+ * E3 (#409) — store/product focus-pair selector with a pre-run preview.
+ *
+ * Fed from live /dimensions data (NEVER synthesized ids — Postgres sequences
+ * don't reset, so ids are not 1-based). Works without re-seeding: scope
+ * selection on the existing dataset is the primary use. The status step
+ * validates the pair server-side and warn-falls-back to discovery when it
+ * dangles (e.g. after a reset re-issued ids).
+ */
+interface ScopeSelectorProps {
+  value: UserScope | null
+  onChange: (value: UserScope | null) => void
+  disabled?: boolean
+}
+
+// page_size hard cap on /dimensions endpoints is 100.
+const PAGE_SIZE = 100
+
+/** "S001 · Main St (North, supermarket)" — label + non-null traits. */
+function describeEntity(label: string, traits: Array<string | null>): string {
+  const present = traits.filter((t): t is string => t !== null && t !== '')
+  return present.length > 0 ? `${label} (${present.join(', ')})` : label
+}
+
+export function ScopeSelector({ value, onChange, disabled = false }: ScopeSelectorProps) {
+  const storesQuery = useStores({ page: 1, pageSize: PAGE_SIZE })
+  const productsQuery = useProducts({ page: 1, pageSize: PAGE_SIZE })
+  const { data: seederStatus } = useSeederStatus()
+
+  const stores = storesQuery.data?.stores ?? []
+  const products = productsQuery.data?.products ?? []
+  const selectedStore = stores.find((s) => s.id === value?.store_id) ?? null
+  const selectedProduct = products.find((p) => p.id === value?.product_id) ?? null
+
+  return (
+    <div className="flex flex-col gap-2">
+      <div className="flex flex-wrap items-end gap-4">
+        <label className="flex flex-col gap-1 text-sm">
+          <span className="text-xs text-muted-foreground">Focus store</span>
+          <Select
+            value={value?.store_id !== undefined ? String(value.store_id) : ''}
+            onValueChange={(v) =>
+              onChange({
+                store_id: Number(v),
+                product_id: value?.product_id ?? products[0]?.id ?? 0,
+              })
+            }
+            disabled={disabled || stores.length === 0}
+          >
+            <SelectTrigger className="w-56" aria-label="Focus store">
+              <SelectValue placeholder="Auto-discover first store" />
+            </SelectTrigger>
+            <SelectContent>
+              {stores.map((store) => (
+                <SelectItem key={store.id} value={String(store.id)}>
+                  {store.code} · {store.name}
+                </SelectItem>
+              ))}
+            </SelectContent>
+          </Select>
+        </label>
+
+        <label className="flex flex-col gap-1 text-sm">
+          <span className="text-xs text-muted-foreground">Focus product</span>
+          <Select
+            value={value?.product_id !== undefined ? String(value.product_id) : ''}
+            onValueChange={(v) =>
+              onChange({
+                store_id: value?.store_id ?? stores[0]?.id ?? 0,
+                product_id: Number(v),
+              })
+            }
+            disabled={disabled || products.length === 0}
+          >
+            <SelectTrigger className="w-56" aria-label="Focus product">
+              <SelectValue placeholder="Auto-discover first product" />
+            </SelectTrigger>
+            <SelectContent>
+              {products.map((product) => (
+                <SelectItem key={product.id} value={String(product.id)}>
+                  {product.sku} · {product.name}
+                </SelectItem>
+              ))}
+            </SelectContent>
+          </Select>
+        </label>
+
+        {value !== null && (
+          <Button variant="outline" size="sm" disabled={disabled} onClick={() => onChange(null)}>
+            Clear focus
+          </Button>
+        )}
+      </div>
+
+      {value !== null && (
+        <Card>
+          <CardContent className="flex flex-wrap items-center gap-x-6 gap-y-1 py-3 text-sm">
+            <span className="flex items-center gap-1 text-muted-foreground">
+              <Crosshair data-icon="inline-start" />
+              Focus pair
+            </span>
+            <span>
+              {selectedStore
+                ? describeEntity(`${selectedStore.code} · ${selectedStore.name}`, [
+                    selectedStore.region,
+                    selectedStore.store_type,
+                  ])
+                : `store #${value.store_id}`}
+            </span>
+            <span>
+              {selectedProduct
+                ? describeEntity(`${selectedProduct.sku} · ${selectedProduct.name}`, [
+                    selectedProduct.category,
+                    selectedProduct.brand,
+                  ])
+                : `product #${value.product_id}`}
+            </span>
+            {seederStatus?.date_range_start && seederStatus.date_range_end && (
+              <span className="text-muted-foreground">
+                seeded window {seederStatus.date_range_start} → {seederStatus.date_range_end}
+              </span>
+            )}
+          </CardContent>
+        </Card>
+      )}
+    </div>
+  )
+}
diff --git a/frontend/src/components/demo/SeedConfigPanel.test.tsx b/frontend/src/components/demo/SeedConfigPanel.test.tsx
new file mode 100644
index 00000000..c81d8bed
--- /dev/null
+++ b/frontend/src/components/demo/SeedConfigPanel.test.tsx
@@ -0,0 +1,95 @@
+import { afterEach, describe, expect, it, vi } from 'vitest'
+import { cleanup, fireEvent, render, screen } from '@testing-library/react'
+import { SeedConfigPanel } from './SeedConfigPanel'
+import type { SeedOverrides } from '@/types/api'
+
+// jsdom lacks ResizeObserver; the radix Slider requires it (no vitest setup
+// file exists in this project — the stub stays local to this suite).
+class ResizeObserverStub {
+  observe() {}
+  unobserve() {}
+  disconnect() {}
+}
+globalThis.ResizeObserver = globalThis.ResizeObserver ?? (ResizeObserverStub as never)
+
+afterEach(cleanup)
+
+function openPanel(value: SeedOverrides | null = null, props = {}) {
+  const onChange = vi.fn()
+  render(<SeedConfigPanel value={value} onChange={onChange} {...props} />)
+  fireEvent.click(screen.getByText('Advanced seed config'))
+  return onChange
+}
+
+describe('SeedConfigPanel', () => {
+  it('renders all 7 knob controls when expanded', () => {
+    openPanel()
+    // 3 int inputs
+    for (const label of ['Stores', 'Products', 'Window (days)']) {
+      expect(screen.getByLabelText(label)).toBeTruthy()
+    }
+    // 4 float sliders
+    for (const label of [
+      'Sparsity',
+      'Promotion intensity',
+      'Stockout intensity',
+      'Noise sigma',
+    ]) {
+      expect(screen.getAllByLabelText(label).length).toBeGreaterThan(0)
+    }
+  })
+
+  it('emits a sparse object containing only the touched knob', () => {
+    const onChange = openPanel()
+    fireEvent.change(screen.getByLabelText('Stores'), { target: { value: '8' } })
+    expect(onChange).toHaveBeenCalledWith({ stores: 8 })
+  })
+
+  it('merges a new knob into the existing sparse object', () => {
+    const onChange = openPanel({ stores: 8 })
+    fireEvent.change(screen.getByLabelText('Products'), { target: { value: '20' } })
+    expect(onChange).toHaveBeenCalledWith({ stores: 8, products: 20 })
+  })
+
+  it('emits null when the last knob is cleared', () => {
+    const onChange = openPanel({ stores: 8 })
+    fireEvent.change(screen.getByLabelText('Stores'), { target: { value: '' } })
+    expect(onChange).toHaveBeenCalledWith(null)
+  })
+
+  it('emits null via the Clear overrides button', () => {
+    const onChange = openPanel({ stores: 8, noise_sigma: 0.25 })
+    fireEvent.click(screen.getByText('Clear overrides'))
+    expect(onChange).toHaveBeenCalledWith(null)
+  })
+
+  it('disables every control when disabled', () => {
+    openPanel({ stores: 8 }, { disabled: true })
+    expect((screen.getByLabelText('Stores') as HTMLInputElement).disabled).toBe(true)
+    expect((screen.getByLabelText('Products') as HTMLInputElement).disabled).toBe(true)
+  })
+
+  it('locks only the window control when windowLocked (holiday_rush)', () => {
+    openPanel(null, { windowLocked: true })
+    expect((screen.getByLabelText('Window (days)') as HTMLInputElement).disabled).toBe(true)
+    expect((screen.getByLabelText('Stores') as HTMLInputElement).disabled).toBe(false)
+    expect(screen.getByText('pinned window (holiday_rush)')).toBeTruthy()
+  })
+
+  it('shows the NaN-WAPE caveat at high stockout intensity', () => {
+    openPanel({ stockout_intensity: 0.4 })
+    expect(
+      screen.getByText(/can legitimately fail the backtest/i)
+    ).toBeTruthy()
+  })
+
+  it('hides the caveat at tame values', () => {
+    openPanel({ stockout_intensity: 0.1, sparsity: 0.2 })
+    expect(screen.queryByText(/can legitimately fail the backtest/i)).toBeNull()
+  })
+
+  it('echoes the live summary of set knobs', () => {
+    openPanel({ stores: 8, products: 20, promotion_intensity: 0.3 })
+    expect(screen.getByText('8 stores · 20 products · promo 0.30')).toBeTruthy()
+  })
+})
diff --git a/frontend/src/components/demo/SeedConfigPanel.tsx b/frontend/src/components/demo/SeedConfigPanel.tsx
new file mode 100644
index 00000000..6eed23e2
--- /dev/null
+++ b/frontend/src/components/demo/SeedConfigPanel.tsx
@@ -0,0 +1,213 @@
+import { useState } from 'react'
+import { ChevronsUpDown, AlertTriangle } from 'lucide-react'
+import { Badge } from '@/components/ui/badge'
+import { Button } from '@/components/ui/button'
+import {
+  Collapsible,
+  CollapsibleContent,
+  CollapsibleTrigger,
+} from '@/components/ui/collapsible'
+import { Input } from '@/components/ui/input'
+import { Slider } from '@/components/ui/slider'
+import type { SeedOverrides } from '@/types/api'
+
+/**
+ * E3 (#409) — advanced seed config panel: the 7 curated, allow-listed knobs.
+ *
+ * Emits a SPARSE object (only operator-touched knobs) and null when nothing
+ * is set, so legacy start frames stay byte-identical. The UI int ranges are
+ * deliberately TIGHTER than the API bounds (laptop-scale demo data); the API
+ * bounds are the law and the backend rejects anything outside them.
+ */
+interface SeedConfigPanelProps {
+  value: SeedOverrides | null
+  onChange: (value: SeedOverrides | null) => void
+  /** Disable every control (run in flight / Re-seed unticked). */
+  disabled?: boolean
+  /** holiday_rush is calendar-pinned — the window control locks. */
+  windowLocked?: boolean
+}
+
+// UI input ranges (int knobs). API bounds: stores 1..100, products 1..500,
+// window_days 75..365 — the inputs clamp to demo-scale subsets.
+const INT_KNOBS = [
+  { key: 'stores', label: 'Stores', min: 1, max: 20, placeholder: 'preset' },
+  { key: 'products', label: 'Products', min: 1, max: 50, placeholder: 'preset' },
+  { key: 'window_days', label: 'Window (days)', min: 75, max: 365, placeholder: 'preset' },
+] as const
+
+// Float knobs rendered as sliders. API bounds are the slider ranges.
+const FLOAT_KNOBS = [
+  { key: 'sparsity', label: 'Sparsity', max: 0.9 },
+  { key: 'promotion_intensity', label: 'Promotion intensity', max: 0.5 },
+  { key: 'stockout_intensity', label: 'Stockout intensity', max: 0.5 },
+  { key: 'noise_sigma', label: 'Noise sigma', max: 0.5 },
+] as const
+
+/** Thresholds above which the NaN-WAPE caveat shows (mirrors the sparse
+ *  preset's documented expected-fail semantics, RUNBOOKS incident 28). */
+const RISKY_SPARSITY = 0.4
+const RISKY_STOCKOUT = 0.25
+
+function setKnob(
+  value: SeedOverrides | null,
+  key: keyof SeedOverrides,
+  knobValue: number | undefined
+): SeedOverrides | null {
+  const next: SeedOverrides = { ...(value ?? {}) }
+  if (knobValue === undefined) {
+    delete next[key]
+  } else {
+    next[key] = knobValue
+  }
+  return Object.keys(next).length > 0 ? next : null
+}
+
+export function SeedConfigPanel({
+  value,
+  onChange,
+  disabled = false,
+  windowLocked = false,
+}: SeedConfigPanelProps) {
+  const [open, setOpen] = useState(false)
+
+  const touched = value !== null && Object.keys(value).length > 0
+  const risky =
+    (value?.sparsity ?? 0) > RISKY_SPARSITY || (value?.stockout_intensity ?? 0) > RISKY_STOCKOUT
+
+  const summaryParts: string[] = []
+  if (value?.stores !== undefined) summaryParts.push(`${value.stores} stores`)
+  if (value?.products !== undefined) summaryParts.push(`${value.products} products`)
+  if (value?.window_days !== undefined) summaryParts.push(`${value.window_days} days`)
+  if (value?.sparsity !== undefined) summaryParts.push(`sparsity ${value.sparsity.toFixed(2)}`)
+  if (value?.promotion_intensity !== undefined)
+    summaryParts.push(`promo ${value.promotion_intensity.toFixed(2)}`)
+  if (value?.stockout_intensity !== undefined)
+    summaryParts.push(`stockout ${value.stockout_intensity.toFixed(2)}`)
+  if (value?.noise_sigma !== undefined)
+    summaryParts.push(`noise ${value.noise_sigma.toFixed(2)}`)
+
+  return (
+    <Collapsible open={open} onOpenChange={setOpen} className="w-full">
+      <CollapsibleTrigger asChild>
+        {/* The trigger stays clickable while disabled so the operator can
+            still INSPECT the config mid-run; only the controls lock. */}
+        <Button variant="ghost" size="sm" className="gap-2 px-2">
+          <ChevronsUpDown data-icon="inline-start" />
+          Advanced seed config
+          {touched && (
+            <Badge variant="secondary">
+              {Object.keys(value ?? {}).length} knob{Object.keys(value ?? {}).length > 1 && 's'}
+            </Badge>
+          )}
+        </Button>
+      </CollapsibleTrigger>
+      <CollapsibleContent>
+        <div className="mt-2 flex flex-col gap-4 rounded-md border p-4">
+          <div className="grid grid-cols-1 gap-4 sm:grid-cols-3">
+            {INT_KNOBS.map((knob) => {
+              const locked = knob.key === 'window_days' && windowLocked
+              return (
+                <label key={knob.key} className="flex flex-col gap-1 text-sm">
+                  <span className="text-xs text-muted-foreground">
+                    {knob.label}{' '}
+                    <span className="text-muted-foreground/70">
+                      ({knob.min}–{knob.max})
+                    </span>
+                  </span>
+                  <Input
+                    type="number"
+                    aria-label={knob.label}
+                    min={knob.min}
+                    max={knob.max}
+                    placeholder={knob.placeholder}
+                    className="h-9"
+                    value={value?.[knob.key] ?? ''}
+                    disabled={disabled || locked}
+                    title={
+                      locked
+                        ? 'holiday_rush seeds a calendar-pinned 2024 window — the window length cannot be overridden'
+                        : undefined
+                    }
+                    onChange={(e) => {
+                      const raw = e.target.value
+                      if (raw === '') {
+                        onChange(setKnob(value, knob.key, undefined))
+                        return
+                      }
+                      const parsed = Number.parseInt(raw, 10)
+                      if (Number.isNaN(parsed)) return
+                      onChange(setKnob(value, knob.key, parsed))
+                    }}
+                  />
+                  {locked && (
+                    <span className="text-xs text-muted-foreground">
+                      pinned window (holiday_rush)
+                    </span>
+                  )}
+                </label>
+              )
+            })}
+          </div>
+
+          <div className="grid grid-cols-1 gap-4 sm:grid-cols-2">
+            {FLOAT_KNOBS.map((knob) => {
+              const knobValue = value?.[knob.key]
+              return (
+              <div key={knob.key} className="flex flex-col gap-1 text-sm">
+                <span className="text-xs text-muted-foreground">
+                  {knob.label}:{' '}
+                  <span className="font-mono">
+                    {knobValue !== undefined ? knobValue.toFixed(2) : 'preset'}
+                  </span>
+                </span>
+                <Slider
+                  aria-label={knob.label}
+                  min={0}
+                  max={knob.max}
+                  step={0.05}
+                  value={[knobValue ?? 0]}
+                  disabled={disabled}
+                  onValueChange={(vals) => {
+                    const v = vals[0]
+                    // 0 from an untouched slider means "preset" — only an
+                    // explicit non-zero (or a previously set knob) registers.
+                    if (v === 0 && knobValue === undefined) return
+                    onChange(setKnob(value, knob.key, v === 0 ? undefined : v))
+                  }}
+                />
+              </div>
+              )
+            })}
+          </div>
+
+          <div className="flex flex-wrap items-center gap-3">
+            {touched ? (
+              <>
+                <p className="text-sm text-muted-foreground">{summaryParts.join(' · ')}</p>
+                <Button
+                  variant="outline"
+                  size="sm"
+                  disabled={disabled}
+                  onClick={() => onChange(null)}
+                >
+                  Clear overrides
+                </Button>
+              </>
+            ) : (
+              <p className="text-sm text-muted-foreground">
+                No overrides — the scenario preset drives every knob.
+              </p>
+            )}
+            {risky && (
+              <Badge variant="outline" className="gap-1 text-destructive">
+                <AlertTriangle data-icon="inline-start" />
+                high sparsity/stockout can legitimately fail the backtest (NaN WAPE)
+              </Badge>
+            )}
+          </div>
+        </div>
+      </CollapsibleContent>
+    </Collapsible>
+  )
+}
diff --git a/frontend/src/components/demo/WorkspaceArtifactsPanel.test.tsx b/frontend/src/components/demo/WorkspaceArtifactsPanel.test.tsx
index 8a6d0549..6fe96923 100644
--- a/frontend/src/components/demo/WorkspaceArtifactsPanel.test.tsx
+++ b/frontend/src/components/demo/WorkspaceArtifactsPanel.test.tsx
@@ -32,6 +32,8 @@ const fullWorkspace: WorkspaceDetail = {
   pinned: false,
   tags: [],
   replayed_from_workspace_id: null,
+  seed_overrides: null,
+  user_scope: null,
   notes: null,
   config_schema_version: 1,
 }
diff --git a/frontend/src/components/demo/WorkspaceEditDialog.test.tsx b/frontend/src/components/demo/WorkspaceEditDialog.test.tsx
index ca73e36b..476d120a 100644
--- a/frontend/src/components/demo/WorkspaceEditDialog.test.tsx
+++ b/frontend/src/components/demo/WorkspaceEditDialog.test.tsx
@@ -32,6 +32,8 @@ const baseItem: WorkspaceListItem = {
   pinned: false,
   tags: ['smoke'],
   replayed_from_workspace_id: null,
+  seed_overrides: null,
+  user_scope: null,
 }
 
 let mockDetail: {
diff --git a/frontend/src/components/demo/WorkspacePanel.test.tsx b/frontend/src/components/demo/WorkspacePanel.test.tsx
index 75bf0f56..9f0ae9be 100644
--- a/frontend/src/components/demo/WorkspacePanel.test.tsx
+++ b/frontend/src/components/demo/WorkspacePanel.test.tsx
@@ -42,6 +42,8 @@ const baseItem: WorkspaceListItem = {
   pinned: false,
   tags: [],
   replayed_from_workspace_id: null,
+  seed_overrides: null,
+  user_scope: null,
 }
 
 const secondItem: WorkspaceListItem = {
diff --git a/frontend/src/components/demo/index.ts b/frontend/src/components/demo/index.ts
index 88731868..39245c1b 100644
--- a/frontend/src/components/demo/index.ts
+++ b/frontend/src/components/demo/index.ts
@@ -7,3 +7,6 @@ export * from './ReplayConfirmDialog'
 export * from './WorkspaceEditDialog'
 export * from './WorkspaceLineageStrip'
 export * from './workspace-name'
+// E3 (#409) — advanced seed config + focus-pair selection.
+export * from './SeedConfigPanel'
+export * from './ScopeSelector'
diff --git a/frontend/src/components/demo/replay-request.test.ts b/frontend/src/components/demo/replay-request.test.ts
index 1e50759d..5b65a677 100644
--- a/frontend/src/components/demo/replay-request.test.ts
+++ b/frontend/src/components/demo/replay-request.test.ts
@@ -16,6 +16,8 @@ const baseItem: WorkspaceListItem = {
   pinned: false,
   tags: [],
   replayed_from_workspace_id: null,
+  seed_overrides: null,
+  user_scope: null,
 }
 
 describe('buildReplayRequest', () => {
@@ -36,4 +38,28 @@ describe('buildReplayRequest', () => {
     expect('workspace_name' in request).toBe(false)
     expect(request.preservation).toBe('keep')
   })
+
+  // E3 (#409) — replay-verbatim covers the recorded story slots.
+  it('omits the E3 keys on a legacy row (null slots) — byte-identical frame', () => {
+    const request = buildReplayRequest(baseItem)
+    expect('seed_overrides' in request).toBe(false)
+    expect('user_scope' in request).toBe(false)
+  })
+
+  it('re-submits recorded seed_overrides and user_scope verbatim', () => {
+    const slotted: WorkspaceListItem = {
+      ...baseItem,
+      seed_overrides: { stores: 8, products: 20, promotion_intensity: 0.3 },
+      user_scope: { store_id: 12, product_id: 47 },
+    }
+    const request = buildReplayRequest(slotted)
+    expect(request.seed_overrides).toEqual({
+      stores: 8,
+      products: 20,
+      promotion_intensity: 0.3,
+    })
+    expect(request.user_scope).toEqual({ store_id: 12, product_id: 47 })
+    // Lineage stays intact when the slots ride along (E1 frozen criterion).
+    expect(request.replayed_from_workspace_id).toBe(baseItem.workspace_id)
+  })
 })
diff --git a/frontend/src/components/demo/replay-request.ts b/frontend/src/components/demo/replay-request.ts
index e2ecee3d..51590be5 100644
--- a/frontend/src/components/demo/replay-request.ts
+++ b/frontend/src/components/demo/replay-request.ts
@@ -15,5 +15,9 @@ export function buildReplayRequest(ws: WorkspaceListItem): DemoRunRequest {
     // E1 (#407) — record replay lineage on the NEW row (soft reference).
     replayed_from_workspace_id: ws.workspace_id,
     ...(ws.name ? { workspace_name: ws.name } : {}),
+    // E3 (#409) — replay-verbatim covers the recorded slots; omitted on
+    // legacy rows (null) so their replay frame stays byte-identical.
+    ...(ws.seed_overrides ? { seed_overrides: ws.seed_overrides } : {}),
+    ...(ws.user_scope ? { user_scope: ws.user_scope } : {}),
   }
 }
diff --git a/frontend/src/pages/showcase.tsx b/frontend/src/pages/showcase.tsx
index 6a3497ce..8ff2f8eb 100644
--- a/frontend/src/pages/showcase.tsx
+++ b/frontend/src/pages/showcase.tsx
@@ -13,6 +13,8 @@ import { ReplayConfirmDialog } from '@/components/demo/ReplayConfirmDialog'
 import { WorkspaceLineageStrip } from '@/components/demo/WorkspaceLineageStrip'
 import { WorkspacePanel } from '@/components/demo/WorkspacePanel'
 import { WorkspaceArtifactsPanel } from '@/components/demo/WorkspaceArtifactsPanel'
+import { SeedConfigPanel } from '@/components/demo/SeedConfigPanel'
+import { ScopeSelector } from '@/components/demo/ScopeSelector'
 import { buildReplayRequest } from '@/components/demo/replay-request'
 import { WORKSPACE_NAME_PATTERN } from '@/components/demo/workspace-name'
 import { Button } from '@/components/ui/button'
@@ -21,7 +23,7 @@ import { Checkbox } from '@/components/ui/checkbox'
 import { Input } from '@/components/ui/input'
 import { ROUTES } from '@/lib/constants'
 import { cn } from '@/lib/utils'
-import type { WorkspaceListItem } from '@/types/api'
+import type { SeedOverrides, UserScope, WorkspaceListItem } from '@/types/api'
 
 const TERMINAL_STATUSES = new Set(['pass', 'fail', 'skip', 'warn'])
 
@@ -124,6 +126,10 @@ export default function ShowcasePage() {
   const [selectedWorkspaceId, setSelectedWorkspaceId] = useState<string | null>(null)
   // E2 (#408) — the workspace awaiting replay confirmation (null = no dialog).
   const [pendingReplay, setPendingReplay] = useState<WorkspaceListItem | null>(null)
+  // E3 (#409) — advanced seed config (sparse; null = preset-driven) and the
+  // operator-selected focus pair (null = auto-discover first pair).
+  const [seedOverrides, setSeedOverrides] = useState<SeedOverrides | null>(null)
+  const [userScope, setUserScope] = useState<UserScope | null>(null)
 
   // The page (not the panel) resolves the loaded workspace's detail — the
   // artifacts panel needs detail-only created_objects.
@@ -159,6 +165,10 @@ export default function ShowcasePage() {
             ...(trimmedName ? { workspace_name: trimmedName } : {}),
           }
         : {}),
+      // E3 (#409) — overrides only ride a re-seed run (the backend rejects
+      // them on skip_seed=true); omit both keys for legacy byte-compat.
+      ...(reseed && seedOverrides ? { seed_overrides: seedOverrides } : {}),
+      ...(userScope ? { user_scope: userScope } : {}),
     })
   }
 
@@ -171,6 +181,9 @@ export default function ShowcasePage() {
     setResetDb(ws.reset)
     setKeepWorkspace(true)
     setWorkspaceName(ws.name ?? '')
+    // E3 (#409) — repopulate the seed-config panel + scope selector.
+    setSeedOverrides(ws.seed_overrides ?? null)
+    setUserScope(ws.user_scope ?? null)
     setSelectedWorkspaceId(ws.workspace_id)
   }
 
@@ -299,7 +312,13 @@ export default function ShowcasePage() {
             <label className="flex items-center gap-2 text-sm">
               <Checkbox
                 checked={reseed}
-                onCheckedChange={(v) => setReseed(v === true)}
+                onCheckedChange={(v) => {
+                  const next = v === true
+                  setReseed(next)
+                  // E3 (#409) — overrides are meaningless without a re-seed
+                  // (validator parity: the backend rejects the combination).
+                  if (!next) setSeedOverrides(null)
+                }}
                 disabled={isRunning}
               />
               <span>
@@ -311,7 +330,13 @@ export default function ShowcasePage() {
             <label className="flex items-center gap-2 text-sm">
               <Checkbox
                 checked={resetDb}
-                onCheckedChange={(v) => setResetDb(v === true)}
+                onCheckedChange={(v) => {
+                  const next = v === true
+                  setResetDb(next)
+                  // E3 (#409) — a wipe re-issues entity ids (sequences never
+                  // reset), so a pre-picked focus pair would dangle.
+                  if (next) setUserScope(null)
+                }}
                 disabled={isRunning}
               />
               <span>
@@ -372,6 +397,27 @@ export default function ShowcasePage() {
             )}
           </div>
 
+          {/* E3 (#409) — advanced seed config, only meaningful on a re-seed run. */}
+          {reseed && (
+            <SeedConfigPanel
+              value={seedOverrides}
+              onChange={setSeedOverrides}
+              disabled={isRunning}
+              windowLocked={scenario === 'holiday_rush'}
+            />
+          )}
+
+          {/* E3 (#409) — focus-pair selection works on the EXISTING dataset
+              (no re-seed needed); a Reset run clears it (ids re-issued). */}
+          <div className="flex flex-col gap-1">
+            <ScopeSelector value={userScope} onChange={setUserScope} disabled={isRunning} />
+            {resetDb && (
+              <p className="text-xs text-destructive">
+                Reset database re-issues entity ids — re-pick the focus pair after the run.
+              </p>
+            )}
+          </div>
+
           {phase === 'running' && (
             <p className="text-sm text-muted-foreground">
               Step {completed} of {steps.length} complete…
diff --git a/frontend/src/pages/workspace-compare.test.tsx b/frontend/src/pages/workspace-compare.test.tsx
index d4dfd706..ea8a352c 100644
--- a/frontend/src/pages/workspace-compare.test.tsx
+++ b/frontend/src/pages/workspace-compare.test.tsx
@@ -40,6 +40,8 @@ function makeDetail(overrides: Partial<WorkspaceDetail>): WorkspaceDetail {
     pinned: false,
     tags: [],
     replayed_from_workspace_id: null,
+    seed_overrides: null,
+    user_scope: null,
     store_id: 3,
     product_id: 7,
     date_start: '2026-01-01',
diff --git a/frontend/src/types/api.ts b/frontend/src/types/api.ts
index f64ae24a..4c9e07ef 100644
--- a/frontend/src/types/api.ts
+++ b/frontend/src/types/api.ts
@@ -774,6 +774,25 @@ export interface StepEvent {
   phase_total?: number | null
 }
 
+// E3 (#409) — curated, allow-listed seed overrides (7 knobs; unknown keys 422
+// server-side via extra='forbid'). Requires skip_seed=false on the start frame.
+export interface SeedOverrides {
+  stores?: number
+  products?: number
+  window_days?: number
+  sparsity?: number
+  promotion_intensity?: number
+  stockout_intensity?: number
+  noise_sigma?: number
+}
+
+// E3 (#409) — operator-selected focus pair (REAL ids from /dimensions —
+// sequences never reset, so ids are not 1-based).
+export interface UserScope {
+  store_id: number
+  product_id: number
+}
+
 // Start frame for WS /demo/stream and request body for POST /demo/run.
 export interface DemoRunRequest {
   seed?: number
@@ -787,6 +806,9 @@ export interface DemoRunRequest {
   workspace_name?: string
   // E1 (#407) — replay provenance: the source workspace_id a Replay re-runs.
   replayed_from_workspace_id?: string
+  // E3 (#409) — advanced seed config + focus pair; omit both for legacy runs.
+  seed_overrides?: SeedOverrides
+  user_scope?: UserScope
 }
 
 // Aggregate result returned by the synchronous POST /demo/run.
@@ -820,6 +842,10 @@ export interface WorkspaceListItem {
   pinned: boolean
   tags: string[]
   replayed_from_workspace_id: string | null
+  // E3 (#409) — replay-relevant story slots (on the LIST item: replay reads
+  // list rows); null on runs without them.
+  seed_overrides: SeedOverrides | null
+  user_scope: UserScope | null
 }
 
 // Full row from GET /demo/workspaces/{workspace_id}.

From bf0ccbfa6d2e8e3eca9532fc783abc4c28d0eb7a Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 01:59:29 +0200
Subject: [PATCH 20/32] docs(docs): document seed override contract and
 workspace slots (#409)

---
 docs/_base/API_CONTRACTS.md | 10 +++++-----
 docs/_base/DOMAIN_MODEL.md  |  8 ++++----
 docs/_base/RUNBOOKS.md      |  9 +++++++--
 3 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/docs/_base/API_CONTRACTS.md b/docs/_base/API_CONTRACTS.md
index 70e6f5ab..e9c2ff7a 100644
--- a/docs/_base/API_CONTRACTS.md
+++ b/docs/_base/API_CONTRACTS.md
@@ -56,12 +56,12 @@ All endpoints serve JSON; error responses use `application/problem+json` (RFC 78
 | agents | POST | `/agents/sessions/{session_id}/approve` | Approve/reject a pending tool call (HITL gate) |
 | agents | DELETE | `/agents/sessions/{session_id}` | Close session |
 | agents | WS | `/agents/stream` | Token-by-token streaming + tool-call events |
-| seeder | (see `app/features/seeder/routes.py`) | `/seeder/*` | Trigger scenarios, status, customization |
+| seeder | (see `app/features/seeder/routes.py`) | `/seeder/*` | Trigger scenarios, status, customization. **E3 (#409)** — `POST /seeder/generate` accepts an additive Optional `overrides` object (`SeederOverrides`, `app/shared/seeder/overrides.py`) with 7 allow-listed knobs: `stores` (1-100), `products` (1-500), `window_days` (75-365; recomputes `start_date` from `end_date`), `sparsity` (0-0.9), `promotion_intensity` (0-0.5), `stockout_intensity` (0-0.5), `noise_sigma` (0-0.5). `extra=forbid` → an unknown knob is a `422`; applied LAST in `_build_config_from_params` so it wins over the scalar `stores`/`products`/`sparsity` params; absent = byte-identical legacy behavior |
 | seeder | POST | `/seeder/phase2-enrichment` | PRP-38 — run Phase 2 generators (lifecycle, replenishment, exogenous, returns) against the existing seeded data. `422 application/problem+json` on an empty database. |
-| demo | POST | `/demo/run` | Run the end-to-end demo pipeline in-process; returns a `DemoRunResult`. `409 application/problem+json` if a run is already active. **PRP-38** — body accepts an Optional `scenario: 'demo_minimal' \| 'showcase_rich' \| 'sparse'` field; default `'demo_minimal'` (back-compat). **E1 (#390)** — body accepts additive Optional `preservation: 'ephemeral' \| 'keep'` (default `'ephemeral'`, today's no-row behavior) and `workspace_name: str \| null` (pattern `^[a-z0-9][a-z0-9\-_]*$`, ≤100 chars); `workspace_name` without `preservation='keep'` → `422 application/problem+json`. `preservation='keep'` records the run as a `showcase_workspace` row; `DemoRunResult` gains an additive Optional `workspace_id: str \| null`. **E2 (#391)** — `scenario` accepts all 8 `ScenarioPreset` values (`retail_standard` / `holiday_rush` / `high_variance` / `stockout_heavy` / `new_launches` / `sparse` / `demo_minimal` / `showcase_rich`); only `showcase_rich` changes the step table (24 rows), every other preset runs the legacy 11-row flow. **E1 (#407)** — body accepts additive Optional `replayed_from_workspace_id: str \| null` (`^[0-9a-f]{32}$`); requires `preservation='keep'` (else `422 application/problem+json`); recorded verbatim on the new `showcase_workspace` row as a SOFT reference (no existence check — dangles are designed). |
+| demo | POST | `/demo/run` | Run the end-to-end demo pipeline in-process; returns a `DemoRunResult`. `409 application/problem+json` if a run is already active. **PRP-38** — body accepts an Optional `scenario: 'demo_minimal' \| 'showcase_rich' \| 'sparse'` field; default `'demo_minimal'` (back-compat). **E1 (#390)** — body accepts additive Optional `preservation: 'ephemeral' \| 'keep'` (default `'ephemeral'`, today's no-row behavior) and `workspace_name: str \| null` (pattern `^[a-z0-9][a-z0-9\-_]*$`, ≤100 chars); `workspace_name` without `preservation='keep'` → `422 application/problem+json`. `preservation='keep'` records the run as a `showcase_workspace` row; `DemoRunResult` gains an additive Optional `workspace_id: str \| null`. **E2 (#391)** — `scenario` accepts all 8 `ScenarioPreset` values (`retail_standard` / `holiday_rush` / `high_variance` / `stockout_heavy` / `new_launches` / `sparse` / `demo_minimal` / `showcase_rich`); only `showcase_rich` changes the step table (24 rows), every other preset runs the legacy 11-row flow. **E1 (#407)** — body accepts additive Optional `replayed_from_workspace_id: str \| null` (`^[0-9a-f]{32}$`); requires `preservation='keep'` (else `422 application/problem+json`); recorded verbatim on the new `showcase_workspace` row as a SOFT reference (no existence check — dangles are designed). **E3 (#409)** — body accepts additive Optional `seed_overrides` (the same `SeederOverrides` object as `POST /seeder/generate`; requires `skip_seed=false` else `422`; `window_days` rejected on the calendar-pinned `holiday_rush` preset; `{}` normalizes to `null`) and `user_scope` (`{store_id: int>=1, product_id: int>=1}`, `extra=forbid` — the focus pair the pipeline models instead of the auto-discovered first pair; validated by the status step, WARN + fallback to discovery on a dangling pair). Both persist into the kept workspace row's story slots and replay verbatim. |
 | demo | WS | `/demo/stream` | Stream one `StepEvent` per pipeline step for the live Showcase page |
-| demo | GET | `/demo/workspaces` | **E4 (#393)** — list saved showcase workspaces, newest first (`limit` 1-100 default 20 / `offset`); `200` + empty list on an empty table. **E1 (#407)** — list items additively carry `archived`, `pinned`, `tags`, `replayed_from_workspace_id`. **E2 (#408)** — additive query params: `q` (name ILIKE search, min 2 chars), repeated `tags` (JSONB containment — all listed tags must match), `include_archived` (default `false` — archived rows are now HIDDEN by default), allow-listed `sort_by` (`created_at`/`name`/`seed`/`status`; unknown → default `created_at desc`, no 422) + `sort_order` (`asc`/`desc`); pinned rows always order first; `total` respects the active filters |
-| demo | GET | `/demo/workspaces/{workspace_id}` | **E4 (#393)** — full workspace row incl. `created_objects` soft references + grain/window columns; `404 application/problem+json` when missing. **E1 (#407)** — response additively carries the list-item lifecycle fields plus `notes`, `config_schema_version`, and the six story slots (`seed_overrides` / `user_scope` / `approval_events` / `rag_events` / `job_ids` / `phase_summaries` — all `null` until their writer epic lands; schemas in `docs/_base/DOMAIN_MODEL.md`) |
+| demo | GET | `/demo/workspaces` | **E4 (#393)** — list saved showcase workspaces, newest first (`limit` 1-100 default 20 / `offset`); `200` + empty list on an empty table. **E1 (#407)** — list items additively carry `archived`, `pinned`, `tags`, `replayed_from_workspace_id`. **E2 (#408)** — additive query params: `q` (name ILIKE search, min 2 chars), repeated `tags` (JSONB containment — all listed tags must match), `include_archived` (default `false` — archived rows are now HIDDEN by default), allow-listed `sort_by` (`created_at`/`name`/`seed`/`status`; unknown → default `created_at desc`, no 422) + `sort_order` (`asc`/`desc`); pinned rows always order first; `total` respects the active filters. **E3 (#409)** — list items additively carry the `seed_overrides` / `user_scope` story slots (`null` on runs without them) — deliberately on the LIST item, because the frontend Replay builds its verbatim start frame from list rows |
+| demo | GET | `/demo/workspaces/{workspace_id}` | **E4 (#393)** — full workspace row incl. `created_objects` soft references + grain/window columns; `404 application/problem+json` when missing. **E1 (#407)** — response additively carries the list-item lifecycle fields plus `notes`, `config_schema_version`, and the six story slots (`seed_overrides` / `user_scope` / `approval_events` / `rag_events` / `job_ids` / `phase_summaries` — `null` until their writer epic lands; schemas in `docs/_base/DOMAIN_MODEL.md`). **E3 (#409)** — `seed_overrides` and `user_scope` are now WRITTEN (recorded at create time from the start frame) and surfaced on the LIST item as well (Detail inherits) |
 | demo | GET | `/demo/workspaces/{workspace_id}/health` | **E2 (#408)** — probe the workspace's soft references in-process (model runs, scenario plans, alias, batch, agent session, `job_ids` slot) via `httpx.ASGITransport`; per-reference `status` ∈ `alive` (2xx) / `dead` (404 — deleted after the run) / `unknown` (anything else — never a 500), plus `alive`/`dead`/`unknown` counts and `partial_run` (true when the row's status ≠ `completed`); non-probeable keys (`v2_model_path`, `scenario_artifact_key`, `train_model_types`) are skipped; `404 application/problem+json` when the workspace is missing |
 | demo | PATCH | `/demo/workspaces/{workspace_id}` | **E1 (#407)** — partial lifecycle update (`name` / `notes` / `tags` / `archived` / `pinned`; `exclude_unset` semantics — only provided fields change; explicit `null` clears `name`/`notes`; explicit `null` on `archived`/`pinned`/`tags` → `422` (send `[]` to clear tags); `status` NOT patchable — the pipeline owns it); returns the updated `WorkspaceDetailResponse`; empty body = `200` no-op; `404 application/problem+json` when missing; `422` on unknown keys / bad name pattern / >20 tags |
 | demo | DELETE | `/demo/workspaces/{workspace_id}` | Delete one saved workspace METADATA row; `204` on success, `404 application/problem+json` when missing. The run's created objects (model runs, scenario plans, aliases, jobs, artifacts) are soft references and are NOT deleted |
@@ -88,7 +88,7 @@ Verified against `app/features/agents/websocket.py` and `app/features/agents/sch
 
 Drives the end-to-end demo pipeline for the dashboard Showcase page. Verified against `app/features/demo/routes.py` and `app/features/demo/schemas.py` (`StepEvent`).
 
-- **Client → server (one start frame):** `{"seed": int, "reset": bool, "skip_seed": bool, "scenario"?: "demo_minimal" | "showcase_rich" | "sparse", "preservation"?: "ephemeral" | "keep", "workspace_name"?: str}` — all fields optional (`DemoRunRequest` supplies defaults `seed=42`, `reset=false`, `skip_seed=true`, `scenario="demo_minimal"`, `preservation="ephemeral"`, `workspace_name=null`). E1 (#390) — `workspace_name` requires `preservation="keep"` (else one `error` event from validation); unknown start-frame keys remain ignored (forward/backward compat). E2 (#391) — `scenario` accepts all 8 `ScenarioPreset` values (`retail_standard` / `holiday_rush` / `high_variance` / `stockout_heavy` / `new_launches` / `sparse` / `demo_minimal` / `showcase_rich`); only `showcase_rich` changes the step table (24 rows), every other preset runs the legacy 11-row flow. E1 (#407) — the start frame additively accepts `replayed_from_workspace_id?: str` (`^[0-9a-f]{32}$`, requires `preservation="keep"` else one `error` event from validation); the Showcase Replay button sends the source row's `workspace_id`, recorded verbatim on the NEW row as a soft reference. The pipeline runs once, then the server closes.
+- **Client → server (one start frame):** `{"seed": int, "reset": bool, "skip_seed": bool, "scenario"?: "demo_minimal" | "showcase_rich" | "sparse", "preservation"?: "ephemeral" | "keep", "workspace_name"?: str}` — all fields optional (`DemoRunRequest` supplies defaults `seed=42`, `reset=false`, `skip_seed=true`, `scenario="demo_minimal"`, `preservation="ephemeral"`, `workspace_name=null`). E1 (#390) — `workspace_name` requires `preservation="keep"` (else one `error` event from validation); unknown start-frame keys remain ignored (forward/backward compat). E2 (#391) — `scenario` accepts all 8 `ScenarioPreset` values (`retail_standard` / `holiday_rush` / `high_variance` / `stockout_heavy` / `new_launches` / `sparse` / `demo_minimal` / `showcase_rich`); only `showcase_rich` changes the step table (24 rows), every other preset runs the legacy 11-row flow. E1 (#407) — the start frame additively accepts `replayed_from_workspace_id?: str` (`^[0-9a-f]{32}$`, requires `preservation="keep"` else one `error` event from validation); the Showcase Replay button sends the source row's `workspace_id`, recorded verbatim on the NEW row as a soft reference. E3 (#409) — the start frame additively accepts `seed_overrides?: {stores?, products?, window_days?, sparsity?, promotion_intensity?, stockout_intensity?, noise_sigma?}` (allow-listed — an unknown knob is one `error` event; requires `skip_seed=false`; `window_days` rejected on `holiday_rush`) and `user_scope?: {store_id, product_id}`; the seed step forwards `seed_overrides` verbatim to `POST /seeder/generate` (its `data` echoes `overrides_applied`), the status step adopts a valid `user_scope` (detail says "(user-selected)", `data.user_scope_applied=true`) or WARNS and falls back to discovery on a dangling pair; both persist to the kept workspace row and replay verbatim. The pipeline runs once, then the server closes.
 - **Server → client (every frame):** Pydantic-serialized `StepEvent` — `{"event_type", "step_name", "step_index", "total_steps", "status", "detail", "duration_ms", "data", "timestamp", "phase_name"?, "phase_index"?, "phase_total"?}`. PRP-38 — the three `phase_*` fields are Optional + Nullable so legacy clients that don't render phases keep working.
 - **`event_type` values (Literal in `StepEvent`):**
   - `step_start` — a step began; `status` is `null`.
diff --git a/docs/_base/DOMAIN_MODEL.md b/docs/_base/DOMAIN_MODEL.md
index 24137fc2..a7493219 100644
--- a/docs/_base/DOMAIN_MODEL.md
+++ b/docs/_base/DOMAIN_MODEL.md
@@ -61,22 +61,22 @@
 - **Stored metadata:** replay config (`seed`, `scenario`, `reset`, `skip_seed`), showcase grain + window (`store_id`, `product_id`, `date_start`, `date_end` — NULL on early failure), lifecycle (`status`, `created_at`/`updated_at`), and the JSONB payloads below. E1 (#407) adds operator-curation columns `archived` / `pinned` (booleans, default false, PATCH-mutable, orthogonal to `status` — the pipeline owns the run lifecycle), `notes` (free text, 2000-char cap at the Pydantic boundary), `tags` (a queryable JSONB string array — its own GIN-indexed column, exact `scenario_plan.tags` pattern, ≤20 items at the PATCH boundary), `config_schema_version` (int, default 1 — versions the workspace config + story-slot schema as a whole; any epic that changes a documented slot shape bumps the ORM default and documents the delta here), and the provenance column `replayed_from_workspace_id` (String(32), btree-indexed SOFT reference — see Invariants).
 - **JSONB fields:** `created_objects` (sparse soft-reference keys — `winning_run_id`, `v2_run_id`, `v2_model_path`, `alias`, `agent_session_id`, `batch_id`, `scenario_plan_ids`, `scenario_artifact_key`, `train_model_types`, `stale_alias_run_id`) and `result_summary` (winner / WAPE / wall-clock display payload).
 - **JSONB story slots (E1 #407 — authoritative per-slot schema):** six dedicated nullable JSONB columns; `NULL` = "slot never written" (distinct from empty). E1 ships the columns only — each slot has an assigned writer epic:
-  - `seed_overrides` (E3 #409 writes) — dict: the curated seeder-override payload from the start frame, stored verbatim (`model_dump(mode="json")`); replay echoes it.
-  - `user_scope` (E3 #409 writes) — dict: operator-selected focus, `{"store_id": int, "product_id": int}` (additive keys allowed later).
+  - `seed_overrides` (**WRITTEN since E3 #409**) — SPARSE dict: only operator-set knobs appear, `{}` is never stored (`None` instead). Allow-listed keys (the `SeederOverrides` schema, `app/shared/seeder/overrides.py`): `stores` int 1-100, `products` int 1-500, `window_days` int 75-365, `sparsity` float 0-0.9, `promotion_intensity` float 0-0.5, `stockout_intensity` float 0-0.5, `noise_sigma` float 0-0.5. Persisted via `model_dump(mode="json", exclude_none=True)` at create time; replay re-submits it verbatim. Records the REQUESTED config — the data the run actually seeded follows from it deterministically.
+  - `user_scope` (**WRITTEN since E3 #409**) — dict: operator-selected focus, `{"store_id": int>=1, "product_id": int>=1}` (`UserScope` schema, `extra=forbid`; additive keys need a documented schema change). Records the REQUESTED pair; the row's `store_id`/`product_id` columns record the EFFECTIVE grain the run modeled — the two legitimately diverge when the requested pair dangled and the status step warn-fell-back to discovery (divergence is visible by design). Both slots are exposed on the workspace LIST item (not detail-only) because the frontend Replay builds its start frame from list rows.
   - `approval_events` (E5 #411 writes) — list[dict], append-only: `{"action_id": str, "tool_name": str, "decision": "approved"|"rejected", "decided_at": iso8601-str, "session_id": str}`.
   - `rag_events` (E5 #411 writes) — list[dict], append-only: `{"event": "index"|"retrieve"|"skip", "detail": str, "count": int, "occurred_at": iso8601-str}`.
   - `job_ids` (later parallel epic — E2 #408 / E4 #410 agree on the writer) — list[str]: job / batch sub-job ids the run submitted (soft references).
   - `phase_summaries` (later parallel epic) — list[dict], one per phase: `{"phase_name": str, "status": "pass"|"fail"|"warn"|"skip", "steps": int, "duration_ms": float}`.
 - **Relationship to demo pipeline runs:** one workspace row per kept pipeline run — `create_workspace` inserts it as `running` before the first step; `finalize_workspace` settles it with the run's collected ids. NOT a seeder `scenario`: a preset is a reusable data-generation recipe; a workspace is the record of ONE concrete run (which preset it used, with what seed, and what it produced).
 - **Invariants:**
-  - The config columns (`seed`, `scenario`, `reset`, `skip_seed`) are sufficient for a verbatim Replay through the normal run path — replay never mutates the original row; it creates a NEW row.
+  - The config columns (`seed`, `scenario`, `reset`, `skip_seed`) — plus, since E3 #409, the `seed_overrides`/`user_scope` story slots — are sufficient for a verbatim Replay through the normal run path; replay never mutates the original row; it creates a NEW row.
   - `name` is deliberately NON-unique; `workspace_id` (UUID hex) is the unique handle.
   - `created_objects` carries SOFT references only — **no ForeignKeys by design**. The workspace row is an audit record, not an ownership root: the referenced runs/plans/aliases are independently operator-deletable, and a workspace must never block (or cascade) their deletion.
   - Deletion is METADATA-ONLY, symmetric with the no-FK design: `DELETE /demo/workspaces/{id}` removes the `showcase_workspace` row and nothing else — the soft-referenced model runs, scenario plans, aliases, jobs, agent sessions, and artifacts survive, and a workspace whose references already dangle still deletes cleanly.
   - Persistence is warn-and-continue: a workspace write failure must never break the demo pipeline (the run completes with `workspace_id: null`). The HTTP-backed helpers (`update_workspace` for PATCH, like get/list/delete) take a caller-owned session and raise normally — warn-and-continue is pipeline-only.
   - E1 (#407): `replayed_from_workspace_id` is a SOFT reference — **no ForeignKey, not even self-referential**: ancestor workspace rows must stay independently deletable (metadata-only delete) without cascading to or blocking descendants. The value is recorded verbatim from the request (no existence check); dangling lineage pointers after an ancestor delete are expected and harmless, like every `created_objects` id.
   - E1 (#407): `status` is NOT patchable — `PATCH /demo/workspaces/{id}` covers `name`/`notes`/`tags`/`archived`/`pinned` only; `archived` is an orthogonal curation flag and the `ck_showcase_workspace_status` CHECK is untouched.
-- **Out of scope (deliberately not modeled yet):** export bundles under `artifacts/showcase/<workspace>/`, RAG-event / approval-decision capture (columns exist as E1 story slots; the writers are E5 #411), advanced seed config (slot exists; writer is E3 #409), and per-phase interactive configuration — see `docs/_base/RUNBOOKS.md` § Showcase workspace.
+- **Out of scope (deliberately not modeled yet):** export bundles under `artifacts/showcase/<workspace>/`, RAG-event / approval-decision capture (columns exist as E1 story slots; the writers are E5 #411), and per-phase interactive configuration — see `docs/_base/RUNBOOKS.md` § Showcase workspace. (Advanced seed config + scope selection shipped in E3 #409 — the `seed_overrides`/`user_scope` slots above are now written.)
 
 ## Key Invariants — NEVER violate
 
diff --git a/docs/_base/RUNBOOKS.md b/docs/_base/RUNBOOKS.md
index 22e06e49..f7aa35a5 100644
--- a/docs/_base/RUNBOOKS.md
+++ b/docs/_base/RUNBOOKS.md
@@ -139,6 +139,11 @@ uv run python scripts/run_demo.py --seed 42 --quiet 2>&1 | tee demo.log
     - `holiday_rush` — seeds a **pinned Oct–Dec 2024 window** (the preset's `HolidayConfig` spikes are fixed 2024 dates; a today-anchored window would never contain them). Re-seeding ADDS rows without wiping prior data, so after a holiday_rush re-seed `/seeder/status` reports the union range (e.g. `2024-10-01..today`); tick **Reset database** together with **Re-seed first** for a clean pinned window, and again when switching back to a today-anchored preset. Expected green on the 11-step flow.
     - `retail_standard` / `high_variance` / `stockout_heavy` — demo-scaled 5×15×180d, today-anchored; `new_launches` — 5×25×180d. All expected **green** on the legacy 11-step flow (only `showcase_rich` runs the 24-step table).
     Fix: none for the documented outcomes above. If a normally-green preset fails, make sure **Re-seed first** was ticked (without it the run reuses the currently seeded dataset, whatever preset produced it), then re-run.
+29. **Seed-overrides / focus-pair failures (E3 #409)** — the Advanced seed config panel and the store/product focus-pair selector add four documented outcomes:
+    - **`POST /demo/run` (or the WS start frame) 422s with `seed_overrides requires skip_seed=false`** — overrides on a run that skips the seed step would be a silent no-op, so the backend rejects the combination. Fix: tick **Re-seed first** (the panel is only rendered while it's ticked; direct API callers must send `skip_seed: false`).
+    - **422 `window_days cannot override the calendar-pinned holiday_rush window`** — expected; the preset's holiday spikes are fixed 2024 dates and a shifted window would silently drop all of them (the UI disables the window control on `holiday_rush`). Fix: pick a today-anchored preset or drop `window_days`.
+    - **`status` step shows ⚠️ `user_scope (store=X, product=Y) not found — fell back to discovered pair`** — expected after a reset/reseed re-issued entity ids (Postgres sequences never reset). The run continues on the discovered pair; the workspace row's `user_scope` slot keeps the REQUESTED pair while the `store_id`/`product_id` columns record the EFFECTIVE grain (divergence is visible by design). Fix: re-pick the pair from the live dropdowns after the run.
+    - **`backtest` step ❌ NaN WAPE after high `stockout_intensity` / `sparsity` overrides** — documented expected outcome, same semantics as the `sparse` preset (incident 28); the panel shows a caveat badge at risky values. Not graceful-skipped by design — a skip would mask real regressions on healthy configs. Fix: lower the knob or accept the documented fail.
 
 > ⚠️ **RAG embedding-dim mismatch can orphan chunks (R4).** PRP-40 indexes a curated 5-file subset; if the operator switches the embedding provider mid-showcase, indexed chunks orphan (pgvector assumes one fixed dimension per column). PRP-40 does NOT ship a `clear_rag` UI toggle — that's a future PRP. Stick to one provider for the showcase run.
 
@@ -155,9 +160,9 @@ uv run python scripts/run_demo.py --seed 42 --quiet 2>&1 | tee demo.log
 4. **Deleting a workspace deletes METADATA ONLY.** The delete removes just the `showcase_workspace` row — the model runs, scenario plans, aliases, jobs, agent sessions, and on-disk artifacts the run created are NOT touched (and the seeded data is not reverted). `created_objects` ids are SOFT references (deliberately no FKs), so deletion in either direction never cascades: an operator-issued `DELETE /registry/runs/{id}` or scenario-plan delete leaves dangling deep links on a loaded workspace's artifact cards — expected; the workspace row records what WAS created, not what still exists. E2 (#408) — that staleness now SURFACES instead of dangling silently: loading a workspace probes its references via `GET /demo/workspaces/{id}/health`, dead references get a warning marker on the artifact cards, and a summary chip shows alive/dead counts plus a partial-run warning for never-completed rows.
 5. **`holiday_rush` workspaces replay the pinned 2024 window.** The preset seeds a fixed Oct–Dec 2024 window (incident 28 above); a Replay with `reset=false` ADDS those rows to a today-anchored dataset, so `/seeder/status` reports the union range afterwards. For a clean pinned window, save the workspace from a run with **Reset database** ticked — its (destructive) Replay then reproduces the pinned window exactly.
 
-**Notes:** keep-runs are recorded by warn-and-continue hooks — a DB hiccup during `create_workspace` yields a green pipeline with `workspace_id: null` and no row (check uvicorn logs for `demo.workspace_create_failed`). Ephemeral runs write no workspace rows and stay in the localStorage Run-history strip; kept runs appear ONLY in the server-backed panel. On `showcase_rich` keep-runs, the planning-phase scenario plans carry the `workspace:<name|id>` tag (E3 #392) — retrieve them via `GET /scenarios?tags=workspace:<label>`.
+**Notes:** keep-runs are recorded by warn-and-continue hooks — a DB hiccup during `create_workspace` yields a green pipeline with `workspace_id: null` and no row (check uvicorn logs for `demo.workspace_create_failed`). Ephemeral runs write no workspace rows and stay in the localStorage Run-history strip; kept runs appear ONLY in the server-backed panel. On `showcase_rich` keep-runs, the planning-phase scenario plans carry the `workspace:<name|id>` tag (E3 #392) — retrieve them via `GET /scenarios?tags=workspace:<label>`. E3 (#409) — a kept run additionally records its `seed_overrides` and `user_scope` story slots at create time; Replay re-submits both verbatim (the slot records the REQUESTED config; the row's `store_id`/`product_id` columns record the EFFECTIVE grain, so a fallen-back scope stays visible).
 
-**Explicitly out of scope (not implemented; future epics, do not assume they exist):** advanced seed configuration on `/showcase` (beyond seed/scenario/reset/skip_seed); export bundles under `artifacts/showcase/<workspace>/`; RAG-event and approval-decision capture on the workspace row (the E1 #407 story-slot columns exist but stay NULL until E5 #411 writes them); full phase-level interactive configuration. (Replay provenance shipped in E1 #407 — `replayed_from_workspace_id` is recorded on every Replay.)
+**Explicitly out of scope (not implemented; future epics, do not assume they exist):** export bundles under `artifacts/showcase/<workspace>/`; RAG-event and approval-decision capture on the workspace row (the E1 #407 story-slot columns exist but stay NULL until E5 #411 writes them); full phase-level interactive configuration. (Replay provenance shipped in E1 #407 — `replayed_from_workspace_id` is recorded on every Replay. Advanced seed configuration shipped in E3 #409 — the 7-knob `seed_overrides` panel + `user_scope` focus pair, both replay-verbatim; phase-level config remains out of scope.)
 
 ### release-please skipped the bump after a dev → main merge
 **Symptoms:** `dev → main` PR is merged, `CD Release` workflow on `main` completes in ~10s, **no Release PR** is opened. release-please log shows `No user facing commits found since <sha> - skipping`.

From c5e68e52738cd58581b4cbb3e28ea1878d83b808 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 05:24:20 +0200
Subject: [PATCH 21/32] feat(api,db): showcase run-config start-frame contract
 + workspace column (#410)

---
 ...3f204_add_showcase_workspace_run_config.py |  41 +++++++
 app/features/demo/models.py                   |   9 ++
 app/features/demo/schemas.py                  |  74 ++++++++++++
 app/features/demo/tests/test_models.py        |  38 ++++++
 app/features/demo/tests/test_schemas.py       | 108 ++++++++++++++++++
 app/features/demo/tests/test_workspace.py     |  35 ++++++
 app/features/demo/workspace.py                |  19 +++
 app/shared/model_taxonomy.py                  |   9 ++
 app/shared/tests/test_model_taxonomy.py       |  22 +++-
 9 files changed, 354 insertions(+), 1 deletion(-)
 create mode 100644 alembic/versions/b7c1d9e3f204_add_showcase_workspace_run_config.py

diff --git a/alembic/versions/b7c1d9e3f204_add_showcase_workspace_run_config.py b/alembic/versions/b7c1d9e3f204_add_showcase_workspace_run_config.py
new file mode 100644
index 00000000..60927c3e
--- /dev/null
+++ b/alembic/versions/b7c1d9e3f204_add_showcase_workspace_run_config.py
@@ -0,0 +1,41 @@
+"""add showcase_workspace run_config column
+
+Revision ID: b7c1d9e3f204
+Revises: d45cf40dfe47
+Create Date: 2026-06-13 12:00:00.000000
+
+E4 of the showcase-completion initiative (umbrella #406, epic #410). Adds a
+single nullable JSONB ``run_config`` column to ``showcase_workspace`` -- a
+REPLAY-INPUT column in the same class as ``seed`` / ``scenario`` / ``reset`` /
+``skip_seed`` (NOT an E1 story slot; see docs/_base/DOMAIN_MODEL.md D1). It
+records the start-frame model set + backtest config a ``preservation="keep"``
+run was launched with, so Load/Replay can reproduce it verbatim. NULL when the
+run used default config. No index (the read path is by ``workspace_id``; the
+column is a display/replay payload). Forward-only.
+"""
+
+from collections.abc import Sequence
+
+import sqlalchemy as sa
+from sqlalchemy.dialects import postgresql
+
+from alembic import op
+
+# revision identifiers, used by Alembic.
+revision: str = "b7c1d9e3f204"
+down_revision: str | None = "d45cf40dfe47"
+branch_labels: str | Sequence[str] | None = None
+depends_on: str | Sequence[str] | None = None
+
+
+def upgrade() -> None:
+    """Add the nullable ``run_config`` JSONB column."""
+    op.add_column(
+        "showcase_workspace",
+        sa.Column("run_config", postgresql.JSONB(astext_type=sa.Text()), nullable=True),
+    )
+
+
+def downgrade() -> None:
+    """Drop the ``run_config`` column."""
+    op.drop_column("showcase_workspace", "run_config")
diff --git a/app/features/demo/models.py b/app/features/demo/models.py
index 4a50eb4a..4897f621 100644
--- a/app/features/demo/models.py
+++ b/app/features/demo/models.py
@@ -63,6 +63,7 @@ class ShowcaseWorkspace(TimestampMixin, Base):
         date_end: Seeded data window end; NULL when unknown.
         created_objects: Soft-reference ids of everything the run created (JSONB).
         result_summary: Winner / WAPE / wall-clock display payload (JSONB).
+        run_config: Replay-input run config -- model set + backtest knobs (E4 #410); NULL on defaults.
         archived: Operator curation flag -- archived rows still list in E1.
         pinned: Operator curation flag -- no behavioral semantics in E1.
         notes: Free-text operator annotation (capped at the Pydantic boundary).
@@ -102,6 +103,14 @@ class ShowcaseWorkspace(TimestampMixin, Base):
     )
     # winner_model_type / winner_wape / wall_clock_s -- display payload.
     result_summary: Mapped[dict[str, Any] | None] = mapped_column(JSONB, nullable=True)
+    # E4 (#410) -- replay-input run config (NOT an E1 story slot; see
+    # DOMAIN_MODEL.md D1). Shape: {"train_model_types": [...], "backtest": {...}}
+    # via model_dump(mode="json"); NULL when the run used default config.
+    # Written by create_workspace at insert time (a replay input known before
+    # step 1, like seed/scenario); consumed by Load/Replay. config_schema_version
+    # is deliberately NOT bumped -- it versions the STORY-SLOT schema; run_config
+    # presence is NULL-detectable and carries its own documented shape.
+    run_config: Mapped[dict[str, Any] | None] = mapped_column(JSONB, nullable=True)
 
     # ── E1 (#407) — lifecycle metadata ────────────────────────────────────
     # Orthogonal to ``status`` (which the pipeline owns): archive/pin are
diff --git a/app/features/demo/schemas.py b/app/features/demo/schemas.py
index d5aa78ea..352770aa 100644
--- a/app/features/demo/schemas.py
+++ b/app/features/demo/schemas.py
@@ -13,6 +13,7 @@
 
 from pydantic import BaseModel, ConfigDict, Field, field_validator, model_validator
 
+from app.shared.model_taxonomy import KNOWN_MODEL_TYPES
 from app.shared.seeder.config import ScenarioPreset
 from app.shared.seeder.overrides import SeederOverrides
 
@@ -43,6 +44,40 @@ class UserScope(BaseModel):
     product_id: int = Field(..., ge=1, description="Real product id from /dimensions/products.")
 
 
+class DemoBacktestConfig(BaseModel):
+    """Backtest knobs for the showcase pipeline (E4, issue #410).
+
+    Bounds MIRROR ``app/features/backtesting/schemas.py:SplitConfig`` exactly --
+    the pipeline forwards them verbatim into ``POST /backtesting/run``. The only
+    intentional divergence is ``n_splits``'s default (3, the demo default, vs
+    SplitConfig's 5) and the addition of ``metric``, the winner-ranking choice
+    (D5: WAPE / MAE / RMSE, all lower-is-better; smape/bias deliberately
+    excluded -- issue #410 names exactly these three). Every field is
+    JSON-native so the parent's ``strict=True`` needs no per-field override.
+    """
+
+    model_config = ConfigDict(strict=True)
+
+    horizon: int = Field(default=14, ge=1, le=90, description="Forecast horizon per fold.")
+    strategy: Literal["expanding", "sliding"] = Field(
+        default="expanding",
+        description="Expanding grows the training window; sliding keeps it fixed.",
+    )
+    n_splits: int = Field(default=3, ge=2, le=20, description="Number of CV folds.")
+    min_train_size: int = Field(default=30, ge=7, description="Minimum training samples.")
+    gap: int = Field(default=0, ge=0, le=30, description="Gap days between train end and test.")
+    metric: Literal["wape", "mae", "rmse"] = Field(
+        default="wape", description="Winner-ranking metric (lower is better)."
+    )
+
+    @model_validator(mode="after")
+    def _gap_lt_horizon(self) -> DemoBacktestConfig:
+        """Mirror SplitConfig's horizon > gap invariant (avoids a 422 deeper in)."""
+        if self.gap >= self.horizon:
+            raise ValueError(f"horizon ({self.horizon}) must be greater than gap ({self.gap})")
+        return self
+
+
 class DemoRunRequest(BaseModel):
     """Request body for ``POST /demo/run`` and the ``WS /demo/stream`` start frame.
 
@@ -124,6 +159,37 @@ class DemoRunRequest(BaseModel):
             "step (warn + fallback to discovery on a dangling pair)."
         ),
     )
+    # E4 (#410): additive run-config. None -> the legacy DEMO_MODEL_TYPES trio +
+    # legacy split constants, byte-identical behaviour. The model allow-list
+    # comes from app.shared.model_taxonomy (vertical-slice rule: the demo slice
+    # never imports model_selection / forecasting). Flag enforcement is NOT
+    # here -- a disabled opt-in model fails fast in step_train (D6) to avoid the
+    # documented ".env-bleed" class from reading settings inside a schema.
+    train_model_types: list[str] | None = Field(
+        default=None,
+        min_length=1,
+        max_length=10,
+        description="Models the pipeline trains/backtests; None = the legacy baseline trio.",
+    )
+    backtest: DemoBacktestConfig | None = Field(
+        default=None,
+        description="Backtest split + ranking-metric config; None = the legacy demo split.",
+    )
+
+    @field_validator("train_model_types")
+    @classmethod
+    def _known_unique_models(cls, v: list[str] | None) -> list[str] | None:
+        """Allow-list + de-dup the model selection against KNOWN_MODEL_TYPES."""
+        if v is None:
+            return v
+        unknown = [m for m in v if m not in KNOWN_MODEL_TYPES]
+        if unknown:
+            raise ValueError(
+                f"Unknown model type(s): {unknown!r}. Valid: {sorted(KNOWN_MODEL_TYPES)}"
+            )
+        if len(set(v)) != len(v):
+            raise ValueError("train_model_types contains duplicates")
+        return v
 
     @model_validator(mode="after")
     def _workspace_name_requires_keep(self) -> DemoRunRequest:
@@ -342,6 +408,14 @@ class WorkspaceListItem(BaseModel):
     user_scope: dict[str, Any] | None = Field(
         default=None, description="Story slot (E3 #409): operator-selected focus."
     )
+    # E4 (#410) -- replay-input echo (NOT a story slot; a dedicated nullable
+    # JSONB column, see DOMAIN_MODEL.md D1). None on default-config / pre-E4
+    # rows. On the LIST item because the frontend Replay reads list rows and
+    # rebuilds the start frame's train_model_types + backtest from it.
+    run_config: dict[str, Any] | None = Field(
+        default=None,
+        description="Replay-input run config (model set + backtest); None on defaults.",
+    )
 
 
 class WorkspaceDetailResponse(WorkspaceListItem):
diff --git a/app/features/demo/tests/test_models.py b/app/features/demo/tests/test_models.py
index 28791caa..ee048764 100644
--- a/app/features/demo/tests/test_models.py
+++ b/app/features/demo/tests/test_models.py
@@ -191,3 +191,41 @@ async def test_showcase_workspace_replayed_from_recorded(db_session: AsyncSessio
     loaded = await get_workspace(db_session, row.workspace_id)
     assert loaded is not None
     assert loaded.replayed_from_workspace_id == dangling_source
+
+
+# =============================================================================
+# E4 (#410) -- run_config replay-input column
+# =============================================================================
+
+
+async def test_showcase_workspace_run_config_roundtrip(db_session: AsyncSession) -> None:
+    """run_config round-trips through JSONB intact."""
+    run_config = {
+        "train_model_types": ["naive", "regression"],
+        "backtest": {
+            "horizon": 21,
+            "strategy": "expanding",
+            "n_splits": 4,
+            "min_train_size": 30,
+            "gap": 0,
+            "metric": "rmse",
+        },
+    }
+    row = _make_row(run_config=run_config)
+    db_session.add(row)
+    await db_session.commit()
+
+    loaded = await get_workspace(db_session, row.workspace_id)
+    assert loaded is not None
+    assert loaded.run_config == run_config
+
+
+async def test_showcase_workspace_run_config_null_default(db_session: AsyncSession) -> None:
+    """run_config stays NULL on a default-config insert."""
+    row = _make_row()
+    db_session.add(row)
+    await db_session.commit()
+
+    loaded = await get_workspace(db_session, row.workspace_id)
+    assert loaded is not None
+    assert loaded.run_config is None
diff --git a/app/features/demo/tests/test_schemas.py b/app/features/demo/tests/test_schemas.py
index 8019d219..d7ba3573 100644
--- a/app/features/demo/tests/test_schemas.py
+++ b/app/features/demo/tests/test_schemas.py
@@ -7,6 +7,7 @@
 from pydantic import ValidationError
 
 from app.features.demo.schemas import (
+    DemoBacktestConfig,
     DemoRunRequest,
     DemoRunResult,
     StepEvent,
@@ -80,6 +81,9 @@ def test_demo_run_request_legacy_frame_still_validates():
     assert req.seed == 7
     assert req.preservation == "ephemeral"
     assert req.workspace_name is None
+    # E4 (#410) -- the run-config fields default None on a legacy frame.
+    assert req.train_model_types is None
+    assert req.backtest is None
 
 
 def test_demo_run_request_workspace_name_requires_keep():
@@ -233,6 +237,110 @@ def test_user_scope_rejects_extra_keys_and_bad_ids():
         UserScope.model_validate({"store_id": "1", "product_id": 1})
 
 
+# =============================================================================
+# E4 (#410) -- train_model_types + backtest (run-config phase controls)
+# =============================================================================
+
+
+def test_demo_run_request_run_config_defaults_none():
+    """E4 (#410) -- both run-config fields default None (legacy behaviour)."""
+    req = DemoRunRequest()
+    assert req.train_model_types is None
+    assert req.backtest is None
+
+
+def test_demo_run_request_accepts_model_selection_json_path():
+    """E4 (#410) -- the JSON wire form accepts a selection + nested backtest
+    dict (validate_python on a parsed dict, the path FastAPI uses)."""
+    req = DemoRunRequest.model_validate(
+        {
+            "train_model_types": ["naive", "seasonal_average"],
+            "backtest": {"horizon": 21, "n_splits": 4, "metric": "rmse"},
+        }
+    )
+    assert req.train_model_types == ["naive", "seasonal_average"]
+    assert req.backtest is not None
+    assert req.backtest.horizon == 21
+    assert req.backtest.n_splits == 4
+    assert req.backtest.metric == "rmse"
+    # Unset nested knobs fall back to their defaults.
+    assert req.backtest.strategy == "expanding"
+    assert req.backtest.min_train_size == 30
+    assert req.backtest.gap == 0
+
+
+def test_demo_run_request_rejects_unknown_model_type():
+    """E4 (#410) -- a model_type outside KNOWN_MODEL_TYPES is rejected."""
+    with pytest.raises(ValidationError):
+        DemoRunRequest.model_validate({"train_model_types": ["naive", "bogus_model"]})
+
+
+def test_demo_run_request_rejects_duplicate_model_types():
+    """E4 (#410) -- duplicate model types are rejected."""
+    with pytest.raises(ValidationError):
+        DemoRunRequest.model_validate({"train_model_types": ["naive", "naive"]})
+
+
+def test_demo_run_request_rejects_empty_and_oversized_selection():
+    """E4 (#410) -- selection size is bounded 1..10."""
+    with pytest.raises(ValidationError):
+        DemoRunRequest.model_validate({"train_model_types": []})
+    # 11 distinct known models -> over the cap of 10.
+    eleven = [
+        "naive",
+        "seasonal_naive",
+        "moving_average",
+        "weighted_moving_average",
+        "seasonal_average",
+        "trend_regression_baseline",
+        "regression",
+        "prophet_like",
+        "lightgbm",
+        "xgboost",
+        "random_forest",
+    ]
+    with pytest.raises(ValidationError):
+        DemoRunRequest.model_validate({"train_model_types": eleven})
+
+
+def test_demo_backtest_config_defaults_and_bounds():
+    """E4 (#410) -- DemoBacktestConfig defaults + bound/invariant enforcement."""
+    cfg = DemoBacktestConfig()
+    assert cfg.horizon == 14
+    assert cfg.strategy == "expanding"
+    assert cfg.n_splits == 3  # demo default, NOT SplitConfig's 5
+    assert cfg.min_train_size == 30
+    assert cfg.gap == 0
+    assert cfg.metric == "wape"
+    # n_splits floor is 2.
+    with pytest.raises(ValidationError):
+        DemoBacktestConfig.model_validate({"n_splits": 1})
+    # gap >= horizon is rejected (mirrors SplitConfig).
+    with pytest.raises(ValidationError):
+        DemoBacktestConfig.model_validate({"horizon": 5, "gap": 5})
+    # Unknown metric rejected (closed Literal).
+    with pytest.raises(ValidationError):
+        DemoBacktestConfig.model_validate({"metric": "smape"})
+
+
+def test_workspace_list_item_run_config_round_trip():
+    """E4 (#410) -- run_config rides on the list item, default None."""
+    bare = WorkspaceListItem.model_validate(_orm_like_workspace_row())
+    assert bare.run_config is None
+    slotted = WorkspaceListItem.model_validate(
+        _orm_like_workspace_row(
+            run_config={
+                "train_model_types": ["naive", "regression"],
+                "backtest": {"horizon": 21, "metric": "rmse"},
+            }
+        )
+    )
+    assert slotted.run_config == {
+        "train_model_types": ["naive", "regression"],
+        "backtest": {"horizon": 21, "metric": "rmse"},
+    }
+
+
 # =============================================================================
 # E1 (#407) -- WorkspaceUpdateRequest (PATCH body)
 # =============================================================================
diff --git a/app/features/demo/tests/test_workspace.py b/app/features/demo/tests/test_workspace.py
index fcef7115..b0597981 100644
--- a/app/features/demo/tests/test_workspace.py
+++ b/app/features/demo/tests/test_workspace.py
@@ -107,6 +107,41 @@ async def test_create_workspace_without_e3_fields_persists_nulls(
     assert row.user_scope is None
 
 
+async def test_create_workspace_records_run_config(db_session: AsyncSession) -> None:
+    """E4 (#410) -- a custom run-config keep-run persists run_config verbatim."""
+    workspace_id = await workspace.create_workspace(
+        _keep_request(
+            train_model_types=["naive", "seasonal_average"],
+            backtest={"horizon": 21, "n_splits": 4, "metric": "rmse"},
+        )
+    )
+    assert workspace_id is not None
+
+    row = await workspace.get_workspace(db_session, workspace_id)
+    assert row is not None
+    assert row.run_config == {
+        "train_model_types": ["naive", "seasonal_average"],
+        "backtest": {
+            "horizon": 21,
+            "strategy": "expanding",
+            "n_splits": 4,
+            "min_train_size": 30,
+            "gap": 0,
+            "metric": "rmse",
+        },
+    }
+
+
+async def test_create_workspace_run_config_null_on_defaults(db_session: AsyncSession) -> None:
+    """E4 (#410) -- a default-config keep-run leaves run_config NULL."""
+    workspace_id = await workspace.create_workspace(_keep_request())
+    assert workspace_id is not None
+
+    row = await workspace.get_workspace(db_session, workspace_id)
+    assert row is not None
+    assert row.run_config is None
+
+
 async def test_finalize_workspace_completed(db_session: AsyncSession) -> None:
     """finalize(failed=False) settles to completed with collected ids."""
     workspace_id = await workspace.create_workspace(_keep_request())
diff --git a/app/features/demo/workspace.py b/app/features/demo/workspace.py
index ca3002df..1b3ba4aa 100644
--- a/app/features/demo/workspace.py
+++ b/app/features/demo/workspace.py
@@ -87,6 +87,22 @@ def _apply_filters[SelectT: Select[Any]](
     return stmt
 
 
+def _run_config_payload(req: DemoRunRequest) -> dict[str, Any] | None:
+    """Build the ``run_config`` JSONB payload for a kept run (E4, #410).
+
+    Returns ``None`` when the run used default config (BOTH fields absent) so
+    the column stays NULL and Load/Replay can NULL-detect "defaults". Otherwise
+    a sparse dict carrying only the operator-set portions, JSON-serialised via
+    ``model_dump(mode="json")`` so a verbatim Replay re-submits it unchanged.
+    """
+    if req.train_model_types is None and req.backtest is None:
+        return None
+    return {
+        "train_model_types": req.train_model_types,
+        "backtest": req.backtest.model_dump(mode="json") if req.backtest is not None else None,
+    }
+
+
 async def create_workspace(req: DemoRunRequest) -> str | None:
     """Insert a ``running`` workspace row for a ``preservation="keep"`` run.
 
@@ -127,6 +143,9 @@ async def create_workspace(req: DemoRunRequest) -> str | None:
                         if req.user_scope is not None
                         else None
                     ),
+                    # E4 (#409 sibling, #410): replay-input run config -- model
+                    # set + backtest knobs, recorded verbatim (NULL on defaults).
+                    run_config=_run_config_payload(req),
                 )
             )
             await db.commit()
diff --git a/app/shared/model_taxonomy.py b/app/shared/model_taxonomy.py
index a42f10e1..8d3f21b7 100644
--- a/app/shared/model_taxonomy.py
+++ b/app/shared/model_taxonomy.py
@@ -56,6 +56,15 @@ class ModelFamily(str, Enum):
 }
 
 
+# E4 (#410) — public cross-slice request-validation allow-list. The demo
+# slice (and any other slice that must validate a model_type without importing
+# a sibling feature slice) checks membership against this frozenset instead of
+# reaching into forecasting/model_selection. Derived from the canonical map
+# above so it can never drift (drift-locked by
+# ``app/shared/tests/test_model_taxonomy.py``).
+KNOWN_MODEL_TYPES: frozenset[str] = frozenset(_MODEL_FAMILY_MAP)
+
+
 def model_family_for(model_type: str) -> ModelFamily:
     """Return the :class:`ModelFamily` for a given ``model_type`` string.
 
diff --git a/app/shared/tests/test_model_taxonomy.py b/app/shared/tests/test_model_taxonomy.py
index bf241d0f..76c2b5e3 100644
--- a/app/shared/tests/test_model_taxonomy.py
+++ b/app/shared/tests/test_model_taxonomy.py
@@ -16,7 +16,12 @@
 
 import pytest
 
-from app.shared.model_taxonomy import ModelFamily, model_family_for
+from app.shared.model_taxonomy import (
+    _MODEL_FAMILY_MAP,
+    KNOWN_MODEL_TYPES,
+    ModelFamily,
+    model_family_for,
+)
 
 # ---------------------------------------------------------------------------
 # model_family_for — canonical mapping (mirrors the legacy suite in
@@ -50,6 +55,21 @@ def test_model_family_for_unknown_returns_baseline() -> None:
     assert model_family_for("future_arima_v9") == ModelFamily.BASELINE
 
 
+# ---------------------------------------------------------------------------
+# KNOWN_MODEL_TYPES — cross-slice request-validation allow-list (E4 #410).
+# ---------------------------------------------------------------------------
+
+
+def test_known_model_types_matches_family_map() -> None:
+    """Drift-lock: the public allow-list IS the canonical map's key set."""
+    assert KNOWN_MODEL_TYPES == frozenset(_MODEL_FAMILY_MAP)
+
+
+def test_known_model_types_contains_demo_trio() -> None:
+    """The legacy demo trio must always validate (byte-compat criterion)."""
+    assert {"naive", "seasonal_naive", "moving_average"} <= KNOWN_MODEL_TYPES
+
+
 # ---------------------------------------------------------------------------
 # Back-compat re-exports — OBJECT IDENTITY across the legacy paths (#268).
 # Enum members are str-valued, so == would pass even across distinct class

From a750d7d1acfbf6cca9e6abe4a9d86a4c9c0f92d6 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 05:24:30 +0200
Subject: [PATCH 22/32] feat(api): honor run config in demo pipeline + catalog
 enabled overlay (#410)

---
 app/features/demo/pipeline.py                 | 317 +++++++++++++-----
 app/features/demo/tests/test_pipeline.py      | 254 +++++++++++++-
 app/features/model_selection/schemas.py       |   5 +
 app/features/model_selection/service.py       |  24 +-
 .../tests/test_capabilities.py                |  10 +
 .../model_selection/tests/test_service.py     |  45 +++
 6 files changed, 570 insertions(+), 85 deletions(-)

diff --git a/app/features/demo/pipeline.py b/app/features/demo/pipeline.py
index 6052a4be..a91c1dd9 100644
--- a/app/features/demo/pipeline.py
+++ b/app/features/demo/pipeline.py
@@ -42,6 +42,7 @@
 from app.core.problem_details import EMBEDDING_AUTH_CODE, ERROR_TYPES
 from app.features.demo import workspace
 from app.features.demo.schemas import DemoRunRequest, StepEvent, StepStatus, UserScope
+from app.shared.model_taxonomy import KNOWN_MODEL_TYPES
 from app.shared.seeder.config import ScenarioPreset
 from app.shared.seeder.overrides import SeederOverrides
 
@@ -210,6 +211,55 @@ async def request(
 # =============================================================================
 
 
+@dataclass(frozen=True)
+class ResolvedRunConfig:
+    """The request's run-config with legacy defaults filled in (E4, #410).
+
+    ``customized`` is True when the request carried EITHER ``train_model_types``
+    OR ``backtest`` -- it gates the byte-identical legacy path in step_backtest
+    (D4) and the ``run_config`` echo in pipeline_complete. When False every
+    field equals the legacy constant, so resolving an untouched frame is a
+    no-op.
+    """
+
+    model_types: tuple[str, ...] = DEMO_MODEL_TYPES
+    horizon: int = DEMO_HORIZON
+    strategy: str = "expanding"
+    n_splits: int = DEMO_BACKTEST_SPLITS
+    min_train_size: int = DEMO_MIN_TRAIN_SIZE
+    gap: int = 0
+    metric: str = "wape"
+    customized: bool = False
+
+
+def _resolve_run_config(req: DemoRunRequest) -> ResolvedRunConfig:
+    """Fold ``req.train_model_types`` / ``req.backtest`` over the legacy defaults.
+
+    None on both -> the all-default ResolvedRunConfig (customized=False),
+    byte-identical to today. A partial config (only a selection, or only a
+    backtest block) fills the unspecified half from the legacy constants.
+    """
+    customized = req.train_model_types is not None or req.backtest is not None
+    if not customized:
+        return ResolvedRunConfig()
+    model_types = (
+        tuple(req.train_model_types) if req.train_model_types is not None else DEMO_MODEL_TYPES
+    )
+    backtest = req.backtest
+    if backtest is None:
+        return ResolvedRunConfig(model_types=model_types, customized=True)
+    return ResolvedRunConfig(
+        model_types=model_types,
+        horizon=backtest.horizon,
+        strategy=backtest.strategy,
+        n_splits=backtest.n_splits,
+        min_train_size=backtest.min_train_size,
+        gap=backtest.gap,
+        metric=backtest.metric,
+        customized=True,
+    )
+
+
 @dataclass
 class DemoContext:
     """Accumulator threaded through every step.
@@ -268,6 +318,10 @@ class DemoContext:
     # validates and adopts (warn + fallback to discovery when dangling).
     seed_overrides: SeederOverrides | None = None
     user_scope: UserScope | None = None
+    # E4 (#410) -- resolved run config (selection + backtest split + ranking
+    # metric). Defaults to the all-legacy ResolvedRunConfig so a frame without
+    # the new fields behaves byte-identically.
+    run_config: ResolvedRunConfig = field(default_factory=ResolvedRunConfig)
 
 
 # =============================================================================
@@ -290,6 +344,13 @@ def _model_config_payload(model_type: str) -> dict[str, Any]:
         return {"model_type": "moving_average", "window_size": 7}
     if model_type == "prophet_like":
         return {"model_type": "prophet_like"}
+    # E4 (#410) -- any other KNOWN model type validates from a minimal
+    # {"model_type": X} body (runtime-verified across all 11 union members,
+    # PRP Gotcha 1). The explicit branches above stay because their non-default
+    # params (season_length / window_size) are load-bearing for config_hash
+    # stability of existing registry rows.
+    if model_type in KNOWN_MODEL_TYPES:
+        return {"model_type": model_type}
     raise ValueError(f"Unsupported demo model_type: {model_type}")
 
 
@@ -452,18 +513,21 @@ def _is_embedding_auth_error(exc: _StepError) -> bool:
 
 def _select_winner(
     backtest_results: dict[str, dict[str, float]],
+    metric: str = "wape",
 ) -> tuple[str, float] | None:
-    """Pick the ``(model_type, WAPE)`` with the lowest aggregated WAPE.
+    """Pick the ``(model_type, metric_value)`` with the lowest configured metric.
 
-    Skips models whose WAPE is missing or NaN (port of run_demo.py:338-356).
+    ``metric`` is one of wape / mae / rmse (E4 #410, D5 -- all lower-is-better);
+    defaults to "wape" so every existing call site is unchanged. Skips models
+    whose metric value is missing or NaN (port of run_demo.py:338-356).
     """
     best: tuple[str, float] | None = None
     for model_type, metrics in backtest_results.items():
-        wape = metrics.get("wape")
-        if wape is None or math.isnan(wape):
+        value = metrics.get(metric)
+        if value is None or math.isnan(value):
             continue
-        if best is None or wape < best[1]:
-            best = (model_type, wape)
+        if best is None or value < best[1]:
+            best = (model_type, value)
     return best
 
 
@@ -754,13 +818,38 @@ async def step_features(ctx: DemoContext, client: _Client) -> StepResult:
 
 
 async def step_train(ctx: DemoContext, client: _Client) -> StepResult:
-    """Train naive / seasonal_naive / moving_average in parallel."""
+    """Train the configured model set in parallel (legacy trio by default).
+
+    E4 (#410) -- the selection comes from ``ctx.run_config.model_types`` and the
+    horizon-tail reservation from ``ctx.run_config.horizon`` (both legacy
+    constants on an untouched frame). A disabled opt-in model (lightgbm /
+    xgboost / random_forest behind a False ``forecast_enable_*`` flag) fails the
+    step FAST with a detail naming the flag (D6) -- the settings read lives here,
+    never in the Pydantic schema (the documented ".env-bleed" class).
+    """
     if ctx.date_start is None or ctx.date_end is None:
         return ("fail", "no date range on ctx", {})
 
+    # D6 -- fail fast on a disabled opt-in model so the operator gets a clear,
+    # actionable message instead of a deeper 400 (route gate) or factory error.
+    settings = get_settings()
+    flag_by_model = {
+        "lightgbm": settings.forecast_enable_lightgbm,
+        "xgboost": settings.forecast_enable_xgboost,
+        "random_forest": settings.forecast_enable_random_forest,
+    }
+    disabled = [m for m in ctx.run_config.model_types if flag_by_model.get(m) is False]
+    if disabled:
+        return (
+            "fail",
+            f"model(s) {disabled} requested but the matching forecast_enable_* flag "
+            "is off — enable the flag (and install the extra) or deselect the model",
+            {"requested_models": list(ctx.run_config.model_types), "disabled_models": disabled},
+        )
+
     # Leave a horizon-sized tail unused by training so the backtest has room.
     train_start = ctx.date_start
-    train_end = ctx.date_end - timedelta(days=DEMO_HORIZON)
+    train_end = ctx.date_end - timedelta(days=ctx.run_config.horizon)
 
     async def _train(model_type: str) -> tuple[str, dict[str, Any]]:
         train_body = await client.request(
@@ -778,7 +867,7 @@ async def _train(model_type: str) -> tuple[str, dict[str, Any]]:
         return model_type, train_body
 
     results: list[tuple[str, dict[str, Any]]] = list(
-        await asyncio.gather(*(_train(m) for m in DEMO_MODEL_TYPES))
+        await asyncio.gather(*(_train(m) for m in ctx.run_config.model_types))
     )
     for model_type, train_body in results:
         ctx.train_results[model_type] = train_body
@@ -786,7 +875,10 @@ async def _train(model_type: str) -> tuple[str, dict[str, Any]]:
     return (
         "pass",
         f"trained {len(ctx.train_results)} models in parallel: {trained}",
-        {"trained": list(ctx.train_results.keys())},
+        {
+            "trained": list(ctx.train_results.keys()),
+            "requested_models": list(ctx.run_config.model_types),
+        },
     )
 
 
@@ -815,110 +907,154 @@ def _coerce_bucketed_metrics(
     return out or None
 
 
+def _backtest_body(
+    ctx: DemoContext,
+    model_type: str,
+    start_date: date,
+    end_date: date,
+    *,
+    include_baselines: bool,
+) -> dict[str, Any]:
+    """Build a ``POST /backtesting/run`` body from ``ctx.run_config`` (E4 #410).
+
+    The split knobs come from the resolved run config -- all legacy constants on
+    an untouched frame, so the body is byte-identical to today on the
+    not-customized path.
+    """
+    run_config = ctx.run_config
+    return {
+        "store_id": ctx.store_id,
+        "product_id": ctx.product_id,
+        "start_date": start_date.isoformat(),
+        "end_date": end_date.isoformat(),
+        "config": {
+            "split_config": {
+                "strategy": run_config.strategy,
+                "n_splits": run_config.n_splits,
+                "min_train_size": run_config.min_train_size,
+                "gap": run_config.gap,
+                "horizon": run_config.horizon,
+            },
+            "model_config_main": _model_config_payload(model_type),
+            "include_baselines": include_baselines,
+            "store_fold_details": False,
+        },
+    }
+
+
 async def step_backtest(ctx: DemoContext, client: _Client) -> StepResult:
-    """Run scenario-aware backtest; pick the lowest-WAPE winner.
+    """Run scenario-aware backtest; pick the winner by the configured metric.
 
     PRP-38 — on SHOWCASE_RICH the main model is feature-aware
     (``prophet_like``); baselines come back in ``baseline_results`` (one call,
     ``include_baselines=true``) and the response carries per-horizon-bucket
     metrics in ``main_model_results.bucketed_aggregated_metrics``. On
     DEMO_MINIMAL the original 3-baseline-loop behaviour is preserved.
+
+    E4 (#410, D4) — when the operator supplied a custom run config
+    (``ctx.run_config.customized``), BOTH legacy branches give way to ONE
+    unified per-model loop over the selection (each ``include_baselines=False``);
+    on SHOWCASE_RICH ``prophet_like`` is appended if absent so the V2 story
+    (``step_v2_train`` registers it unconditionally) keeps a backtest entry, and
+    its call supplies the bucketed metrics. The winner is the best
+    ``ctx.run_config.metric`` (wape / mae / rmse).
     """
     if ctx.date_start is None or ctx.date_end is None:
         return ("fail", "no date range on ctx", {})
+    start_date = ctx.date_start
+    end_date = ctx.date_end
+    run_config = ctx.run_config
 
-    if ctx.scenario is ScenarioPreset.SHOWCASE_RICH:
-        body = await client.request(
-            f"backtest[{SHOWCASE_V2_MODEL_TYPE}]",
-            "POST",
-            "/backtesting/run",
-            json_body={
-                "store_id": ctx.store_id,
-                "product_id": ctx.product_id,
-                "start_date": ctx.date_start.isoformat(),
-                "end_date": ctx.date_end.isoformat(),
-                "config": {
-                    "split_config": {
-                        "strategy": "expanding",
-                        "n_splits": DEMO_BACKTEST_SPLITS,
-                        "min_train_size": DEMO_MIN_TRAIN_SIZE,
-                        "gap": 0,
-                        "horizon": DEMO_HORIZON,
-                    },
-                    "model_config_main": _model_config_payload(SHOWCASE_V2_MODEL_TYPE),
-                    "include_baselines": True,
-                    "store_fold_details": False,
-                },
-            },
-        )
-        main_results = body.get("main_model_results", {})
-        baseline_results = body.get("baseline_results") or []
-        main_metrics = _coerce_metric_dict(
-            main_results.get("aggregated_metrics") if isinstance(main_results, dict) else None
-        )
-        ctx.backtest_results[SHOWCASE_V2_MODEL_TYPE] = main_metrics
-        # baseline_results is list[ModelBacktestResult].
-        if isinstance(baseline_results, list):
-            for entry in baseline_results:
-                if not isinstance(entry, dict):
-                    continue
-                entry_type = entry.get("model_type")
-                if not isinstance(entry_type, str):
-                    continue
-                ctx.backtest_results[entry_type] = _coerce_metric_dict(
-                    entry.get("aggregated_metrics")
+    if not run_config.customized:
+        if ctx.scenario is ScenarioPreset.SHOWCASE_RICH:
+            body = await client.request(
+                f"backtest[{SHOWCASE_V2_MODEL_TYPE}]",
+                "POST",
+                "/backtesting/run",
+                json_body=_backtest_body(
+                    ctx, SHOWCASE_V2_MODEL_TYPE, start_date, end_date, include_baselines=True
+                ),
+            )
+            main_results = body.get("main_model_results", {})
+            baseline_results = body.get("baseline_results") or []
+            main_metrics = _coerce_metric_dict(
+                main_results.get("aggregated_metrics") if isinstance(main_results, dict) else None
+            )
+            ctx.backtest_results[SHOWCASE_V2_MODEL_TYPE] = main_metrics
+            # baseline_results is list[ModelBacktestResult].
+            if isinstance(baseline_results, list):
+                for entry in baseline_results:
+                    if not isinstance(entry, dict):
+                        continue
+                    entry_type = entry.get("model_type")
+                    if not isinstance(entry_type, str):
+                        continue
+                    ctx.backtest_results[entry_type] = _coerce_metric_dict(
+                        entry.get("aggregated_metrics")
+                    )
+            ctx.bucketed_aggregated_metrics = _coerce_bucketed_metrics(
+                main_results.get("bucketed_aggregated_metrics")
+                if isinstance(main_results, dict)
+                else None
+            )
+        else:
+            # DEMO_MINIMAL / SPARSE / others: loop over baselines (legacy path).
+            for model_type in DEMO_MODEL_TYPES:
+                body = await client.request(
+                    f"backtest[{model_type}]",
+                    "POST",
+                    "/backtesting/run",
+                    json_body=_backtest_body(
+                        ctx, model_type, start_date, end_date, include_baselines=False
+                    ),
+                )
+                main_results = body.get("main_model_results", {})
+                ctx.backtest_results[model_type] = _coerce_metric_dict(
+                    main_results.get("aggregated_metrics")
+                    if isinstance(main_results, dict)
+                    else None
                 )
-        ctx.bucketed_aggregated_metrics = _coerce_bucketed_metrics(
-            main_results.get("bucketed_aggregated_metrics")
-            if isinstance(main_results, dict)
-            else None
-        )
     else:
-        # DEMO_MINIMAL / SPARSE / others: loop over baselines (legacy path).
-        for model_type in DEMO_MODEL_TYPES:
+        # E4 (#410, D4) — unified per-model loop over the operator's selection.
+        models = list(run_config.model_types)
+        if ctx.scenario is ScenarioPreset.SHOWCASE_RICH and SHOWCASE_V2_MODEL_TYPE not in models:
+            models.append(SHOWCASE_V2_MODEL_TYPE)
+        for model_type in models:
             body = await client.request(
                 f"backtest[{model_type}]",
                 "POST",
                 "/backtesting/run",
-                json_body={
-                    "store_id": ctx.store_id,
-                    "product_id": ctx.product_id,
-                    "start_date": ctx.date_start.isoformat(),
-                    "end_date": ctx.date_end.isoformat(),
-                    "config": {
-                        "split_config": {
-                            "strategy": "expanding",
-                            "n_splits": DEMO_BACKTEST_SPLITS,
-                            "min_train_size": DEMO_MIN_TRAIN_SIZE,
-                            "gap": 0,
-                            "horizon": DEMO_HORIZON,
-                        },
-                        "model_config_main": _model_config_payload(model_type),
-                        "include_baselines": False,
-                        "store_fold_details": False,
-                    },
-                },
+                json_body=_backtest_body(
+                    ctx, model_type, start_date, end_date, include_baselines=False
+                ),
             )
             main_results = body.get("main_model_results", {})
             ctx.backtest_results[model_type] = _coerce_metric_dict(
                 main_results.get("aggregated_metrics") if isinstance(main_results, dict) else None
             )
+            if model_type == SHOWCASE_V2_MODEL_TYPE:
+                ctx.bucketed_aggregated_metrics = _coerce_bucketed_metrics(
+                    main_results.get("bucketed_aggregated_metrics")
+                    if isinstance(main_results, dict)
+                    else None
+                )
 
-    winner = _select_winner(ctx.backtest_results)
+    winner = _select_winner(ctx.backtest_results, run_config.metric)
     if winner is None:
-        return ("fail", "no model produced a usable WAPE (all NaN?)", {})
+        return ("fail", f"no model produced a usable {run_config.metric} (all NaN?)", {})
     ctx.winner_model_type, ctx.winner_wape = winner
     payload: dict[str, Any] = {
         "per_model": dict(ctx.backtest_results),
         "winner": ctx.winner_model_type,
         "winner_wape": ctx.winner_wape,
+        "metric": run_config.metric,
     }
     if ctx.bucketed_aggregated_metrics is not None:
         payload["bucketed_aggregated_metrics"] = ctx.bucketed_aggregated_metrics
     return (
         "pass",
         f"{len(ctx.backtest_results)} models, winner={ctx.winner_model_type} "
-        f"wape={ctx.winner_wape:.4f}",
+        f"{run_config.metric}={ctx.winner_wape:.4f}",
         payload,
     )
 
@@ -1105,7 +1241,8 @@ async def step_v2_train(ctx: DemoContext, client: _Client) -> StepResult:
     if ctx.date_start is None or ctx.date_end is None:
         return ("fail", "no date range on ctx", {})
     train_start = ctx.date_start
-    train_end = ctx.date_end - timedelta(days=DEMO_HORIZON)
+    # E4 (#410, D8) -- the configured horizon drives the modeling steps' tail.
+    train_end = ctx.date_end - timedelta(days=ctx.run_config.horizon)
 
     train_body = await client.request(
         "v2_train[train]",
@@ -2738,6 +2875,9 @@ async def run_pipeline(app: FastAPI, req: DemoRunRequest) -> AsyncIterator[StepE
         # E3 (#409) -- thread the validated start-frame config verbatim.
         seed_overrides=req.seed_overrides,
         user_scope=req.user_scope,
+        # E4 (#410) -- resolve the run config (selection + backtest) once;
+        # legacy defaults fill in the unspecified half.
+        run_config=_resolve_run_config(req),
     )
     # E1 (#390) -- create the workspace row BEFORE the first step executes so
     # even an early failure records the run config. create_workspace is
@@ -2857,5 +2997,22 @@ async def run_pipeline(app: FastAPI, req: DemoRunRequest) -> AsyncIterator[StepE
             # E1 (#390) -- additive; a string on preservation='keep' runs,
             # None otherwise (legacy clients ignore unknown keys).
             "workspace_id": ctx.workspace_id,
+            # E4 (#410) -- echo the resolved run config on customized runs so the
+            # FE can confirm what ran; None on legacy (default-config) runs.
+            "run_config": (
+                {
+                    "train_model_types": list(ctx.run_config.model_types),
+                    "backtest": {
+                        "horizon": ctx.run_config.horizon,
+                        "strategy": ctx.run_config.strategy,
+                        "n_splits": ctx.run_config.n_splits,
+                        "min_train_size": ctx.run_config.min_train_size,
+                        "gap": ctx.run_config.gap,
+                        "metric": ctx.run_config.metric,
+                    },
+                }
+                if ctx.run_config.customized
+                else None
+            ),
         },
     )
diff --git a/app/features/demo/tests/test_pipeline.py b/app/features/demo/tests/test_pipeline.py
index 1fc4c1b4..7862d5a2 100644
--- a/app/features/demo/tests/test_pipeline.py
+++ b/app/features/demo/tests/test_pipeline.py
@@ -16,7 +16,7 @@
 from fastapi import FastAPI
 
 from app.features.demo import pipeline
-from app.features.demo.schemas import DemoRunRequest, UserScope
+from app.features.demo.schemas import DemoBacktestConfig, DemoRunRequest, UserScope
 from app.shared.seeder.config import ScenarioPreset
 from app.shared.seeder.overrides import SeederOverrides
 
@@ -378,6 +378,9 @@ def _fake_settings(
     *,
     rag_embedding_provider: str = "openai",
     openai_api_key: str = "sk-test",
+    forecast_enable_lightgbm: bool = False,
+    forecast_enable_xgboost: bool = False,
+    forecast_enable_random_forest: bool = False,
 ) -> SimpleNamespace:
     """Fake settings: usable registry root, no agent LLM key (agent skips).
 
@@ -385,6 +388,11 @@ def _fake_settings(
     PRP-40 knowledge phase runs to completion in test fixtures; the
     knowledge-skip tests override via ``rag_embedding_provider="openai"`` +
     ``openai_api_key=""`` (or "ollama" with an unreachable canned probe).
+
+    E4 (#410) -- the ``forecast_enable_*`` flags default False (matching
+    app/core/config.py), so the legacy demo trio (all always-on) still trains;
+    step_train's disabled-model fail-fast path is exercised by overriding a flag
+    AND selecting that model.
     """
     return SimpleNamespace(
         registry_artifact_root=registry_root,
@@ -393,6 +401,9 @@ def _fake_settings(
         openai_api_key=openai_api_key,
         google_api_key="",
         rag_embedding_provider=rag_embedding_provider,
+        forecast_enable_lightgbm=forecast_enable_lightgbm,
+        forecast_enable_xgboost=forecast_enable_xgboost,
+        forecast_enable_random_forest=forecast_enable_random_forest,
     )
 
 
@@ -416,6 +427,247 @@ def test_select_winner_none_when_no_usable_wape():
     assert pipeline._select_winner({"naive": {"wape": float("nan")}}) is None
 
 
+# =============================================================================
+# E4 (#410) -- run-config resolution, selection, split, metric, echo
+# =============================================================================
+
+
+def _ctx_for_step(
+    scenario: ScenarioPreset = ScenarioPreset.DEMO_MINIMAL,
+    run_config: pipeline.ResolvedRunConfig | None = None,
+) -> pipeline.DemoContext:
+    """A DemoContext positioned at the modeling phase (grain + window set)."""
+    ctx = pipeline.DemoContext(seed=42, skip_seed=True, reset=False, scenario=scenario)
+    ctx.store_id = 7
+    ctx.product_id = 3
+    ctx.date_start = date(2024, 1, 1)
+    ctx.date_end = date(2024, 12, 31)
+    if run_config is not None:
+        ctx.run_config = run_config
+    return ctx
+
+
+def test_resolve_run_config_defaults_and_custom():
+    """E4 (#410) -- None/None -> legacy; partial configs fill the other half."""
+    legacy = pipeline._resolve_run_config(DemoRunRequest())
+    assert legacy.customized is False
+    assert legacy.model_types == pipeline.DEMO_MODEL_TYPES
+    assert legacy.horizon == pipeline.DEMO_HORIZON
+    assert legacy.n_splits == pipeline.DEMO_BACKTEST_SPLITS
+    assert legacy.min_train_size == pipeline.DEMO_MIN_TRAIN_SIZE
+    assert legacy.gap == 0
+    assert legacy.metric == "wape"
+
+    sel_only = pipeline._resolve_run_config(
+        DemoRunRequest(train_model_types=["naive", "seasonal_average"])
+    )
+    assert sel_only.customized is True
+    assert sel_only.model_types == ("naive", "seasonal_average")
+    assert sel_only.horizon == pipeline.DEMO_HORIZON  # backtest defaults stay legacy
+    assert sel_only.metric == "wape"
+
+    bt_only = pipeline._resolve_run_config(
+        DemoRunRequest(backtest=DemoBacktestConfig(horizon=21, n_splits=4, metric="rmse"))
+    )
+    assert bt_only.customized is True
+    assert bt_only.model_types == pipeline.DEMO_MODEL_TYPES  # selection stays legacy
+    assert bt_only.horizon == 21
+    assert bt_only.n_splits == 4
+    assert bt_only.metric == "rmse"
+
+
+def test_model_config_payload_minimal_fallback_for_all_known_types():
+    """E4 (#410) -- every KNOWN type resolves; explicit branches keep params."""
+    from app.shared.model_taxonomy import KNOWN_MODEL_TYPES
+
+    for mt in KNOWN_MODEL_TYPES:
+        assert pipeline._model_config_payload(mt)["model_type"] == mt
+    assert pipeline._model_config_payload("seasonal_naive") == {
+        "model_type": "seasonal_naive",
+        "season_length": 7,
+    }
+    assert pipeline._model_config_payload("moving_average") == {
+        "model_type": "moving_average",
+        "window_size": 7,
+    }
+    with pytest.raises(ValueError, match="Unsupported demo model_type"):
+        pipeline._model_config_payload("not_a_model")
+
+
+def test_select_winner_honors_metric():
+    """E4 (#410, D5) -- the metric param drives selection; NaN/missing skip."""
+    results = {
+        "naive": {"wape": 0.30, "mae": 5.0, "rmse": 9.0},
+        "seasonal_naive": {"wape": 0.12, "mae": 6.0, "rmse": 7.0},
+    }
+    assert pipeline._select_winner(results, "wape") == ("seasonal_naive", 0.12)
+    assert pipeline._select_winner(results, "mae") == ("naive", 5.0)
+    assert pipeline._select_winner(results, "rmse") == ("seasonal_naive", 7.0)
+    sparse = {"a": {"wape": 0.2}, "b": {"mae": 4.0}}
+    assert pipeline._select_winner(sparse, "mae") == ("b", 4.0)
+    nan = {"a": {"rmse": float("nan")}, "b": {"rmse": 3.0}}
+    assert pipeline._select_winner(nan, "rmse") == ("b", 3.0)
+
+
+async def test_step_train_trains_selected_models(monkeypatch, tmp_path):
+    """E4 (#410) -- step_train trains exactly the configured selection."""
+    monkeypatch.setattr(pipeline, "get_settings", lambda: _fake_settings(str(tmp_path / "reg")))
+    rc = pipeline._resolve_run_config(
+        DemoRunRequest(train_model_types=["naive", "seasonal_average"])
+    )
+    ctx = _ctx_for_step(run_config=rc)
+    rec = _RecordingClient(
+        None,
+        responses={("POST", "/forecasting/train"): {"model_path": "demo/x-model_abc.joblib"}},
+    )
+    status, _detail, data = await pipeline.step_train(ctx, _as_client(rec))
+    assert status == "pass"
+    assert set(ctx.train_results) == {"naive", "seasonal_average"}
+    assert data["requested_models"] == ["naive", "seasonal_average"]
+    posted = [
+        b["config"]["model_type"]
+        for (_m, p, b) in rec.calls
+        if p == "/forecasting/train" and b is not None
+    ]
+    assert sorted(posted) == ["naive", "seasonal_average"]
+
+
+async def test_step_train_fails_fast_on_disabled_flag(monkeypatch, tmp_path):
+    """E4 (#410, D6) -- a disabled opt-in model fails before any train POST."""
+    monkeypatch.setattr(
+        pipeline,
+        "get_settings",
+        lambda: _fake_settings(str(tmp_path / "reg"), forecast_enable_lightgbm=False),
+    )
+    rc = pipeline._resolve_run_config(DemoRunRequest(train_model_types=["naive", "lightgbm"]))
+    ctx = _ctx_for_step(run_config=rc)
+    rec = _RecordingClient(None, responses={("POST", "/forecasting/train"): {"model_path": "x"}})
+    status, detail, data = await pipeline.step_train(ctx, _as_client(rec))
+    assert status == "fail"
+    assert "forecast_enable" in detail
+    assert "lightgbm" in detail
+    assert data["disabled_models"] == ["lightgbm"]
+    assert rec.calls == []  # fail-fast: no train requests issued
+
+
+async def test_step_backtest_sends_configured_split_config():
+    """E4 (#410) -- the configured split + metric ride into POST /backtesting/run."""
+    rc = pipeline._resolve_run_config(
+        DemoRunRequest(
+            train_model_types=["naive", "seasonal_average"],
+            backtest=DemoBacktestConfig(
+                horizon=21, strategy="sliding", n_splits=4, min_train_size=40, gap=2, metric="rmse"
+            ),
+        )
+    )
+    ctx = _ctx_for_step(run_config=rc)
+    rec = _RecordingClient(
+        None,
+        responses={
+            ("POST", "/backtesting/run"): {
+                "main_model_results": {"aggregated_metrics": {"wape": 0.3, "mae": 5.0, "rmse": 9.0}}
+            }
+        },
+    )
+    status, detail, data = await pipeline.step_backtest(ctx, _as_client(rec))
+    assert status == "pass"
+    assert data["metric"] == "rmse"
+    bodies = [b for (_m, p, b) in rec.calls if p == "/backtesting/run" and b is not None]
+    assert len(bodies) == 2  # exactly the selected models, no separate baselines call
+    for body in bodies:
+        assert body["config"]["split_config"] == {
+            "strategy": "sliding",
+            "n_splits": 4,
+            "min_train_size": 40,
+            "gap": 2,
+            "horizon": 21,
+        }
+        assert body["config"]["include_baselines"] is False
+    assert detail.startswith("2 models") and "rmse=" in detail
+
+
+async def test_step_backtest_custom_selection_appends_prophet_like_on_showcase_rich():
+    """E4 (#410, D4) -- prophet_like is appended on showcase_rich custom runs."""
+    rc = pipeline._resolve_run_config(
+        DemoRunRequest(train_model_types=["naive", "seasonal_average"])
+    )
+    ctx = _ctx_for_step(scenario=ScenarioPreset.SHOWCASE_RICH, run_config=rc)
+    rec = _RecordingClient(
+        None,
+        responses={
+            ("POST", "/backtesting/run"): {
+                "main_model_results": {
+                    "aggregated_metrics": {"wape": 0.3},
+                    "bucketed_aggregated_metrics": {"h_1_7": {"wape": 0.25}},
+                }
+            }
+        },
+    )
+    status, _detail, _data = await pipeline.step_backtest(ctx, _as_client(rec))
+    assert status == "pass"
+    posted = [
+        b["config"]["model_config_main"]["model_type"]
+        for (_m, p, b) in rec.calls
+        if p == "/backtesting/run" and b is not None
+    ]
+    assert posted == ["naive", "seasonal_average", "prophet_like"]
+    # bucketed metrics captured from the prophet_like (V2) call.
+    assert ctx.bucketed_aggregated_metrics == {"h_1_7": {"wape": 0.25}}
+
+
+async def test_step_backtest_legacy_path_unchanged_when_not_customized():
+    """E4 (#410, D4) -- a non-customized run keeps the legacy 3-baseline loop."""
+    ctx = _ctx_for_step()  # demo_minimal, default (not customized) run_config
+    rec = _RecordingClient(
+        None,
+        responses={
+            ("POST", "/backtesting/run"): {
+                "main_model_results": {"aggregated_metrics": {"wape": 0.3}}
+            }
+        },
+    )
+    status, detail, data = await pipeline.step_backtest(ctx, _as_client(rec))
+    assert status == "pass"
+    bodies = [b for (_m, p, b) in rec.calls if p == "/backtesting/run" and b is not None]
+    posted = [b["config"]["model_config_main"]["model_type"] for b in bodies]
+    assert posted == list(pipeline.DEMO_MODEL_TYPES)
+    for body in bodies:
+        assert body["config"]["split_config"] == {
+            "strategy": "expanding",
+            "n_splits": pipeline.DEMO_BACKTEST_SPLITS,
+            "min_train_size": pipeline.DEMO_MIN_TRAIN_SIZE,
+            "gap": 0,
+            "horizon": pipeline.DEMO_HORIZON,
+        }
+        assert body["config"]["include_baselines"] is False
+    assert data["metric"] == "wape"
+    assert "wape=" in detail
+
+
+async def test_pipeline_complete_echoes_run_config(monkeypatch, tmp_path):
+    """E4 (#410) -- pipeline_complete echoes run_config on custom runs, None on legacy."""
+    artifact = tmp_path / "naive-model.joblib"
+    artifact.write_bytes(b"fake joblib artifact bytes")
+    registry_root = tmp_path / "registry"
+    monkeypatch.setattr(pipeline, "get_settings", lambda: _fake_settings(str(registry_root)))
+    wapes = {"naive": 0.30, "seasonal_average": 0.15}
+    monkeypatch.setattr(pipeline, "_Client", _build_fake_client(str(artifact), wapes))
+
+    req = DemoRunRequest(
+        train_model_types=["naive", "seasonal_average"],
+        backtest=DemoBacktestConfig(horizon=14, n_splits=3, metric="rmse"),
+    )
+    events = [e async for e in pipeline.run_pipeline(app=_FAKE_APP, req=req)]
+    final = events[-1]
+    assert final.event_type == "pipeline_complete"
+    assert final.data["run_config"] is not None
+    assert final.data["run_config"]["train_model_types"] == ["naive", "seasonal_average"]
+    assert final.data["run_config"]["backtest"]["metric"] == "rmse"
+
+    legacy = [e async for e in pipeline.run_pipeline(app=_FAKE_APP, req=DemoRunRequest())]
+    assert legacy[-1].data["run_config"] is None
+
+
 # =============================================================================
 # run_pipeline -- full green run
 # =============================================================================
diff --git a/app/features/model_selection/schemas.py b/app/features/model_selection/schemas.py
index f494882d..2ebb3482 100644
--- a/app/features/model_selection/schemas.py
+++ b/app/features/model_selection/schemas.py
@@ -426,6 +426,11 @@ class CandidateModelInfo(BaseModel):
     default_params: dict[str, Any]
     supports_auto_predict: bool  # False for feature-aware models (predict() rejects them)
     description: str
+    # E4 (#410) — runtime forecast_enable_* overlay; SERVICE-set (the pure
+    # capabilities.build_model_catalog leaves the default True). False exactly
+    # when the matching forecast_enable_{lightgbm,xgboost,random_forest} flag is
+    # off; the showcase model picker hides disabled opt-ins.
+    enabled: bool = True
 
 
 class ModelCatalogResponse(BaseModel):
diff --git a/app/features/model_selection/service.py b/app/features/model_selection/service.py
index 10220540..3f02eb3f 100644
--- a/app/features/model_selection/service.py
+++ b/app/features/model_selection/service.py
@@ -111,12 +111,28 @@ class ModelSelectionService:
     # -------------------------------------------------------------------------
 
     def get_model_catalog(self) -> ModelCatalogResponse:
-        """Return the backend-owned candidate-model catalog (static, no I/O).
+        """Return the backend-owned candidate-model catalog with the enabled overlay.
 
-        Thin pass-through to the pure :func:`capabilities.build_model_catalog`;
-        kept on the service for symmetry with ``get_availability`` / ``run``.
+        Thin orchestration over the pure :func:`capabilities.build_model_catalog`
+        (which stays I/O-free): the service overlays the runtime
+        ``forecast_enable_*`` flags onto each item's ``enabled`` field (E4 #410,
+        D3) so the showcase model picker can hide disabled opt-in models. Every
+        always-on model stays ``enabled=True``.
         """
-        return build_model_catalog()
+        base = build_model_catalog()
+        settings = get_settings()
+        flag_by_model = {
+            "lightgbm": settings.forecast_enable_lightgbm,
+            "xgboost": settings.forecast_enable_xgboost,
+            "random_forest": settings.forecast_enable_random_forest,
+        }
+        return ModelCatalogResponse(
+            models=[
+                model.model_copy(update={"enabled": flag_by_model.get(model.model_type, True)})
+                for model in base.models
+            ],
+            default_candidate_model_types=base.default_candidate_model_types,
+        )
 
     # -------------------------------------------------------------------------
     # Availability
diff --git a/app/features/model_selection/tests/test_capabilities.py b/app/features/model_selection/tests/test_capabilities.py
index 3ff73804..667beb84 100644
--- a/app/features/model_selection/tests/test_capabilities.py
+++ b/app/features/model_selection/tests/test_capabilities.py
@@ -34,6 +34,16 @@ def test_catalog_families_are_valid_literals() -> None:
         assert model.family in {"baseline", "tree", "additive"}
 
 
+def test_capabilities_stays_pure_default_enabled_true() -> None:
+    """E4 (#410, D3) -- the pure catalog leaves enabled=True (no settings read).
+
+    The forecast_enable_* overlay is the SERVICE's job; build_model_catalog
+    stays I/O-free, so every item carries the schema default.
+    """
+    for model in build_model_catalog().models:
+        assert model.enabled is True
+
+
 def test_requires_extra_flags_lightgbm_xgboost_only() -> None:
     """Only the opt-in extras (lightgbm/xgboost) carry requires_extra=True."""
     catalog = build_model_catalog()
diff --git a/app/features/model_selection/tests/test_service.py b/app/features/model_selection/tests/test_service.py
index 67f60a60..ab13d6a3 100644
--- a/app/features/model_selection/tests/test_service.py
+++ b/app/features/model_selection/tests/test_service.py
@@ -57,6 +57,51 @@ def _patch_availability(monkeypatch: pytest.MonkeyPatch, status: str) -> None:
     )
 
 
+# -----------------------------------------------------------------------------
+# E4 (#410) -- catalog enabled overlay
+# -----------------------------------------------------------------------------
+
+_OPT_IN_MODELS = {"lightgbm", "xgboost", "random_forest"}
+
+
+def _patch_catalog_settings(
+    monkeypatch: pytest.MonkeyPatch,
+    *,
+    lightgbm: bool = False,
+    xgboost: bool = False,
+    random_forest: bool = False,
+) -> None:
+    """Patch the service's get_settings with the three forecast_enable_* flags."""
+    settings = SimpleNamespace(
+        forecast_enable_lightgbm=lightgbm,
+        forecast_enable_xgboost=xgboost,
+        forecast_enable_random_forest=random_forest,
+    )
+    monkeypatch.setattr("app.features.model_selection.service.get_settings", lambda: settings)
+
+
+def test_catalog_enabled_false_when_flags_off(monkeypatch: pytest.MonkeyPatch) -> None:
+    """E4 (#410, D3) -- with all flags off the three opt-ins are disabled,
+    every always-on model stays enabled."""
+    _patch_catalog_settings(monkeypatch)  # all default False
+    catalog = ModelSelectionService().get_model_catalog()
+    by_type = {m.model_type: m.enabled for m in catalog.models}
+    for opt_in in _OPT_IN_MODELS:
+        assert by_type[opt_in] is False
+    for model_type, enabled in by_type.items():
+        if model_type not in _OPT_IN_MODELS:
+            assert enabled is True
+
+
+def test_catalog_enabled_true_when_flag_on(monkeypatch: pytest.MonkeyPatch) -> None:
+    """E4 (#410, D3) -- enabling a flag flips exactly that model to enabled."""
+    _patch_catalog_settings(monkeypatch, lightgbm=True)
+    by_type = {m.model_type: m.enabled for m in ModelSelectionService().get_model_catalog().models}
+    assert by_type["lightgbm"] is True
+    assert by_type["xgboost"] is False
+    assert by_type["random_forest"] is False
+
+
 # -----------------------------------------------------------------------------
 # Flattening
 # -----------------------------------------------------------------------------

From 061b85e06fc5330e44d92e0fae208a33abde2012 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 05:24:30 +0200
Subject: [PATCH 23/32] feat(ui): showcase run-config panel, preview, and
 replay wiring (#410)

---
 .../candidate-model-picker.test.tsx           |   1 +
 .../demo/DemoBacktestSettingsForm.test.tsx    |  75 +++++++
 .../demo/DemoBacktestSettingsForm.tsx         | 199 ++++++++++++++++++
 .../components/demo/RunConfigPanel.test.tsx   | 102 +++++++++
 .../src/components/demo/RunConfigPanel.tsx    | 158 ++++++++++++++
 .../components/demo/WorkspacePanel.test.tsx   |  32 +++
 .../src/components/demo/WorkspacePanel.tsx    |  15 ++
 .../components/demo/replay-request.test.ts    |  26 +++
 .../src/components/demo/replay-request.ts     |   7 +
 .../components/demo/run-config-utils.test.ts  | 115 ++++++++++
 .../src/components/demo/run-config-utils.ts   | 153 ++++++++++++++
 .../src/hooks/use-model-selection.test.ts     |   1 +
 frontend/src/pages/showcase.tsx               |  46 +++-
 .../src/pages/visualize/champion.test.tsx     |   2 +
 frontend/src/types/api.ts                     |  28 +++
 15 files changed, 958 insertions(+), 2 deletions(-)
 create mode 100644 frontend/src/components/demo/DemoBacktestSettingsForm.test.tsx
 create mode 100644 frontend/src/components/demo/DemoBacktestSettingsForm.tsx
 create mode 100644 frontend/src/components/demo/RunConfigPanel.test.tsx
 create mode 100644 frontend/src/components/demo/RunConfigPanel.tsx
 create mode 100644 frontend/src/components/demo/run-config-utils.test.ts
 create mode 100644 frontend/src/components/demo/run-config-utils.ts

diff --git a/frontend/src/components/champion-selector/candidate-model-picker.test.tsx b/frontend/src/components/champion-selector/candidate-model-picker.test.tsx
index 8c7d171d..0b790117 100644
--- a/frontend/src/components/champion-selector/candidate-model-picker.test.tsx
+++ b/frontend/src/components/champion-selector/candidate-model-picker.test.tsx
@@ -18,6 +18,7 @@ function model(
     default_params: {},
     supports_auto_predict: true,
     description: `desc ${model_type}`,
+    enabled: true,
     ...overrides,
   }
 }
diff --git a/frontend/src/components/demo/DemoBacktestSettingsForm.test.tsx b/frontend/src/components/demo/DemoBacktestSettingsForm.test.tsx
new file mode 100644
index 00000000..59115ee3
--- /dev/null
+++ b/frontend/src/components/demo/DemoBacktestSettingsForm.test.tsx
@@ -0,0 +1,75 @@
+import { afterEach, beforeAll, describe, expect, it, vi } from 'vitest'
+import { cleanup, fireEvent, render, screen } from '@testing-library/react'
+import { DemoBacktestSettingsForm } from './DemoBacktestSettingsForm'
+import { DEFAULT_BACKTEST } from './run-config-utils'
+import type { DemoBacktestConfig } from '@/types/api'
+
+// Radix primitives need a couple of layout APIs jsdom lacks.
+beforeAll(() => {
+  class ResizeObserverStub {
+    observe() {}
+    unobserve() {}
+    disconnect() {}
+  }
+  vi.stubGlobal('ResizeObserver', ResizeObserverStub)
+  if (!Element.prototype.hasPointerCapture) {
+    Element.prototype.hasPointerCapture = () => false
+  }
+  if (!Element.prototype.scrollIntoView) {
+    Element.prototype.scrollIntoView = () => {}
+  }
+})
+
+afterEach(cleanup)
+
+describe('DemoBacktestSettingsForm', () => {
+  it('edits the horizon and calls onChange', () => {
+    const onChange = vi.fn()
+    render(
+      <DemoBacktestSettingsForm
+        value={{ ...DEFAULT_BACKTEST }}
+        scenario="demo_minimal"
+        onChange={onChange}
+      />,
+    )
+    fireEvent.change(screen.getByTestId('demo-settings-horizon'), {
+      target: { value: '21' },
+    })
+    expect(onChange).toHaveBeenCalledWith(
+      expect.objectContaining({ horizon: 21 } satisfies Partial<DemoBacktestConfig>),
+    )
+  })
+
+  it('shows the split-fit warning when the split exceeds the window', () => {
+    render(
+      <DemoBacktestSettingsForm
+        value={{ ...DEFAULT_BACKTEST, horizon: 28, n_splits: 5, min_train_size: 60 }}
+        scenario="demo_minimal"
+        onChange={() => {}}
+      />,
+    )
+    expect(screen.getByTestId('demo-split-fit-warning')).toBeTruthy()
+  })
+
+  it('hides the warning for a fitting split', () => {
+    render(
+      <DemoBacktestSettingsForm
+        value={{ ...DEFAULT_BACKTEST }}
+        scenario="demo_minimal"
+        onChange={() => {}}
+      />,
+    )
+    expect(screen.queryByTestId('demo-split-fit-warning')).toBeNull()
+  })
+
+  it('surfaces a split validation error (gap >= horizon)', () => {
+    render(
+      <DemoBacktestSettingsForm
+        value={{ ...DEFAULT_BACKTEST, horizon: 5, gap: 5 }}
+        scenario="showcase_rich"
+        onChange={() => {}}
+      />,
+    )
+    expect(screen.getByTestId('demo-settings-errors')).toBeTruthy()
+  })
+})
diff --git a/frontend/src/components/demo/DemoBacktestSettingsForm.tsx b/frontend/src/components/demo/DemoBacktestSettingsForm.tsx
new file mode 100644
index 00000000..2ff8a255
--- /dev/null
+++ b/frontend/src/components/demo/DemoBacktestSettingsForm.tsx
@@ -0,0 +1,199 @@
+import { useState } from 'react'
+import { ChevronDown, Settings2 } from 'lucide-react'
+import { Button } from '@/components/ui/button'
+import {
+  Collapsible,
+  CollapsibleContent,
+  CollapsibleTrigger,
+} from '@/components/ui/collapsible'
+import { Input } from '@/components/ui/input'
+import {
+  Select,
+  SelectContent,
+  SelectItem,
+  SelectTrigger,
+  SelectValue,
+} from '@/components/ui/select'
+import { splitConfigErrors } from '@/components/champion-selector/split-config'
+import { cn } from '@/lib/utils'
+import type {
+  DemoBacktestConfig,
+  DemoRankingMetric,
+  ScenarioPreset,
+  SplitStrategy,
+} from '@/types/api'
+import { splitFitWarning } from './run-config-utils'
+
+interface DemoBacktestSettingsFormProps {
+  value: DemoBacktestConfig
+  scenario: ScenarioPreset
+  onChange: (next: DemoBacktestConfig) => void
+  disabled?: boolean
+}
+
+const RANKING_METRICS: { value: DemoRankingMetric; label: string }[] = [
+  { value: 'wape', label: 'WAPE (default)' },
+  { value: 'mae', label: 'MAE' },
+  { value: 'rmse', label: 'RMSE' },
+]
+
+function Field({
+  label,
+  children,
+  hint,
+}: {
+  label: string
+  children: React.ReactNode
+  hint?: string
+}) {
+  return (
+    <div className="space-y-1">
+      <span className="text-xs text-muted-foreground">{label}</span>
+      {children}
+      {hint && <p className="text-[11px] text-muted-foreground">{hint}</p>}
+    </div>
+  )
+}
+
+/**
+ * E4 (#410) — showcase backtest-settings form. Mirrors the champion selector's
+ * BacktestSettingsForm, with two intentional differences: the horizon is
+ * EDITABLE here (the showcase drives its own horizon), and the metric list is
+ * WAPE/MAE/RMSE (issue #410). A non-blocking split-fit warning surfaces when
+ * the chosen split cannot fit the scenario's seeded window.
+ */
+export function DemoBacktestSettingsForm({
+  value,
+  scenario,
+  onChange,
+  disabled = false,
+}: DemoBacktestSettingsFormProps) {
+  const [advancedOpen, setAdvancedOpen] = useState(false)
+  const errors = splitConfigErrors(value)
+  const fitWarning = splitFitWarning(value, scenario)
+
+  function patch(partial: Partial<DemoBacktestConfig>) {
+    onChange({ ...value, ...partial })
+  }
+
+  return (
+    <div className="space-y-4" data-testid="demo-backtest-settings-form">
+      <div className="grid grid-cols-1 gap-4 sm:grid-cols-2">
+        <Field label="Ranking metric" hint="Lower is better. Picks the winning model.">
+          <Select
+            value={value.metric}
+            onValueChange={(metric) => patch({ metric: metric as DemoRankingMetric })}
+            disabled={disabled}
+          >
+            <SelectTrigger data-testid="demo-ranking-metric-select">
+              <SelectValue />
+            </SelectTrigger>
+            <SelectContent>
+              {RANKING_METRICS.map((metric) => (
+                <SelectItem key={metric.value} value={metric.value}>
+                  {metric.label}
+                </SelectItem>
+              ))}
+            </SelectContent>
+          </Select>
+        </Field>
+        <Field label="Horizon (1–90 days)" hint="Forecast length each fold evaluates.">
+          <Input
+            type="number"
+            min={1}
+            max={90}
+            value={String(value.horizon)}
+            data-testid="demo-settings-horizon"
+            disabled={disabled}
+            onChange={(event) => patch({ horizon: Number(event.target.value) || 0 })}
+          />
+        </Field>
+      </div>
+
+      <Collapsible open={advancedOpen} onOpenChange={setAdvancedOpen}>
+        <CollapsibleTrigger asChild>
+          <Button
+            type="button"
+            variant="ghost"
+            size="sm"
+            data-testid="demo-advanced-toggle"
+            disabled={disabled}
+          >
+            <Settings2 className="mr-2 h-4 w-4" />
+            Advanced split settings
+            <ChevronDown
+              className={cn('ml-2 h-4 w-4 transition-transform', advancedOpen && 'rotate-180')}
+            />
+          </Button>
+        </CollapsibleTrigger>
+        <CollapsibleContent className="pt-3">
+          <div className="grid grid-cols-1 gap-4 sm:grid-cols-2 lg:grid-cols-4">
+            <Field label="Strategy">
+              <Select
+                value={value.strategy}
+                onValueChange={(strategy) => patch({ strategy: strategy as SplitStrategy })}
+                disabled={disabled}
+              >
+                <SelectTrigger data-testid="demo-settings-strategy">
+                  <SelectValue />
+                </SelectTrigger>
+                <SelectContent>
+                  <SelectItem value="expanding">Expanding</SelectItem>
+                  <SelectItem value="sliding">Sliding</SelectItem>
+                </SelectContent>
+              </Select>
+            </Field>
+            <Field label="Splits (2–20)">
+              <Input
+                type="number"
+                min={2}
+                max={20}
+                value={String(value.n_splits)}
+                data-testid="demo-settings-n-splits"
+                disabled={disabled}
+                onChange={(event) => patch({ n_splits: Number(event.target.value) || 0 })}
+              />
+            </Field>
+            <Field label="Min train (≥7d)">
+              <Input
+                type="number"
+                min={7}
+                value={String(value.min_train_size)}
+                data-testid="demo-settings-min-train"
+                disabled={disabled}
+                onChange={(event) => patch({ min_train_size: Number(event.target.value) || 0 })}
+              />
+            </Field>
+            <Field label="Gap (0–30d)">
+              <Input
+                type="number"
+                min={0}
+                max={30}
+                value={String(value.gap)}
+                data-testid="demo-settings-gap"
+                disabled={disabled}
+                onChange={(event) => patch({ gap: Number(event.target.value) || 0 })}
+              />
+            </Field>
+          </div>
+        </CollapsibleContent>
+      </Collapsible>
+
+      {errors.length > 0 && (
+        <ul className="space-y-0.5" data-testid="demo-settings-errors">
+          {errors.map((error) => (
+            <li key={error} className="text-xs text-destructive">
+              {error}
+            </li>
+          ))}
+        </ul>
+      )}
+
+      {fitWarning && (
+        <p className="text-xs text-amber-600 dark:text-amber-500" data-testid="demo-split-fit-warning">
+          {fitWarning}
+        </p>
+      )}
+    </div>
+  )
+}
diff --git a/frontend/src/components/demo/RunConfigPanel.test.tsx b/frontend/src/components/demo/RunConfigPanel.test.tsx
new file mode 100644
index 00000000..b58bc934
--- /dev/null
+++ b/frontend/src/components/demo/RunConfigPanel.test.tsx
@@ -0,0 +1,102 @@
+import { afterEach, beforeAll, describe, expect, it, vi } from 'vitest'
+import { cleanup, fireEvent, render, screen } from '@testing-library/react'
+import type { ModelCatalogResponse } from '@/types/api'
+
+// Radix primitives need a couple of layout APIs jsdom lacks.
+beforeAll(() => {
+  class ResizeObserverStub {
+    observe() {}
+    unobserve() {}
+    disconnect() {}
+  }
+  vi.stubGlobal('ResizeObserver', ResizeObserverStub)
+  if (!Element.prototype.hasPointerCapture) {
+    Element.prototype.hasPointerCapture = () => false
+  }
+  if (!Element.prototype.scrollIntoView) {
+    Element.prototype.scrollIntoView = () => {}
+  }
+})
+
+// A catalog with one DISABLED opt-in model (lightgbm) — the picker must hide it.
+const CATALOG: ModelCatalogResponse = {
+  models: [
+    {
+      model_type: 'naive',
+      label: 'Naive',
+      family: 'baseline',
+      feature_aware: false,
+      requires_extra: false,
+      default_params: {},
+      supports_auto_predict: true,
+      description: 'baseline',
+      enabled: true,
+    },
+    {
+      model_type: 'lightgbm',
+      label: 'LightGBM',
+      family: 'tree',
+      feature_aware: true,
+      requires_extra: true,
+      default_params: {},
+      supports_auto_predict: false,
+      description: 'opt-in',
+      enabled: false,
+    },
+  ],
+  default_candidate_model_types: ['naive'],
+}
+
+vi.mock('@/hooks/use-model-selection', () => ({
+  useModelCatalog: () => ({ data: CATALOG, isLoading: false, isError: false, error: null }),
+}))
+
+import { RunConfigPanel } from './RunConfigPanel'
+import { DEFAULT_BACKTEST, DEFAULT_TRAIN_MODELS } from './run-config-utils'
+
+afterEach(cleanup)
+
+function renderPanel(overrides: Partial<React.ComponentProps<typeof RunConfigPanel>> = {}) {
+  const onSelectionChange = vi.fn()
+  const onBacktestChange = vi.fn()
+  render(
+    <RunConfigPanel
+      scenario="demo_minimal"
+      selection={['naive']}
+      onSelectionChange={onSelectionChange}
+      backtest={{ ...DEFAULT_BACKTEST }}
+      onBacktestChange={onBacktestChange}
+      {...overrides}
+    />,
+  )
+  // The panel is collapsed by default; open it to render the inner controls.
+  fireEvent.click(screen.getByTestId('run-config-toggle'))
+  return { onSelectionChange, onBacktestChange }
+}
+
+describe('RunConfigPanel', () => {
+  it('hides opt-in models whose flag is off (enabled=false)', () => {
+    renderPanel()
+    expect(screen.getByTestId('candidate-model-naive')).toBeTruthy()
+    expect(screen.queryByTestId('candidate-model-lightgbm')).toBeNull()
+  })
+
+  it('appends prophet_like (V2) to the preview only on showcase_rich', () => {
+    renderPanel({ scenario: 'showcase_rich', selection: ['naive'] })
+    expect(screen.getByTestId('preview-chip-naive')).toBeTruthy()
+    expect(screen.getByTestId('preview-chip-prophet_like')).toBeTruthy()
+  })
+
+  it('does not append prophet_like on demo_minimal', () => {
+    renderPanel({ scenario: 'demo_minimal', selection: ['naive'] })
+    expect(screen.getByTestId('preview-chip-naive')).toBeTruthy()
+    expect(screen.queryByTestId('preview-chip-prophet_like')).toBeNull()
+  })
+
+  it('reset restores the default selection + backtest', () => {
+    const { onSelectionChange, onBacktestChange } = renderPanel({ selection: ['naive'] })
+    fireEvent.click(screen.getByTestId('run-config-reset'))
+    expect(onSelectionChange).toHaveBeenCalledWith(DEFAULT_TRAIN_MODELS)
+    expect(onBacktestChange).toHaveBeenCalledWith(DEFAULT_BACKTEST)
+  })
+})
diff --git a/frontend/src/components/demo/RunConfigPanel.tsx b/frontend/src/components/demo/RunConfigPanel.tsx
new file mode 100644
index 00000000..c36a18b6
--- /dev/null
+++ b/frontend/src/components/demo/RunConfigPanel.tsx
@@ -0,0 +1,158 @@
+import { useMemo, useState } from 'react'
+import { ChevronDown, RotateCcw, SlidersHorizontal } from 'lucide-react'
+import { CandidateModelPicker } from '@/components/champion-selector/candidate-model-picker'
+import { useModelCatalog } from '@/hooks/use-model-selection'
+import { Badge } from '@/components/ui/badge'
+import { Button } from '@/components/ui/button'
+import {
+  Collapsible,
+  CollapsibleContent,
+  CollapsibleTrigger,
+} from '@/components/ui/collapsible'
+import { cn } from '@/lib/utils'
+import type {
+  DemoBacktestConfig,
+  ModelCatalogResponse,
+  ModelFamily,
+  ScenarioPreset,
+} from '@/types/api'
+import { DemoBacktestSettingsForm } from './DemoBacktestSettingsForm'
+import {
+  DEFAULT_BACKTEST,
+  DEFAULT_TRAIN_MODELS,
+  buildTrainPlan,
+  isDefaultBacktest,
+  isDefaultSelection,
+} from './run-config-utils'
+
+interface RunConfigPanelProps {
+  scenario: ScenarioPreset
+  disabled?: boolean
+  selection: string[]
+  onSelectionChange: (models: string[]) => void
+  backtest: DemoBacktestConfig
+  onBacktestChange: (cfg: DemoBacktestConfig) => void
+}
+
+/**
+ * E4 (#410) — collapsible "Run configuration (advanced)" section on /showcase.
+ *
+ * Composes the reused CandidateModelPicker (fed an enabled-filtered catalog so
+ * disabled opt-in models are hidden), the DemoBacktestSettingsForm, a Reset
+ * button, and a train-candidate preview chip list. Collapsed by default so an
+ * untouched run sends a byte-identical legacy frame (the dirty-only rule lives
+ * in showcase.tsx).
+ */
+export function RunConfigPanel({
+  scenario,
+  disabled = false,
+  selection,
+  onSelectionChange,
+  backtest,
+  onBacktestChange,
+}: RunConfigPanelProps) {
+  const [open, setOpen] = useState(false)
+  const { data: catalog, isLoading } = useModelCatalog()
+
+  // Hide opt-in models whose forecast_enable_* flag is off (catalog.enabled).
+  const enabledCatalog: ModelCatalogResponse | undefined = useMemo(() => {
+    if (!catalog) return undefined
+    return { ...catalog, models: catalog.models.filter((m) => m.enabled) }
+  }, [catalog])
+
+  const familyByType = useMemo(() => {
+    const map: Record<string, ModelFamily> = {}
+    for (const m of catalog?.models ?? []) map[m.model_type] = m.family
+    return map
+  }, [catalog])
+
+  const plan = useMemo(
+    () => buildTrainPlan(selection, scenario, familyByType),
+    [selection, scenario, familyByType],
+  )
+
+  const isCustomized = !isDefaultSelection(selection) || !isDefaultBacktest(backtest)
+
+  function reset() {
+    onSelectionChange([...DEFAULT_TRAIN_MODELS])
+    onBacktestChange({ ...DEFAULT_BACKTEST })
+  }
+
+  return (
+    <Collapsible open={open} onOpenChange={setOpen} data-testid="run-config-panel">
+      <CollapsibleTrigger asChild>
+        <Button
+          type="button"
+          variant="ghost"
+          size="sm"
+          data-testid="run-config-toggle"
+          disabled={disabled}
+        >
+          <SlidersHorizontal className="mr-2 h-4 w-4" />
+          Run configuration (advanced)
+          {isCustomized && (
+            <Badge variant="secondary" className="ml-2" data-testid="run-config-custom-badge">
+              customized
+            </Badge>
+          )}
+          <ChevronDown
+            className={cn('ml-2 h-4 w-4 transition-transform', open && 'rotate-180')}
+          />
+        </Button>
+      </CollapsibleTrigger>
+      <CollapsibleContent className="space-y-6 pt-4">
+        <div className="space-y-2">
+          <div className="flex items-center justify-between">
+            <p className="text-sm font-medium">Models to train</p>
+            <Button
+              type="button"
+              variant="ghost"
+              size="sm"
+              data-testid="run-config-reset"
+              onClick={reset}
+              disabled={disabled}
+            >
+              <RotateCcw className="mr-2 h-4 w-4" />
+              Reset to defaults
+            </Button>
+          </div>
+          <CandidateModelPicker
+            catalog={enabledCatalog}
+            selected={selection}
+            onChange={onSelectionChange}
+            isLoading={isLoading}
+          />
+        </div>
+
+        <div className="space-y-2">
+          <p className="text-sm font-medium">Backtest settings</p>
+          <DemoBacktestSettingsForm
+            value={backtest}
+            scenario={scenario}
+            onChange={onBacktestChange}
+            disabled={disabled}
+          />
+        </div>
+
+        <div className="space-y-2" data-testid="train-candidate-preview">
+          <p className="text-sm font-medium">
+            Will train {plan.length} model{plan.length === 1 ? '' : 's'}
+          </p>
+          <div className="flex flex-wrap gap-1.5">
+            {plan.map((entry) => (
+              <Badge
+                key={entry.model_type}
+                variant={entry.v2 ? 'default' : 'outline'}
+                data-testid={`preview-chip-${entry.model_type}`}
+              >
+                {entry.model_type}
+                {entry.v2 ? ' (V2)' : ''}
+                {entry.family ? ` · ${entry.family}` : ''}
+              </Badge>
+            ))}
+          </div>
+        </div>
+      </CollapsibleContent>
+    </Collapsible>
+  )
+}
diff --git a/frontend/src/components/demo/WorkspacePanel.test.tsx b/frontend/src/components/demo/WorkspacePanel.test.tsx
index 9f0ae9be..f1fa2d25 100644
--- a/frontend/src/components/demo/WorkspacePanel.test.tsx
+++ b/frontend/src/components/demo/WorkspacePanel.test.tsx
@@ -44,6 +44,7 @@ const baseItem: WorkspaceListItem = {
   replayed_from_workspace_id: null,
   seed_overrides: null,
   user_scope: null,
+  run_config: null,
 }
 
 const secondItem: WorkspaceListItem = {
@@ -150,6 +151,37 @@ describe('WorkspacePanel', () => {
     expect(container.textContent).toContain('DESTRUCTIVE')
   })
 
+  it('renders the custom-config badge only when run_config is set (E4 #410)', () => {
+    // Default-config row: no badge.
+    mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
+    const plain = renderPanel()
+    expect(plain.container.querySelector('[data-testid="run-config-summary-badge"]')).toBeNull()
+    cleanup()
+
+    // Custom-config row: badge with the compact summary.
+    mockResponse = {
+      data: {
+        workspaces: [
+          {
+            ...baseItem,
+            run_config: {
+              train_model_types: ['naive', 'regression', 'prophet_like', 'seasonal_average'],
+              backtest: { horizon: 21, n_splits: 4, metric: 'rmse' },
+            },
+          },
+        ],
+        total: 1,
+      },
+      isLoading: false,
+    }
+    const custom = renderPanel()
+    const badge = custom.container.querySelector('[data-testid="run-config-summary-badge"]')
+    expect(badge).not.toBeNull()
+    expect(badge!.textContent).toContain('4 models')
+    expect(badge!.textContent).toContain('rmse')
+    expect(badge!.textContent).toContain('4×h21')
+  })
+
   it('invokes onLoad / onRequestReplay with the list item — replay never starts here', () => {
     mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
     const onLoad = vi.fn()
diff --git a/frontend/src/components/demo/WorkspacePanel.tsx b/frontend/src/components/demo/WorkspacePanel.tsx
index 1fa62fe6..fe931421 100644
--- a/frontend/src/components/demo/WorkspacePanel.tsx
+++ b/frontend/src/components/demo/WorkspacePanel.tsx
@@ -69,6 +69,7 @@ import { ROUTES } from '@/lib/constants'
 import { cn } from '@/lib/utils'
 import type { WorkspaceListItem, WorkspaceListParams } from '@/types/api'
 import { WorkspaceEditDialog } from './WorkspaceEditDialog'
+import { parseRunConfig } from './run-config-utils'
 
 interface WorkspacePanelProps {
   /** Called when the operator clicks Load — restore config + artifacts, no run. */
@@ -115,6 +116,15 @@ function labelOf(ws: WorkspaceListItem): string {
   return ws.name ?? ws.workspace_id.slice(0, 8)
 }
 
+// E4 (#410) — compact run-config summary, e.g. "custom: 4 models · rmse · 4×h21".
+// Null when the row used default config (run_config null) — no badge rendered.
+function runConfigSummary(ws: WorkspaceListItem): string | null {
+  const parsed = parseRunConfig(ws.run_config)
+  if (!parsed) return null
+  const { trainModels, backtest } = parsed
+  return `${trainModels.length} models · ${backtest.metric} · ${backtest.n_splits}×h${backtest.horizon}`
+}
+
 export function WorkspacePanel({
   onLoad,
   onRequestReplay,
@@ -353,6 +363,11 @@ export function WorkspacePanel({
                     <span className="font-semibold">{labelOf(ws)}</span>
                     {ws.archived && <Badge variant="outline">archived</Badge>}
                     {ws.replayed_from_workspace_id && <Badge variant="outline">replay</Badge>}
+                    {runConfigSummary(ws) && (
+                      <Badge variant="secondary" data-testid="run-config-summary-badge">
+                        custom: {runConfigSummary(ws)}
+                      </Badge>
+                    )}
                     <span className="rounded bg-muted px-2 py-0.5">{ws.scenario}</span>
                     <span>seed {ws.seed}</span>
                     <span className={statusClass(ws.status)}>{ws.status.toUpperCase()}</span>
diff --git a/frontend/src/components/demo/replay-request.test.ts b/frontend/src/components/demo/replay-request.test.ts
index 5b65a677..c375ece4 100644
--- a/frontend/src/components/demo/replay-request.test.ts
+++ b/frontend/src/components/demo/replay-request.test.ts
@@ -18,6 +18,7 @@ const baseItem: WorkspaceListItem = {
   replayed_from_workspace_id: null,
   seed_overrides: null,
   user_scope: null,
+  run_config: null,
 }
 
 describe('buildReplayRequest', () => {
@@ -46,6 +47,31 @@ describe('buildReplayRequest', () => {
     expect('user_scope' in request).toBe(false)
   })
 
+  // E4 (#410) — replay-verbatim covers the recorded run config.
+  it('omits run-config keys on a default-config row (null run_config)', () => {
+    const request = buildReplayRequest(baseItem)
+    expect('train_model_types' in request).toBe(false)
+    expect('backtest' in request).toBe(false)
+  })
+
+  it('re-submits recorded run_config (model set + backtest) verbatim', () => {
+    const configured: WorkspaceListItem = {
+      ...baseItem,
+      run_config: {
+        train_model_types: ['naive', 'seasonal_average'],
+        backtest: { horizon: 21, n_splits: 4, metric: 'rmse' },
+      },
+    }
+    const request = buildReplayRequest(configured)
+    expect(request.train_model_types).toEqual(['naive', 'seasonal_average'])
+    expect(request.backtest?.horizon).toBe(21)
+    expect(request.backtest?.n_splits).toBe(4)
+    expect(request.backtest?.metric).toBe('rmse')
+    // Missing knobs are filled from the defaults (verbatim-complete frame).
+    expect(request.backtest?.strategy).toBe('expanding')
+    expect(request.backtest?.min_train_size).toBe(30)
+  })
+
   it('re-submits recorded seed_overrides and user_scope verbatim', () => {
     const slotted: WorkspaceListItem = {
       ...baseItem,
diff --git a/frontend/src/components/demo/replay-request.ts b/frontend/src/components/demo/replay-request.ts
index 51590be5..aab00732 100644
--- a/frontend/src/components/demo/replay-request.ts
+++ b/frontend/src/components/demo/replay-request.ts
@@ -1,4 +1,5 @@
 import type { DemoRunRequest, WorkspaceListItem } from '@/types/api'
+import { parseRunConfig } from './run-config-utils'
 
 /**
  * E2 (#408) — the EXACT request a confirmed replay sends. Single source for
@@ -6,6 +7,9 @@ import type { DemoRunRequest, WorkspaceListItem } from '@/types/api'
  * the preview can never lie about what goes on the wire.
  */
 export function buildReplayRequest(ws: WorkspaceListItem): DemoRunRequest {
+  // E4 (#410) — replay-verbatim covers the recorded run config; null on
+  // default-config rows, so their replay frame stays byte-identical.
+  const runConfig = parseRunConfig(ws.run_config)
   return {
     seed: ws.seed,
     scenario: ws.scenario,
@@ -19,5 +23,8 @@ export function buildReplayRequest(ws: WorkspaceListItem): DemoRunRequest {
     // legacy rows (null) so their replay frame stays byte-identical.
     ...(ws.seed_overrides ? { seed_overrides: ws.seed_overrides } : {}),
     ...(ws.user_scope ? { user_scope: ws.user_scope } : {}),
+    ...(runConfig
+      ? { train_model_types: runConfig.trainModels, backtest: runConfig.backtest }
+      : {}),
   }
 }
diff --git a/frontend/src/components/demo/run-config-utils.test.ts b/frontend/src/components/demo/run-config-utils.test.ts
new file mode 100644
index 00000000..9bee6f14
--- /dev/null
+++ b/frontend/src/components/demo/run-config-utils.test.ts
@@ -0,0 +1,115 @@
+import { describe, expect, it } from 'vitest'
+import {
+  DEFAULT_BACKTEST,
+  DEFAULT_TRAIN_MODELS,
+  buildTrainPlan,
+  isDefaultBacktest,
+  isDefaultSelection,
+  parseRunConfig,
+  splitFitWarning,
+  windowDaysFor,
+} from './run-config-utils'
+import type { DemoBacktestConfig } from '@/types/api'
+
+describe('isDefaultSelection', () => {
+  it('is true for the default trio regardless of order', () => {
+    expect(isDefaultSelection([...DEFAULT_TRAIN_MODELS])).toBe(true)
+    expect(isDefaultSelection(['moving_average', 'naive', 'seasonal_naive'])).toBe(true)
+  })
+
+  it('is false for any other selection', () => {
+    expect(isDefaultSelection(['naive'])).toBe(false)
+    expect(isDefaultSelection(['naive', 'seasonal_naive', 'regression'])).toBe(false)
+  })
+})
+
+describe('isDefaultBacktest', () => {
+  it('is true for the default config', () => {
+    expect(isDefaultBacktest({ ...DEFAULT_BACKTEST })).toBe(true)
+  })
+
+  it('is false when any knob differs', () => {
+    expect(isDefaultBacktest({ ...DEFAULT_BACKTEST, metric: 'rmse' })).toBe(false)
+    expect(isDefaultBacktest({ ...DEFAULT_BACKTEST, horizon: 21 })).toBe(false)
+  })
+})
+
+describe('buildTrainPlan', () => {
+  it('returns the selection verbatim on non-showcase scenarios', () => {
+    const plan = buildTrainPlan(['naive', 'seasonal_average'], 'demo_minimal')
+    expect(plan.map((p) => p.model_type)).toEqual(['naive', 'seasonal_average'])
+    expect(plan.some((p) => p.v2)).toBe(false)
+  })
+
+  it('appends prophet_like (V2) on showcase_rich when absent', () => {
+    const plan = buildTrainPlan(['naive'], 'showcase_rich')
+    expect(plan.map((p) => p.model_type)).toEqual(['naive', 'prophet_like'])
+    expect(plan[1].v2).toBe(true)
+  })
+
+  it('does not double-append prophet_like when already selected', () => {
+    const plan = buildTrainPlan(['prophet_like', 'naive'], 'showcase_rich')
+    expect(plan.map((p) => p.model_type)).toEqual(['prophet_like', 'naive'])
+  })
+
+  it('tags each chip with its family from the catalog map', () => {
+    const plan = buildTrainPlan(['naive'], 'demo_minimal', { naive: 'baseline' })
+    expect(plan[0].family).toBe('baseline')
+  })
+})
+
+describe('windowDaysFor', () => {
+  it('returns 92 for the short-window presets', () => {
+    expect(windowDaysFor('demo_minimal')).toBe(92)
+    expect(windowDaysFor('sparse')).toBe(92)
+    expect(windowDaysFor('holiday_rush')).toBe(92)
+  })
+
+  it('returns 180 for the rich-window presets', () => {
+    expect(windowDaysFor('showcase_rich')).toBe(180)
+    expect(windowDaysFor('retail_standard')).toBe(180)
+  })
+})
+
+describe('splitFitWarning', () => {
+  it('returns null when the split fits the window', () => {
+    expect(splitFitWarning({ ...DEFAULT_BACKTEST }, 'demo_minimal')).toBeNull()
+  })
+
+  it('warns when the split exceeds the seeded window', () => {
+    const aggressive: DemoBacktestConfig = {
+      ...DEFAULT_BACKTEST,
+      horizon: 28,
+      n_splits: 5,
+      min_train_size: 60,
+    }
+    const warning = splitFitWarning(aggressive, 'demo_minimal')
+    expect(warning).toContain('demo_minimal')
+  })
+})
+
+describe('parseRunConfig', () => {
+  it('returns null for a null/empty config', () => {
+    expect(parseRunConfig(null)).toBeNull()
+    expect(parseRunConfig(undefined)).toBeNull()
+  })
+
+  it('parses train_model_types + backtest, defaulting missing knobs', () => {
+    const parsed = parseRunConfig({
+      train_model_types: ['naive', 'regression'],
+      backtest: { horizon: 21, metric: 'rmse' },
+    })
+    expect(parsed).not.toBeNull()
+    expect(parsed!.trainModels).toEqual(['naive', 'regression'])
+    expect(parsed!.backtest.horizon).toBe(21)
+    expect(parsed!.backtest.metric).toBe('rmse')
+    // Missing knobs fall back to the defaults.
+    expect(parsed!.backtest.n_splits).toBe(DEFAULT_BACKTEST.n_splits)
+    expect(parsed!.backtest.strategy).toBe(DEFAULT_BACKTEST.strategy)
+  })
+
+  it('falls back to the default trio when models are malformed', () => {
+    const parsed = parseRunConfig({ train_model_types: 'oops', backtest: {} })
+    expect(parsed!.trainModels).toEqual(DEFAULT_TRAIN_MODELS)
+  })
+})
diff --git a/frontend/src/components/demo/run-config-utils.ts b/frontend/src/components/demo/run-config-utils.ts
new file mode 100644
index 00000000..9264b12d
--- /dev/null
+++ b/frontend/src/components/demo/run-config-utils.ts
@@ -0,0 +1,153 @@
+import type {
+  DemoBacktestConfig,
+  DemoRankingMetric,
+  ModelFamily,
+  ScenarioPreset,
+} from '@/types/api'
+
+/**
+ * E4 (#410) — pure helpers for the showcase run-config panel. Kept in a `.ts`
+ * module (not a `.tsx`) so the `react-refresh/only-export-components` lint rule
+ * stays happy and the logic is unit-testable without rendering.
+ */
+
+// The legacy demo trio — the default selection (and the byte-compat baseline).
+export const DEFAULT_TRAIN_MODELS = ['naive', 'seasonal_naive', 'moving_average']
+
+// The legacy demo split + ranking metric. Mirrors the backend defaults
+// (DEMO_HORIZON=14, DEMO_BACKTEST_SPLITS=3, DEMO_MIN_TRAIN_SIZE=30, gap=0,
+// strategy 'expanding', metric 'wape').
+export const DEFAULT_BACKTEST: DemoBacktestConfig = {
+  horizon: 14,
+  strategy: 'expanding',
+  n_splits: 3,
+  min_train_size: 30,
+  gap: 0,
+  metric: 'wape',
+}
+
+// The V2 feature-aware model appended to a custom selection on showcase_rich
+// (the v2_train step trains/registers it unconditionally; see pipeline.py).
+export const SHOWCASE_V2_MODEL = 'prophet_like'
+
+/** True when `models` equals the default trio (order-insensitive). */
+export function isDefaultSelection(models: string[]): boolean {
+  if (models.length !== DEFAULT_TRAIN_MODELS.length) return false
+  const a = [...models].sort()
+  const b = [...DEFAULT_TRAIN_MODELS].sort()
+  return a.every((m, i) => m === b[i])
+}
+
+/** True when every backtest knob equals its default. */
+export function isDefaultBacktest(cfg: DemoBacktestConfig): boolean {
+  return (
+    cfg.horizon === DEFAULT_BACKTEST.horizon &&
+    cfg.strategy === DEFAULT_BACKTEST.strategy &&
+    cfg.n_splits === DEFAULT_BACKTEST.n_splits &&
+    cfg.min_train_size === DEFAULT_BACKTEST.min_train_size &&
+    cfg.gap === DEFAULT_BACKTEST.gap &&
+    cfg.metric === DEFAULT_BACKTEST.metric
+  )
+}
+
+export interface TrainPlanEntry {
+  model_type: string
+  family?: ModelFamily
+  /** Appended V2 entry (prophet_like on showcase_rich) — not operator-picked. */
+  v2?: boolean
+}
+
+/**
+ * The exact models the pipeline will train, in display order. On showcase_rich
+ * `prophet_like (V2)` is appended (unless already selected) because the
+ * v2_train step registers it unconditionally — it stays in the competition.
+ * The `families` map (model_type → family, from the catalog) tags each chip.
+ */
+export function buildTrainPlan(
+  models: string[],
+  scenario: ScenarioPreset,
+  families: Record<string, ModelFamily> = {},
+): TrainPlanEntry[] {
+  const plan: TrainPlanEntry[] = models.map((m) => ({
+    model_type: m,
+    family: families[m],
+  }))
+  if (scenario === 'showcase_rich' && !models.includes(SHOWCASE_V2_MODEL)) {
+    plan.push({ model_type: SHOWCASE_V2_MODEL, family: families[SHOWCASE_V2_MODEL], v2: true })
+  }
+  return plan
+}
+
+/**
+ * The seeded window (days) for a scenario. SOURCE OF TRUTH:
+ * pipeline.py `_SCENARIO_SEED_PROFILE` (demo_minimal / sparse / holiday_rush =
+ * 92-day window, every other preset = 180). Keep in sync.
+ */
+export function windowDaysFor(scenario: ScenarioPreset): number {
+  if (scenario === 'demo_minimal' || scenario === 'sparse' || scenario === 'holiday_rush') {
+    return 92
+  }
+  return 180
+}
+
+/**
+ * A soft (non-blocking) warning when the split cannot fit the seeded window:
+ * `min_train_size + n_splits * (horizon + gap) > windowDays`. The backend does
+ * NOT clamp — an over-aggressive split fails honestly at backtest (sparse-preset
+ * precedent), so the UI warns ahead of time. Returns null when the split fits.
+ */
+export function splitFitWarning(
+  cfg: DemoBacktestConfig,
+  scenario: ScenarioPreset,
+): string | null {
+  const windowDays = windowDaysFor(scenario)
+  const required = cfg.min_train_size + cfg.n_splits * (cfg.horizon + cfg.gap)
+  if (required > windowDays) {
+    return (
+      `This split needs ~${required} days but ${scenario} seeds ~${windowDays}. ` +
+      'The backtest may produce NaN / too-few-folds and fail — reduce horizon, splits, or min train.'
+    )
+  }
+  return null
+}
+
+/**
+ * Parse a stored `run_config` (Record<string, unknown> from a workspace row)
+ * into the typed pieces Load/Replay repopulate. Returns null when absent or
+ * shapeless. Missing knobs fall back to the defaults so a partial stored config
+ * still yields a complete backtest object.
+ */
+export function parseRunConfig(
+  raw: Record<string, unknown> | null | undefined,
+): { trainModels: string[]; backtest: DemoBacktestConfig } | null {
+  if (!raw || typeof raw !== 'object') return null
+  const rawModels = (raw as { train_model_types?: unknown }).train_model_types
+  const trainModels =
+    Array.isArray(rawModels) && rawModels.every((m) => typeof m === 'string')
+      ? (rawModels as string[])
+      : DEFAULT_TRAIN_MODELS
+  const rawBacktest = (raw as { backtest?: unknown }).backtest
+  const backtest = parseBacktest(rawBacktest)
+  return { trainModels, backtest }
+}
+
+function parseBacktest(raw: unknown): DemoBacktestConfig {
+  if (!raw || typeof raw !== 'object') return { ...DEFAULT_BACKTEST }
+  const obj = raw as Record<string, unknown>
+  const num = (key: keyof DemoBacktestConfig, fallback: number): number =>
+    typeof obj[key] === 'number' ? (obj[key] as number) : fallback
+  const strategy = obj.strategy === 'sliding' ? 'sliding' : DEFAULT_BACKTEST.strategy
+  const metricRaw = obj.metric
+  const metric: DemoRankingMetric =
+    metricRaw === 'mae' || metricRaw === 'rmse' || metricRaw === 'wape'
+      ? metricRaw
+      : DEFAULT_BACKTEST.metric
+  return {
+    horizon: num('horizon', DEFAULT_BACKTEST.horizon),
+    strategy,
+    n_splits: num('n_splits', DEFAULT_BACKTEST.n_splits),
+    min_train_size: num('min_train_size', DEFAULT_BACKTEST.min_train_size),
+    gap: num('gap', DEFAULT_BACKTEST.gap),
+    metric,
+  }
+}
diff --git a/frontend/src/hooks/use-model-selection.test.ts b/frontend/src/hooks/use-model-selection.test.ts
index 5074351b..df08674a 100644
--- a/frontend/src/hooks/use-model-selection.test.ts
+++ b/frontend/src/hooks/use-model-selection.test.ts
@@ -48,6 +48,7 @@ const CATALOG: ModelCatalogResponse = {
       default_params: {},
       supports_auto_predict: true,
       description: 'Repeats the last observed value.',
+      enabled: true,
     },
   ],
   default_candidate_model_types: ['naive', 'seasonal_naive', 'moving_average'],
diff --git a/frontend/src/pages/showcase.tsx b/frontend/src/pages/showcase.tsx
index 8ff2f8eb..97545ce8 100644
--- a/frontend/src/pages/showcase.tsx
+++ b/frontend/src/pages/showcase.tsx
@@ -15,6 +15,14 @@ import { WorkspacePanel } from '@/components/demo/WorkspacePanel'
 import { WorkspaceArtifactsPanel } from '@/components/demo/WorkspaceArtifactsPanel'
 import { SeedConfigPanel } from '@/components/demo/SeedConfigPanel'
 import { ScopeSelector } from '@/components/demo/ScopeSelector'
+import { RunConfigPanel } from '@/components/demo/RunConfigPanel'
+import {
+  DEFAULT_BACKTEST,
+  DEFAULT_TRAIN_MODELS,
+  isDefaultBacktest,
+  isDefaultSelection,
+  parseRunConfig,
+} from '@/components/demo/run-config-utils'
 import { buildReplayRequest } from '@/components/demo/replay-request'
 import { WORKSPACE_NAME_PATTERN } from '@/components/demo/workspace-name'
 import { Button } from '@/components/ui/button'
@@ -23,7 +31,12 @@ import { Checkbox } from '@/components/ui/checkbox'
 import { Input } from '@/components/ui/input'
 import { ROUTES } from '@/lib/constants'
 import { cn } from '@/lib/utils'
-import type { SeedOverrides, UserScope, WorkspaceListItem } from '@/types/api'
+import type {
+  DemoBacktestConfig,
+  SeedOverrides,
+  UserScope,
+  WorkspaceListItem,
+} from '@/types/api'
 
 const TERMINAL_STATUSES = new Set(['pass', 'fail', 'skip', 'warn'])
 
@@ -130,6 +143,10 @@ export default function ShowcasePage() {
   // operator-selected focus pair (null = auto-discover first pair).
   const [seedOverrides, setSeedOverrides] = useState<SeedOverrides | null>(null)
   const [userScope, setUserScope] = useState<UserScope | null>(null)
+  // E4 (#410) — run-config phase controls. Default = the legacy trio + split;
+  // the dirty-only rule (below) omits both keys from the frame when untouched.
+  const [trainModels, setTrainModels] = useState<string[]>([...DEFAULT_TRAIN_MODELS])
+  const [backtestCfg, setBacktestCfg] = useState<DemoBacktestConfig>({ ...DEFAULT_BACKTEST })
 
   // The page (not the panel) resolves the loaded workspace's detail — the
   // artifacts panel needs detail-only created_objects.
@@ -169,6 +186,11 @@ export default function ShowcasePage() {
       // them on skip_seed=true); omit both keys for legacy byte-compat.
       ...(reseed && seedOverrides ? { seed_overrides: seedOverrides } : {}),
       ...(userScope ? { user_scope: userScope } : {}),
+      // E4 (#410) — dirty-only inclusion: omit train_model_types / backtest
+      // when they equal the defaults, so untouched controls send a
+      // byte-identical legacy frame (umbrella criterion).
+      ...(isDefaultSelection(trainModels) ? {} : { train_model_types: trainModels }),
+      ...(isDefaultBacktest(backtestCfg) ? {} : { backtest: backtestCfg }),
     })
   }
 
@@ -184,6 +206,11 @@ export default function ShowcasePage() {
     // E3 (#409) — repopulate the seed-config panel + scope selector.
     setSeedOverrides(ws.seed_overrides ?? null)
     setUserScope(ws.user_scope ?? null)
+    // E4 (#410) — repopulate the run-config panel; reset to defaults when the
+    // row carried no custom config (null run_config).
+    const runConfig = parseRunConfig(ws.run_config)
+    setTrainModels(runConfig ? runConfig.trainModels : [...DEFAULT_TRAIN_MODELS])
+    setBacktestCfg(runConfig ? runConfig.backtest : { ...DEFAULT_BACKTEST })
     setSelectedWorkspaceId(ws.workspace_id)
   }
 
@@ -292,7 +319,11 @@ export default function ShowcasePage() {
         <CardContent className="space-y-4">
           <div className="flex flex-wrap items-end gap-6">
             <ScenarioPicker value={scenario} onChange={setScenario} disabled={isRunning} />
-            <Button onClick={handleRun} disabled={isRunning || nameInvalid} size="lg">
+            <Button
+              onClick={handleRun}
+              disabled={isRunning || nameInvalid || trainModels.length === 0}
+              size="lg"
+            >
               {isRunning ? (
                 <Loader2 className="mr-2 h-4 w-4 animate-spin" />
               ) : (
@@ -418,6 +449,17 @@ export default function ShowcasePage() {
             )}
           </div>
 
+          {/* E4 (#410) — run-config phase controls (model set + backtest +
+              preview). Collapsed by default; untouched sends a legacy frame. */}
+          <RunConfigPanel
+            scenario={scenario}
+            disabled={isRunning}
+            selection={trainModels}
+            onSelectionChange={setTrainModels}
+            backtest={backtestCfg}
+            onBacktestChange={setBacktestCfg}
+          />
+
           {phase === 'running' && (
             <p className="text-sm text-muted-foreground">
               Step {completed} of {steps.length} complete…
diff --git a/frontend/src/pages/visualize/champion.test.tsx b/frontend/src/pages/visualize/champion.test.tsx
index 2ae297ca..691a8b19 100644
--- a/frontend/src/pages/visualize/champion.test.tsx
+++ b/frontend/src/pages/visualize/champion.test.tsx
@@ -29,6 +29,7 @@ const CATALOG: ModelCatalogResponse = {
       default_params: {},
       supports_auto_predict: true,
       description: 'Repeats the last observed value.',
+      enabled: true,
     },
     {
       model_type: 'regression',
@@ -39,6 +40,7 @@ const CATALOG: ModelCatalogResponse = {
       default_params: {},
       supports_auto_predict: false,
       description: 'Histogram gradient boosting.',
+      enabled: true,
     },
   ],
   default_candidate_model_types: ['naive', 'regression'],
diff --git a/frontend/src/types/api.ts b/frontend/src/types/api.ts
index 4c9e07ef..637f7a62 100644
--- a/frontend/src/types/api.ts
+++ b/frontend/src/types/api.ts
@@ -793,6 +793,23 @@ export interface UserScope {
   product_id: number
 }
 
+// E4 (#410) — winner-ranking metric for the showcase backtest. A subset of
+// RankingMetric (the champion selector's wape/smape/mae/bias) — issue #410
+// names exactly WAPE/MAE/RMSE, all lower-is-better.
+export type DemoRankingMetric = 'wape' | 'mae' | 'rmse'
+
+// E4 (#410) — showcase backtest config. Mirrors the backend
+// app/features/demo/schemas.py:DemoBacktestConfig (which itself mirrors
+// SplitConfig bounds; the demo n_splits default is 3, not SplitConfig's 5).
+export interface DemoBacktestConfig {
+  horizon: number // 1..90, def 14; must be > gap
+  strategy: SplitStrategy // def 'expanding'
+  n_splits: number // 2..20, def 3
+  min_train_size: number // >= 7, def 30
+  gap: number // 0..30, def 0
+  metric: DemoRankingMetric // def 'wape'
+}
+
 // Start frame for WS /demo/stream and request body for POST /demo/run.
 export interface DemoRunRequest {
   seed?: number
@@ -809,6 +826,11 @@ export interface DemoRunRequest {
   // E3 (#409) — advanced seed config + focus pair; omit both for legacy runs.
   seed_overrides?: SeedOverrides
   user_scope?: UserScope
+  // E4 (#410) — run-config phase controls. Omit both (dirty-only rule) to keep
+  // the legacy frame byte-identical; None server-side → the legacy baseline
+  // trio + default split.
+  train_model_types?: string[]
+  backtest?: DemoBacktestConfig
 }
 
 // Aggregate result returned by the synchronous POST /demo/run.
@@ -846,6 +868,9 @@ export interface WorkspaceListItem {
   // list rows); null on runs without them.
   seed_overrides: SeedOverrides | null
   user_scope: UserScope | null
+  // E4 (#410) — replay-input run config (model set + backtest); null on
+  // default-config / pre-E4 rows. Replay rebuilds the start frame from it.
+  run_config: Record<string, unknown> | null
 }
 
 // Full row from GET /demo/workspaces/{workspace_id}.
@@ -1368,6 +1393,9 @@ export interface CandidateModelInfo {
   /** false for feature-aware models (the predict path rejects them). */
   supports_auto_predict: boolean
   description: string
+  // E4 (#410) — runtime forecast_enable_* overlay (service-set). False exactly
+  // when the matching opt-in flag is off; the showcase picker hides those.
+  enabled: boolean
 }
 
 export interface ModelCatalogResponse {

From 3ece4535972968d5f87f02d486f595eb54a34fce Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 05:24:30 +0200
Subject: [PATCH 24/32] docs(docs): document showcase run-config contract
 (#410)

---
 docs/_base/API_CONTRACTS.md | 6 +++---
 docs/_base/DOMAIN_MODEL.md  | 4 ++--
 docs/_base/RUNBOOKS.md      | 4 +++-
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/docs/_base/API_CONTRACTS.md b/docs/_base/API_CONTRACTS.md
index e9c2ff7a..2922c077 100644
--- a/docs/_base/API_CONTRACTS.md
+++ b/docs/_base/API_CONTRACTS.md
@@ -58,9 +58,9 @@ All endpoints serve JSON; error responses use `application/problem+json` (RFC 78
 | agents | WS | `/agents/stream` | Token-by-token streaming + tool-call events |
 | seeder | (see `app/features/seeder/routes.py`) | `/seeder/*` | Trigger scenarios, status, customization. **E3 (#409)** — `POST /seeder/generate` accepts an additive Optional `overrides` object (`SeederOverrides`, `app/shared/seeder/overrides.py`) with 7 allow-listed knobs: `stores` (1-100), `products` (1-500), `window_days` (75-365; recomputes `start_date` from `end_date`), `sparsity` (0-0.9), `promotion_intensity` (0-0.5), `stockout_intensity` (0-0.5), `noise_sigma` (0-0.5). `extra=forbid` → an unknown knob is a `422`; applied LAST in `_build_config_from_params` so it wins over the scalar `stores`/`products`/`sparsity` params; absent = byte-identical legacy behavior |
 | seeder | POST | `/seeder/phase2-enrichment` | PRP-38 — run Phase 2 generators (lifecycle, replenishment, exogenous, returns) against the existing seeded data. `422 application/problem+json` on an empty database. |
-| demo | POST | `/demo/run` | Run the end-to-end demo pipeline in-process; returns a `DemoRunResult`. `409 application/problem+json` if a run is already active. **PRP-38** — body accepts an Optional `scenario: 'demo_minimal' \| 'showcase_rich' \| 'sparse'` field; default `'demo_minimal'` (back-compat). **E1 (#390)** — body accepts additive Optional `preservation: 'ephemeral' \| 'keep'` (default `'ephemeral'`, today's no-row behavior) and `workspace_name: str \| null` (pattern `^[a-z0-9][a-z0-9\-_]*$`, ≤100 chars); `workspace_name` without `preservation='keep'` → `422 application/problem+json`. `preservation='keep'` records the run as a `showcase_workspace` row; `DemoRunResult` gains an additive Optional `workspace_id: str \| null`. **E2 (#391)** — `scenario` accepts all 8 `ScenarioPreset` values (`retail_standard` / `holiday_rush` / `high_variance` / `stockout_heavy` / `new_launches` / `sparse` / `demo_minimal` / `showcase_rich`); only `showcase_rich` changes the step table (24 rows), every other preset runs the legacy 11-row flow. **E1 (#407)** — body accepts additive Optional `replayed_from_workspace_id: str \| null` (`^[0-9a-f]{32}$`); requires `preservation='keep'` (else `422 application/problem+json`); recorded verbatim on the new `showcase_workspace` row as a SOFT reference (no existence check — dangles are designed). **E3 (#409)** — body accepts additive Optional `seed_overrides` (the same `SeederOverrides` object as `POST /seeder/generate`; requires `skip_seed=false` else `422`; `window_days` rejected on the calendar-pinned `holiday_rush` preset; `{}` normalizes to `null`) and `user_scope` (`{store_id: int>=1, product_id: int>=1}`, `extra=forbid` — the focus pair the pipeline models instead of the auto-discovered first pair; validated by the status step, WARN + fallback to discovery on a dangling pair). Both persist into the kept workspace row's story slots and replay verbatim. |
-| demo | WS | `/demo/stream` | Stream one `StepEvent` per pipeline step for the live Showcase page |
-| demo | GET | `/demo/workspaces` | **E4 (#393)** — list saved showcase workspaces, newest first (`limit` 1-100 default 20 / `offset`); `200` + empty list on an empty table. **E1 (#407)** — list items additively carry `archived`, `pinned`, `tags`, `replayed_from_workspace_id`. **E2 (#408)** — additive query params: `q` (name ILIKE search, min 2 chars), repeated `tags` (JSONB containment — all listed tags must match), `include_archived` (default `false` — archived rows are now HIDDEN by default), allow-listed `sort_by` (`created_at`/`name`/`seed`/`status`; unknown → default `created_at desc`, no 422) + `sort_order` (`asc`/`desc`); pinned rows always order first; `total` respects the active filters. **E3 (#409)** — list items additively carry the `seed_overrides` / `user_scope` story slots (`null` on runs without them) — deliberately on the LIST item, because the frontend Replay builds its verbatim start frame from list rows |
+| demo | POST | `/demo/run` | Run the end-to-end demo pipeline in-process; returns a `DemoRunResult`. `409 application/problem+json` if a run is already active. **PRP-38** — body accepts an Optional `scenario: 'demo_minimal' \| 'showcase_rich' \| 'sparse'` field; default `'demo_minimal'` (back-compat). **E1 (#390)** — body accepts additive Optional `preservation: 'ephemeral' \| 'keep'` (default `'ephemeral'`, today's no-row behavior) and `workspace_name: str \| null` (pattern `^[a-z0-9][a-z0-9\-_]*$`, ≤100 chars); `workspace_name` without `preservation='keep'` → `422 application/problem+json`. `preservation='keep'` records the run as a `showcase_workspace` row; `DemoRunResult` gains an additive Optional `workspace_id: str \| null`. **E2 (#391)** — `scenario` accepts all 8 `ScenarioPreset` values (`retail_standard` / `holiday_rush` / `high_variance` / `stockout_heavy` / `new_launches` / `sparse` / `demo_minimal` / `showcase_rich`); only `showcase_rich` changes the step table (24 rows), every other preset runs the legacy 11-row flow. **E1 (#407)** — body accepts additive Optional `replayed_from_workspace_id: str \| null` (`^[0-9a-f]{32}$`); requires `preservation='keep'` (else `422 application/problem+json`); recorded verbatim on the new `showcase_workspace` row as a SOFT reference (no existence check — dangles are designed). **E3 (#409)** — body accepts additive Optional `seed_overrides` (the same `SeederOverrides` object as `POST /seeder/generate`; requires `skip_seed=false` else `422`; `window_days` rejected on the calendar-pinned `holiday_rush` preset; `{}` normalizes to `null`) and `user_scope` (`{store_id: int>=1, product_id: int>=1}`, `extra=forbid` — the focus pair the pipeline models instead of the auto-discovered first pair; validated by the status step, WARN + fallback to discovery on a dangling pair). Both persist into the kept workspace row's story slots and replay verbatim. **E4 (#410)** — body accepts additive Optional `train_model_types: list[str] \| null` (1-10 items, allow-listed against the 11 `KNOWN_MODEL_TYPES` in `app/shared/model_taxonomy.py`; unknown/duplicate → `422`) and `backtest: DemoBacktestConfig \| null` (`{horizon 1-90 def 14, strategy expanding\|sliding, n_splits 2-20 def 3, min_train_size ≥7 def 30, gap 0-30 def 0, metric wape\|mae\|rmse}`; `gap ≥ horizon` → `422`). Both `None` → byte-identical legacy behaviour (the baseline trio + default split). A selected opt-in model whose `forecast_enable_*` flag is off fails the `train` step (NOT validation — D6) with a detail naming the flag. On `preservation='keep'` runs the config is recorded verbatim in the `showcase_workspace.run_config` column and replayed verbatim; `pipeline_complete.data.run_config` echoes it (`null` on default-config runs). |
+| demo | WS | `/demo/stream` | Stream one `StepEvent` per pipeline step for the live Showcase page. **E4 (#410)** — the start frame additively accepts `train_model_types` + `backtest` (same shapes/validation as `POST /demo/run`); a bad selection (unknown/duplicate model, `gap ≥ horizon`) is one `error` event then close. The frontend sends both keys only when the operator changed them (dirty-only rule), so an untouched run streams a byte-identical legacy frame. |
+| demo | GET | `/demo/workspaces` | **E4 (#393)** — list saved showcase workspaces, newest first (`limit` 1-100 default 20 / `offset`); `200` + empty list on an empty table. **E1 (#407)** — list items additively carry `archived`, `pinned`, `tags`, `replayed_from_workspace_id`. **E2 (#408)** — additive query params: `q` (name ILIKE search, min 2 chars), repeated `tags` (JSONB containment — all listed tags must match), `include_archived` (default `false` — archived rows are now HIDDEN by default), allow-listed `sort_by` (`created_at`/`name`/`seed`/`status`; unknown → default `created_at desc`, no 422) + `sort_order` (`asc`/`desc`); pinned rows always order first; `total` respects the active filters. **E3 (#409)** — list items additively carry the `seed_overrides` / `user_scope` story slots (`null` on runs without them) — deliberately on the LIST item, because the frontend Replay builds its verbatim start frame from list rows. **E4 (#410)** — list items additively carry `run_config` (`{train_model_types, backtest}` or `null` on default-config rows) — also on the LIST item so Replay rebuilds the run config from list rows |
 | demo | GET | `/demo/workspaces/{workspace_id}` | **E4 (#393)** — full workspace row incl. `created_objects` soft references + grain/window columns; `404 application/problem+json` when missing. **E1 (#407)** — response additively carries the list-item lifecycle fields plus `notes`, `config_schema_version`, and the six story slots (`seed_overrides` / `user_scope` / `approval_events` / `rag_events` / `job_ids` / `phase_summaries` — `null` until their writer epic lands; schemas in `docs/_base/DOMAIN_MODEL.md`). **E3 (#409)** — `seed_overrides` and `user_scope` are now WRITTEN (recorded at create time from the start frame) and surfaced on the LIST item as well (Detail inherits) |
 | demo | GET | `/demo/workspaces/{workspace_id}/health` | **E2 (#408)** — probe the workspace's soft references in-process (model runs, scenario plans, alias, batch, agent session, `job_ids` slot) via `httpx.ASGITransport`; per-reference `status` ∈ `alive` (2xx) / `dead` (404 — deleted after the run) / `unknown` (anything else — never a 500), plus `alive`/`dead`/`unknown` counts and `partial_run` (true when the row's status ≠ `completed`); non-probeable keys (`v2_model_path`, `scenario_artifact_key`, `train_model_types`) are skipped; `404 application/problem+json` when the workspace is missing |
 | demo | PATCH | `/demo/workspaces/{workspace_id}` | **E1 (#407)** — partial lifecycle update (`name` / `notes` / `tags` / `archived` / `pinned`; `exclude_unset` semantics — only provided fields change; explicit `null` clears `name`/`notes`; explicit `null` on `archived`/`pinned`/`tags` → `422` (send `[]` to clear tags); `status` NOT patchable — the pipeline owns it); returns the updated `WorkspaceDetailResponse`; empty body = `200` no-op; `404 application/problem+json` when missing; `422` on unknown keys / bad name pattern / >20 tags |
diff --git a/docs/_base/DOMAIN_MODEL.md b/docs/_base/DOMAIN_MODEL.md
index a7493219..150a3a41 100644
--- a/docs/_base/DOMAIN_MODEL.md
+++ b/docs/_base/DOMAIN_MODEL.md
@@ -58,7 +58,7 @@
 ### `showcase_workspace` (Demo)
 - **Root:** `ShowcaseWorkspace(workspace_id: str, status: str)` — one row = one preserved (`preservation="keep"`) showcase run. Ephemeral runs (the default) write no row; a `workspace_name` merely labels a keep-run row (names are non-unique).
 - **Status state machine:** `running` → `completed` | `failed` (CHECK-constrained; the finalize hook settles the row even on mid-run failure).
-- **Stored metadata:** replay config (`seed`, `scenario`, `reset`, `skip_seed`), showcase grain + window (`store_id`, `product_id`, `date_start`, `date_end` — NULL on early failure), lifecycle (`status`, `created_at`/`updated_at`), and the JSONB payloads below. E1 (#407) adds operator-curation columns `archived` / `pinned` (booleans, default false, PATCH-mutable, orthogonal to `status` — the pipeline owns the run lifecycle), `notes` (free text, 2000-char cap at the Pydantic boundary), `tags` (a queryable JSONB string array — its own GIN-indexed column, exact `scenario_plan.tags` pattern, ≤20 items at the PATCH boundary), `config_schema_version` (int, default 1 — versions the workspace config + story-slot schema as a whole; any epic that changes a documented slot shape bumps the ORM default and documents the delta here), and the provenance column `replayed_from_workspace_id` (String(32), btree-indexed SOFT reference — see Invariants).
+- **Stored metadata:** replay config (`seed`, `scenario`, `reset`, `skip_seed`), showcase grain + window (`store_id`, `product_id`, `date_start`, `date_end` — NULL on early failure), lifecycle (`status`, `created_at`/`updated_at`), and the JSONB payloads below. E1 (#407) adds operator-curation columns `archived` / `pinned` (booleans, default false, PATCH-mutable, orthogonal to `status` — the pipeline owns the run lifecycle), `notes` (free text, 2000-char cap at the Pydantic boundary), `tags` (a queryable JSONB string array — its own GIN-indexed column, exact `scenario_plan.tags` pattern, ≤20 items at the PATCH boundary), `config_schema_version` (int, default 1 — versions the workspace config + story-slot schema as a whole; any epic that changes a documented slot shape bumps the ORM default and documents the delta here), and the provenance column `replayed_from_workspace_id` (String(32), btree-indexed SOFT reference — see Invariants). E4 (#410) adds the replay-input column `run_config` (nullable JSONB, `{"train_model_types": [...], "backtest": {...}}` or NULL on default-config runs) — a REPLAY INPUT in the same class as `seed`/`scenario`/`reset`/`skip_seed`, **NOT a story slot** (D1): it records the start-frame model set + backtest config a kept run was launched with, written by `create_workspace` at insert time and consumed by Load/Replay. `config_schema_version` is deliberately NOT bumped by E4 — it versions the STORY-SLOT schema; `run_config` presence is NULL-detectable and carries its own documented shape.
 - **JSONB fields:** `created_objects` (sparse soft-reference keys — `winning_run_id`, `v2_run_id`, `v2_model_path`, `alias`, `agent_session_id`, `batch_id`, `scenario_plan_ids`, `scenario_artifact_key`, `train_model_types`, `stale_alias_run_id`) and `result_summary` (winner / WAPE / wall-clock display payload).
 - **JSONB story slots (E1 #407 — authoritative per-slot schema):** six dedicated nullable JSONB columns; `NULL` = "slot never written" (distinct from empty). E1 ships the columns only — each slot has an assigned writer epic:
   - `seed_overrides` (**WRITTEN since E3 #409**) — SPARSE dict: only operator-set knobs appear, `{}` is never stored (`None` instead). Allow-listed keys (the `SeederOverrides` schema, `app/shared/seeder/overrides.py`): `stores` int 1-100, `products` int 1-500, `window_days` int 75-365, `sparsity` float 0-0.9, `promotion_intensity` float 0-0.5, `stockout_intensity` float 0-0.5, `noise_sigma` float 0-0.5. Persisted via `model_dump(mode="json", exclude_none=True)` at create time; replay re-submits it verbatim. Records the REQUESTED config — the data the run actually seeded follows from it deterministically.
@@ -69,7 +69,7 @@
   - `phase_summaries` (later parallel epic) — list[dict], one per phase: `{"phase_name": str, "status": "pass"|"fail"|"warn"|"skip", "steps": int, "duration_ms": float}`.
 - **Relationship to demo pipeline runs:** one workspace row per kept pipeline run — `create_workspace` inserts it as `running` before the first step; `finalize_workspace` settles it with the run's collected ids. NOT a seeder `scenario`: a preset is a reusable data-generation recipe; a workspace is the record of ONE concrete run (which preset it used, with what seed, and what it produced).
 - **Invariants:**
-  - The config columns (`seed`, `scenario`, `reset`, `skip_seed`) — plus, since E3 #409, the `seed_overrides`/`user_scope` story slots — are sufficient for a verbatim Replay through the normal run path; replay never mutates the original row; it creates a NEW row.
+  - The config columns (`seed`, `scenario`, `reset`, `skip_seed`) — plus, since E3 #409, the `seed_overrides`/`user_scope` story slots, and since E4 #410 the `run_config` replay-input column — are sufficient for a verbatim Replay through the normal run path; replay never mutates the original row; it creates a NEW row.
   - `name` is deliberately NON-unique; `workspace_id` (UUID hex) is the unique handle.
   - `created_objects` carries SOFT references only — **no ForeignKeys by design**. The workspace row is an audit record, not an ownership root: the referenced runs/plans/aliases are independently operator-deletable, and a workspace must never block (or cascade) their deletion.
   - Deletion is METADATA-ONLY, symmetric with the no-FK design: `DELETE /demo/workspaces/{id}` removes the `showcase_workspace` row and nothing else — the soft-referenced model runs, scenario plans, aliases, jobs, agent sessions, and artifacts survive, and a workspace whose references already dangle still deletes cleanly.
diff --git a/docs/_base/RUNBOOKS.md b/docs/_base/RUNBOOKS.md
index f7aa35a5..495e476d 100644
--- a/docs/_base/RUNBOOKS.md
+++ b/docs/_base/RUNBOOKS.md
@@ -144,6 +144,8 @@ uv run python scripts/run_demo.py --seed 42 --quiet 2>&1 | tee demo.log
     - **422 `window_days cannot override the calendar-pinned holiday_rush window`** — expected; the preset's holiday spikes are fixed 2024 dates and a shifted window would silently drop all of them (the UI disables the window control on `holiday_rush`). Fix: pick a today-anchored preset or drop `window_days`.
     - **`status` step shows ⚠️ `user_scope (store=X, product=Y) not found — fell back to discovered pair`** — expected after a reset/reseed re-issued entity ids (Postgres sequences never reset). The run continues on the discovered pair; the workspace row's `user_scope` slot keeps the REQUESTED pair while the `store_id`/`product_id` columns record the EFFECTIVE grain (divergence is visible by design). Fix: re-pick the pair from the live dropdowns after the run.
     - **`backtest` step ❌ NaN WAPE after high `stockout_intensity` / `sparsity` overrides** — documented expected outcome, same semantics as the `sparse` preset (incident 28); the panel shows a caveat badge at risky values. Not graceful-skipped by design — a skip would mask real regressions on healthy configs. Fix: lower the knob or accept the documented fail.
+30. **`train` step ❌ "forecast_enable_* flag is off" after selecting an opt-in model (E4 #410)** — the "Run configuration (advanced)" model picker only surfaces opt-in models (`lightgbm` / `xgboost` / `random_forest`) when the matching `forecast_enable_*` flag is on, so this normally cannot happen from the UI. A direct `POST /demo/run` / WS start frame naming a disabled opt-in passes Pydantic validation (the schema checks only the static `KNOWN_MODEL_TYPES` allow-list, NOT settings — D6, to avoid the `.env`-bleed class) and fails fast at the `train` step with a detail naming the flag. Cause: the flag defaults False (`app/core/config.py:118-120`), or the extra is not installed even with the flag on. Fix: set `forecast_enable_<model>=true` in `.env` (and `uv sync --all-extras` for lightgbm/xgboost), or deselect the model. The catalog's `requires_extra` badge hints at the install need.
+31. **`backtest` step ❌ NaN / too-few-folds after an aggressive custom split (E4 #410)** — the "Advanced split settings" form lets the operator push `horizon` / `n_splits` / `min_train_size` / `gap` past what the seeded window can fit; `min_train_size + n_splits×(horizon+gap)` greater than the scenario window (92d for demo_minimal/sparse/holiday_rush, 180d otherwise) cannot produce valid folds. The backend does NOT clamp — it fails honestly (same policy as the `sparse` preset, incident 28; a silent clamp would mask real regressions on healthy configs). The form shows a non-blocking amber split-fit warning ahead of time. Fix: reduce horizon / splits / min train (or pick a 180-day preset), then re-run.
 
 > ⚠️ **RAG embedding-dim mismatch can orphan chunks (R4).** PRP-40 indexes a curated 5-file subset; if the operator switches the embedding provider mid-showcase, indexed chunks orphan (pgvector assumes one fixed dimension per column). PRP-40 does NOT ship a `clear_rag` UI toggle — that's a future PRP. Stick to one provider for the showcase run.
 
@@ -162,7 +164,7 @@ uv run python scripts/run_demo.py --seed 42 --quiet 2>&1 | tee demo.log
 
 **Notes:** keep-runs are recorded by warn-and-continue hooks — a DB hiccup during `create_workspace` yields a green pipeline with `workspace_id: null` and no row (check uvicorn logs for `demo.workspace_create_failed`). Ephemeral runs write no workspace rows and stay in the localStorage Run-history strip; kept runs appear ONLY in the server-backed panel. On `showcase_rich` keep-runs, the planning-phase scenario plans carry the `workspace:<name|id>` tag (E3 #392) — retrieve them via `GET /scenarios?tags=workspace:<label>`. E3 (#409) — a kept run additionally records its `seed_overrides` and `user_scope` story slots at create time; Replay re-submits both verbatim (the slot records the REQUESTED config; the row's `store_id`/`product_id` columns record the EFFECTIVE grain, so a fallen-back scope stays visible).
 
-**Explicitly out of scope (not implemented; future epics, do not assume they exist):** export bundles under `artifacts/showcase/<workspace>/`; RAG-event and approval-decision capture on the workspace row (the E1 #407 story-slot columns exist but stay NULL until E5 #411 writes them); full phase-level interactive configuration. (Replay provenance shipped in E1 #407 — `replayed_from_workspace_id` is recorded on every Replay. Advanced seed configuration shipped in E3 #409 — the 7-knob `seed_overrides` panel + `user_scope` focus pair, both replay-verbatim; phase-level config remains out of scope.)
+**Explicitly out of scope (not implemented; future epics, do not assume they exist):** export bundles under `artifacts/showcase/<workspace>/`; RAG-event and approval-decision capture on the workspace row (the E1 #407 story-slot columns exist but stay NULL until E5 #411 writes them); mid-run / per-phase re-entry (the linear single-`asyncio.Lock` pipeline is preserved — all configuration is start-frame-time only). (Replay provenance shipped in E1 #407 — `replayed_from_workspace_id` is recorded on every Replay. Advanced seed configuration shipped in E3 #409 — the 7-knob `seed_overrides` panel + `user_scope` focus pair, both replay-verbatim. Run configuration shipped in E4 #410 — the start-frame model set + backtest config in the `run_config` replay-input column, replay-verbatim.)
 
 ### release-please skipped the bump after a dev → main merge
 **Symptoms:** `dev → main` PR is merged, `CD Release` workflow on `main` completes in ~10s, **no Release PR** is opened. release-please log shows `No user facing commits found since <sha> - skipping`.

From de16625a5b8a0b84b7ff3b565402c73fa5dfd0aa Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 08:49:29 +0200
Subject: [PATCH 25/32] feat(api): add hitl decision relay and story capture to
 demo pipeline (#411)

---
 app/features/demo/hitl.py                 |  95 +++++
 app/features/demo/models.py               |  27 +-
 app/features/demo/pipeline.py             | 452 +++++++++++++++++-----
 app/features/demo/routes.py               |  63 ++-
 app/features/demo/schemas.py              |  61 +++
 app/features/demo/tests/test_hitl.py      |  80 ++++
 app/features/demo/tests/test_models.py    |   9 +-
 app/features/demo/tests/test_pipeline.py  | 351 ++++++++++++++++-
 app/features/demo/tests/test_routes.py    | 124 ++++++
 app/features/demo/tests/test_schemas.py   |  59 +++
 app/features/demo/tests/test_workspace.py | 143 ++++++-
 app/features/demo/workspace.py            |  82 +++-
 12 files changed, 1416 insertions(+), 130 deletions(-)
 create mode 100644 app/features/demo/hitl.py
 create mode 100644 app/features/demo/tests/test_hitl.py

diff --git a/app/features/demo/hitl.py b/app/features/demo/hitl.py
new file mode 100644
index 00000000..32a19edb
--- /dev/null
+++ b/app/features/demo/hitl.py
@@ -0,0 +1,95 @@
+"""HITL decision relay for the showcase pipeline (E5, issue #411).
+
+A single-slot, in-memory store that lets the Showcase HITL step card relay an
+operator's Approve/Reject decision back to the in-flight pipeline. The browser
+POSTs ``/demo/hitl-decision`` (demo slice); :func:`resolve` records the decision
+and wakes the waiting step, which then forwards the real decision to the agents
+HITL gate (``POST /agents/sessions/{id}/approve`` with ``approved=true|false``).
+
+This is module-level mutable state. It is SAFE because
+``app.features.demo.service._pipeline_lock`` enforces exactly one pipeline per
+process, and ``step_agent_hitl_flow`` registers at most one pending action per
+run (precedent for module-level demo state: the lock itself, ``service.py:19``).
+Defensive anyway: :func:`register` overwrites any stale slot from a crashed run
+so the next run can never wedge, and the step clears the slot in a ``finally``.
+"""
+
+from __future__ import annotations
+
+import asyncio
+from dataclasses import dataclass, field
+from typing import Literal
+
+from app.core.logging import get_logger
+
+logger = get_logger(__name__)
+
+Decision = Literal["approved", "rejected"]
+ResolveOutcome = Literal["applied", "already_decided", "not_found"]
+
+
+@dataclass
+class _PendingDecision:
+    """The one open decision window, or ``None`` when no step is awaiting."""
+
+    action_id: str
+    event: asyncio.Event = field(default_factory=asyncio.Event)
+    decision: Decision | None = None
+    reason: str | None = None
+
+
+_slot: _PendingDecision | None = None  # module-level; one pipeline at a time
+
+
+def register(action_id: str) -> None:
+    """Open the decision window for ``action_id``.
+
+    Overwrites any stale slot left by a crashed run so a wedged slot can never
+    block the next run. Called by ``step_agent_hitl_flow`` immediately before it
+    yields the intermediate ``awaiting_approval`` event.
+    """
+    global _slot
+    _slot = _PendingDecision(action_id=action_id)
+
+
+def resolve(action_id: str, decision: Decision, reason: str | None = None) -> ResolveOutcome:
+    """Record the operator's decision; called by ``POST /demo/hitl-decision``.
+
+    Returns ``"not_found"`` when no window is open for ``action_id`` (nothing
+    pending under that id), ``"already_decided"`` when a decision already
+    landed, and ``"applied"`` on success.
+    """
+    if _slot is None or _slot.action_id != action_id:
+        return "not_found"
+    if _slot.decision is not None:
+        return "already_decided"
+    _slot.decision = decision
+    _slot.reason = reason
+    _slot.event.set()
+    logger.info("demo.hitl_decision_resolved", action_id=action_id, decision=decision)
+    return "applied"
+
+
+async def wait_for_decision(action_id: str, timeout: float) -> tuple[Decision, str | None] | None:
+    """Block up to ``timeout`` seconds for an operator decision.
+
+    Returns the ``(decision, reason)`` pair when the operator decided in time,
+    or ``None`` when the window lapsed (the caller then auto-approves) or when
+    no slot is open for ``action_id`` (defensive -- the step always registers
+    first).
+    """
+    if _slot is None or _slot.action_id != action_id:
+        return None
+    try:
+        await asyncio.wait_for(_slot.event.wait(), timeout=timeout)
+    except TimeoutError:
+        return None
+    if _slot.decision is None:  # defensive: event set without a decision
+        return None
+    return (_slot.decision, _slot.reason)
+
+
+def clear() -> None:
+    """Close the decision window (called from the step's ``finally``)."""
+    global _slot
+    _slot = None
diff --git a/app/features/demo/models.py b/app/features/demo/models.py
index 4897f621..207aed33 100644
--- a/app/features/demo/models.py
+++ b/app/features/demo/models.py
@@ -130,8 +130,12 @@ class ShowcaseWorkspace(TimestampMixin, Base):
     )
     # Version of the workspace config + story-slot schema (umbrella #406
     # junk-drawer mitigation). Bump the ORM default when a slot shape changes.
+    # E5 (#411) bumped 1 -> 2: it widened the approval_events.decision enum
+    # (+timed_out), added "probe" to rag_events.event, and added the additive
+    # entry keys documented below. server_default stays text("1") -- no
+    # migration; old rows legitimately read 1.
     config_schema_version: Mapped[int] = mapped_column(
-        Integer, nullable=False, default=1, server_default=text("1")
+        Integer, nullable=False, default=2, server_default=text("1")
     )
 
     # ── E1 (#407) — replay provenance ─────────────────────────────────────
@@ -152,13 +156,22 @@ class ShowcaseWorkspace(TimestampMixin, Base):
     #   user_scope       (E3 #409 writes) — dict: operator-selected focus,
     #                    {"store_id": int, "product_id": int} (additive keys
     #                    allowed later).
-    #   approval_events  (E5 #411 writes) — list[dict], append-only:
+    #   approval_events  (E5 #411 writes — schema v2) — list[dict], append-only:
     #                    {"action_id": str, "tool_name": str,
-    #                     "decision": "approved"|"rejected",
-    #                     "decided_at": iso8601-str, "session_id": str}.
-    #   rag_events       (E5 #411 writes) — list[dict], append-only:
-    #                    {"event": "index"|"retrieve"|"skip", "detail": str,
-    #                     "count": int, "occurred_at": iso8601-str}.
+    #                     "decision": "approved"|"rejected"|"timed_out",
+    #                     "decided_at": iso8601-str, "session_id": str,
+    #                     # v2 additive (config_schema_version >= 2):
+    #                     "auto_approved": bool, "reason": str|None,
+    #                     "execution_status": str|None,
+    #                     "tool_call_summary": {"description": str,
+    #                         "arguments_keys": list[str]},
+    #                     "transcript_summary": str, "tokens_used": int,
+    #                     "tool_calls_count": int}.
+    #   rag_events       (E5 #411 writes — schema v2) — list[dict], append-only:
+    #                    {"event": "probe"|"index"|"retrieve"|"skip",
+    #                     "status": "pass"|"warn"|"skip", "detail": str,
+    #                     "count": int, "occurred_at": iso8601-str,
+    #                     "provider": str|None, "reachable": bool|None}.
     #   job_ids          (later parallel epic) — list[str]: job / batch
     #                    sub-job ids the run submitted (soft references).
     #   phase_summaries  (later parallel epic) — list[dict], one per phase:
diff --git a/app/features/demo/pipeline.py b/app/features/demo/pipeline.py
index a91c1dd9..4602e8c8 100644
--- a/app/features/demo/pipeline.py
+++ b/app/features/demo/pipeline.py
@@ -40,7 +40,7 @@
 from app.core.config import get_settings
 from app.core.logging import get_logger
 from app.core.problem_details import EMBEDDING_AUTH_CODE, ERROR_TYPES
-from app.features.demo import workspace
+from app.features.demo import hitl, workspace
 from app.features.demo.schemas import DemoRunRequest, StepEvent, StepStatus, UserScope
 from app.shared.model_taxonomy import KNOWN_MODEL_TYPES
 from app.shared.seeder.config import ScenarioPreset
@@ -305,7 +305,7 @@ class DemoContext:
     # PRP-41 — additive HITL approval state, populated only by
     # step_agent_hitl_flow on SHOWCASE_RICH. Remain None on every other path.
     approval_action_id: str | None = None
-    agent_approval_decision: str | None = None  # "executed"|"rejected"|"expired"|"timed_out"
+    agent_approval_decision: str | None = None  # executed|rejected|external_4xx|timed_out
     # E1 (#390) -- workspace persistence. Set only on preservation="keep" runs
     # (and only when the row insert succeeded); None on ephemeral runs.
     workspace_id: str | None = None
@@ -322,6 +322,13 @@ class DemoContext:
     # metric). Defaults to the all-legacy ResolvedRunConfig so a frame without
     # the new fields behaves byte-identically.
     run_config: ResolvedRunConfig = field(default_factory=ResolvedRunConfig)
+    # E5 (#411) -- story-capture accumulators. Appended by step_agent_hitl_flow
+    # and the three knowledge steps on SHOWCASE_RICH; finalize_workspace
+    # persists them to the workspace slots (empty list -> slot stays NULL).
+    # Always append in-memory (cheap, cannot fail); only the DB write is
+    # fallible (warn-and-continue).
+    approval_events: list[dict[str, Any]] = field(default_factory=list)
+    rag_events: list[dict[str, Any]] = field(default_factory=list)
 
 
 # =============================================================================
@@ -379,10 +386,13 @@ def _llm_key_present() -> bool:
     return False
 
 
-# PRP-41 — HITL approval flow constants. Display delay gives the visitor a
-# window to click Approve on the FE before the backend auto-fires; the hard
+# PRP-41 / E5 (#411) — HITL approval flow constants. The decision window is the
+# span the FE renders Approve + Reject and the step waits on the in-memory relay
+# (D3: 10 s -- 3 s was unclickable by a human; 10 s stays well under the 90 s
+# hard timeout and the 180 s soft budget). It is emitted to the FE as
+# ``data.decision_window_s`` so the countdown never hardcodes it. The hard
 # timeout is the load-bearing fallback so a hung agent never stops the demo.
-_APPROVAL_DISPLAY_DELAY_S = 3.0
+_APPROVAL_DECISION_WINDOW_S = 10.0
 _APPROVAL_HARD_TIMEOUT_S = 90.0
 _HITL_PROMPT = (
     "Save a 10% price-cut scenario plan for the demo-production model "
@@ -390,6 +400,53 @@ def _llm_key_present() -> bool:
 )
 
 
+def _record_approval_event(
+    ctx: DemoContext,
+    *,
+    action_id: str,
+    tool_name: str,
+    decision: str,
+    session_id: str,
+    auto_approved: bool,
+    reason: str | None,
+    execution_status: str | None,
+    pending_action: dict[str, Any],
+    transcript_summary: str,
+    tokens_used: int,
+    tool_calls_count: int,
+) -> None:
+    """Append one approval-event entry to ``ctx.approval_events`` (E5, #411).
+
+    ``tool_call_summary`` carries the action description + argument KEYS only --
+    never values (security-patterns.md: never echo full payloads; values may
+    embed user-supplied text). Schema v2 -- see DOMAIN_MODEL § showcase_workspace.
+    """
+    raw_args = pending_action.get("arguments")
+    arguments_keys = sorted(raw_args) if isinstance(raw_args, dict) else []
+    description_raw = pending_action.get("description")
+    description = description_raw if isinstance(description_raw, str) else ""
+    ctx.approval_events.append(
+        {
+            "action_id": action_id,
+            "tool_name": tool_name,
+            "decision": decision,
+            "decided_at": datetime.now(UTC).isoformat(),
+            "session_id": session_id,
+            # -- E5 (#411) additive (config_schema_version >= 2) --
+            "auto_approved": auto_approved,
+            "reason": reason,
+            "execution_status": execution_status,
+            "tool_call_summary": {
+                "description": description,
+                "arguments_keys": arguments_keys,
+            },
+            "transcript_summary": transcript_summary,
+            "tokens_used": tokens_used,
+            "tool_calls_count": tool_calls_count,
+        }
+    )
+
+
 # PRP-40 — artifact-key parser for /scenarios/* run_id resolution. Two ID
 # spaces: model_run.run_id (32-char UUID-hex) vs scenarios.run_id (12-char
 # artifact key parsed from `model_{KEY}.joblib`). Memory anchor:
@@ -1670,6 +1727,35 @@ async def step_multi_plan_compare(ctx: DemoContext, client: _Client) -> StepResu
     )
 
 
+def _record_rag_event(
+    ctx: DemoContext,
+    *,
+    event: str,
+    status: str,
+    detail: str,
+    count: int = 0,
+    provider: str | None = None,
+    reachable: bool | None = None,
+) -> None:
+    """Append one RAG-event entry to ``ctx.rag_events`` (E5, #411).
+
+    Called once on EVERY return path of the three knowledge steps so the
+    workspace story records the knowledge outcome (probe / index / retrieve /
+    skip) with the provider state. Schema v2 -- see DOMAIN_MODEL.
+    """
+    ctx.rag_events.append(
+        {
+            "event": event,
+            "status": status,
+            "detail": detail,
+            "count": count,
+            "occurred_at": datetime.now(UTC).isoformat(),
+            "provider": provider,
+            "reachable": reachable,
+        }
+    )
+
+
 async def step_embedding_provider_probe(ctx: DemoContext, client: _Client) -> StepResult:
     """PRP-40 — probe the configured embedding provider. Always PASS.
 
@@ -1689,6 +1775,9 @@ async def step_embedding_provider_probe(ctx: DemoContext, client: _Client) -> St
         if reachable
         else f"provider={provider} unreachable — knowledge phase will skip"
     )
+    _record_rag_event(
+        ctx, event="probe", status="pass", detail=detail, provider=provider, reachable=reachable
+    )
     return ("pass", detail, {"provider": provider, "reachable": reachable})
 
 
@@ -1699,7 +1788,15 @@ async def step_rag_index_subset(ctx: DemoContext, client: _Client) -> StepResult
     Uses the additive ``path_prefix`` field on IndexProjectDocsRequest so the
     blast radius stays scoped to the user-guide subset.
     """
+    provider = get_settings().rag_embedding_provider
     if ctx.embedding_unreachable:
+        _record_rag_event(
+            ctx,
+            event="skip",
+            status="skip",
+            detail="embedding provider unreachable",
+            provider=provider,
+        )
         return ("skip", "embedding provider unreachable", {})
 
     try:
@@ -1721,6 +1818,13 @@ async def step_rag_index_subset(ctx: DemoContext, client: _Client) -> StepResult
         # context so the retrieve probe skips too, without a second 401 round-trip.
         if _is_embedding_auth_error(exc):
             ctx.embedding_unreachable = True
+            _record_rag_event(
+                ctx,
+                event="skip",
+                status="skip",
+                detail="embedding provider rejected credentials",
+                provider=provider,
+            )
             return ("skip", "embedding provider rejected credentials", {})
         raise
     results = body.get("results") or []
@@ -1734,9 +1838,18 @@ async def step_rag_index_subset(ctx: DemoContext, client: _Client) -> StepResult
         for r in results
         if isinstance(r, dict) and r.get("source_path") in _USER_GUIDE_CURATED_FILES
     )
+    detail = f"files_indexed={curated_hits}/5 chunks={total_chunks} failed={failed}"
+    _record_rag_event(
+        ctx,
+        event="index",
+        status="pass",
+        detail=detail,
+        count=total_chunks,
+        provider=provider,
+    )
     return (
         "pass",
-        f"files_indexed={curated_hits}/5 chunks={total_chunks} failed={failed}",
+        detail,
         {
             "total_files": int(body.get("total_files", 0)),
             "indexed": indexed,
@@ -1755,7 +1868,15 @@ async def step_rag_retrieve_probe(ctx: DemoContext, client: _Client) -> StepResu
     SKIPs when ``ctx.embedding_unreachable``. WARN (not FAIL) on zero hits so
     a green-but-empty corpus still lets the pipeline go green.
     """
+    provider = get_settings().rag_embedding_provider
     if ctx.embedding_unreachable:
+        _record_rag_event(
+            ctx,
+            event="skip",
+            status="skip",
+            detail="embedding provider unreachable",
+            provider=provider,
+        )
         return ("skip", "embedding provider unreachable", {})
 
     try:
@@ -1770,13 +1891,24 @@ async def step_rag_retrieve_probe(ctx: DemoContext, client: _Client) -> StepResu
         # in case retrieve is reached with a freshly-rejecting key.
         if _is_embedding_auth_error(exc):
             ctx.embedding_unreachable = True
+            _record_rag_event(
+                ctx,
+                event="skip",
+                status="skip",
+                detail="embedding provider rejected credentials",
+                provider=provider,
+            )
             return ("skip", "embedding provider rejected credentials", {})
         raise
     results = body.get("results") or []
     if not results:
+        detail = "no hits — corpus indexed but query did not match"
+        _record_rag_event(
+            ctx, event="retrieve", status="warn", detail=detail, count=0, provider=provider
+        )
         return (
             "warn",
-            "no hits — corpus indexed but query did not match",
+            detail,
             {
                 "results_count": 0,
                 "total_chunks_searched": body.get("total_chunks_searched", 0),
@@ -1789,9 +1921,18 @@ async def step_rag_retrieve_probe(ctx: DemoContext, client: _Client) -> StepResu
         score = float(score_raw)
     except (TypeError, ValueError):
         score = 0.0
+    detail = f"top hit: {title} (score={score:.3f})"
+    _record_rag_event(
+        ctx,
+        event="retrieve",
+        status="pass",
+        detail=detail,
+        count=len(results),
+        provider=provider,
+    )
     return (
         "pass",
-        f"top hit: {title} (score={score:.3f})",
+        detail,
         {
             "results_count": len(results),
             "top_source_path": title,
@@ -2414,7 +2555,7 @@ async def step_cleanup(ctx: DemoContext, client: _Client) -> StepResult:
 
 
 async def step_agent_hitl_flow(ctx: DemoContext, client: _Client) -> StepResult:
-    """PRP-41 — HITL approval round-trip on the experiment agent.
+    """PRP-41 / E5 (#411) — HITL approval round-trip on the experiment agent.
 
     Flow:
       1. ``_llm_key_present()`` -> skip when no key.
@@ -2424,22 +2565,25 @@ async def step_agent_hitl_flow(ctx: DemoContext, client: _Client) -> StepResult:
          on the ``save_scenario`` entry in ``agent_require_approval``. The
          chat response carries ``pending_approval=true`` +
          ``pending_action: PendingAction``.
-      4. ``client.yield_event(...)`` an intermediate step_complete with
-         ``status='running'`` + ``awaiting_approval=true`` so the FE can
-         render the Approve button.
-      5. Sleep ``_APPROVAL_DISPLAY_DELAY_S`` -- a one-click FE Approve may
-         pre-empt the auto-approve in this window.
-      6. ``POST /agents/sessions/{id}/approve`` with ``{action_id,
-         approved: true}``. Absorb 4xx (the FE pre-empted; the action was
-         already consumed).
-      7. Terminal: ``pass`` with the approval decision in step.data.
+      4. ``hitl.register(action_id)`` then ``client.yield_event(...)`` an
+         intermediate step_complete with ``status='running'`` +
+         ``awaiting_approval=true`` + ``decision_url`` + ``decision_window_s``
+         so the FE renders Approve + Reject (E5).
+      5. ``hitl.wait_for_decision(...)`` up to ``_APPROVAL_DECISION_WINDOW_S``
+         -- the operator's Approve/Reject relayed via POST /demo/hitl-decision;
+         ``None`` on window lapse -> auto-approve.
+      6. ``POST /agents/sessions/{id}/approve`` with the REAL decision
+         (``approved: true|false`` + optional reason). Absorb 4xx (an operator
+         pre-empted the agents endpoint directly; the action was consumed).
+      7. Append one ``approval_events`` entry, then terminal ``pass`` -- a
+         reject is GREEN by design (D5): the gated tool never executed.
 
     Skip-gracefully on every error path (session-create / chat / approve
     failure, or the agent never triggers ``save_scenario``). Never raises.
 
     Hard timeout: if the elapsed time exceeds ``_APPROVAL_HARD_TIMEOUT_S``
     before step (6) completes, returns ``skip`` with
-    ``approval_decision='timed_out'``.
+    ``approval_decision='timed_out'`` (and records a ``timed_out`` entry).
     """
     key_present = _llm_key_present()
     logger.info("demo.agent_hitl_flow.key_present", present=key_present)
@@ -2489,6 +2633,12 @@ async def step_agent_hitl_flow(ctx: DemoContext, client: _Client) -> StepResult:
     tokens_used = int(chat_body.get("tokens_used", 0))
     raw_tool_calls = chat_body.get("tool_calls", [])
     tool_count = len(raw_tool_calls) if isinstance(raw_tool_calls, list) else 0
+    # E5 (#411) -- transcript summary for the approval-events entry. Capped at
+    # 200 chars (precedent: the #335 failure-detail 300-char cap); never the
+    # full transcript (security-patterns.md).
+    transcript_summary = str(chat_body.get("message", ""))[:200]
+    raw_action_type = pending_action.get("action_type")
+    tool_name = raw_action_type if isinstance(raw_action_type, str) else "save_scenario"
 
     if not pending_approval or not pending_action:
         # The agent didn't trigger save_scenario (e.g. answered directly or
@@ -2516,102 +2666,173 @@ async def step_agent_hitl_flow(ctx: DemoContext, client: _Client) -> StepResult:
     action_id: str = action_id_raw
     ctx.approval_action_id = action_id
 
-    # (4) -- intermediate event so the FE renders Approve. step_index /
-    # total_steps / phase_index / phase_total are stamped by the orchestrator
-    # when it drains the sink (see run_pipeline).
-    elapsed_ms = (time.monotonic() - started_at) * 1000.0
-    client.yield_event(
-        StepEvent(
-            event_type="step_complete",
-            step_name="agent_hitl_flow",
-            step_index=0,
-            total_steps=0,
-            status="running",
-            detail="awaiting approval (auto-approve in 3 s)",
-            duration_ms=elapsed_ms,
-            data={
-                "awaiting_approval": True,
-                "approval_url": f"/agents/sessions/{session_id}/approve",
-                "action_id": action_id,
-                "session_id": session_id,
-                "tokens_used": tokens_used,
-                "tool_calls_count": tool_count,
-            },
-            phase_name=PHASE_AGENTS,
-        )
-    )
-
-    # (5) -- display delay.
-    elapsed_after_intermediate = time.monotonic() - started_at
-    delay = max(0.0, _APPROVAL_DISPLAY_DELAY_S - elapsed_after_intermediate)
-    if delay > 0:
-        await asyncio.sleep(delay)
-
-    # (5b) -- hard-timeout check BEFORE the approve POST.
-    elapsed_before_approve = time.monotonic() - started_at
-    if elapsed_before_approve > _APPROVAL_HARD_TIMEOUT_S:
-        ctx.agent_approval_decision = "timed_out"
-        return (
-            "skip",
-            "approval timed out -- pipeline continued",
-            {
-                "session_id": session_id,
-                "action_id": action_id,
-                "approval_decision": "timed_out",
-                "tokens_used": tokens_used,
-                "tool_calls_count": tool_count,
-                "timed_out": True,
-            },
-        )
-
-    # (6) -- POST /approve. Absorb 4xx (FE pre-empted) per Task 1 §5 #2:
-    # AgentService.approve_action returns 400 ("No pending action") when the
-    # action was already consumed by the FE's optimistic Approve click.
-    approval_decision = "executed"
+    # E5 (#411) -- open the decision window on the in-memory relay BEFORE the
+    # FE can see the action, then clear it on every exit (finally). The relay
+    # is the single intent channel: the FE Approve/Reject buttons POST
+    # /demo/hitl-decision (demo slice), this step waits on the relay, then
+    # forwards the REAL decision to the agents HITL gate.
+    hitl.register(action_id)
     try:
-        approve_body = await client.request(
-            "agent_hitl_flow[approve]",
-            "POST",
-            f"/agents/sessions/{session_id}/approve",
-            json_body={"action_id": action_id, "approved": True},
+        # (4) -- intermediate event so the FE renders Approve + Reject.
+        # step_index / total_steps / phase_index / phase_total are stamped by
+        # the orchestrator when it drains the sink (see run_pipeline). D2 makes
+        # this event reach the browser DURING the window, not after it closes.
+        window_s = _APPROVAL_DECISION_WINDOW_S
+        elapsed_ms = (time.monotonic() - started_at) * 1000.0
+        client.yield_event(
+            StepEvent(
+                event_type="step_complete",
+                step_name="agent_hitl_flow",
+                step_index=0,
+                total_steps=0,
+                status="running",
+                detail=f"awaiting approval (auto-approve in {int(window_s)} s)",
+                duration_ms=elapsed_ms,
+                data={
+                    "awaiting_approval": True,
+                    # E5 -- the relay is the new intent channel; approval_url is
+                    # kept for back-compat (an operator curl-ing it directly is
+                    # still absorbed below as execution_status="external_4xx").
+                    "decision_url": "/demo/hitl-decision",
+                    "decision_window_s": window_s,
+                    "approval_url": f"/agents/sessions/{session_id}/approve",
+                    "action_id": action_id,
+                    "session_id": session_id,
+                    "tokens_used": tokens_used,
+                    "tool_calls_count": tool_count,
+                },
+                phase_name=PHASE_AGENTS,
+            )
         )
-        raw_status = approve_body.get("status", "executed")
-        if isinstance(raw_status, str):
-            approval_decision = raw_status
-    except _StepError as exc:
-        if 400 <= exc.status_code < 500:
-            # FE pre-empted -- the approval already landed. Optimistic default.
-            logger.info(
-                "demo.agent_hitl_flow.approve_pre_empted",
-                session_id=session_id,
+
+        # (5) -- wait up to the remaining window for an operator decision.
+        remaining = max(0.0, window_s - (time.monotonic() - started_at))
+        operator = await hitl.wait_for_decision(action_id, timeout=remaining)
+
+        # (5b) -- hard-timeout check BEFORE the approve POST (a hung agent /
+        # blocked window never stops the demo). timed_out -> skip + entry.
+        elapsed_before_approve = time.monotonic() - started_at
+        if elapsed_before_approve > _APPROVAL_HARD_TIMEOUT_S:
+            ctx.agent_approval_decision = "timed_out"
+            _record_approval_event(
+                ctx,
                 action_id=action_id,
-                status_code=exc.status_code,
+                tool_name=tool_name,
+                decision="timed_out",
+                session_id=session_id,
+                auto_approved=False,
+                reason=None,
+                execution_status=None,
+                pending_action=pending_action,
+                transcript_summary=transcript_summary,
+                tokens_used=tokens_used,
+                tool_calls_count=tool_count,
             )
-            approval_decision = "executed"
-        else:
             return (
                 "skip",
-                f"approve failed: {exc}",
+                "approval timed out -- pipeline continued",
                 {
                     "session_id": session_id,
                     "action_id": action_id,
+                    "approval_decision": "timed_out",
                     "tokens_used": tokens_used,
                     "tool_calls_count": tool_count,
+                    "timed_out": True,
                 },
             )
 
-    ctx.agent_approval_decision = approval_decision
+        # Resolve the operator's intent (None == window lapsed -> auto-approve).
+        auto_approved = operator is None
+        approved = operator is None or operator[0] == "approved"
+        reason = operator[1] if operator is not None else None
+
+        # (6) -- forward the REAL decision to the agents HITL gate. Absorb 4xx
+        # (an operator pre-empted by curl-ing /agents/.../approve directly):
+        # AgentService.approve_action returns 400 once the action is consumed.
+        approve_json: dict[str, Any] = {"action_id": action_id, "approved": approved}
+        if reason:
+            approve_json["reason"] = reason
+        execution_status = "executed" if approved else "rejected"
+        try:
+            approve_body = await client.request(
+                "agent_hitl_flow[approve]",
+                "POST",
+                f"/agents/sessions/{session_id}/approve",
+                json_body=approve_json,
+            )
+            raw_status = approve_body.get("status", execution_status)
+            if isinstance(raw_status, str):
+                execution_status = raw_status
+        except _StepError as exc:
+            if 400 <= exc.status_code < 500:
+                # Pre-empted -- the decision already landed via the agents API.
+                logger.info(
+                    "demo.agent_hitl_flow.approve_pre_empted",
+                    session_id=session_id,
+                    action_id=action_id,
+                    status_code=exc.status_code,
+                )
+                execution_status = "external_4xx"
+            else:
+                return (
+                    "skip",
+                    f"approve failed: {exc}",
+                    {
+                        "session_id": session_id,
+                        "action_id": action_id,
+                        "tokens_used": tokens_used,
+                        "tool_calls_count": tool_count,
+                    },
+                )
 
+        decision = "approved" if approved else "rejected"
+        # ctx mirror keeps the agents-API execution status (executed/rejected/
+        # external_4xx); the slot entry below records the operator decision.
+        ctx.agent_approval_decision = execution_status
+        _record_approval_event(
+            ctx,
+            action_id=action_id,
+            tool_name=tool_name,
+            decision=decision,
+            session_id=session_id,
+            auto_approved=auto_approved,
+            reason=reason,
+            execution_status=execution_status,
+            pending_action=pending_action,
+            transcript_summary=transcript_summary,
+            tokens_used=tokens_used,
+            tool_calls_count=tool_count,
+        )
+    finally:
+        hitl.clear()
+
+    # (7) -- terminal. D5: a human rejection is a SUCCESSFUL demonstration of
+    # the HITL gate, not an error -- the run stays GREEN and the gated
+    # save_scenario never executed (no scenario_plan row written by the agent).
+    if not approved:
+        return (
+            "pass",
+            "rejected by operator",
+            {
+                "session_id": session_id,
+                "action_id": action_id,
+                "approval_decision": "rejected",
+                "auto_approved": False,
+                "tokens_used": tokens_used,
+                "tool_calls_count": tool_count,
+            },
+        )
     return (
         "pass",
         (
             f"session={session_id[:8]}... tokens={tokens_used} "
-            f"tool_calls={tool_count} approved={approval_decision}"
+            f"tool_calls={tool_count} approved={execution_status}"
         ),
         {
             "session_id": session_id,
             "action_id": action_id,
-            "approval_decision": approval_decision,
+            "approval_decision": execution_status,
+            "auto_approved": auto_approved,
             "tokens_used": tokens_used,
             "tool_calls_count": tool_count,
         },
@@ -2908,8 +3129,33 @@ async def run_pipeline(app: FastAPI, req: DemoRunRequest) -> AsyncIterator[StepE
             status: StepStatus
             detail: str
             data: dict[str, Any]
+            # E5 (#411) — D2: run the step as a task and drain the intermediate
+            # event sink CONCURRENTLY with the in-flight step. PRP-41 drained
+            # the sink only AFTER the step returned, so the HITL step's
+            # ``awaiting_approval`` event reached the browser only once the
+            # decision window had already closed (the auto-approve had fired).
+            # Steps still execute strictly one at a time under the single lock;
+            # only event flushing overlaps the running step.
+            task = asyncio.ensure_future(fn(ctx, client))
             try:
-                status, detail, data = await fn(ctx, client)
+                while True:
+                    done, _pending = await asyncio.wait({task}, timeout=0.25)
+                    # Drain + stamp the row's index/phase fields so the FE state
+                    # machine processes buffered events as if the orchestrator
+                    # emitted them. Order matters: an intermediate must land
+                    # before the terminal so "awaiting_approval" precedes
+                    # "approved" in the WS stream.
+                    for ev in intermediate_events:
+                        ev.step_index = index
+                        ev.total_steps = total
+                        ev.phase_index = phase_index
+                        ev.phase_total = phase_total
+                        ev.phase_name = phase_name
+                        yield ev
+                    intermediate_events.clear()
+                    if done:
+                        break
+                status, detail, data = task.result()
             except _StepError as exc:
                 status, detail, data = "fail", str(exc), {}
             except (httpx.HTTPError, OSError) as exc:
@@ -2927,19 +3173,25 @@ async def run_pipeline(app: FastAPI, req: DemoRunRequest) -> AsyncIterator[StepE
                     f"unexpected error: {type(exc).__name__}: {exc}",
                     {},
                 )
+            finally:
+                # LOAD-BEARING (PRP Gotcha / quality Finding 3): the Stop button
+                # closes the WebSocket -> the async generator is closed, throwing
+                # GeneratorExit (a BaseException no except clause above catches)
+                # into the mid-step ``yield ev`` suspension point. This finally
+                # is the only hook that runs on EVERY exit path; without it the
+                # in-flight step task is orphaned ("Task was destroyed but it is
+                # pending") while the _Client closes underneath it.
+                if not task.done():
+                    task.cancel()
             duration_ms = (time.monotonic() - t0) * 1000
-            # PRP-41 — drain any intermediate events the step buffered BEFORE
-            # the terminal step_complete. Stamp the row's index/phase fields
-            # so the FE state machine processes them as if they were emitted
-            # by the orchestrator. Order matters: intermediate events must
-            # land before the terminal so "awaiting_approval" precedes
-            # "approved" in the WS stream.
+            # Final flush: drain anything the step buffered after the last
+            # 0.25s tick (mid-step loop drained the rest). Keeps the
+            # intermediate-before-terminal ordering identical to pre-D2.
             for ev in intermediate_events:
                 ev.step_index = index
                 ev.total_steps = total
                 ev.phase_index = phase_index
                 ev.phase_total = phase_total
-                # phase_name is set by the step fn already, but mirror in case.
                 ev.phase_name = phase_name
                 yield ev
             intermediate_events.clear()
diff --git a/app/features/demo/routes.py b/app/features/demo/routes.py
index deaa8ac0..dc9d6b89 100644
--- a/app/features/demo/routes.py
+++ b/app/features/demo/routes.py
@@ -40,10 +40,13 @@
 from app.core.database import get_db
 from app.core.exceptions import ConflictError, NotFoundError
 from app.core.logging import get_logger
-from app.features.demo import link_health, service, workspace
+from app.features.demo import hitl, link_health, service, workspace
 from app.features.demo.schemas import (
+    ApprovalEventItem,
+    ApprovalEventsResponse,
     DemoRunRequest,
     DemoRunResult,
+    HitlDecisionRequest,
     StepEvent,
     WorkspaceDetailResponse,
     WorkspaceHealthResponse,
@@ -86,6 +89,64 @@ async def run_demo_pipeline(request: Request, params: DemoRunRequest) -> DemoRun
         raise ConflictError(str(exc)) from exc
 
 
+@router.post(
+    "/hitl-decision",
+    status_code=status.HTTP_204_NO_CONTENT,
+    summary="Relay an operator decision to the in-flight HITL step",
+    description=(
+        "Relay the Showcase HITL step card's Approve / Reject to the running "
+        "pipeline (E5, #411). The pipeline forwards the real decision to the "
+        "agents HITL gate. 404 when no matching action is pending; 409 when the "
+        "action was already decided; 422 on a malformed body."
+    ),
+)
+async def submit_hitl_decision(body: HitlDecisionRequest) -> None:
+    """Relay an operator Approve/Reject to the in-flight HITL step (E5, #411).
+
+    Args:
+        body: The operator decision (action_id, approved/rejected, optional reason).
+
+    Raises:
+        NotFoundError: When no HITL action is pending under ``action_id`` (404).
+        ConflictError: When the action was already decided (409).
+    """
+    outcome = hitl.resolve(body.action_id, body.decision, body.reason)
+    if outcome == "not_found":
+        raise NotFoundError(message=f"No pending HITL action: {body.action_id}")
+    if outcome == "already_decided":
+        raise ConflictError(f"Action already decided: {body.action_id}")
+
+
+@router.get(
+    "/approval-events",
+    response_model=ApprovalEventsResponse,
+    summary="Recent HITL approval events across saved workspaces",
+    description=(
+        "Flatten ``approval_events`` across the newest saved workspaces that "
+        "carry the slot, newest-workspace-first (E5, #411). An audit-glance "
+        "surface -- no pagination. Returns 200 + an empty list when none."
+    ),
+)
+async def list_hitl_approval_events(
+    db: AsyncSession = Depends(get_db),
+    limit: int = Query(default=50, ge=1, le=200, description="Maximum flattened entries."),
+) -> ApprovalEventsResponse:
+    """List recent HITL approval events flattened across workspaces (E5, #411).
+
+    Args:
+        db: Async database session from dependency.
+        limit: Maximum flattened entries to return (1-200).
+
+    Returns:
+        The flattened approval events plus the returned count.
+    """
+    events = await workspace.list_approval_events(db, limit=limit)
+    return ApprovalEventsResponse(
+        events=[ApprovalEventItem.model_validate(event) for event in events],
+        total=len(events),
+    )
+
+
 @router.get(
     "/workspaces",
     response_model=WorkspaceListResponse,
diff --git a/app/features/demo/schemas.py b/app/features/demo/schemas.py
index 352770aa..70b1e8c3 100644
--- a/app/features/demo/schemas.py
+++ b/app/features/demo/schemas.py
@@ -506,3 +506,64 @@ class WorkspaceHealthResponse(BaseModel):
     dead: int = Field(..., ge=0, description="Count of references that probed dead (404).")
     unknown: int = Field(..., ge=0, description="Count of references whose probe was inconclusive.")
     checked_at: datetime = Field(default_factory=_utc_now, description="When the probes ran (UTC).")
+
+
+class HitlDecisionRequest(BaseModel):
+    """Operator decision relay for the showcase HITL step (E5, issue #411).
+
+    POSTed by the Showcase step card's Approve / Reject buttons to
+    ``POST /demo/hitl-decision``; the in-flight pipeline waits on the in-memory
+    relay and forwards the real decision to the agents HITL gate. HTTP-only
+    body -- every field is JSON-native (``str`` / ``Literal``), so the
+    model-level ``strict=True`` needs no ``Field(strict=False)`` override (the
+    AST policy walker fires only on date/datetime/time/UUID/Decimal).
+    ``extra="forbid"`` so a typo'd field 422s instead of silently no-opping.
+    """
+
+    model_config = ConfigDict(strict=True, extra="forbid")
+
+    action_id: str = Field(..., min_length=1, description="Pending action to decide.")
+    decision: Literal["approved", "rejected"] = Field(..., description="Operator decision.")
+    reason: str | None = Field(
+        default=None,
+        max_length=500,
+        description="Optional reason (mirrors agents ApprovalRequest.reason).",
+    )
+
+
+class ApprovalEventItem(BaseModel):
+    """One flattened approval event for ``GET /demo/approval-events`` (E5, #411).
+
+    Built from JSONB story-slot dicts (NOT ORM rows) -- tolerant typing with
+    defaults so a v1 entry (pre-E5 base keys only) still validates. Response
+    model: plain ``BaseModel``, NOT strict (strict mode is request-body policy).
+    """
+
+    workspace_id: str = Field(..., description="The workspace whose run recorded this event.")
+    workspace_name: str | None = Field(default=None, description="The workspace's optional label.")
+    action_id: str | None = Field(default=None, description="The decided action's id.")
+    tool_name: str | None = Field(default=None, description="The gated tool (e.g. save_scenario).")
+    decision: str | None = Field(default=None, description="approved / rejected / timed_out.")
+    decided_at: str | None = Field(default=None, description="ISO8601 UTC decision timestamp.")
+    session_id: str | None = Field(
+        default=None, description="Agent session the action belonged to."
+    )
+    auto_approved: bool | None = Field(
+        default=None, description="True when the decision window lapsed."
+    )
+    reason: str | None = Field(default=None, description="Operator-supplied reason (reject).")
+    execution_status: str | None = Field(
+        default=None, description="Agents-API status: executed / rejected / external_4xx."
+    )
+    transcript_summary: str | None = Field(
+        default=None, description="Agent chat message (<=200 chars)."
+    )
+
+
+class ApprovalEventsResponse(BaseModel):
+    """Recent HITL approval events flattened across workspaces (E5, #411)."""
+
+    events: list[ApprovalEventItem] = Field(
+        ..., description="Flattened approval events, newest workspace first; empty when none."
+    )
+    total: int = Field(..., ge=0, description="Number of flattened entries returned (capped).")
diff --git a/app/features/demo/tests/test_hitl.py b/app/features/demo/tests/test_hitl.py
new file mode 100644
index 00000000..34f38e97
--- /dev/null
+++ b/app/features/demo/tests/test_hitl.py
@@ -0,0 +1,80 @@
+"""Unit tests for the HITL decision relay (E5, issue #411).
+
+pytest-asyncio runs in auto mode (``pyproject.toml``), so ``async def`` tests
+need no marker. Each test clears the module slot first so the global state never
+leaks between cases.
+"""
+
+from __future__ import annotations
+
+import asyncio
+
+import pytest
+
+from app.features.demo import hitl
+
+
+@pytest.fixture(autouse=True)
+def _clear_slot() -> None:
+    """Reset the module-level slot before every test (global-state hygiene)."""
+    hitl.clear()
+
+
+def test_resolve_without_register_is_not_found() -> None:
+    assert hitl.resolve("action-1", "approved") == "not_found"
+
+
+def test_resolve_wrong_action_is_not_found() -> None:
+    hitl.register("action-1")
+    assert hitl.resolve("other", "approved") == "not_found"
+
+
+def test_double_resolve_is_already_decided() -> None:
+    hitl.register("action-1")
+    assert hitl.resolve("action-1", "approved") == "applied"
+    assert hitl.resolve("action-1", "rejected") == "already_decided"
+
+
+def test_register_overwrites_stale_slot() -> None:
+    hitl.register("stale")
+    assert hitl.resolve("stale", "approved") == "applied"
+    # A new run registers a fresh slot; the stale decision must not bleed in.
+    hitl.register("fresh")
+    assert hitl.resolve("fresh", "rejected") == "applied"
+
+
+def test_clear_closes_the_window() -> None:
+    hitl.register("action-1")
+    hitl.clear()
+    assert hitl.resolve("action-1", "approved") == "not_found"
+
+
+async def test_resolve_before_wait_returns_decision() -> None:
+    hitl.register("action-1")
+    assert hitl.resolve("action-1", "rejected", reason="too risky") == "applied"
+    result = await hitl.wait_for_decision("action-1", timeout=1.0)
+    assert result == ("rejected", "too risky")
+
+
+async def test_wait_then_resolve_concurrently() -> None:
+    hitl.register("action-1")
+
+    async def _decide() -> None:
+        await asyncio.sleep(0.02)
+        hitl.resolve("action-1", "approved")
+
+    decider = asyncio.ensure_future(_decide())
+    result = await hitl.wait_for_decision("action-1", timeout=1.0)
+    await decider
+    assert result == ("approved", None)
+
+
+async def test_wait_times_out_to_none() -> None:
+    hitl.register("action-1")
+    result = await hitl.wait_for_decision("action-1", timeout=0.02)
+    assert result is None
+
+
+async def test_wait_unknown_action_returns_none() -> None:
+    result = await hitl.wait_for_decision("never-registered", timeout=0.02)
+    assert result is None
diff --git a/app/features/demo/tests/test_models.py b/app/features/demo/tests/test_models.py
index ee048764..22b2eedd 100644
--- a/app/features/demo/tests/test_models.py
+++ b/app/features/demo/tests/test_models.py
@@ -118,7 +118,12 @@ async def test_showcase_workspace_status_check_violation(db_session: AsyncSessio
 
 
 async def test_showcase_workspace_e1_defaults_applied(db_session: AsyncSession) -> None:
-    """A minimal insert gets the E1 defaults (ORM + server defaults agree)."""
+    """A minimal insert gets the E1 defaults.
+
+    E5 (#411) D4 -- an ORM insert now applies the bumped ORM default
+    (config_schema_version=2); the server_default stays 1 so pre-E5 rows
+    inserted outside the ORM legitimately read 1.
+    """
     row = _make_row()
     db_session.add(row)
     await db_session.commit()
@@ -129,7 +134,7 @@ async def test_showcase_workspace_e1_defaults_applied(db_session: AsyncSession)
     assert loaded.pinned is False
     assert loaded.notes is None
     assert loaded.tags == []
-    assert loaded.config_schema_version == 1
+    assert loaded.config_schema_version == 2
     assert loaded.replayed_from_workspace_id is None
     # All six story slots stay NULL until their writer epic lands.
     assert loaded.seed_overrides is None
diff --git a/app/features/demo/tests/test_pipeline.py b/app/features/demo/tests/test_pipeline.py
index 7862d5a2..27b8d406 100644
--- a/app/features/demo/tests/test_pipeline.py
+++ b/app/features/demo/tests/test_pipeline.py
@@ -8,6 +8,7 @@
 
 from __future__ import annotations
 
+import asyncio
 from datetime import date, timedelta
 from types import SimpleNamespace
 from typing import Any, cast
@@ -15,8 +16,13 @@
 import pytest
 from fastapi import FastAPI
 
-from app.features.demo import pipeline
-from app.features.demo.schemas import DemoBacktestConfig, DemoRunRequest, UserScope
+from app.features.demo import hitl, pipeline
+from app.features.demo.schemas import (
+    DemoBacktestConfig,
+    DemoRunRequest,
+    StepEvent,
+    UserScope,
+)
 from app.shared.seeder.config import ScenarioPreset
 from app.shared.seeder.overrides import SeederOverrides
 
@@ -850,10 +856,121 @@ async def request(self, *_a: object, **_k: object) -> dict[str, Any]:
 
 
 # =============================================================================
-# PRP-38 — phase grouping + new scenarios
+# E5 (#411) — D2 concurrent intermediate-event drain
 # =============================================================================
 
 
+def _single_step_table(step_fn: Any) -> Any:
+    """Return a one-row phase table wrapping ``step_fn`` (drain-test helper)."""
+
+    def _table(_scenario: Any) -> list[Any]:
+        return [("data", "blocking", step_fn)]
+
+    return _table
+
+
+async def test_run_pipeline_drains_intermediate_event_mid_step(monkeypatch):
+    """D2 — an intermediate event is YIELDED while the step is still pending.
+
+    The stub step buffers an intermediate event, signals it has started, then
+    blocks on an asyncio.Event. The test consumes events until it sees the
+    intermediate frame, asserts the step has NOT yet returned its terminal,
+    then releases the step. Proves the drain overlaps the in-flight step in
+    wall time (not just stream order).
+    """
+    started = asyncio.Event()
+    release = asyncio.Event()
+
+    async def _blocking_step(_ctx: Any, client: Any) -> Any:
+        client.yield_event(
+            StepEvent(
+                event_type="step_complete",
+                step_name="blocking",
+                step_index=0,
+                total_steps=0,
+                status="running",
+                detail="mid-step",
+                data={"awaiting": True},
+            )
+        )
+        started.set()
+        await release.wait()
+        return ("pass", "done", {})
+
+    monkeypatch.setattr(pipeline, "_phase_table", _single_step_table(_blocking_step))
+
+    agen = pipeline.run_pipeline(app=_FAKE_APP, req=DemoRunRequest())
+    seen: list[StepEvent] = []
+    intermediate: StepEvent | None = None
+    async for ev in agen:
+        seen.append(ev)
+        if ev.event_type == "step_complete" and ev.data.get("awaiting"):
+            intermediate = ev
+            # The step is still blocked: its terminal step_complete (status
+            # 'pass') cannot have been emitted yet.
+            assert started.is_set()
+            assert not any(e.event_type == "step_complete" and e.status == "pass" for e in seen)
+            release.set()
+            break
+
+    assert intermediate is not None
+    # The orchestrator stamped the row's index/phase fields on the drained event.
+    assert intermediate.step_index == 1
+    assert intermediate.phase_name == "data"
+    rest = [e async for e in agen]
+    terminal = [e for e in rest if e.event_type == "step_complete" and e.status == "pass"]
+    assert terminal and terminal[0].step_name == "blocking"
+    assert rest[-1].event_type == "pipeline_complete"
+
+
+async def test_run_pipeline_cancels_in_flight_step_on_generator_close(monkeypatch):
+    """D2 — closing the generator mid-step cancels the step task (Stop button).
+
+    Drives one intermediate event, then ``aclose()`` while the stub step is
+    still blocked. The finally clause must cancel the in-flight task so it ends
+    cancelled (no "Task was destroyed but it is pending" warning).
+    """
+    started = asyncio.Event()
+    release = asyncio.Event()
+    cancelled = False
+
+    async def _blocking_step(_ctx: Any, client: Any) -> Any:
+        nonlocal cancelled
+        client.yield_event(
+            StepEvent(
+                event_type="step_complete",
+                step_name="blocking",
+                step_index=0,
+                total_steps=0,
+                status="running",
+                detail="mid-step",
+                data={"awaiting": True},
+            )
+        )
+        started.set()
+        try:
+            await release.wait()
+        except asyncio.CancelledError:
+            cancelled = True
+            raise
+        return ("pass", "done", {})  # pragma: no cover -- never reached
+
+    monkeypatch.setattr(pipeline, "_phase_table", _single_step_table(_blocking_step))
+
+    # Typed Any so .aclose() is reachable (run_pipeline is annotated as the
+    # AsyncIterator supertype, which has no aclose).
+    agen: Any = pipeline.run_pipeline(app=_FAKE_APP, req=DemoRunRequest())
+    async for ev in agen:
+        if ev.event_type == "step_complete" and ev.data.get("awaiting"):
+            break
+    # Close the generator (mirrors the WebSocketDisconnect -> aclose path).
+    await agen.aclose()
+    # Let the cancellation propagate into the orphaned-otherwise task.
+    await asyncio.sleep(0)
+    assert started.is_set()
+    assert cancelled is True
+
+
 def test_phase_table_demo_minimal_matches_legacy_11_steps_under_agents_phase():
     """PRP-38 / PRP-41 — DEMO_MINIMAL keeps the legacy 11-step flow.
 
@@ -2006,6 +2123,116 @@ async def test_rag_retrieve_probe_skips_on_embedding_auth_502():
     assert ctx.embedding_unreachable is True
 
 
+# =============================================================================
+# E5 (#411) — RAG-event capture (one ctx.rag_events entry per return path)
+# =============================================================================
+
+
+async def test_rag_event_capture_probe_records_provider_state(monkeypatch, tmp_path):
+    monkeypatch.setattr(
+        pipeline,
+        "get_settings",
+        lambda: _fake_settings(
+            str(tmp_path / "reg"), rag_embedding_provider="openai", openai_api_key="sk-test"
+        ),
+    )
+    ctx = _make_showcase_ctx()
+    await pipeline.step_embedding_provider_probe(ctx, _as_client(_RecordingClient(None)))
+    assert len(ctx.rag_events) == 1
+    ev = ctx.rag_events[0]
+    assert ev["event"] == "probe"
+    assert ev["status"] == "pass"
+    assert ev["provider"] == "openai"
+    assert ev["reachable"] is True
+
+
+async def test_rag_event_capture_index_records_chunk_count(monkeypatch, tmp_path):
+    monkeypatch.setattr(
+        pipeline,
+        "get_settings",
+        lambda: _fake_settings(str(tmp_path / "reg"), rag_embedding_provider="openai"),
+    )
+    ctx = _make_showcase_ctx()
+    results = [
+        {"source_path": p, "status": "indexed", "chunks_created": 4, "error": None}
+        for p in sorted(pipeline._USER_GUIDE_CURATED_FILES)
+    ]
+    client = _RecordingClient(
+        None,
+        responses={
+            ("POST", "/rag/index/project-docs"): {
+                "results": results,
+                "total_files": 5,
+                "indexed": 5,
+                "updated": 0,
+                "unchanged": 0,
+                "failed": 0,
+                "total_chunks": 20,
+            },
+        },
+    )
+    await pipeline.step_rag_index_subset(ctx, _as_client(client))
+    assert len(ctx.rag_events) == 1
+    ev = ctx.rag_events[0]
+    assert ev["event"] == "index"
+    assert ev["status"] == "pass"
+    assert ev["count"] == 20
+    assert ev["provider"] == "openai"
+
+
+async def test_rag_event_capture_index_skip_when_unreachable(monkeypatch, tmp_path):
+    monkeypatch.setattr(
+        pipeline,
+        "get_settings",
+        lambda: _fake_settings(str(tmp_path / "reg"), rag_embedding_provider="ollama"),
+    )
+    ctx = _make_showcase_ctx()
+    ctx.embedding_unreachable = True
+    status, _detail, _ = await pipeline.step_rag_index_subset(
+        ctx, _as_client(_RecordingClient(None))
+    )
+    assert status == "skip"
+    assert len(ctx.rag_events) == 1
+    assert ctx.rag_events[0]["event"] == "skip"
+    assert ctx.rag_events[0]["status"] == "skip"
+
+
+async def test_rag_event_capture_retrieve_warn_on_zero_hits(monkeypatch, tmp_path):
+    monkeypatch.setattr(
+        pipeline,
+        "get_settings",
+        lambda: _fake_settings(str(tmp_path / "reg"), rag_embedding_provider="openai"),
+    )
+    ctx = _make_showcase_ctx()
+    client = _RecordingClient(
+        None,
+        responses={("POST", "/rag/retrieve"): {"results": [], "total_chunks_searched": 12}},
+    )
+    status, _detail, _ = await pipeline.step_rag_retrieve_probe(ctx, _as_client(client))
+    assert status == "warn"
+    assert len(ctx.rag_events) == 1
+    ev = ctx.rag_events[0]
+    assert ev["event"] == "retrieve"
+    assert ev["status"] == "warn"
+    assert ev["count"] == 0
+
+
+async def test_rag_event_capture_demo_minimal_leaves_events_empty(monkeypatch, tmp_path):
+    """A legacy demo_minimal run never reaches the knowledge phase -> no events."""
+    artifact = tmp_path / "naive-model.joblib"
+    artifact.write_bytes(b"fake joblib artifact bytes")
+    monkeypatch.setattr(
+        pipeline, "get_settings", lambda: _fake_settings(str(tmp_path / "registry"))
+    )
+    wapes = {"naive": 0.30, "seasonal_naive": 0.15, "moving_average": 0.25}
+    monkeypatch.setattr(pipeline, "_Client", _build_fake_client(str(artifact), wapes))
+    events = [e async for e in pipeline.run_pipeline(app=_FAKE_APP, req=DemoRunRequest())]
+    # demo_minimal has no knowledge steps; the accumulator stays empty.
+    assert events[-1].event_type == "pipeline_complete"
+    knowledge_steps = {"embedding_provider_probe", "rag_index_subset", "rag_retrieve_probe"}
+    assert not any(e.step_name in knowledge_steps for e in events)
+
+
 async def test_run_pipeline_showcase_rich_runs_planning_and_knowledge(monkeypatch, tmp_path):
     """PRP-40 — end-to-end SHOWCASE_RICH reaches the 5 new steps + greens."""
     artifact = tmp_path / "artifacts" / "models" / "model_abc123def456.joblib"
@@ -2112,6 +2339,9 @@ def __init__(
             event_sink: list[Any] | None = None,
         ) -> None:
             self.calls: list[tuple[str, str]] = []
+            # E5 (#411) -- capture the approve POST body so tests can assert the
+            # relayed decision (approved=true|false + optional reason).
+            self.approve_body_sent: dict[str, Any] | None = None
             self._event_sink = event_sink if event_sink is not None else intermediate
 
         async def __aenter__(self) -> _HitlClient:
@@ -2158,16 +2388,21 @@ async def request(
                     "tokens_used": 80,
                 }
             if path.endswith("/approve"):
+                self.approve_body_sent = json_body
                 if approve_status >= 400:
                     raise pipeline._StepError(
                         step,
                         approve_status,
                         {"title": "Bad Request", "detail": "No pending action"},
                     )
-                return approve_body or {
+                if approve_body is not None:
+                    return approve_body
+                # Mirror the agents API: approved=false -> status "rejected".
+                approved = bool(json_body.get("approved", True)) if json_body else True
+                return {
                     "action_id": chat_action_id,
-                    "approved": True,
-                    "status": "executed",
+                    "approved": approved,
+                    "status": "executed" if approved else "rejected",
                 }
             raise AssertionError(f"unexpected request: {method} {path}")
 
@@ -2209,8 +2444,8 @@ def test_llm_key_present_cloud_still_requires_key(monkeypatch):
     assert pipeline._llm_key_present() is False
 
 
-async def test_agent_hitl_flow_happy_path(monkeypatch, tmp_path):
-    """PRP-41 — full HITL round-trip: chat -> intermediate -> approve -> pass."""
+async def test_agent_hitl_flow_window_lapse_auto_approves(monkeypatch, tmp_path):
+    """PRP-41 / E5 — no operator decision -> window lapses -> auto-approve pass."""
     monkeypatch.setattr(
         pipeline,
         "get_settings",
@@ -2222,8 +2457,8 @@ async def test_agent_hitl_flow_happy_path(monkeypatch, tmp_path):
         "_llm_key_present",
         lambda: True,
     )
-    # Short-circuit the 3s display delay so the test stays fast.
-    monkeypatch.setattr(pipeline, "_APPROVAL_DISPLAY_DELAY_S", 0.0)
+    # Zero window -> wait_for_decision lapses immediately (no operator click).
+    monkeypatch.setattr(pipeline, "_APPROVAL_DECISION_WINDOW_S", 0.0)
 
     client, intermediate = _make_hitl_client()
     ctx = pipeline.DemoContext(seed=42, skip_seed=True, reset=False)
@@ -2232,22 +2467,92 @@ async def test_agent_hitl_flow_happy_path(monkeypatch, tmp_path):
     assert status == "pass"
     assert "approved=executed" in detail
     assert data["approval_decision"] == "executed"
+    assert data["auto_approved"] is True
     assert data["action_id"] == "action-abc-123"
     assert data["session_id"] == "sess-test-0001"
     assert data["tokens_used"] == 240
+    # The window lapse relayed approved=true to the agents HITL gate.
+    assert client.approve_body_sent == {"action_id": "action-abc-123", "approved": True}
     # The HITL step buffered exactly one intermediate event for the FE.
     assert len(intermediate) == 1
     inter = intermediate[0]
     assert inter.status == "running"
     assert inter.data["awaiting_approval"] is True
     assert inter.data["action_id"] == "action-abc-123"
+    assert inter.data["decision_url"] == "/demo/hitl-decision"
+    assert inter.data["decision_window_s"] == 0.0
     assert inter.phase_name == pipeline.PHASE_AGENTS
+    # One approval_events entry captured (auto-approved).
+    assert len(ctx.approval_events) == 1
+    entry = ctx.approval_events[0]
+    assert entry["decision"] == "approved"
+    assert entry["auto_approved"] is True
+    assert entry["tool_name"] == "save_scenario"
+    assert entry["execution_status"] == "executed"
+    assert entry["transcript_summary"] == "I'll save that scenario."
+    assert "arguments_keys" in entry["tool_call_summary"]
     # Ctx threaded for downstream cleanup + KPI consumers.
     assert ctx.approval_action_id == "action-abc-123"
     assert ctx.agent_approval_decision == "executed"
     assert ctx.session_id == "sess-test-0001"
 
 
+async def test_agent_hitl_flow_operator_approve(monkeypatch, tmp_path):
+    """E5 — operator approves within the window -> approve POST, entry approved."""
+    monkeypatch.setattr(pipeline, "get_settings", lambda: _fake_settings(str(tmp_path / "reg")))
+    monkeypatch.setattr(pipeline, "_llm_key_present", lambda: True)
+    monkeypatch.setattr(pipeline, "_APPROVAL_DECISION_WINDOW_S", 5.0)
+
+    async def _fake_wait(_action_id: str, timeout: float) -> tuple[str, str | None]:
+        return ("approved", None)
+
+    monkeypatch.setattr(hitl, "wait_for_decision", _fake_wait)
+
+    client, _intermediate = _make_hitl_client()
+    ctx = pipeline.DemoContext(seed=42, skip_seed=True, reset=False)
+    status, _detail, data = await pipeline.step_agent_hitl_flow(ctx, client)
+
+    assert status == "pass"
+    assert data["approval_decision"] == "executed"
+    assert data["auto_approved"] is False
+    assert client.approve_body_sent == {"action_id": "action-abc-123", "approved": True}
+    assert len(ctx.approval_events) == 1
+    assert ctx.approval_events[0]["decision"] == "approved"
+    assert ctx.approval_events[0]["auto_approved"] is False
+
+
+async def test_agent_hitl_flow_operator_reject(monkeypatch, tmp_path):
+    """E5 (D5) — operator rejects -> approve POST approved=false; pass + green."""
+    monkeypatch.setattr(pipeline, "get_settings", lambda: _fake_settings(str(tmp_path / "reg")))
+    monkeypatch.setattr(pipeline, "_llm_key_present", lambda: True)
+    monkeypatch.setattr(pipeline, "_APPROVAL_DECISION_WINDOW_S", 5.0)
+
+    async def _fake_wait(_action_id: str, timeout: float) -> tuple[str, str | None]:
+        return ("rejected", "too risky for the demo")
+
+    monkeypatch.setattr(hitl, "wait_for_decision", _fake_wait)
+
+    client, _intermediate = _make_hitl_client()
+    ctx = pipeline.DemoContext(seed=42, skip_seed=True, reset=False)
+    status, detail, data = await pipeline.step_agent_hitl_flow(ctx, client)
+
+    # D5 -- a reject is a SUCCESSFUL HITL demonstration: the run stays GREEN.
+    assert status == "pass"
+    assert detail == "rejected by operator"
+    assert data["approval_decision"] == "rejected"
+    # The reject + reason were relayed to the agents HITL gate (approved=false).
+    assert client.approve_body_sent == {
+        "action_id": "action-abc-123",
+        "approved": False,
+        "reason": "too risky for the demo",
+    }
+    assert len(ctx.approval_events) == 1
+    entry = ctx.approval_events[0]
+    assert entry["decision"] == "rejected"
+    assert entry["reason"] == "too risky for the demo"
+    assert entry["auto_approved"] is False
+
+
 async def test_agent_hitl_flow_skips_without_key(monkeypatch, tmp_path):
     """PRP-41 — no LLM key -> skip-gracefully; no session created."""
     monkeypatch.setattr(pipeline, "get_settings", lambda: _fake_settings(str(tmp_path / "reg")))
@@ -2297,7 +2602,7 @@ async def test_agent_hitl_flow_skips_when_agent_did_not_trigger_tool(monkeypatch
     """PRP-41 — agent answered directly (no pending_action) -> skip with detail."""
     monkeypatch.setattr(pipeline, "get_settings", lambda: _fake_settings(str(tmp_path / "reg")))
     monkeypatch.setattr(pipeline, "_llm_key_present", lambda: True)
-    monkeypatch.setattr(pipeline, "_APPROVAL_DISPLAY_DELAY_S", 0.0)
+    monkeypatch.setattr(pipeline, "_APPROVAL_DECISION_WINDOW_S", 0.0)
 
     client, intermediate = _make_hitl_client(chat_pending=False)
     ctx = pipeline.DemoContext(seed=42, skip_seed=True, reset=False)
@@ -2309,20 +2614,26 @@ async def test_agent_hitl_flow_skips_when_agent_did_not_trigger_tool(monkeypatch
     assert intermediate == []
 
 
-async def test_agent_hitl_flow_absorbs_double_approve_400(monkeypatch, tmp_path):
-    """PRP-41 — FE pre-empted Approve -> backend approve returns 400; absorb."""
+async def test_agent_hitl_flow_absorbs_external_approve_400(monkeypatch, tmp_path):
+    """E5 (D1) — an operator pre-empted /agents/.../approve directly -> 400.
+
+    The step absorbs the 4xx and records ``execution_status="external_4xx"`` --
+    honest about the residual ambiguity (the decision landed outside the relay).
+    """
     monkeypatch.setattr(pipeline, "get_settings", lambda: _fake_settings(str(tmp_path / "reg")))
     monkeypatch.setattr(pipeline, "_llm_key_present", lambda: True)
-    monkeypatch.setattr(pipeline, "_APPROVAL_DISPLAY_DELAY_S", 0.0)
+    monkeypatch.setattr(pipeline, "_APPROVAL_DECISION_WINDOW_S", 0.0)
 
     client, intermediate = _make_hitl_client(approve_status=400)
     ctx = pipeline.DemoContext(seed=42, skip_seed=True, reset=False)
     status, detail, data = await pipeline.step_agent_hitl_flow(ctx, client)
 
-    # 4xx absorbed: step still passes with optimistic "executed" decision.
+    # 4xx absorbed: step still passes; the decision is the honest external edge.
     assert status == "pass"
-    assert data["approval_decision"] == "executed"
-    assert "approved=executed" in detail
+    assert data["approval_decision"] == "external_4xx"
+    assert "approved=external_4xx" in detail
+    assert ctx.approval_events[0]["execution_status"] == "external_4xx"
+    assert ctx.approval_events[0]["decision"] == "approved"
     # The intermediate event was still buffered before the absorb branch.
     assert len(intermediate) == 1
 
@@ -2331,7 +2642,7 @@ async def test_agent_hitl_flow_skips_on_hard_timeout(monkeypatch, tmp_path):
     """PRP-41 — elapsed > _APPROVAL_HARD_TIMEOUT_S -> skip with timed_out."""
     monkeypatch.setattr(pipeline, "get_settings", lambda: _fake_settings(str(tmp_path / "reg")))
     monkeypatch.setattr(pipeline, "_llm_key_present", lambda: True)
-    monkeypatch.setattr(pipeline, "_APPROVAL_DISPLAY_DELAY_S", 0.0)
+    monkeypatch.setattr(pipeline, "_APPROVAL_DECISION_WINDOW_S", 0.0)
     # Force the elapsed-time check to fire: set the hard cap below the
     # display delay so any positive elapsed exceeds it.
     monkeypatch.setattr(pipeline, "_APPROVAL_HARD_TIMEOUT_S", -1.0)
@@ -2345,6 +2656,10 @@ async def test_agent_hitl_flow_skips_on_hard_timeout(monkeypatch, tmp_path):
     assert data["timed_out"] is True
     assert data["approval_decision"] == "timed_out"
     assert ctx.agent_approval_decision == "timed_out"
+    # E5 -- a timed_out entry is still recorded for the workspace story.
+    assert len(ctx.approval_events) == 1
+    assert ctx.approval_events[0]["decision"] == "timed_out"
+    assert ctx.approval_events[0]["execution_status"] is None
     # Intermediate event was emitted; approve POST never fired.
     assert len(intermediate) == 1
     assert all(call[1] != f"/agents/sessions/{data['session_id']}/approve" for call in client.calls)
diff --git a/app/features/demo/tests/test_routes.py b/app/features/demo/tests/test_routes.py
index f5813d79..7b8858ba 100644
--- a/app/features/demo/tests/test_routes.py
+++ b/app/features/demo/tests/test_routes.py
@@ -822,3 +822,127 @@ async def test_workspace_health_integration_alive_and_dead(client, db_session: A
         assert body["partial_run"] is True
     finally:
         await client.delete(f"/agents/sessions/{agent_session_id}")
+
+
+# =============================================================================
+# E5 (#411) — POST /demo/hitl-decision + GET /demo/approval-events
+# =============================================================================
+
+
+@pytest.fixture(autouse=True)
+def _clear_hitl_slot():
+    """Reset the module-level HITL relay slot around every test in this file."""
+    from app.features.demo import hitl
+
+    hitl.clear()
+    yield
+    hitl.clear()
+
+
+async def test_hitl_decision_204_on_pending(client):
+    """A decision for the registered pending action returns 204."""
+    from app.features.demo import hitl
+
+    hitl.register("act-204")
+    resp = await client.post(
+        "/demo/hitl-decision",
+        json={"action_id": "act-204", "decision": "rejected", "reason": "too risky"},
+    )
+    assert resp.status_code == 204
+    # The relay recorded the operator's decision for the waiting step (use a
+    # positive timeout: wait_for(timeout=0) raises before stepping the
+    # freshly-scheduled task, even when the event is already set).
+    assert await hitl.wait_for_decision("act-204", timeout=1.0) == ("rejected", "too risky")
+
+
+async def test_hitl_decision_404_when_nothing_pending(client):
+    """No registered action -> 404 problem+json."""
+    resp = await client.post(
+        "/demo/hitl-decision",
+        json={"action_id": "ghost", "decision": "approved"},
+    )
+    assert resp.status_code == 404
+    assert resp.headers["content-type"].startswith("application/problem+json")
+    assert "No pending HITL action" in resp.json()["detail"]
+
+
+async def test_hitl_decision_409_when_already_decided(client):
+    """A second decision for the same action -> 409 problem+json."""
+    from app.features.demo import hitl
+
+    hitl.register("act-409")
+    first = await client.post(
+        "/demo/hitl-decision", json={"action_id": "act-409", "decision": "approved"}
+    )
+    assert first.status_code == 204
+    second = await client.post(
+        "/demo/hitl-decision", json={"action_id": "act-409", "decision": "rejected"}
+    )
+    assert second.status_code == 409
+    assert second.headers["content-type"].startswith("application/problem+json")
+    assert "already decided" in second.json()["detail"]
+
+
+async def test_hitl_decision_422_bad_body(client):
+    """A bad decision literal / extra key -> 422 problem+json."""
+    bad_literal = await client.post(
+        "/demo/hitl-decision", json={"action_id": "a", "decision": "maybe"}
+    )
+    assert bad_literal.status_code == 422
+    assert bad_literal.headers["content-type"].startswith("application/problem+json")
+    extra_key = await client.post(
+        "/demo/hitl-decision",
+        json={"action_id": "a", "decision": "approved", "bogus": 1},
+    )
+    assert extra_key.status_code == 422
+
+
+async def test_approval_events_empty(client, monkeypatch):
+    """200 + empty list when no workspace carries approval events."""
+
+    async def fake_list(_db, *, limit: int = 50) -> list[dict[str, object]]:
+        return []
+
+    monkeypatch.setattr(workspace, "list_approval_events", fake_list)
+    resp = await client.get("/demo/approval-events")
+    assert resp.status_code == 200
+    body = resp.json()
+    assert body == {"events": [], "total": 0}
+
+
+async def test_approval_events_populated(client, monkeypatch):
+    """Flattened entries carry workspace_id / workspace_name + decision."""
+
+    async def fake_list(_db, *, limit: int = 50) -> list[dict[str, object]]:
+        return [
+            {
+                "workspace_id": "a" * 32,
+                "workspace_name": "demo-1",
+                "action_id": "act-1",
+                "tool_name": "save_scenario",
+                "decision": "rejected",
+                "decided_at": "2026-06-13T00:00:00+00:00",
+                "session_id": "sess-1",
+                "auto_approved": False,
+                "reason": "too risky",
+                "execution_status": "rejected",
+                "transcript_summary": "I'll save that scenario.",
+            }
+        ]
+
+    monkeypatch.setattr(workspace, "list_approval_events", fake_list)
+    resp = await client.get("/demo/approval-events", params={"limit": 5})
+    assert resp.status_code == 200
+    body = resp.json()
+    assert body["total"] == 1
+    assert body["events"][0]["workspace_id"] == "a" * 32
+    assert body["events"][0]["workspace_name"] == "demo-1"
+    assert body["events"][0]["decision"] == "rejected"
+
+
+async def test_approval_events_rejects_bad_limit(client):
+    """limit is bounded 1-200 -> 422 problem+json out of range."""
+    resp = await client.get("/demo/approval-events", params={"limit": 0})
+    assert resp.status_code == 422
+    resp = await client.get("/demo/approval-events", params={"limit": 999})
+    assert resp.status_code == 422
diff --git a/app/features/demo/tests/test_schemas.py b/app/features/demo/tests/test_schemas.py
index d7ba3573..ac976544 100644
--- a/app/features/demo/tests/test_schemas.py
+++ b/app/features/demo/tests/test_schemas.py
@@ -7,9 +7,11 @@
 from pydantic import ValidationError
 
 from app.features.demo.schemas import (
+    ApprovalEventItem,
     DemoBacktestConfig,
     DemoRunRequest,
     DemoRunResult,
+    HitlDecisionRequest,
     StepEvent,
     UserScope,
     WorkspaceDetailResponse,
@@ -622,3 +624,60 @@ def test_workspace_list_response_shape():
     assert dumped["workspaces"][0]["workspace_id"] == "a" * 32
     # ISO serialization on the wire.
     assert isinstance(dumped["workspaces"][0]["created_at"], str)
+
+
+# =============================================================================
+# E5 (#411) — HitlDecisionRequest + ApprovalEventItem
+# =============================================================================
+
+
+def test_hitl_decision_request_json_path():
+    """The JSON-dict path (FastAPI's validate_python) accepts a valid body."""
+    body = HitlDecisionRequest.model_validate(
+        {"action_id": "act-1", "decision": "rejected", "reason": "too risky"}
+    )
+    assert body.action_id == "act-1"
+    assert body.decision == "rejected"
+    assert body.reason == "too risky"
+
+
+def test_hitl_decision_request_reason_optional():
+    body = HitlDecisionRequest.model_validate({"action_id": "act-1", "decision": "approved"})
+    assert body.reason is None
+
+
+def test_hitl_decision_request_rejects_unknown_decision():
+    with pytest.raises(ValidationError):
+        HitlDecisionRequest.model_validate({"action_id": "a", "decision": "maybe"})
+
+
+def test_hitl_decision_request_forbids_extra_key():
+    with pytest.raises(ValidationError):
+        HitlDecisionRequest.model_validate({"action_id": "a", "decision": "approved", "bogus": 1})
+
+
+def test_hitl_decision_request_rejects_empty_action_and_long_reason():
+    with pytest.raises(ValidationError):
+        HitlDecisionRequest.model_validate({"action_id": "", "decision": "approved"})
+    with pytest.raises(ValidationError):
+        HitlDecisionRequest.model_validate(
+            {"action_id": "a", "decision": "approved", "reason": "x" * 501}
+        )
+
+
+def test_approval_event_item_tolerates_v1_entry():
+    """A pre-E5 base-key entry (no additive keys) still validates with defaults."""
+    item = ApprovalEventItem.model_validate(
+        {
+            "workspace_id": "a" * 32,
+            "workspace_name": "demo-1",
+            "action_id": "act-1",
+            "tool_name": "save_scenario",
+            "decision": "approved",
+            "decided_at": "2026-06-13T00:00:00+00:00",
+            "session_id": "sess-1",
+        }
+    )
+    assert item.auto_approved is None
+    assert item.execution_status is None
+    assert item.decision == "approved"
diff --git a/app/features/demo/tests/test_workspace.py b/app/features/demo/tests/test_workspace.py
index b0597981..737bfccd 100644
--- a/app/features/demo/tests/test_workspace.py
+++ b/app/features/demo/tests/test_workspace.py
@@ -279,7 +279,8 @@ async def test_create_workspace_without_replayed_from_is_none(db_session: AsyncS
     assert row.archived is False
     assert row.pinned is False
     assert row.tags == []
-    assert row.config_schema_version == 1
+    # E5 (#411) D4 -- new rows carry the bumped story-slot schema version.
+    assert row.config_schema_version == 2
 
 
 # =============================================================================
@@ -379,3 +380,143 @@ async def test_update_workspace_empty_request_noop(db_session: AsyncSession) ->
     assert row is not None
     assert row.name == "it-noop"
     assert row.status == WORKSPACE_STATUS_RUNNING
+
+
+# =============================================================================
+# E5 (#411) — story-slot capture + reproduction marker + approval-events list
+# =============================================================================
+
+
+def _ctx_with_story() -> DemoContext:
+    """A finished ctx carrying one approval event + one index rag event."""
+    ctx = _finished_ctx()
+    ctx.approval_events = [
+        {
+            "action_id": "act-1",
+            "tool_name": "save_scenario",
+            "decision": "rejected",
+            "decided_at": "2026-06-13T00:00:00+00:00",
+            "session_id": "sess-0123abcd",
+            "auto_approved": False,
+            "reason": "too risky",
+            "execution_status": "rejected",
+            "tool_call_summary": {"description": "save plan", "arguments_keys": ["name"]},
+            "transcript_summary": "I'll save that scenario.",
+            "tokens_used": 240,
+            "tool_calls_count": 1,
+        }
+    ]
+    ctx.rag_events = [
+        {
+            "event": "index",
+            "status": "pass",
+            "detail": "files_indexed=5/5 chunks=20",
+            "count": 20,
+            "occurred_at": "2026-06-13T00:00:00+00:00",
+            "provider": "openai",
+            "reachable": None,
+        }
+    ]
+    return ctx
+
+
+async def test_finalize_writes_story_slots(db_session: AsyncSession) -> None:
+    """finalize persists approval_events + rag_events when the run captured them."""
+    workspace_id = await workspace.create_workspace(_keep_request(workspace_name="it-story"))
+    assert workspace_id is not None
+    await workspace.finalize_workspace(
+        workspace_id, _ctx_with_story(), failed=False, wall_clock_s=5.0
+    )
+    row = await workspace.get_workspace(db_session, workspace_id)
+    assert row is not None
+    assert row.approval_events is not None
+    assert len(row.approval_events) == 1
+    assert row.approval_events[0]["decision"] == "rejected"
+    assert row.rag_events is not None
+    assert row.rag_events[0]["event"] == "index"
+
+
+async def test_finalize_leaves_story_slots_null_when_empty(db_session: AsyncSession) -> None:
+    """Empty accumulators -> slots stay NULL (never []), per E1's slot contract."""
+    workspace_id = await workspace.create_workspace(_keep_request(workspace_name="it-empty"))
+    assert workspace_id is not None
+    await workspace.finalize_workspace(
+        workspace_id, _finished_ctx(), failed=False, wall_clock_s=5.0
+    )
+    row = await workspace.get_workspace(db_session, workspace_id)
+    assert row is not None
+    assert row.approval_events is None
+    assert row.rag_events is None
+    assert "story_reproduction" not in (row.result_summary or {})
+
+
+async def test_finalize_replay_records_reproduced(db_session: AsyncSession) -> None:
+    """A replay of a workspace WITH a story whose run also has a story -> reproduced."""
+    source_id = await workspace.create_workspace(_keep_request(workspace_name="it-src"))
+    assert source_id is not None
+    await workspace.finalize_workspace(source_id, _ctx_with_story(), failed=False, wall_clock_s=5.0)
+    replay_id = await workspace.create_workspace(
+        _keep_request(workspace_name="it-replay", replayed_from_workspace_id=source_id)
+    )
+    assert replay_id is not None
+    await workspace.finalize_workspace(replay_id, _ctx_with_story(), failed=False, wall_clock_s=5.0)
+    row = await workspace.get_workspace(db_session, replay_id)
+    assert row is not None
+    repro = (row.result_summary or {})["story_reproduction"]
+    assert repro["agent"] == "reproduced"
+    assert repro["knowledge"] == "reproduced"
+    assert repro["source_workspace_id"] == source_id
+
+
+async def test_finalize_replay_records_not_applicable(db_session: AsyncSession) -> None:
+    """Source row had NO story -> not_applicable regardless of this run's capture."""
+    source_id = await workspace.create_workspace(_keep_request(workspace_name="it-src2"))
+    assert source_id is not None
+    await workspace.finalize_workspace(source_id, _finished_ctx(), failed=False, wall_clock_s=5.0)
+    replay_id = await workspace.create_workspace(
+        _keep_request(workspace_name="it-replay2", replayed_from_workspace_id=source_id)
+    )
+    assert replay_id is not None
+    await workspace.finalize_workspace(replay_id, _ctx_with_story(), failed=False, wall_clock_s=5.0)
+    row = await workspace.get_workspace(db_session, replay_id)
+    assert row is not None
+    repro = (row.result_summary or {})["story_reproduction"]
+    assert repro["agent"] == "not_applicable"
+    assert repro["knowledge"] == "not_applicable"
+
+
+async def test_finalize_replay_dangling_source_is_unknown(db_session: AsyncSession) -> None:
+    """A dangling replay source (deleted / never existed) -> unknown."""
+    replay_id = await workspace.create_workspace(
+        _keep_request(workspace_name="it-dangle", replayed_from_workspace_id="0" * 32)
+    )
+    assert replay_id is not None
+    await workspace.finalize_workspace(replay_id, _ctx_with_story(), failed=False, wall_clock_s=5.0)
+    row = await workspace.get_workspace(db_session, replay_id)
+    assert row is not None
+    repro = (row.result_summary or {})["story_reproduction"]
+    assert repro["agent"] == "unknown"
+    assert repro["knowledge"] == "unknown"
+    assert repro["source_workspace_id"] is None
+
+
+async def test_list_approval_events_flattens_newest_first(db_session: AsyncSession) -> None:
+    """list_approval_events flattens entries newest-row-first and respects limit."""
+    for index in range(2):
+        wid = await workspace.create_workspace(_keep_request(workspace_name=f"it-ae-{index}"))
+        assert wid is not None
+        await workspace.finalize_workspace(wid, _ctx_with_story(), failed=False, wall_clock_s=1.0)
+    # A workspace with no approval_events must be excluded from the flatten.
+    plain = await workspace.create_workspace(_keep_request(workspace_name="it-ae-plain"))
+    assert plain is not None
+    await workspace.finalize_workspace(plain, _finished_ctx(), failed=False, wall_clock_s=1.0)
+
+    events = await workspace.list_approval_events(db_session, limit=50)
+    assert len(events) == 2
+    assert all(e["workspace_name"].startswith("it-ae-") for e in events)
+    assert all("workspace_id" in e and e["decision"] == "rejected" for e in events)
+    # Newest workspace first.
+    assert events[0]["workspace_name"] == "it-ae-1"
+
+    capped = await workspace.list_approval_events(db_session, limit=1)
+    assert len(capped) == 1
diff --git a/app/features/demo/workspace.py b/app/features/demo/workspace.py
index 1b3ba4aa..7b5ce5c7 100644
--- a/app/features/demo/workspace.py
+++ b/app/features/demo/workspace.py
@@ -219,11 +219,29 @@ async def finalize_workspace(
             row.date_start = ctx.date_start
             row.date_end = ctx.date_end
             row.created_objects = _collect_created_objects(ctx)
-            row.result_summary = {
+            # E5 (#411) -- story slots. Empty list -> None (E1: NULL = "slot
+            # never written"). Whole-value assignment so SQLAlchemy change
+            # detection fires (never mutate a loaded row's JSONB in place).
+            row.approval_events = ctx.approval_events or None
+            row.rag_events = ctx.rag_events or None
+            summary: dict[str, Any] = {
                 "winner_model_type": ctx.winner_model_type,
                 "winner_wape": ctx.winner_wape,
                 "wall_clock_s": wall_clock_s,
             }
+            # E5 (#411) D7 -- on a replay keep-run, compare the SOURCE row's
+            # story slots against this run's capture and record the verdict.
+            # One extra get-by-id select in this same session (inside the
+            # warn-and-continue try). A dangling source -> "unknown".
+            if row.replayed_from_workspace_id:
+                src_result = await db.execute(
+                    select(ShowcaseWorkspace).where(
+                        ShowcaseWorkspace.workspace_id == row.replayed_from_workspace_id
+                    )
+                )
+                src = src_result.scalar_one_or_none()
+                summary["story_reproduction"] = _story_reproduction(src, ctx)
+            row.result_summary = summary
             await db.commit()
     except Exception as exc:  # workspace must never break the demo
         logger.warning(
@@ -236,6 +254,68 @@ async def finalize_workspace(
     logger.info("demo.workspace_finalized", workspace_id=workspace_id, failed=failed)
 
 
+def _story_reproduction(src: ShowcaseWorkspace | None, ctx: DemoContext) -> dict[str, Any]:
+    """Compare the source row's story slots against this run's capture (E5 D7).
+
+    ``agent``: the source row had >=1 approval event -> compare with this run
+    (``reproduced`` / ``not_reproduced``); the source had none -> ``not_applicable``;
+    the source row is missing (soft reference dangles) -> ``unknown``.
+    ``knowledge``: same logic over ``rag_events`` entries whose ``event`` is
+    ``index`` / ``retrieve`` with ``status != "skip"`` (a real knowledge hit,
+    not a graceful skip).
+    """
+    if src is None:
+        return {"agent": "unknown", "knowledge": "unknown", "source_workspace_id": None}
+
+    def _verdict(source_had: bool, new_has: bool) -> str:
+        if not source_had:
+            return "not_applicable"
+        return "reproduced" if new_has else "not_reproduced"
+
+    def _has_knowledge(events: list[dict[str, Any]] | None) -> bool:
+        return any(
+            e.get("event") in ("index", "retrieve") and e.get("status") != "skip"
+            for e in (events or [])
+        )
+
+    return {
+        "agent": _verdict(bool(src.approval_events), bool(ctx.approval_events)),
+        "knowledge": _verdict(_has_knowledge(src.rag_events), _has_knowledge(ctx.rag_events)),
+        "source_workspace_id": src.workspace_id,
+    }
+
+
+async def list_approval_events(db: AsyncSession, *, limit: int = 50) -> list[dict[str, Any]]:
+    """Flatten ``approval_events`` across the newest rows that carry the slot.
+
+    Scans the newest workspace rows with a non-NULL ``approval_events`` slot
+    (an audit-glance surface, not a browse API) and flattens their entries
+    newest-row-first, each tagged with its ``workspace_id`` / ``workspace_name``,
+    capped at ``limit``. Python-side flatten over a low-cardinality table -- no
+    ``jsonb_array_elements`` SQL (D6).
+
+    Args:
+        db: An open async session (caller-owned).
+        limit: Maximum flattened entries to return (route caps 1-200).
+
+    Returns:
+        The flattened approval events, newest workspace first.
+    """
+    result = await db.execute(
+        select(ShowcaseWorkspace)
+        .where(ShowcaseWorkspace.approval_events.isnot(None))
+        .order_by(ShowcaseWorkspace.created_at.desc(), ShowcaseWorkspace.id.desc())
+        .limit(50)
+    )
+    events: list[dict[str, Any]] = []
+    for row in result.scalars():
+        for entry in row.approval_events or []:
+            events.append({"workspace_id": row.workspace_id, "workspace_name": row.name, **entry})
+            if len(events) >= limit:
+                return events
+    return events
+
+
 async def get_workspace(db: AsyncSession, workspace_id: str) -> ShowcaseWorkspace | None:
     """Load a workspace row by its external id.
 

From 8277e45b1c1486b687c28fad6506efadb92af32a Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 08:49:37 +0200
Subject: [PATCH 26/32] feat(ui): add reject button, run story panel and ops
 approval history (#411)

---
 .../demo/WorkspaceArtifactsPanel.test.tsx     |   3 +
 .../demo/WorkspaceStoryPanel.test.tsx         | 121 +++++++++++
 .../components/demo/WorkspaceStoryPanel.tsx   | 193 ++++++++++++++++++
 .../components/demo/demo-step-card.test.tsx   |  90 +++++++-
 .../src/components/demo/demo-step-card.tsx    |  94 +++++----
 frontend/src/components/demo/index.ts         |   2 +
 frontend/src/hooks/index.ts                   |   1 +
 .../src/hooks/use-approval-events.test.ts     |  86 ++++++++
 frontend/src/hooks/use-approval-events.ts     |  19 ++
 frontend/src/pages/ops.tsx                    |  74 +++++++
 frontend/src/pages/showcase.tsx               |   3 +
 frontend/src/types/api.ts                     |  54 +++++
 12 files changed, 698 insertions(+), 42 deletions(-)
 create mode 100644 frontend/src/components/demo/WorkspaceStoryPanel.test.tsx
 create mode 100644 frontend/src/components/demo/WorkspaceStoryPanel.tsx
 create mode 100644 frontend/src/hooks/use-approval-events.test.ts
 create mode 100644 frontend/src/hooks/use-approval-events.ts

diff --git a/frontend/src/components/demo/WorkspaceArtifactsPanel.test.tsx b/frontend/src/components/demo/WorkspaceArtifactsPanel.test.tsx
index 6fe96923..fa8b0ab7 100644
--- a/frontend/src/components/demo/WorkspaceArtifactsPanel.test.tsx
+++ b/frontend/src/components/demo/WorkspaceArtifactsPanel.test.tsx
@@ -34,8 +34,11 @@ const fullWorkspace: WorkspaceDetail = {
   replayed_from_workspace_id: null,
   seed_overrides: null,
   user_scope: null,
+  run_config: null,
   notes: null,
   config_schema_version: 1,
+  approval_events: null,
+  rag_events: null,
 }
 
 function renderPanel(workspace: WorkspaceDetail, health: WorkspaceHealth | null = null) {
diff --git a/frontend/src/components/demo/WorkspaceStoryPanel.test.tsx b/frontend/src/components/demo/WorkspaceStoryPanel.test.tsx
new file mode 100644
index 00000000..9de64bef
--- /dev/null
+++ b/frontend/src/components/demo/WorkspaceStoryPanel.test.tsx
@@ -0,0 +1,121 @@
+/**
+ * E5 (#411) — render tests for WorkspaceStoryPanel: approval history,
+ * knowledge events, reproduction markers, and the legacy self-hide path.
+ */
+
+import { cleanup, render, screen } from '@testing-library/react'
+import { afterEach, describe, expect, it } from 'vitest'
+import { WorkspaceStoryPanel } from './WorkspaceStoryPanel'
+import type { WorkspaceDetail } from '@/types/api'
+
+afterEach(() => cleanup())
+
+const baseWorkspace: WorkspaceDetail = {
+  workspace_id: 'a'.repeat(32),
+  name: 'e5-story',
+  status: 'completed',
+  seed: 42,
+  scenario: 'showcase_rich',
+  reset: false,
+  skip_seed: true,
+  result_summary: null,
+  created_at: '2026-06-01T12:00:00Z',
+  store_id: 3,
+  product_id: 7,
+  date_start: '2026-01-01',
+  date_end: '2026-03-31',
+  created_objects: {},
+  archived: false,
+  pinned: false,
+  tags: [],
+  replayed_from_workspace_id: null,
+  seed_overrides: null,
+  user_scope: null,
+  run_config: null,
+  notes: null,
+  config_schema_version: 2,
+  approval_events: null,
+  rag_events: null,
+}
+
+function renderPanel(workspace: WorkspaceDetail) {
+  return render(<WorkspaceStoryPanel workspace={workspace} />)
+}
+
+describe('WorkspaceStoryPanel', () => {
+  it('renders nothing for a legacy row with no slots and no reproduction', () => {
+    const { container } = renderPanel(baseWorkspace)
+    expect(container.firstChild).toBeNull()
+    expect(screen.queryByTestId('workspace-story-panel')).toBeNull()
+  })
+
+  it('renders approval history with a decision badge, tool, and transcript snippet', () => {
+    const workspace: WorkspaceDetail = {
+      ...baseWorkspace,
+      approval_events: [
+        {
+          action_id: 'act-1',
+          tool_name: 'save_scenario',
+          decision: 'rejected',
+          decided_at: '2026-06-01T12:05:00Z',
+          session_id: 'sess-1',
+          auto_approved: false,
+          reason: 'not now',
+          execution_status: 'rejected',
+          transcript_summary: 'I would like to save this scenario plan.',
+          tokens_used: 240,
+          tool_calls_count: 1,
+        },
+      ],
+    }
+    const { container } = renderPanel(workspace)
+    expect(screen.getByTestId('workspace-story-panel')).toBeTruthy()
+    expect(container.textContent).toContain('rejected')
+    expect(container.textContent).toContain('save_scenario')
+    expect(container.textContent).toContain('I would like to save this scenario plan.')
+    expect(container.textContent).toContain('reason: not now')
+  })
+
+  it('renders knowledge events with event/status/provider/count', () => {
+    const workspace: WorkspaceDetail = {
+      ...baseWorkspace,
+      rag_events: [
+        {
+          event: 'index',
+          status: 'pass',
+          detail: 'indexed 5 files',
+          count: 42,
+          occurred_at: '2026-06-01T12:03:00Z',
+          provider: 'openai',
+          reachable: null,
+        },
+      ],
+    }
+    const { container } = renderPanel(workspace)
+    expect(container.textContent).toContain('index')
+    expect(container.textContent).toContain('pass')
+    expect(container.textContent).toContain('openai')
+    expect(container.textContent).toContain('count: 42')
+  })
+
+  it('renders reproduction markers only when story_reproduction is present', () => {
+    const workspace: WorkspaceDetail = {
+      ...baseWorkspace,
+      result_summary: {
+        story_reproduction: {
+          agent: 'reproduced',
+          knowledge: 'not_reproduced',
+          source_workspace_id: 'b'.repeat(32),
+        },
+      },
+    }
+    renderPanel(workspace)
+    const marker = screen.getByTestId('story-reproduction')
+    expect(marker.textContent).toContain('agent')
+    expect(marker.textContent).toContain('reproduced')
+    expect(marker.textContent).toContain('knowledge')
+    expect(marker.textContent).toContain('not reproduced')
+    // source_workspace_id is not rendered as a verdict chip.
+    expect(marker.textContent).not.toContain('source_workspace_id')
+  })
+})
diff --git a/frontend/src/components/demo/WorkspaceStoryPanel.tsx b/frontend/src/components/demo/WorkspaceStoryPanel.tsx
new file mode 100644
index 00000000..cc98b420
--- /dev/null
+++ b/frontend/src/components/demo/WorkspaceStoryPanel.tsx
@@ -0,0 +1,193 @@
+/**
+ * E5 (#411) — render the agent/HITL + RAG story captured on a LOADED
+ * workspace row. Three sections:
+ *   - Approval history: each approval_events entry (decision badge + tool +
+ *     transcript snippet + when).
+ *   - Knowledge events: each rag_events entry (event/status/provider/count).
+ *   - Reproduction markers: result_summary.story_reproduction chips (replay
+ *     rows only — rendered only when present).
+ *
+ * Renders NOTHING for legacy rows that carry neither slot nor a reproduction
+ * marker. Reads the row only — the run is long gone, the row is the memory
+ * (same contract as WorkspaceArtifactsPanel).
+ */
+
+import { Card, CardContent, CardDescription, CardHeader, CardTitle } from '@/components/ui/card'
+import { StatusBadge } from '@/components/common/status-badge'
+import type {
+  ApprovalEventDetail,
+  RagEventDetail,
+  WorkspaceDetail,
+} from '@/types/api'
+
+interface WorkspaceStoryPanelProps {
+  workspace: WorkspaceDetail
+}
+
+/** Format an ISO timestamp for display; '—' when null. */
+function formatWhen(value: string | null | undefined): string {
+  if (!value) return '—'
+  const parsed = new Date(value)
+  return Number.isNaN(parsed.getTime()) ? value : parsed.toLocaleString()
+}
+
+/** Decision → StatusBadge variant. */
+function decisionVariant(
+  decision: string | null,
+): 'success' | 'error' | 'warning' | 'default' {
+  if (decision === 'approved') return 'success'
+  if (decision === 'rejected') return 'error'
+  if (decision === 'timed_out') return 'warning'
+  return 'default'
+}
+
+/** rag_events status → StatusBadge variant. */
+function ragStatusVariant(status: string): 'success' | 'warning' | 'pending' | 'default' {
+  if (status === 'pass') return 'success'
+  if (status === 'warn') return 'warning'
+  if (status === 'skip') return 'pending'
+  return 'default'
+}
+
+/** story_reproduction verdict → StatusBadge variant. */
+function verdictVariant(verdict: string): 'success' | 'error' | 'pending' | 'default' {
+  if (verdict === 'reproduced') return 'success'
+  if (verdict === 'not_reproduced') return 'error'
+  if (verdict === 'not_applicable' || verdict === 'unknown') return 'pending'
+  return 'default'
+}
+
+/** Read result_summary.story_reproduction as a tolerant map of string verdicts. */
+function readReproduction(
+  summary: Record<string, unknown> | null,
+): Record<string, string> | null {
+  if (!summary || typeof summary !== 'object') return null
+  const raw = (summary as Record<string, unknown>).story_reproduction
+  if (!raw || typeof raw !== 'object') return null
+  const out: Record<string, string> = {}
+  for (const [key, value] of Object.entries(raw as Record<string, unknown>)) {
+    if (typeof value === 'string') out[key] = value
+  }
+  return Object.keys(out).length > 0 ? out : null
+}
+
+export function WorkspaceStoryPanel({ workspace }: WorkspaceStoryPanelProps) {
+  const approvalEvents: ApprovalEventDetail[] = workspace.approval_events ?? []
+  const ragEvents: RagEventDetail[] = workspace.rag_events ?? []
+  const reproduction = readReproduction(workspace.result_summary)
+
+  // Legacy rows: nothing captured -> render nothing.
+  if (approvalEvents.length === 0 && ragEvents.length === 0 && reproduction === null) {
+    return null
+  }
+
+  return (
+    <Card data-testid="workspace-story-panel">
+      <CardHeader>
+        <CardTitle>Run story</CardTitle>
+        <CardDescription>
+          The agent/HITL approval and knowledge moments this run captured —
+          replayed from the workspace row.
+        </CardDescription>
+      </CardHeader>
+      <CardContent className="space-y-5">
+        {/* Reproduction markers — replay rows only. */}
+        {reproduction && (
+          <div className="space-y-2" data-testid="story-reproduction">
+            <h3 className="text-sm font-semibold">Replay reproduction</h3>
+            <div className="flex flex-wrap items-center gap-2">
+              {Object.entries(reproduction)
+                .filter(([key]) => key !== 'source_workspace_id')
+                .map(([key, verdict]) => (
+                  <span key={key} className="flex items-center gap-1.5 text-xs">
+                    <span className="text-muted-foreground">{key}</span>
+                    <StatusBadge variant={verdictVariant(verdict)}>
+                      {verdict.replace(/_/g, ' ')}
+                    </StatusBadge>
+                  </span>
+                ))}
+            </div>
+          </div>
+        )}
+
+        {/* Approval history. */}
+        <div className="space-y-2">
+          <h3 className="text-sm font-semibold">Approval history</h3>
+          {approvalEvents.length === 0 ? (
+            <p className="text-sm text-muted-foreground">No approval events recorded.</p>
+          ) : (
+            <ul className="space-y-2">
+              {approvalEvents.map((event, index) => (
+                <li
+                  key={`${event.action_id ?? 'action'}-${index}`}
+                  className="rounded-md border p-3"
+                >
+                  <div className="flex flex-wrap items-center gap-2">
+                    <StatusBadge variant={decisionVariant(event.decision)}>
+                      {event.decision ?? 'unknown'}
+                    </StatusBadge>
+                    {event.tool_name && (
+                      <span className="font-mono text-xs text-muted-foreground">
+                        {event.tool_name}
+                      </span>
+                    )}
+                    {event.auto_approved === true && (
+                      <span className="text-xs text-muted-foreground">(auto)</span>
+                    )}
+                    <span className="ml-auto text-xs text-muted-foreground">
+                      {formatWhen(event.decided_at)}
+                    </span>
+                  </div>
+                  {event.transcript_summary && (
+                    <p className="mt-1 break-words text-sm text-muted-foreground">
+                      {event.transcript_summary}
+                    </p>
+                  )}
+                  {event.reason && (
+                    <p className="mt-1 text-xs text-muted-foreground">
+                      reason: {event.reason}
+                    </p>
+                  )}
+                </li>
+              ))}
+            </ul>
+          )}
+        </div>
+
+        {/* Knowledge events. */}
+        <div className="space-y-2">
+          <h3 className="text-sm font-semibold">Knowledge events</h3>
+          {ragEvents.length === 0 ? (
+            <p className="text-sm text-muted-foreground">No knowledge events recorded.</p>
+          ) : (
+            <ul className="space-y-2">
+              {ragEvents.map((event, index) => (
+                <li
+                  key={`${event.event}-${index}`}
+                  className="flex flex-wrap items-center gap-2 rounded-md border p-3 text-xs"
+                >
+                  <span className="font-mono font-semibold">{event.event}</span>
+                  <StatusBadge variant={ragStatusVariant(event.status)}>
+                    {event.status}
+                  </StatusBadge>
+                  {event.provider && (
+                    <span className="rounded-md bg-muted px-2 py-0.5 font-mono">
+                      {event.provider}
+                    </span>
+                  )}
+                  <span className="text-muted-foreground">count: {event.count}</span>
+                  {event.detail && (
+                    <span className="break-words text-muted-foreground">{event.detail}</span>
+                  )}
+                  <span className="ml-auto text-muted-foreground">
+                    {formatWhen(event.occurred_at)}
+                  </span>
+                </li>
+              ))}
+            </ul>
+          )}
+        </div>
+      </CardContent>
+    </Card>
+  )
+}
diff --git a/frontend/src/components/demo/demo-step-card.test.tsx b/frontend/src/components/demo/demo-step-card.test.tsx
index eac4112d..9e0b5d6d 100644
--- a/frontend/src/components/demo/demo-step-card.test.tsx
+++ b/frontend/src/components/demo/demo-step-card.test.tsx
@@ -3,13 +3,16 @@
  * and the Inspect deep-link hrefs they expose.
  */
 
-import { afterEach, describe, expect, it } from 'vitest'
-import { cleanup, render, screen } from '@testing-library/react'
+import { afterEach, describe, expect, it, vi } from 'vitest'
+import { cleanup, fireEvent, render, screen, waitFor } from '@testing-library/react'
 import { MemoryRouter } from 'react-router-dom'
 import type { DemoStep } from '@/hooks/use-demo-pipeline'
 import { DemoStepCard } from './demo-step-card'
 
-afterEach(cleanup)
+afterEach(() => {
+  cleanup()
+  vi.unstubAllGlobals()
+})
 
 function makeStep(
   name: string,
@@ -144,28 +147,101 @@ describe('DemoStepCard PRP-39 mini-summaries', () => {
     expect(text).toContain('approval=executed')
   })
 
-  it('agent_hitl_flow — running + awaiting_approval=true surfaces the Approve button', () => {
+  it('agent_hitl_flow — running + awaiting_approval=true surfaces Approve and Reject', () => {
     const step = makeStep('agent_hitl_flow', 'running', {
       session_id: 'sess-x',
       awaiting_approval: true,
       action_id: 'act-y',
-      approval_url: '/agents/sessions/sess-x/approve',
+      decision_window_s: 10,
     })
     const { container } = renderCard(step, null)
     const buttons = Array.from(container.querySelectorAll('button')).map((b) => b.textContent)
     expect(buttons).toContain('Approve')
+    expect(buttons).toContain('Reject')
   })
 
-  it('agent_hitl_flow — terminal status hides the Approve button', () => {
+  it('agent_hitl_flow — terminal status hides the decision buttons', () => {
     const step = makeStep('agent_hitl_flow', 'pass', {
       session_id: 'sess-x',
       awaiting_approval: true, // stale flag from intermediate event
       action_id: 'act-y',
-      approval_url: '/agents/sessions/sess-x/approve',
+      decision_window_s: 10,
     })
     const { container } = renderCard(step, null)
     const buttons = Array.from(container.querySelectorAll('button')).map((b) => b.textContent)
     expect(buttons).not.toContain('Approve')
+    expect(buttons).not.toContain('Reject')
+  })
+
+  it('agent_hitl_flow — countdown reads data.decision_window_s', () => {
+    const step = makeStep('agent_hitl_flow', 'running', {
+      session_id: 'sess-x',
+      awaiting_approval: true,
+      action_id: 'act-y',
+      decision_window_s: 7,
+    })
+    const { container } = renderCard(step, null)
+    expect(container.textContent).toContain('auto-approve in 7s')
+  })
+
+  it('agent_hitl_flow — Approve POSTs the demo relay with the approved decision', async () => {
+    const fetchMock = vi.fn().mockResolvedValue(new Response(null, { status: 204 }))
+    vi.stubGlobal('fetch', fetchMock)
+    const step = makeStep('agent_hitl_flow', 'running', {
+      session_id: 'sess-x',
+      awaiting_approval: true,
+      action_id: 'act-y',
+      decision_window_s: 10,
+    })
+    renderCard(step, null)
+    fireEvent.click(screen.getByRole('button', { name: 'Approve' }))
+    await waitFor(() => expect(fetchMock).toHaveBeenCalledTimes(1))
+    const call = fetchMock.mock.calls[0]!
+    expect(String(call[0])).toContain('/demo/hitl-decision')
+    const init = call[1] as RequestInit
+    expect(init.method).toBe('POST')
+    expect(JSON.parse(String(init.body))).toEqual({ action_id: 'act-y', decision: 'approved' })
+    // Both buttons disable after a click.
+    expect(screen.getByRole('button', { name: 'Approving…' })).toBeTruthy()
+    expect((screen.getByRole('button', { name: 'Reject' }) as HTMLButtonElement).disabled).toBe(true)
+  })
+
+  it('agent_hitl_flow — Reject POSTs the demo relay with the rejected decision', async () => {
+    const fetchMock = vi.fn().mockResolvedValue(new Response(null, { status: 204 }))
+    vi.stubGlobal('fetch', fetchMock)
+    const step = makeStep('agent_hitl_flow', 'running', {
+      session_id: 'sess-x',
+      awaiting_approval: true,
+      action_id: 'act-z',
+      decision_window_s: 10,
+    })
+    renderCard(step, null)
+    fireEvent.click(screen.getByRole('button', { name: 'Reject' }))
+    await waitFor(() => expect(fetchMock).toHaveBeenCalledTimes(1))
+    const init = fetchMock.mock.calls[0]![1] as RequestInit
+    expect(JSON.parse(String(init.body))).toEqual({ action_id: 'act-z', decision: 'rejected' })
+    expect(screen.getByRole('button', { name: 'Rejecting…' })).toBeTruthy()
+  })
+
+  it('agent_hitl_flow — absorbs a 404 (auto-approve raced) without surfacing an error', async () => {
+    const problem = JSON.stringify({ status: 404, detail: 'No pending HITL action' })
+    const fetchMock = vi.fn().mockResolvedValue(
+      new Response(problem, {
+        status: 404,
+        headers: { 'content-type': 'application/problem+json' },
+      }),
+    )
+    vi.stubGlobal('fetch', fetchMock)
+    const step = makeStep('agent_hitl_flow', 'running', {
+      session_id: 'sess-x',
+      awaiting_approval: true,
+      action_id: 'act-y',
+      decision_window_s: 10,
+    })
+    const { container } = renderCard(step, null)
+    fireEvent.click(screen.getByRole('button', { name: 'Approve' }))
+    await waitFor(() => expect(fetchMock).toHaveBeenCalledTimes(1))
+    expect(container.textContent).not.toMatch(/decision failed/)
   })
 
   it('ops_snapshot — renders the 5-tile mini grid with values', () => {
diff --git a/frontend/src/components/demo/demo-step-card.tsx b/frontend/src/components/demo/demo-step-card.tsx
index b2b1379b..9fef1529 100644
--- a/frontend/src/components/demo/demo-step-card.tsx
+++ b/frontend/src/components/demo/demo-step-card.tsx
@@ -4,6 +4,7 @@ import { Link } from 'react-router-dom'
 import type { DemoStep, DemoStepUiStatus } from '@/hooks/use-demo-pipeline'
 import { Button } from '@/components/ui/button'
 import { Card } from '@/components/ui/card'
+import { api, ApiError } from '@/lib/api'
 import { cn } from '@/lib/utils'
 import { HorizonBucketsMini } from './HorizonBucketsMini'
 
@@ -361,58 +362,77 @@ function OpsSnapshotMiniGrid({ data }: { data: Record<string, unknown> }) {
 }
 
 /**
- * PRP-41 — one-click Approve button rendered on the HITL step card when
- * the backend has emitted `awaiting_approval=true` + `status='running'`.
+ * E5 (#411) — Approve / Reject buttons rendered on the HITL step card while
+ * the backend awaits a decision (`awaiting_approval=true` + `status='running'`).
  *
- * Clicking POSTs `{action_id, approved: true}` to the captured approval_url.
- * Optimistic disable on click; the backend's auto-approve absorbs a 400
- * "No pending action" if the auto-approve fires first (Task 1 contract probe).
+ * Either click relays the operator's intent to the DEMO slice via
+ * `POST /demo/hitl-decision` (through `lib/api.ts` `api()` — API_BASE_URL
+ * prefixed, never bare `fetch`, so it works off-origin). The pipeline is the
+ * sole caller of the agents approve endpoint. Both buttons disable after
+ * either click. A live "auto-approve in Ns" countdown reads the backend's
+ * `decision_window_s` (fallback 10) — never hardcoded, never derived from the
+ * 90 s hard timeout. 404/409 are absorbed silently (the auto-approve raced);
+ * only 5xx surfaces an inline error.
  */
-function ApproveButton({
-  approvalUrl,
+function HitlDecisionButtons({
   actionId,
+  decisionWindowS,
 }: {
-  approvalUrl: string
   actionId: string
+  decisionWindowS: number
 }) {
-  const [clicked, setClicked] = useState(false)
+  const [pending, setPending] = useState<'approved' | 'rejected' | null>(null)
   const [error, setError] = useState<string | null>(null)
-  const [waitingMs, setWaitingMs] = useState(0)
+  const [remaining, setRemaining] = useState(Math.max(0, Math.ceil(decisionWindowS)))
 
   useEffect(() => {
-    if (clicked) return
-    const startedAt = Date.now()
-    const id = setInterval(() => setWaitingMs(Date.now() - startedAt), 1000)
+    if (pending) return
+    const id = setInterval(() => {
+      setRemaining((prev) => (prev > 0 ? prev - 1 : 0))
+    }, 1000)
     return () => clearInterval(id)
-  }, [clicked])
+  }, [pending])
 
-  const onClick = async () => {
-    if (clicked || !approvalUrl || !actionId) return
-    setClicked(true)
+  const decide = async (decision: 'approved' | 'rejected') => {
+    if (pending || !actionId) return
+    setPending(decision)
     try {
-      const res = await fetch(approvalUrl, {
+      await api<void>('/demo/hitl-decision', {
         method: 'POST',
-        headers: { 'content-type': 'application/json' },
-        body: JSON.stringify({ action_id: actionId, approved: true }),
+        body: { action_id: actionId, decision },
       })
-      // Absorb 4xx absorptions silently — the auto-approve already landed
-      // and the next StepEvent will surface the terminal status.
-      if (!res.ok && res.status >= 500) {
-        setError(`approve failed (${res.status})`)
-      }
     } catch (err) {
-      setError(err instanceof Error ? err.message : 'approve failed')
+      // Absorb 404 (no pending action) / 409 (already decided) — the
+      // auto-approve or a prior click raced. Surface only 5xx.
+      if (err instanceof ApiError && err.status >= 500) {
+        setError(`decision failed (${err.status})`)
+      } else if (!(err instanceof ApiError)) {
+        setError(err instanceof Error ? err.message : 'decision failed')
+      }
     }
   }
 
   return (
-    <div className="mt-3 flex items-center gap-3">
-      <Button onClick={onClick} disabled={clicked} size="sm" variant="default">
-        {clicked ? 'Approving…' : 'Approve'}
+    <div className="mt-3 flex flex-wrap items-center gap-3">
+      <Button
+        onClick={() => void decide('approved')}
+        disabled={pending !== null}
+        size="sm"
+        variant="default"
+      >
+        {pending === 'approved' ? 'Approving…' : 'Approve'}
+      </Button>
+      <Button
+        onClick={() => void decide('rejected')}
+        disabled={pending !== null}
+        size="sm"
+        variant="destructive"
+      >
+        {pending === 'rejected' ? 'Rejecting…' : 'Reject'}
       </Button>
-      {!clicked && waitingMs > 30_000 && (
+      {!pending && (
         <span className="text-xs text-muted-foreground">
-          Still waiting — auto-approve in {Math.max(0, Math.ceil((90_000 - waitingMs) / 1000))}s
+          auto-approve in {remaining}s
         </span>
       )}
       {error && <span className="text-xs text-destructive">{error}</span>}
@@ -493,14 +513,18 @@ export function DemoStepCard({ step, index, inspectHref }: DemoStepCardProps) {
           {/* PRP-41 — agents (HITL) + ops snapshot mini-summaries. */}
           {step.name === 'agent_hitl_flow' && <HitlFlowSummary data={step.data} />}
           {step.name === 'ops_snapshot' && <OpsSnapshotMiniGrid data={step.data} />}
-          {/* PRP-41 — one-click Approve only while awaiting (status==running). */}
+          {/* E5 (#411) — Approve / Reject only while awaiting (status==running);
+              countdown reads data.decision_window_s (fallback 10). */}
           {step.data.awaiting_approval === true &&
             step.status === 'running' &&
-            typeof step.data.approval_url === 'string' &&
             typeof step.data.action_id === 'string' && (
-              <ApproveButton
-                approvalUrl={step.data.approval_url}
+              <HitlDecisionButtons
                 actionId={step.data.action_id}
+                decisionWindowS={
+                  typeof step.data.decision_window_s === 'number'
+                    ? step.data.decision_window_s
+                    : 10
+                }
               />
             )}
           {showInspect && (
diff --git a/frontend/src/components/demo/index.ts b/frontend/src/components/demo/index.ts
index 39245c1b..0b1e7cfa 100644
--- a/frontend/src/components/demo/index.ts
+++ b/frontend/src/components/demo/index.ts
@@ -10,3 +10,5 @@ export * from './workspace-name'
 // E3 (#409) — advanced seed config + focus-pair selection.
 export * from './SeedConfigPanel'
 export * from './ScopeSelector'
+// E5 (#411) — agent/HITL + RAG story capture panel.
+export * from './WorkspaceStoryPanel'
diff --git a/frontend/src/hooks/index.ts b/frontend/src/hooks/index.ts
index fb3e6aa7..9b31a266 100644
--- a/frontend/src/hooks/index.ts
+++ b/frontend/src/hooks/index.ts
@@ -15,3 +15,4 @@ export * from './use-websocket'
 export * from './use-seeder'
 export * from './use-demo-pipeline'
 export * from './use-workspaces'
+export * from './use-approval-events'
diff --git a/frontend/src/hooks/use-approval-events.test.ts b/frontend/src/hooks/use-approval-events.test.ts
new file mode 100644
index 00000000..0aeb0cc7
--- /dev/null
+++ b/frontend/src/hooks/use-approval-events.test.ts
@@ -0,0 +1,86 @@
+/**
+ * E5 (#411) — unit tests for useApprovalEvents. Stubs fetch to assert the
+ * hook calls GET /demo/approval-events with the limit param and surfaces the
+ * flattened response (pattern: use-workspaces.test.ts).
+ */
+import { QueryClient, QueryClientProvider } from '@tanstack/react-query'
+import { renderHook, waitFor } from '@testing-library/react'
+import { afterEach, describe, expect, it, vi } from 'vitest'
+import { createElement, type ReactNode } from 'react'
+
+import { useApprovalEvents } from './use-approval-events'
+
+function makeWrapper(client: QueryClient) {
+  return function Wrapper({ children }: { children: ReactNode }) {
+    return createElement(QueryClientProvider, { client }, children)
+  }
+}
+
+function jsonResponse(body: unknown, status = 200): Response {
+  return new Response(JSON.stringify(body), {
+    status,
+    headers: { 'content-type': 'application/json' },
+  })
+}
+
+afterEach(() => {
+  vi.unstubAllGlobals()
+})
+
+describe('useApprovalEvents', () => {
+  it('GETs /demo/approval-events with the limit param and returns the events', async () => {
+    const body = {
+      events: [
+        {
+          workspace_id: 'a'.repeat(32),
+          workspace_name: 'e5-story',
+          action_id: 'act-1',
+          tool_name: 'save_scenario',
+          decision: 'approved',
+          decided_at: '2026-06-01T12:05:00Z',
+          session_id: 'sess-1',
+          auto_approved: false,
+          reason: null,
+          execution_status: 'executed',
+          transcript_summary: 'save it',
+        },
+      ],
+      total: 1,
+    }
+    const fetchMock = vi.fn().mockResolvedValue(jsonResponse(body))
+    vi.stubGlobal('fetch', fetchMock)
+
+    const client = new QueryClient({ defaultOptions: { queries: { retry: false } } })
+    const { result } = renderHook(() => useApprovalEvents(25), {
+      wrapper: makeWrapper(client),
+    })
+    await waitFor(() => expect(result.current.isSuccess).toBe(true))
+
+    const url = String(fetchMock.mock.calls[0]![0])
+    expect(url).toContain('/demo/approval-events')
+    expect(url).toContain('limit=25')
+    expect(result.current.data?.total).toBe(1)
+    expect(result.current.data?.events[0]?.tool_name).toBe('save_scenario')
+  })
+
+  it('defaults the limit to 50', async () => {
+    const fetchMock = vi.fn().mockResolvedValue(jsonResponse({ events: [], total: 0 }))
+    vi.stubGlobal('fetch', fetchMock)
+
+    const client = new QueryClient({ defaultOptions: { queries: { retry: false } } })
+    const { result } = renderHook(() => useApprovalEvents(), {
+      wrapper: makeWrapper(client),
+    })
+    await waitFor(() => expect(result.current.isSuccess).toBe(true))
+
+    expect(String(fetchMock.mock.calls[0]![0])).toContain('limit=50')
+  })
+
+  it('stays disabled when enabled=false', () => {
+    const fetchMock = vi.fn()
+    vi.stubGlobal('fetch', fetchMock)
+    const client = new QueryClient({ defaultOptions: { queries: { retry: false } } })
+    renderHook(() => useApprovalEvents(50, false), { wrapper: makeWrapper(client) })
+    expect(fetchMock).not.toHaveBeenCalled()
+  })
+})
diff --git a/frontend/src/hooks/use-approval-events.ts b/frontend/src/hooks/use-approval-events.ts
new file mode 100644
index 00000000..2b4cd9ea
--- /dev/null
+++ b/frontend/src/hooks/use-approval-events.ts
@@ -0,0 +1,19 @@
+import { useQuery } from '@tanstack/react-query'
+import { api } from '@/lib/api'
+import type { ApprovalEventsResponse } from '@/types/api'
+
+/**
+ * E5 (#411) — recent HITL approval events flattened across saved showcase
+ * workspaces, newest-first. Deliberately NOT polled: the table only changes
+ * when a showcase run finishes capturing a decision, so refetch-on-mount is
+ * sufficient (mirrors useRetrainingCandidates). queryKey carries `limit` so
+ * distinct caps cache independently.
+ */
+export function useApprovalEvents(limit = 50, enabled = true) {
+  return useQuery({
+    queryKey: ['demo', 'approval-events', limit],
+    queryFn: () =>
+      api<ApprovalEventsResponse>('/demo/approval-events', { params: { limit } }),
+    enabled,
+  })
+}
diff --git a/frontend/src/pages/ops.tsx b/frontend/src/pages/ops.tsx
index 233c8ef5..86455136 100644
--- a/frontend/src/pages/ops.tsx
+++ b/frontend/src/pages/ops.tsx
@@ -3,6 +3,7 @@ import { useNavigate, Link } from 'react-router-dom'
 import { Activity, AlertTriangle, CheckCircle2, Clock, Download, RefreshCw } from 'lucide-react'
 import { toast } from 'sonner'
 import { useModelHealth, useOpsSummary, useRetrainingCandidates } from '@/hooks/use-ops'
+import { useApprovalEvents } from '@/hooks/use-approval-events'
 import { useProviderHealth } from '@/hooks/use-config'
 import { useCreateJob } from '@/hooks/use-jobs'
 import { useCreateAlias, useRun, useAliases } from '@/hooks/use-runs'
@@ -97,6 +98,8 @@ export default function OpsPage() {
   const summaryQuery = useOpsSummary()
   const candidatesQuery = useRetrainingCandidates()
   const modelHealthQuery = useModelHealth()
+  // E5 (#411) — recent HITL approval events flattened across saved workspaces.
+  const approvalEventsQuery = useApprovalEvents()
   const providerQuery = useProviderHealth()
   const aliasesQuery = useAliases()
   const createJob = useCreateJob()
@@ -238,6 +241,18 @@ export default function OpsPage() {
     setPromoteTarget(null)
   }
 
+  const approvalEvents = approvalEventsQuery.data?.events ?? []
+
+  /** E5 (#411) — approval decision → StatusBadge variant. */
+  function decisionBadgeVariant(
+    decision: string | null,
+  ): 'success' | 'error' | 'warning' | 'default' {
+    if (decision === 'approved') return 'success'
+    if (decision === 'rejected') return 'error'
+    if (decision === 'timed_out') return 'warning'
+    return 'default'
+  }
+
   /** PRP-36 enum → human-readable reason chip label. */
   function staleReasonLabel(reason: string | null): string {
     if (reason === null) return '—'
@@ -445,6 +460,65 @@ export default function OpsPage() {
             </CardContent>
           </Card>
 
+          {/* E5 (#411) — Approval History. Recent HITL approval decisions
+              captured on saved showcase workspaces (demo slice endpoint;
+              frontend-only surface). */}
+          <Card>
+            <CardHeader>
+              <CardTitle>Approval History</CardTitle>
+              <CardDescription>
+                Recent human-in-the-loop approval decisions captured on saved
+                showcase workspaces — approve, reject, or window-lapse auto-approve.
+              </CardDescription>
+            </CardHeader>
+            <CardContent>
+              {approvalEventsQuery.isLoading ? (
+                <LoadingState message="Loading approval history..." />
+              ) : approvalEvents.length === 0 ? (
+                <p className="py-6 text-center text-sm text-muted-foreground">
+                  No approval events yet — run a showcase pipeline with the HITL
+                  step to capture one.
+                </p>
+              ) : (
+                <Table>
+                  <TableHeader>
+                    <TableRow>
+                      <TableHead>Decision</TableHead>
+                      <TableHead>Tool</TableHead>
+                      <TableHead>Workspace</TableHead>
+                      <TableHead>Transcript</TableHead>
+                      <TableHead>When</TableHead>
+                    </TableRow>
+                  </TableHeader>
+                  <TableBody>
+                    {approvalEvents.map((event, index) => (
+                      <TableRow key={`${event.workspace_id}-${event.action_id ?? index}`}>
+                        <TableCell>
+                          <StatusBadge variant={decisionBadgeVariant(event.decision)}>
+                            {event.decision ?? 'unknown'}
+                            {event.auto_approved === true ? ' (auto)' : ''}
+                          </StatusBadge>
+                        </TableCell>
+                        <TableCell className="font-mono text-xs">
+                          {event.tool_name ?? '—'}
+                        </TableCell>
+                        <TableCell className="text-sm">
+                          {event.workspace_name ?? event.workspace_id.slice(0, 8)}
+                        </TableCell>
+                        <TableCell className="max-w-md truncate text-sm text-muted-foreground">
+                          {event.transcript_summary ?? '—'}
+                        </TableCell>
+                        <TableCell className="text-sm text-muted-foreground">
+                          {formatWhen(event.decided_at)}
+                        </TableCell>
+                      </TableRow>
+                    ))}
+                  </TableBody>
+                </Table>
+              )}
+            </CardContent>
+          </Card>
+
           {/* PRP-37 — Stale aliases. Surfaces the new
               feature_frame_version_mismatch reason chip (PRP-36) alongside
               the existing newer-run / artifact-not-verified / run-not-success
diff --git a/frontend/src/pages/showcase.tsx b/frontend/src/pages/showcase.tsx
index 97545ce8..9330bf17 100644
--- a/frontend/src/pages/showcase.tsx
+++ b/frontend/src/pages/showcase.tsx
@@ -13,6 +13,7 @@ import { ReplayConfirmDialog } from '@/components/demo/ReplayConfirmDialog'
 import { WorkspaceLineageStrip } from '@/components/demo/WorkspaceLineageStrip'
 import { WorkspacePanel } from '@/components/demo/WorkspacePanel'
 import { WorkspaceArtifactsPanel } from '@/components/demo/WorkspaceArtifactsPanel'
+import { WorkspaceStoryPanel } from '@/components/demo/WorkspaceStoryPanel'
 import { SeedConfigPanel } from '@/components/demo/SeedConfigPanel'
 import { ScopeSelector } from '@/components/demo/ScopeSelector'
 import { RunConfigPanel } from '@/components/demo/RunConfigPanel'
@@ -554,6 +555,8 @@ export default function ShowcasePage() {
             workspace={loadedWorkspace}
             health={workspaceHealth ?? null}
           />
+          {/* E5 (#411) — captured agent/HITL + RAG story; self-hides on legacy rows. */}
+          <WorkspaceStoryPanel workspace={loadedWorkspace} />
         </div>
       )}
 
diff --git a/frontend/src/types/api.ts b/frontend/src/types/api.ts
index 637f7a62..74a691cb 100644
--- a/frontend/src/types/api.ts
+++ b/frontend/src/types/api.ts
@@ -883,6 +883,10 @@ export interface WorkspaceDetail extends WorkspaceListItem {
   // E1 (#407) — operator annotation + schema version.
   notes: string | null
   config_schema_version: number
+  // E5 (#411) -- story slots: agent/HITL approval + RAG knowledge events.
+  // null until E5 writes them; legacy rows stay null.
+  approval_events: ApprovalEventDetail[] | null
+  rag_events: RagEventDetail[] | null
 }
 
 // Page shape of GET /demo/workspaces.
@@ -891,6 +895,56 @@ export interface WorkspaceListResponse {
   total: number
 }
 
+// === Showcase story capture (E5, #411) ===
+
+// One approval_events entry on WorkspaceDetail (built from JSONB; tolerant).
+export interface ApprovalEventDetail {
+  action_id: string | null
+  tool_name: string | null
+  decision: 'approved' | 'rejected' | 'timed_out' | string | null
+  decided_at: string | null
+  session_id: string | null
+  auto_approved?: boolean | null
+  reason?: string | null
+  execution_status?: string | null
+  tool_call_summary?: { description?: string; arguments_keys?: string[] } | null
+  transcript_summary?: string | null
+  tokens_used?: number | null
+  tool_calls_count?: number | null
+}
+
+// One rag_events entry on WorkspaceDetail (built from JSONB; tolerant).
+export interface RagEventDetail {
+  event: 'probe' | 'index' | 'retrieve' | 'skip' | string
+  status: 'pass' | 'warn' | 'skip' | string
+  detail: string
+  count: number
+  occurred_at: string
+  provider?: string | null
+  reachable?: boolean | null
+}
+
+// One flattened row from GET /demo/approval-events (workspace-tagged).
+export interface ApprovalEventItem {
+  workspace_id: string
+  workspace_name: string | null
+  action_id: string | null
+  tool_name: string | null
+  decision: string | null
+  decided_at: string | null
+  session_id: string | null
+  auto_approved: boolean | null
+  reason: string | null
+  execution_status: string | null
+  transcript_summary: string | null
+}
+
+// Page shape of GET /demo/approval-events.
+export interface ApprovalEventsResponse {
+  events: ApprovalEventItem[]
+  total: number
+}
+
 // E2 (#408) — partial-update body for PATCH /demo/workspaces/{workspace_id}
 // (E1 endpoint). Absent field = unchanged; explicit null clears name/notes.
 export interface WorkspaceUpdate {

From d41f80b803e5a640a4568f6131220f75ed9cc429 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 08:49:37 +0200
Subject: [PATCH 27/32] docs(docs): document approval and rag story capture
 contracts (#411)

---
 docs/_base/API_CONTRACTS.md |  5 ++++-
 docs/_base/DOMAIN_MODEL.md  | 10 +++++-----
 docs/_base/RUNBOOKS.md      |  4 ++--
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/docs/_base/API_CONTRACTS.md b/docs/_base/API_CONTRACTS.md
index 2922c077..b2fb5c64 100644
--- a/docs/_base/API_CONTRACTS.md
+++ b/docs/_base/API_CONTRACTS.md
@@ -65,6 +65,8 @@ All endpoints serve JSON; error responses use `application/problem+json` (RFC 78
 | demo | GET | `/demo/workspaces/{workspace_id}/health` | **E2 (#408)** — probe the workspace's soft references in-process (model runs, scenario plans, alias, batch, agent session, `job_ids` slot) via `httpx.ASGITransport`; per-reference `status` ∈ `alive` (2xx) / `dead` (404 — deleted after the run) / `unknown` (anything else — never a 500), plus `alive`/`dead`/`unknown` counts and `partial_run` (true when the row's status ≠ `completed`); non-probeable keys (`v2_model_path`, `scenario_artifact_key`, `train_model_types`) are skipped; `404 application/problem+json` when the workspace is missing |
 | demo | PATCH | `/demo/workspaces/{workspace_id}` | **E1 (#407)** — partial lifecycle update (`name` / `notes` / `tags` / `archived` / `pinned`; `exclude_unset` semantics — only provided fields change; explicit `null` clears `name`/`notes`; explicit `null` on `archived`/`pinned`/`tags` → `422` (send `[]` to clear tags); `status` NOT patchable — the pipeline owns it); returns the updated `WorkspaceDetailResponse`; empty body = `200` no-op; `404 application/problem+json` when missing; `422` on unknown keys / bad name pattern / >20 tags |
 | demo | DELETE | `/demo/workspaces/{workspace_id}` | Delete one saved workspace METADATA row; `204` on success, `404 application/problem+json` when missing. The run's created objects (model runs, scenario plans, aliases, jobs, artifacts) are soft references and are NOT deleted |
+| demo | POST | `/demo/hitl-decision` | **E5 (#411)** — relay the Showcase HITL step card's Approve/Reject to the in-flight pipeline. Body `{action_id: str, decision: 'approved' \| 'rejected', reason?: str ≤500}` (`ConfigDict(strict=True, extra='forbid')`). `204` on success; `404 application/problem+json` when no matching action is pending; `409` when the action was already decided; `422` on a malformed body. The in-memory single-slot relay is safe because the pipeline runs one-at-a-time under the module `_pipeline_lock`; the pipeline forwards the real decision to `/agents/sessions/{id}/approve` (`approved=true\|false` + reason) — `agent_require_approval` is untouched. A reject keeps the pipeline GREEN (D5); the gated `save_scenario` never executes |
+| demo | GET | `/demo/approval-events` | **E5 (#411)** — recent HITL approval events flattened across the newest saved workspaces carrying the `approval_events` slot, newest-workspace-first (`limit` 1-200 default 50); `200` + empty list when none. Each item carries `workspace_id` / `workspace_name` plus the entry's base + additive keys (`decision`, `tool_name`, `auto_approved`, `reason`, `execution_status`, `transcript_summary`, …). Audit-glance surface — no pagination/offset (D6). Backs the `/ops` page's Approval History table (frontend-only — the ops slice does not import demo code) |
 | config | GET | `/config/ai` | Effective AI-model config (agent LLM + RAG embeddings); API keys masked, never raw |
 | config | PATCH | `/config/ai` | Persist + apply AI-model changes live (no restart). `409` if an embedding-dimension change would orphan indexed RAG chunks (resend with `force=true`) |
 | config | GET | `/config/providers/health` | Per-provider connectivity — Ollama probed live, cloud providers by API-key presence |
@@ -92,7 +94,7 @@ Drives the end-to-end demo pipeline for the dashboard Showcase page. Verified ag
 - **Server → client (every frame):** Pydantic-serialized `StepEvent` — `{"event_type", "step_name", "step_index", "total_steps", "status", "detail", "duration_ms", "data", "timestamp", "phase_name"?, "phase_index"?, "phase_total"?}`. PRP-38 — the three `phase_*` fields are Optional + Nullable so legacy clients that don't render phases keep working.
 - **`event_type` values (Literal in `StepEvent`):**
   - `step_start` — a step began; `status` is `null`.
-  - `step_complete` — a step finished; `status ∈ {pass, fail, skip, warn}`, `data` carries structured payload (backtest `per_model` WAPE + `winner` + `bucketed_aggregated_metrics` on PRP-36/38 feature-aware runs; register `run_id` + `alias`; PRP-38 `v2_train` → `v2_run_id` + `feature_frame_version` + `feature_columns_count` + `feature_groups` + `artifact_uri_full`).
+  - `step_complete` — a step finished; `status ∈ {pass, fail, skip, warn}`, `data` carries structured payload (backtest `per_model` WAPE + `winner` + `bucketed_aggregated_metrics` on PRP-36/38 feature-aware runs; register `run_id` + `alias`; PRP-38 `v2_train` → `v2_run_id` + `feature_frame_version` + `feature_columns_count` + `feature_groups` + `artifact_uri_full`). **E5 (#411)** — `agent_hitl_flow` ALSO emits an INTERMEDIATE `step_complete` with `status="running"` + `data.awaiting_approval=true` while it is still awaiting; that frame now reaches the browser DURING the decision window (the orchestrator drains the intermediate-event sink concurrently with the in-flight step — D2; pre-E5 it flushed only after the step returned, so the button could never render in time). Its `data` gains `decision_url: "/demo/hitl-decision"` + `decision_window_s: float` (the FE countdown reads this, never hardcodes) alongside the existing `action_id` / `session_id` / `approval_url`.
   - `pipeline_complete` — final event; `data` carries `winner_model_type`, `winner_wape`, `winning_run_id`, `alias`, `wall_clock_s`, `v2_run_id` (PRP-38; null when no V2 run was registered), and `workspace_id` (E1 #390; additive — a string on `preservation="keep"` runs, null otherwise).
   - `error` — bad start frame or a concurrent run already in progress; one event, then the server closes.
 - Concurrency: a module-level `asyncio.Lock` allows one pipeline at a time. A second `POST /demo/run` returns `409`; a second `WS /demo/stream` receives one `error` event.
@@ -100,6 +102,7 @@ Drives the end-to-end demo pipeline for the dashboard Showcase page. Verified ag
 - PRP-40 — `scenario="showcase_rich"` ALSO adds two phases inserted BEFORE `verify`: `planning` (2 steps — `scenario_simulate_and_save`, `multi_plan_compare`) and `knowledge` (3 steps — `embedding_provider_probe`, `rag_index_subset`, `rag_retrieve_probe`). Total step count: 19 for `showcase_rich`, 11 for `demo_minimal` and `sparse`. Phase ids on `showcase_rich` are `data` / `modeling` / `decision` / `planning` / `knowledge` / `verify` / `agent` / `cleanup` (8 phases). The knowledge steps SKIP gracefully when the embedding provider is unreachable; the pipeline still goes green.
 - E3 (#392) — the planning-phase steps tag the plans they save: pipeline-saved plans now carry `source:showcase` (alongside the legacy `showcase` + `price`/`holiday` tags), and on `preservation="keep"` runs additionally `workspace:<workspace_name|workspace_id>` — retrievable via `GET /scenarios?tags=workspace:<label>` (JSONB containment, all listed tags must match). The `scenario_simulate_and_save` step's `data` additively echoes the `tags` list it sent.
 - E4 (#393) — the start frame's E1 preservation fields are now exercised by the Showcase UI ("Save as workspace" checkbox + name + seed inputs). **Replay** re-submits a recorded workspace's config verbatim (`seed`/`scenario`/`reset`/`skip_seed`) with `preservation="keep"` (+ the recorded `workspace_name`), creating a NEW `showcase_workspace` row each time — the original row is never mutated; names are non-unique by design. Saved rows are read back over `GET /demo/workspaces` (+ `/{workspace_id}`). E1 (#407) — the Replay start frame now also sends `replayed_from_workspace_id: <source workspace_id>`, so replays carry lineage. E2 (#408) — every panel Replay now requires an explicit confirmation dialog with a recorded-vs-sent config preview (destructive copy + destructive-styled confirm on `reset=true` workspaces); the saved-workspaces panel renders the lineage as a replay badge + clickable ancestor chain (deleted ancestors marked, never an error), and a two-workspace compare page lives at `/showcase/compare?a=&b=` (frontend-only diff — no new backend endpoint).
+- E5 (#411) — `scenario="showcase_rich"` agents phase: the HITL decision window grew 3 s → **10 s** (a human can now actually click) and the `agent_hitl_flow` step card renders **Approve + Reject** buttons that POST `/demo/hitl-decision` (the relay; NOT `/agents/.../approve` directly). An operator **Reject** keeps the run GREEN (D5) — terminal `("pass", "rejected by operator", …)`, and the gated `save_scenario` never runs (no `scenario_plan` row). No decision in the window → auto-approve (`auto_approved=true`). On `preservation="keep"` runs the resolved approvals land in the workspace `approval_events` slot and the three knowledge steps land in `rag_events` (schema v2 — see `docs/_base/DOMAIN_MODEL.md`); a replay keep-run additionally records `result_summary.story_reproduction`. Capture is warn-and-continue — it never fails a green pipeline. No `DemoRunRequest` change; legacy / `demo_minimal` / `sparse` frames stream byte-identically (no relay events, no slots written).
 
 ## Async Events / Queues
 
diff --git a/docs/_base/DOMAIN_MODEL.md b/docs/_base/DOMAIN_MODEL.md
index 150a3a41..fb9c7878 100644
--- a/docs/_base/DOMAIN_MODEL.md
+++ b/docs/_base/DOMAIN_MODEL.md
@@ -58,13 +58,13 @@
 ### `showcase_workspace` (Demo)
 - **Root:** `ShowcaseWorkspace(workspace_id: str, status: str)` — one row = one preserved (`preservation="keep"`) showcase run. Ephemeral runs (the default) write no row; a `workspace_name` merely labels a keep-run row (names are non-unique).
 - **Status state machine:** `running` → `completed` | `failed` (CHECK-constrained; the finalize hook settles the row even on mid-run failure).
-- **Stored metadata:** replay config (`seed`, `scenario`, `reset`, `skip_seed`), showcase grain + window (`store_id`, `product_id`, `date_start`, `date_end` — NULL on early failure), lifecycle (`status`, `created_at`/`updated_at`), and the JSONB payloads below. E1 (#407) adds operator-curation columns `archived` / `pinned` (booleans, default false, PATCH-mutable, orthogonal to `status` — the pipeline owns the run lifecycle), `notes` (free text, 2000-char cap at the Pydantic boundary), `tags` (a queryable JSONB string array — its own GIN-indexed column, exact `scenario_plan.tags` pattern, ≤20 items at the PATCH boundary), `config_schema_version` (int, default 1 — versions the workspace config + story-slot schema as a whole; any epic that changes a documented slot shape bumps the ORM default and documents the delta here), and the provenance column `replayed_from_workspace_id` (String(32), btree-indexed SOFT reference — see Invariants). E4 (#410) adds the replay-input column `run_config` (nullable JSONB, `{"train_model_types": [...], "backtest": {...}}` or NULL on default-config runs) — a REPLAY INPUT in the same class as `seed`/`scenario`/`reset`/`skip_seed`, **NOT a story slot** (D1): it records the start-frame model set + backtest config a kept run was launched with, written by `create_workspace` at insert time and consumed by Load/Replay. `config_schema_version` is deliberately NOT bumped by E4 — it versions the STORY-SLOT schema; `run_config` presence is NULL-detectable and carries its own documented shape.
-- **JSONB fields:** `created_objects` (sparse soft-reference keys — `winning_run_id`, `v2_run_id`, `v2_model_path`, `alias`, `agent_session_id`, `batch_id`, `scenario_plan_ids`, `scenario_artifact_key`, `train_model_types`, `stale_alias_run_id`) and `result_summary` (winner / WAPE / wall-clock display payload).
+- **Stored metadata:** replay config (`seed`, `scenario`, `reset`, `skip_seed`), showcase grain + window (`store_id`, `product_id`, `date_start`, `date_end` — NULL on early failure), lifecycle (`status`, `created_at`/`updated_at`), and the JSONB payloads below. E1 (#407) adds operator-curation columns `archived` / `pinned` (booleans, default false, PATCH-mutable, orthogonal to `status` — the pipeline owns the run lifecycle), `notes` (free text, 2000-char cap at the Pydantic boundary), `tags` (a queryable JSONB string array — its own GIN-indexed column, exact `scenario_plan.tags` pattern, ≤20 items at the PATCH boundary), `config_schema_version` (int, default 1 — versions the workspace config + story-slot schema as a whole; any epic that changes a documented slot shape bumps the ORM default and documents the delta here), and the provenance column `replayed_from_workspace_id` (String(32), btree-indexed SOFT reference — see Invariants). E4 (#410) adds the replay-input column `run_config` (nullable JSONB, `{"train_model_types": [...], "backtest": {...}}` or NULL on default-config runs) — a REPLAY INPUT in the same class as `seed`/`scenario`/`reset`/`skip_seed`, **NOT a story slot** (D1): it records the start-frame model set + backtest config a kept run was launched with, written by `create_workspace` at insert time and consumed by Load/Replay. `config_schema_version` is deliberately NOT bumped by E4 — it versions the STORY-SLOT schema; `run_config` presence is NULL-detectable and carries its own documented shape. E5 (#411) DOES bump the ORM default `1 → 2` (it widened the `approval_events.decision` enum + added `"probe"` to `rag_events.event` + the additive entry keys below); `server_default` stays `text("1")` (no migration — pre-E5 rows inserted outside the ORM legitimately read 1).
+- **JSONB fields:** `created_objects` (sparse soft-reference keys — `winning_run_id`, `v2_run_id`, `v2_model_path`, `alias`, `agent_session_id`, `batch_id`, `scenario_plan_ids`, `scenario_artifact_key`, `train_model_types`, `stale_alias_run_id`) and `result_summary` (winner / WAPE / wall-clock display payload). E5 (#411) additively writes `result_summary.story_reproduction` on a REPLAY keep-run (`replayed_from_workspace_id` set): `{"agent": V, "knowledge": V, "source_workspace_id": str|null}` with `V ∈ "reproduced" | "not_reproduced" | "not_applicable" | "unknown"` — `agent` compares the source row's `approval_events` vs this run's; `knowledge` compares `rag_events` entries whose `event` is `index`/`retrieve` with `status != "skip"`; a dangling source row → `unknown`. Computed inside `finalize_workspace`'s warn-and-continue try (one extra get-by-id select). It is NOT a story slot and NOT a new column (D7).
 - **JSONB story slots (E1 #407 — authoritative per-slot schema):** six dedicated nullable JSONB columns; `NULL` = "slot never written" (distinct from empty). E1 ships the columns only — each slot has an assigned writer epic:
   - `seed_overrides` (**WRITTEN since E3 #409**) — SPARSE dict: only operator-set knobs appear, `{}` is never stored (`None` instead). Allow-listed keys (the `SeederOverrides` schema, `app/shared/seeder/overrides.py`): `stores` int 1-100, `products` int 1-500, `window_days` int 75-365, `sparsity` float 0-0.9, `promotion_intensity` float 0-0.5, `stockout_intensity` float 0-0.5, `noise_sigma` float 0-0.5. Persisted via `model_dump(mode="json", exclude_none=True)` at create time; replay re-submits it verbatim. Records the REQUESTED config — the data the run actually seeded follows from it deterministically.
   - `user_scope` (**WRITTEN since E3 #409**) — dict: operator-selected focus, `{"store_id": int>=1, "product_id": int>=1}` (`UserScope` schema, `extra=forbid`; additive keys need a documented schema change). Records the REQUESTED pair; the row's `store_id`/`product_id` columns record the EFFECTIVE grain the run modeled — the two legitimately diverge when the requested pair dangled and the status step warn-fell-back to discovery (divergence is visible by design). Both slots are exposed on the workspace LIST item (not detail-only) because the frontend Replay builds its start frame from list rows.
-  - `approval_events` (E5 #411 writes) — list[dict], append-only: `{"action_id": str, "tool_name": str, "decision": "approved"|"rejected", "decided_at": iso8601-str, "session_id": str}`.
-  - `rag_events` (E5 #411 writes) — list[dict], append-only: `{"event": "index"|"retrieve"|"skip", "detail": str, "count": int, "occurred_at": iso8601-str}`.
+  - `approval_events` (**WRITTEN since E5 #411** — schema v2) — list[dict], append-only, one entry per resolved HITL approval. E1-frozen base keys: `{"action_id": str, "tool_name": str, "decision": "approved"|"rejected"|"timed_out", "decided_at": iso8601-str, "session_id": str}` (the `decision` enum is WIDENED with `"timed_out"` in v2). E5 additive keys (`config_schema_version >= 2`): `"auto_approved": bool` (true when the 10 s decision window lapsed → auto-approve), `"reason": str|None` (operator Reject reason, ≤500), `"execution_status": str|None` (agents-API `ApprovalResponse.status`: `executed`|`rejected`|`expired`; `"external_4xx"` on the absorbed direct-pre-empt edge; `None` on `timed_out`), `"tool_call_summary": {"description": str, "arguments_keys": list[str]}` (argument KEYS only — never values, per security-patterns.md), `"transcript_summary": str` (agent chat message, ≤200 chars), `"tokens_used": int`, `"tool_calls_count": int`. Written by `step_agent_hitl_flow` (in-memory) → `finalize_workspace` (DB, warn-and-continue); empty → slot stays NULL.
+  - `rag_events` (**WRITTEN since E5 #411** — schema v2) — list[dict], append-only, one entry per knowledge-step return path. `{"event": "probe"|"index"|"retrieve"|"skip", "status": "pass"|"warn"|"skip", "detail": str, "count": int, "occurred_at": iso8601-str, "provider": str|None, "reachable": bool|None}` (the `"probe"` event value + `status`/`provider`/`reachable` keys are v2 additive; `reachable` is populated on the probe entry only). Written by the three knowledge steps (`embedding_provider_probe`/`rag_index_subset`/`rag_retrieve_probe`) on `showcase_rich` → `finalize_workspace`; empty → slot stays NULL.
   - `job_ids` (later parallel epic — E2 #408 / E4 #410 agree on the writer) — list[str]: job / batch sub-job ids the run submitted (soft references).
   - `phase_summaries` (later parallel epic) — list[dict], one per phase: `{"phase_name": str, "status": "pass"|"fail"|"warn"|"skip", "steps": int, "duration_ms": float}`.
 - **Relationship to demo pipeline runs:** one workspace row per kept pipeline run — `create_workspace` inserts it as `running` before the first step; `finalize_workspace` settles it with the run's collected ids. NOT a seeder `scenario`: a preset is a reusable data-generation recipe; a workspace is the record of ONE concrete run (which preset it used, with what seed, and what it produced).
@@ -76,7 +76,7 @@
   - Persistence is warn-and-continue: a workspace write failure must never break the demo pipeline (the run completes with `workspace_id: null`). The HTTP-backed helpers (`update_workspace` for PATCH, like get/list/delete) take a caller-owned session and raise normally — warn-and-continue is pipeline-only.
   - E1 (#407): `replayed_from_workspace_id` is a SOFT reference — **no ForeignKey, not even self-referential**: ancestor workspace rows must stay independently deletable (metadata-only delete) without cascading to or blocking descendants. The value is recorded verbatim from the request (no existence check); dangling lineage pointers after an ancestor delete are expected and harmless, like every `created_objects` id.
   - E1 (#407): `status` is NOT patchable — `PATCH /demo/workspaces/{id}` covers `name`/`notes`/`tags`/`archived`/`pinned` only; `archived` is an orthogonal curation flag and the `ck_showcase_workspace_status` CHECK is untouched.
-- **Out of scope (deliberately not modeled yet):** export bundles under `artifacts/showcase/<workspace>/`, RAG-event / approval-decision capture (columns exist as E1 story slots; the writers are E5 #411), and per-phase interactive configuration — see `docs/_base/RUNBOOKS.md` § Showcase workspace. (Advanced seed config + scope selection shipped in E3 #409 — the `seed_overrides`/`user_scope` slots above are now written.)
+- **Out of scope (deliberately not modeled yet):** export bundles under `artifacts/showcase/<workspace>/` (E6 #412) and per-phase interactive configuration — see `docs/_base/RUNBOOKS.md` § Showcase workspace. (Advanced seed config + scope selection shipped in E3 #409 — the `seed_overrides`/`user_scope` slots are written. Agent/HITL + RAG story capture shipped in E5 #411 — the `approval_events`/`rag_events` slots above are now written, plus `result_summary.story_reproduction` on replay keep-runs. The `job_ids`/`phase_summaries` slots remain unwritten.)
 
 ## Key Invariants — NEVER violate
 
diff --git a/docs/_base/RUNBOOKS.md b/docs/_base/RUNBOOKS.md
index 495e476d..6cdb9fd7 100644
--- a/docs/_base/RUNBOOKS.md
+++ b/docs/_base/RUNBOOKS.md
@@ -130,7 +130,7 @@ uv run python scripts/run_demo.py --seed 42 --quiet 2>&1 | tee demo.log
 21. **`rag_index_subset` step fails with `path_prefix escapes the project root` (PRP-40, `showcase_rich` only)** — the demo step hard-codes `path_prefix="docs/user-guide"`, so a real-world hit means `RAGService._base_dir` no longer points at the repo root (e.g. a misconfigured container start). Fix: confirm the backend was started from the repo root (or that `RAGService(base_dir=...)` was constructed with the right path); rerun the showcase. The path-traversal guard is load-bearing security — never relax it.
 22. **`rag_retrieve_probe` step shows ⚠️ with `no hits — corpus indexed but query did not match` (PRP-40, `showcase_rich` only)** — the 5-file corpus was indexed (the prior step PASSed) but the canned query `"How do I run the demo pipeline?"` returned zero hits. Common cause: the embedding-provider was switched mid-showcase and indexed chunks are now orphaned (memory anchor: `[[rag-runtime-config-and-corpus-state]]`); the pgvector column has one fixed dimension per provider. Fix: stick to one provider, or clear the RAG corpus (`DELETE /rag/sources/{id}` per source) and re-run.
 23. **`agent_hitl_flow` step shows ⏭️ with `no API key matching agent_default_model provider` (PRP-41, `showcase_rich` only)** — expected when no LLM key is set for the configured `agent_default_model` provider. Pipeline still goes green. Fix only if you want the HITL phase to run: set `OPENAI_API_KEY` / `ANTHROPIC_API_KEY` / `GOOGLE_API_KEY` to match the provider prefix in `agent_default_model` (e.g. `anthropic:claude-...` → `ANTHROPIC_API_KEY`).
-24. **`agent_hitl_flow` step shows ⏭️ with `approval timed out -- pipeline continued` (PRP-41, `showcase_rich` only)** — the 90 s hard timeout fired before `POST /agents/sessions/{id}/approve` completed. Causes: agent retry / provider 5xx / network hang. Pipeline still greens; `cleanup` still closes the session via `DELETE /agents/sessions/{id}`. Fix: check uvicorn logs for the `session_id` echoed in the step's `data.session_id`.
+24. **`agent_hitl_flow` step shows ⏭️ with `approval timed out -- pipeline continued` (PRP-41, `showcase_rich` only)** — the 90 s hard timeout fired before `POST /agents/sessions/{id}/approve` completed. Causes: agent retry / provider 5xx / network hang. Pipeline still greens; `cleanup` still closes the session via `DELETE /agents/sessions/{id}`. Fix: check uvicorn logs for the `session_id` echoed in the step's `data.session_id`. **E5 (#411)** — distinguish this 90 s HARD timeout (entry `decision="timed_out"`, step ⏭️ skip) from the normal 10 s DECISION WINDOW lapse (no operator click → auto-approve, entry `decision="approved"`, `auto_approved=true`, step ✅ pass). The card now shows **Approve + Reject** buttons (POST `/demo/hitl-decision`, the relay — NOT `/agents/.../approve` directly) with a live "auto-approve in Ns" countdown reading `data.decision_window_s`; a human **Reject** keeps the run GREEN (terminal `pass` + detail `rejected by operator`) and the gated `save_scenario` never writes a `scenario_plan` row. On `preservation="keep"` runs the resolved approval lands in the workspace `approval_events` slot (+ `rag_events` for the knowledge steps; `result_summary.story_reproduction` on replays). The intermediate `awaiting_approval` frame now reaches the browser DURING the window (D2 concurrent drain) — pre-E5 the Approve button could never render in time.
 25. **`agent_hitl_flow` step shows ⏭️ with `agent did not trigger save_scenario` (PRP-41, `showcase_rich` only)** — the agent answered the prompt directly (no `tool_save_scenario` call) so `pending_approval=false` came back on the chat response. Cause: model picked a different tool / answered in chat. Pipeline still greens. Fix: re-run; the model's response is non-deterministic. If the model ALWAYS skips the tool, raise the temperature in `agent_default_model` or re-prompt.
 26. **`ops_snapshot` step shows ⚠️ with `/ops/* all 4xx/5xx -- ops snapshot unavailable` (PRP-41, `showcase_rich` only)** — all three of `GET /ops/summary`, `/ops/retraining-candidates`, `/ops/model-health` returned non-2xx. Cause: DB unreachable, alembic migration drift, OpsService change broke the schema. Pipeline still warn (NEVER fail). Fix: `docker compose ps`; `uv run alembic upgrade head`; re-run.
 27. **Stop button used mid-run** — the Stop button on `/showcase` closes the WebSocket; the backend's `WebSocketDisconnect` handler at `app/features/demo/routes.py:74` releases `_pipeline_lock`. Page returns to `idle` within ~5 s with banner "Pipeline cancelled by user.". To resume, click Run again. Half-finished registry rows / scenario plans persist (the backend doesn't roll them back — they're operator-visible artefacts of a partial run).
@@ -164,7 +164,7 @@ uv run python scripts/run_demo.py --seed 42 --quiet 2>&1 | tee demo.log
 
 **Notes:** keep-runs are recorded by warn-and-continue hooks — a DB hiccup during `create_workspace` yields a green pipeline with `workspace_id: null` and no row (check uvicorn logs for `demo.workspace_create_failed`). Ephemeral runs write no workspace rows and stay in the localStorage Run-history strip; kept runs appear ONLY in the server-backed panel. On `showcase_rich` keep-runs, the planning-phase scenario plans carry the `workspace:<name|id>` tag (E3 #392) — retrieve them via `GET /scenarios?tags=workspace:<label>`. E3 (#409) — a kept run additionally records its `seed_overrides` and `user_scope` story slots at create time; Replay re-submits both verbatim (the slot records the REQUESTED config; the row's `store_id`/`product_id` columns record the EFFECTIVE grain, so a fallen-back scope stays visible).
 
-**Explicitly out of scope (not implemented; future epics, do not assume they exist):** export bundles under `artifacts/showcase/<workspace>/`; RAG-event and approval-decision capture on the workspace row (the E1 #407 story-slot columns exist but stay NULL until E5 #411 writes them); mid-run / per-phase re-entry (the linear single-`asyncio.Lock` pipeline is preserved — all configuration is start-frame-time only). (Replay provenance shipped in E1 #407 — `replayed_from_workspace_id` is recorded on every Replay. Advanced seed configuration shipped in E3 #409 — the 7-knob `seed_overrides` panel + `user_scope` focus pair, both replay-verbatim. Run configuration shipped in E4 #410 — the start-frame model set + backtest config in the `run_config` replay-input column, replay-verbatim.)
+**Explicitly out of scope (not implemented; future epics, do not assume they exist):** export bundles under `artifacts/showcase/<workspace>/` (E6 #412); mid-run / per-phase re-entry (the linear single-`asyncio.Lock` pipeline is preserved — all configuration is start-frame-time only); the `job_ids` / `phase_summaries` story slots (columns exist, still unwritten). (Replay provenance shipped in E1 #407 — `replayed_from_workspace_id` is recorded on every Replay. Advanced seed configuration shipped in E3 #409 — the 7-knob `seed_overrides` panel + `user_scope` focus pair, both replay-verbatim. Run configuration shipped in E4 #410 — the start-frame model set + backtest config in the `run_config` replay-input column, replay-verbatim. Agent/HITL + RAG story capture shipped in E5 #411 — the `approval_events` / `rag_events` slots are now written, plus the **Reject** button + 10 s decision window + `/ops` Approval History + the loaded-workspace Run-story panel.)
 
 ### release-please skipped the bump after a dev → main merge
 **Symptoms:** `dev → main` PR is merged, `CD Release` workflow on `main` completes in ~10s, **no Release PR** is opened. release-please log shows `No user facing commits found since <sha> - skipping`.

From 06712644704280f84eaac1d3e6a25ad60e3c7f1e Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 09:44:26 +0200
Subject: [PATCH 28/32] feat(api): add showcase workspace export bundle
 endpoint (#412)

POST /demo/workspaces/{id}/export writes a checksum-validated bundle
(manifest.json + scenario-plan snapshots + sha256sum-compatible
checksums.sha256) under artifacts/showcase/<workspace_id>/, validating
every checksum before returning. Soft references resolve over in-process
HTTP; model artifacts are referenced (uri + registry hash + live verify),
never copied. Dangling refs are reported, not fatal; 404 missing, 409
while running, deterministic overwrite on re-export. Stateless: no
migration, no DB writes. Traversal guard + chunked sha256 mirror
registry/storage.py (pattern, not import).
---
 .env.example                           |   4 +
 app/core/config.py                     |   4 +
 app/features/demo/export.py            | 387 +++++++++++++++++++++++++
 app/features/demo/routes.py            |  55 +++-
 app/features/demo/schemas.py           |  57 ++++
 app/features/demo/tests/test_export.py | 362 +++++++++++++++++++++++
 app/features/demo/tests/test_routes.py |  78 ++++-
 7 files changed, 944 insertions(+), 3 deletions(-)
 create mode 100644 app/features/demo/export.py
 create mode 100644 app/features/demo/tests/test_export.py

diff --git a/.env.example b/.env.example
index 38ef75b4..62da51b9 100644
--- a/.env.example
+++ b/.env.example
@@ -29,6 +29,10 @@ FORECAST_ENABLE_LIGHTGBM=false
 # FORECAST_ENABLE_XGBOOST defaults to false (opt-in; install ml-xgboost extra)
 # FORECAST_ENABLE_RANDOM_FOREST=false  # PRP-36 optional model — pure sklearn, no extra needed
 
+# Demo / Showcase settings
+# E6 (#412) — root for saved-workspace export bundles (manifest + checksums).
+SHOWCASE_EXPORT_ROOT=./artifacts/showcase
+
 # RAG Configuration
 # Embedding Provider: "openai" or "ollama"
 RAG_EMBEDDING_PROVIDER=openai
diff --git a/app/core/config.py b/app/core/config.py
index 033c77d9..6cedfd37 100644
--- a/app/core/config.py
+++ b/app/core/config.py
@@ -129,6 +129,10 @@ class Settings(BaseSettings):
     registry_artifact_root: str = "./artifacts/registry"
     registry_duplicate_policy: Literal["allow", "deny", "detect"] = "detect"
 
+    # Demo / Showcase
+    # E6 (#412) — root for workspace export bundles (manifest + checksums).
+    showcase_export_root: str = "./artifacts/showcase"
+
     # Analytics
     analytics_max_rows: int = 10000
     analytics_max_date_range_days: int = 730
diff --git a/app/features/demo/export.py b/app/features/demo/export.py
new file mode 100644
index 00000000..fc5090f9
--- /dev/null
+++ b/app/features/demo/export.py
@@ -0,0 +1,387 @@
+"""Workspace export-bundle writer (E6, issue #412).
+
+Write a self-describing, checksum-validated bundle for a saved showcase
+workspace under ``<showcase_export_root>/<workspace_id>/``::
+
+    manifest.json              versioned snapshot + references
+    scenario_plans/<id>.json   one per resolvable scenario plan
+    checksums.sha256           sha256sum-compatible; covers every other file
+
+Frozen decisions (see ``PRPs/PRP-showcase-completion-E6-export-bundle.md``):
+
+1. One directory per ``workspace_id`` (unique uuid4 hex), keyed off the DB row.
+2. Re-export is a deterministic overwrite -- the existing guarded bundle
+   directory is removed and rewritten; ``exported_at`` records the moment.
+3. Soft references resolve over the public HTTP surface IN-PROCESS
+   (``httpx.ASGITransport``) -- the demo slice may not import the registry /
+   scenarios slices (vertical-slice rule). Any non-2xx -> an
+   ``unresolved_references`` entry (or ``artifact_verified=None``), never a
+   failed export.
+4. Model artifacts are REFERENCED (uri + registry hash + live verify result),
+   never copied.
+5. Stateless -- export writes NOTHING to the database (no row, no story slot).
+6. ``failed`` workspaces are exportable; ``running`` ones are a 409.
+7. ``checksums.sha256`` excludes itself (a self-referencing checksum file is a
+   bootstrap hole) and uses the two-space ``sha256sum`` separator.
+
+The traversal guard (:func:`_resolve_bundle_dir`) and chunked SHA-256
+(:func:`_compute_sha256`) MIRROR ``app/features/registry/storage.py``
+(``LocalFSProvider._resolve_path`` / ``AbstractStorageProvider.compute_hash``)
+-- the vertical-slice rule forbids importing that module, so the ~10-line
+pattern is reimplemented here. Reference resolution uses the same in-process
+``httpx`` client ``app/features/demo/link_health.py`` uses.
+"""
+
+from __future__ import annotations
+
+import hashlib
+import json
+import shutil
+from datetime import UTC, datetime
+from pathlib import Path
+from typing import TYPE_CHECKING, Any
+
+import httpx
+from sqlalchemy.ext.asyncio import AsyncSession
+
+from app.core.config import get_settings
+from app.core.exceptions import ConflictError, ForecastLabError, NotFoundError
+from app.core.logging import get_logger
+from app.features.demo import workspace
+from app.features.demo.models import WORKSPACE_STATUS_RUNNING
+from app.features.demo.schemas import (
+    BUNDLE_FORMAT_VERSION,
+    ExportFileEntry,
+    UnresolvedReference,
+    WorkspaceDetailResponse,
+    WorkspaceExportResult,
+)
+
+if TYPE_CHECKING:
+    from fastapi import FastAPI
+
+logger = get_logger(__name__)
+
+_MANIFEST = "manifest.json"
+_CHECKSUMS = "checksums.sha256"
+_PLANS_DIR = "scenario_plans"
+# created_objects run-id keys whose registry runs the manifest references.
+_RUN_KEYS = ("winning_run_id", "v2_run_id", "stale_alias_run_id")
+# Generous in-process budget (no real network); a hung driven endpoint surfaces
+# as a response under raise_app_exceptions=False, not a hang.
+_EXPORT_TIMEOUT = httpx.Timeout(30.0, connect=5.0)
+
+
+def _compute_sha256(path: Path) -> str:
+    """Chunked SHA-256 of a file (mirror ``registry/storage.py:compute_hash``)."""
+    digest = hashlib.sha256()
+    with path.open("rb") as handle:
+        for chunk in iter(lambda: handle.read(8192), b""):
+            digest.update(chunk)
+    return digest.hexdigest()
+
+
+def _resolve_bundle_dir(root: Path, workspace_id: str) -> Path:
+    """Resolve ``<root>/<workspace_id>``, guarding against path traversal.
+
+    Mirrors ``registry/storage.py:LocalFSProvider._resolve_path`` -- ``resolve()``
+    then ``relative_to(root)``. A ``workspace_id`` that escapes the root raises
+    ``ValueError`` BEFORE any disk I/O. ``root`` must already be resolved. The
+    id always comes from the DB row (uuid4 hex), never raw from the URL path, so
+    this is defense in depth.
+    """
+    bundle_dir = (root / workspace_id).resolve()
+    try:
+        bundle_dir.relative_to(root)
+    except ValueError:
+        logger.warning(
+            "demo.export_path_traversal_attempt",
+            workspace_id=workspace_id,
+            root=str(root),
+        )
+        raise
+    return bundle_dir
+
+
+def _write_json(path: Path, payload: dict[str, Any]) -> int:
+    """Write deterministic JSON (sorted keys, 2-space indent, trailing newline).
+
+    ``sort_keys`` makes the bytes order-independent so unchanged state
+    re-exports to identical bytes (stable checksums). Returns the byte size.
+    """
+    data = (json.dumps(payload, indent=2, sort_keys=True) + "\n").encode("utf-8")
+    path.write_bytes(data)
+    return len(data)
+
+
+def _root_relative(root: Path) -> str:
+    """Repo-root-relative POSIX string for display (no absolute-path leak)."""
+    try:
+        return root.relative_to(Path.cwd()).as_posix()
+    except ValueError:
+        return root.as_posix()
+
+
+def _open_client(app: FastAPI) -> httpx.AsyncClient:
+    """In-process client over ``ASGITransport`` (pattern: ``link_health.py``).
+
+    ``raise_app_exceptions=False`` is load-bearing: a driven endpoint's failure
+    becomes a 5xx *response* (-> ``unresolved_references`` / ``artifact_verified
+    =None``), never a re-raised exception inside the export. ``base_url`` is
+    cosmetic but required by httpx.
+    """
+    return httpx.AsyncClient(
+        transport=httpx.ASGITransport(app=app, raise_app_exceptions=False),
+        base_url="http://demo.internal",
+        timeout=_EXPORT_TIMEOUT,
+    )
+
+
+async def _resolve_model_runs(
+    client: httpx.AsyncClient,
+    created: dict[str, Any],
+) -> tuple[list[dict[str, Any]], list[UnresolvedReference]]:
+    """Resolve the run-id soft references to manifest model-run references.
+
+    A run that resolves (2xx) is referenced (uri + registry hash + a live
+    ``artifact_verified`` from the verify endpoint when both uri and hash are
+    present). A non-2xx run is an ``unresolved_references`` entry. A failed
+    artifact *verify* on a resolved run is NOT unresolved -- the run resolved;
+    only its artifact check did not (``artifact_verified=None``).
+    """
+    model_runs: list[dict[str, Any]] = []
+    unresolved: list[UnresolvedReference] = []
+    for key in _RUN_KEYS:
+        run_id = created.get(key)
+        if not isinstance(run_id, str) or not run_id:
+            continue
+        resp = await client.get(f"/registry/runs/{run_id}")
+        if resp.status_code != 200:
+            reason = f"HTTP {resp.status_code}"
+            unresolved.append(UnresolvedReference(key=key, ref_id=run_id, reason=reason))
+            logger.warning(
+                "demo.export_unresolved_reference", key=key, ref_id=run_id, reason=reason
+            )
+            continue
+        body = resp.json()
+        artifact_uri = body.get("artifact_uri")
+        artifact_hash = body.get("artifact_hash")
+        verified: bool | None = None
+        if artifact_uri and artifact_hash:
+            vresp = await client.get(f"/registry/runs/{run_id}/verify")
+            if vresp.status_code == 200:
+                raw = vresp.json().get("verified")
+                verified = raw if isinstance(raw, bool) else None
+        model_runs.append(
+            {
+                "key": key,
+                "run_id": run_id,
+                "model_type": body.get("model_type"),
+                "status": body.get("status"),
+                "artifact_uri": artifact_uri,
+                "artifact_hash": artifact_hash,
+                "artifact_verified": verified,
+                "metrics": body.get("metrics"),
+            }
+        )
+    return model_runs, unresolved
+
+
+async def _resolve_scenario_plans(
+    client: httpx.AsyncClient,
+    created: dict[str, Any],
+    plans_dir: Path,
+) -> tuple[list[dict[str, Any]], list[tuple[str, int]], list[UnresolvedReference]]:
+    """Write a JSON snapshot per resolvable scenario plan; report dangles.
+
+    Returns ``(manifest plan entries, written (relpath, size) pairs,
+    unresolved)``. The plan body is stored verbatim -- its ``run_id`` is the
+    forecast ARTIFACT key, not a registry ``model_run.run_id`` (different id
+    spaces; memory anchor ``scenario-run-id-vs-registry-run-id``), so it is
+    never joined against the registry.
+    """
+    plan_entries: list[dict[str, Any]] = []
+    file_entries: list[tuple[str, int]] = []
+    unresolved: list[UnresolvedReference] = []
+    # JSONB types this list[str], but nothing enforces it at runtime -- treat
+    # entries as untrusted (mirrors link_health's created_objects guards).
+    raw_plan_ids = created.get("scenario_plan_ids")
+    plan_ids: list[Any] = raw_plan_ids if isinstance(raw_plan_ids, list) else []
+    for scenario_id in plan_ids:
+        if not isinstance(scenario_id, str) or not scenario_id:
+            continue
+        resp = await client.get(f"/scenarios/{scenario_id}")
+        if resp.status_code != 200:
+            reason = f"HTTP {resp.status_code}"
+            unresolved.append(
+                UnresolvedReference(key="scenario_plan_ids", ref_id=scenario_id, reason=reason)
+            )
+            logger.warning(
+                "demo.export_unresolved_reference",
+                key="scenario_plan_ids",
+                ref_id=scenario_id,
+                reason=reason,
+            )
+            continue
+        body = resp.json()
+        rel = f"{_PLANS_DIR}/{scenario_id}.json"
+        size = _write_json(plans_dir / f"{scenario_id}.json", body)
+        plan_entries.append(
+            {
+                "scenario_id": scenario_id,
+                "file": rel,
+                "name": body.get("name") if isinstance(body, dict) else None,
+            }
+        )
+        file_entries.append((rel, size))
+    return plan_entries, file_entries, unresolved
+
+
+def _validate_checksums(bundle_dir: Path) -> bool:
+    """Re-read ``checksums.sha256``, recompute every listed hash, compare.
+
+    Returns ``False`` (the caller logs it) rather than raising on any mismatch
+    or parse issue -- a failed validation is reported honestly in the response.
+    """
+    checksums_path = bundle_dir / _CHECKSUMS
+    try:
+        content = checksums_path.read_text(encoding="utf-8")
+    except OSError:
+        return False
+    for line in content.splitlines():
+        if not line.strip():
+            continue
+        # sha256sum format: "<hex>  <relpath>" (two-space separator).
+        expected, _, rel = line.partition("  ")
+        if not rel:
+            return False
+        target = bundle_dir / rel
+        try:
+            actual = _compute_sha256(target)
+        except OSError:
+            return False
+        if actual != expected:
+            return False
+    return True
+
+
+async def export_workspace(
+    db: AsyncSession,
+    app: FastAPI,
+    workspace_id: str,
+    *,
+    export_root: str | Path | None = None,
+) -> WorkspaceExportResult:
+    """Export a saved workspace to a checksum-validated bundle on disk.
+
+    Re-queries the row via :func:`workspace.get_workspace` so the function is
+    independently callable/testable; the route's 404/409 pre-guard fires before
+    any export work begins.
+
+    Args:
+        db: Caller-owned async session (used only to load the row).
+        app: The live FastAPI app for in-process soft-reference resolution.
+        workspace_id: External id of the workspace to export.
+        export_root: Override the configured ``showcase_export_root`` (tests).
+
+    Returns:
+        The export result (bundle path, file inventory, counts, unresolved
+        references, checksum-validation flag).
+
+    Raises:
+        NotFoundError: When no workspace matches ``workspace_id`` (404).
+        ConflictError: When the workspace run is still ``running`` (409).
+        ForecastLabError: When the bundle cannot be written to disk (500).
+    """
+    row = await workspace.get_workspace(db, workspace_id)
+    if row is None:
+        raise NotFoundError(message=f"Workspace not found: {workspace_id}")
+    if row.status == WORKSPACE_STATUS_RUNNING:
+        raise ConflictError(
+            "Cannot export while the run is still in progress; retry after the run settles."
+        )
+
+    snapshot = WorkspaceDetailResponse.model_validate(row).model_dump(mode="json")
+    created = row.created_objects or {}
+
+    root = Path(export_root or get_settings().showcase_export_root).resolve()
+    root.mkdir(parents=True, exist_ok=True)
+    # GUARD before any rmtree / mkdir / write -- the rmtree target is the
+    # guarded resolution only, never a raw request value.
+    bundle_dir = _resolve_bundle_dir(root, row.workspace_id)
+
+    exported_at = datetime.now(UTC)
+    try:
+        if bundle_dir.exists():
+            shutil.rmtree(bundle_dir)  # Decision 2 -- deterministic overwrite.
+        plans_dir = bundle_dir / _PLANS_DIR
+        plans_dir.mkdir(parents=True)
+
+        async with _open_client(app) as client:
+            model_runs, run_unresolved = await _resolve_model_runs(client, created)
+            plan_entries, plan_files, plan_unresolved = await _resolve_scenario_plans(
+                client, created, plans_dir
+            )
+        unresolved = [*run_unresolved, *plan_unresolved]
+
+        manifest = {
+            "bundle_format_version": BUNDLE_FORMAT_VERSION,
+            "exported_at": exported_at.isoformat(),
+            "workspace": snapshot,
+            "model_runs": model_runs,
+            "scenario_plans": plan_entries,
+            "unresolved_references": [ref.model_dump() for ref in unresolved],
+            # Paths + sizes so a consumer can sanity-check without parsing the
+            # hash file; hashes live ONLY in checksums.sha256 (Decision 7).
+            "files": [{"path": rel, "size_bytes": size} for rel, size in plan_files],
+        }
+        _write_json(bundle_dir / _MANIFEST, manifest)
+
+        # checksums.sha256 -- every bundle file except itself, sorted, two-space
+        # sha256sum format, bundle-relative POSIX paths.
+        checksum_lines = [
+            f"{_compute_sha256(path)}  {path.relative_to(bundle_dir).as_posix()}"
+            for path in sorted(bundle_dir.rglob("*"))
+            if path.is_file() and path.name != _CHECKSUMS
+        ]
+        (bundle_dir / _CHECKSUMS).write_text("\n".join(checksum_lines) + "\n", encoding="utf-8")
+    except OSError as exc:
+        logger.warning(
+            "demo.workspace_export_failed",
+            workspace_id=row.workspace_id,
+            error=str(exc),
+            error_type=type(exc).__name__,
+        )
+        raise ForecastLabError(
+            message=f"Export bundle write failed: {exc}", status_code=500
+        ) from exc
+
+    validated = _validate_checksums(bundle_dir)
+    files = [
+        ExportFileEntry(
+            path=path.relative_to(bundle_dir).as_posix(),
+            sha256=_compute_sha256(path),
+            size_bytes=path.stat().st_size,
+        )
+        for path in sorted(bundle_dir.rglob("*"))
+        if path.is_file()
+    ]
+
+    logger.info(
+        "demo.workspace_exported",
+        workspace_id=row.workspace_id,
+        files=len(files),
+        unresolved=len(unresolved),
+        validated=validated,
+    )
+    return WorkspaceExportResult(
+        workspace_id=row.workspace_id,
+        bundle_path=f"{_root_relative(root)}/{row.workspace_id}",
+        bundle_format_version=BUNDLE_FORMAT_VERSION,
+        exported_at=exported_at,
+        files=files,
+        scenario_plans_exported=len(plan_entries),
+        model_runs_referenced=len(model_runs),
+        unresolved_references=unresolved,
+        validated=validated,
+    )
diff --git a/app/features/demo/routes.py b/app/features/demo/routes.py
index dc9d6b89..0f27e458 100644
--- a/app/features/demo/routes.py
+++ b/app/features/demo/routes.py
@@ -14,6 +14,11 @@
   update (rename / notes / tags / archive / pin); ``status`` is not patchable.
 - ``DELETE /demo/workspaces/{workspace_id}``  -- delete the workspace METADATA
   row only; the run's created objects are soft references and stay untouched.
+- ``POST   /demo/workspaces/{workspace_id}/export`` -- E6 (#412): write a
+  checksum-validated bundle (manifest + scenario-plan snapshots + checksums)
+  under ``artifacts/showcase/<workspace_id>/``; soft references resolve
+  in-process, model artifacts are referenced (never copied), dangling refs are
+  reported, not fatal.
 
 The run/stream handlers obtain the live FastAPI app from ``request.app`` /
 ``websocket.app`` and pass it into the pipeline -- the slice never imports
@@ -40,7 +45,8 @@
 from app.core.database import get_db
 from app.core.exceptions import ConflictError, NotFoundError
 from app.core.logging import get_logger
-from app.features.demo import hitl, link_health, service, workspace
+from app.features.demo import export, hitl, link_health, service, workspace
+from app.features.demo.models import WORKSPACE_STATUS_RUNNING
 from app.features.demo.schemas import (
     ApprovalEventItem,
     ApprovalEventsResponse,
@@ -49,6 +55,7 @@
     HitlDecisionRequest,
     StepEvent,
     WorkspaceDetailResponse,
+    WorkspaceExportResult,
     WorkspaceHealthResponse,
     WorkspaceListItem,
     WorkspaceListResponse,
@@ -348,6 +355,52 @@ async def delete_showcase_workspace(
         raise NotFoundError(message=f"Workspace not found: {workspace_id}")
 
 
+@router.post(
+    "/workspaces/{workspace_id}/export",
+    response_model=WorkspaceExportResult,
+    summary="Export a saved showcase workspace as a checksum-validated bundle",
+    description=(
+        "Write artifacts/showcase/<workspace_id>/ -- a versioned manifest.json, "
+        "one JSON per resolvable scenario plan, and a sha256sum-compatible "
+        "checksums.sha256 -- then re-verify every checksum before returning. "
+        "Model artifacts are referenced (uri + registry hash + live verify), "
+        "never copied. Dangling soft references are reported in "
+        "`unresolved_references` (the export still returns 200). 404 when the "
+        "workspace is missing; 409 while its run is still in progress; "
+        "re-export overwrites the bundle."
+    ),
+)
+async def export_showcase_workspace(
+    workspace_id: str,
+    request: Request,
+    db: AsyncSession = Depends(get_db),
+) -> WorkspaceExportResult:
+    """Export a saved showcase workspace to a checksum-validated bundle (E6, #412).
+
+    Args:
+        workspace_id: External identifier of the workspace.
+        request: The incoming request (used to obtain the live FastAPI app for
+            the in-process soft-reference resolution GETs).
+        db: Async database session from dependency.
+
+    Returns:
+        The export result -- bundle path, file inventory with hashes, counts,
+        unresolved references, and the checksum-validation flag.
+
+    Raises:
+        NotFoundError: When no workspace matches ``workspace_id`` (404).
+        ConflictError: When the workspace run is still in progress (409).
+    """
+    row = await workspace.get_workspace(db, workspace_id)
+    if row is None:
+        raise NotFoundError(message=f"Workspace not found: {workspace_id}")
+    if row.status == WORKSPACE_STATUS_RUNNING:
+        raise ConflictError(
+            "Cannot export while the run is still in progress; retry after the run settles."
+        )
+    return await export.export_workspace(db, request.app, workspace_id)
+
+
 @router.websocket("/stream")
 async def stream_demo_pipeline(websocket: WebSocket) -> None:
     """Stream one StepEvent per pipeline step over a WebSocket.
diff --git a/app/features/demo/schemas.py b/app/features/demo/schemas.py
index 70b1e8c3..353a7eac 100644
--- a/app/features/demo/schemas.py
+++ b/app/features/demo/schemas.py
@@ -567,3 +567,60 @@ class ApprovalEventsResponse(BaseModel):
         ..., description="Flattened approval events, newest workspace first; empty when none."
     )
     total: int = Field(..., ge=0, description="Number of flattened entries returned (capped).")
+
+
+# =============================================================================
+# E6 (#412) -- workspace export bundle (POST /demo/workspaces/{id}/export)
+# =============================================================================
+
+# Bumped on any manifest-shape change so bundle consumers can branch on it.
+BUNDLE_FORMAT_VERSION = 1
+
+
+class ExportFileEntry(BaseModel):
+    """One file inside an exported workspace bundle (E6, issue #412).
+
+    Response model -- plain ``BaseModel``, NOT ``ConfigDict(strict=True)``:
+    strict mode is a request-body policy and this endpoint has no body.
+    """
+
+    path: str = Field(..., description="Bundle-relative POSIX path.")
+    sha256: str = Field(..., description="Hex SHA-256 of the file contents.")
+    size_bytes: int = Field(..., ge=0, description="File size in bytes.")
+
+
+class UnresolvedReference(BaseModel):
+    """A soft reference that could not be resolved during export (E6, #412)."""
+
+    key: str = Field(..., description="created_objects key (e.g. 'scenario_plan_ids').")
+    ref_id: str = Field(..., description="The id that failed to resolve.")
+    reason: str = Field(..., description="Short cause, e.g. 'HTTP 404'.")
+
+
+class WorkspaceExportResult(BaseModel):
+    """Result of ``POST /demo/workspaces/{workspace_id}/export`` (E6, #412)."""
+
+    workspace_id: str = Field(..., description="The exported workspace's id.")
+    bundle_path: str = Field(
+        ..., description="Repo-root-relative bundle dir, e.g. 'artifacts/showcase/<id>'."
+    )
+    bundle_format_version: int = Field(..., description="Manifest schema version.")
+    exported_at: datetime = Field(..., description="When the export ran (UTC).")
+    # The COMPLETE on-disk inventory, INCLUDING checksums.sha256 itself (with
+    # its own computed hash) -- it just never lists itself inside the checksum
+    # file; the response is where that hash lives.
+    files: list[ExportFileEntry] = Field(
+        ..., description="Every file in the bundle with its hash and size."
+    )
+    scenario_plans_exported: int = Field(
+        ..., ge=0, description="Scenario plans written to scenario_plans/."
+    )
+    model_runs_referenced: int = Field(
+        ..., ge=0, description="Model runs referenced in the manifest (not copied)."
+    )
+    unresolved_references: list[UnresolvedReference] = Field(
+        ..., description="Soft references that could not be resolved (export still succeeded)."
+    )
+    validated: bool = Field(
+        ..., description="True when checksums.sha256 re-read + recomputed clean."
+    )
diff --git a/app/features/demo/tests/test_export.py b/app/features/demo/tests/test_export.py
new file mode 100644
index 00000000..2b1c950f
--- /dev/null
+++ b/app/features/demo/tests/test_export.py
@@ -0,0 +1,362 @@
+"""Tests for the workspace export-bundle writer (E6, issue #412).
+
+Unit tests (no DB, no app) cover the disk primitives -- chunked sha256, the
+traversal guard (must raise BEFORE any I/O), deterministic JSON -- and the
+manifest assembly via a mocked in-process client. Integration tests run the
+real endpoint against docker-compose Postgres with a ``tmp_path`` export root.
+"""
+
+from __future__ import annotations
+
+import datetime as _dt
+import hashlib
+import json
+from pathlib import Path
+from types import SimpleNamespace
+from typing import TYPE_CHECKING, Any, cast
+
+import httpx
+import pytest
+
+from app.features.demo import export, workspace
+from app.features.demo.models import ShowcaseWorkspace
+from app.features.demo.schemas import WorkspaceExportResult
+
+if TYPE_CHECKING:
+    from fastapi import FastAPI
+    from sqlalchemy.ext.asyncio import AsyncSession
+
+# The direct-call unit tests monkeypatch get_workspace + _open_client, so the
+# real session / app are never touched -- typed None sentinels keep the strict
+# signature satisfied without a DB or app instance.
+_NO_DB = cast("AsyncSession", None)
+_NO_APP = cast("FastAPI", None)
+
+# =============================================================================
+# Unit -- disk primitives (no DB, no app)
+# =============================================================================
+
+
+def test_compute_sha256_matches_whole_file(tmp_path: Path) -> None:
+    """The chunked digest equals a whole-file hashlib hash."""
+    target = tmp_path / "blob"
+    target.write_bytes(b"y" * 25_000)  # > one 8192-byte chunk
+    assert export._compute_sha256(target) == hashlib.sha256(target.read_bytes()).hexdigest()
+
+
+@pytest.mark.parametrize("evil", ["../escape", "../../etc/passwd", "/etc/passwd"])
+def test_resolve_bundle_dir_rejects_traversal_before_io(tmp_path: Path, evil: str) -> None:
+    """A traversal-shaped id raises ValueError and writes nothing."""
+    root = tmp_path.resolve()
+    with pytest.raises(ValueError):
+        export._resolve_bundle_dir(root, evil)
+    # The guard does pure path math -- no directory is created.
+    assert list(root.iterdir()) == []
+
+
+def test_resolve_bundle_dir_accepts_uuid_hex(tmp_path: Path) -> None:
+    """A normal uuid-hex id resolves directly under the root."""
+    root = tmp_path.resolve()
+    workspace_id = "a" * 32
+    resolved = export._resolve_bundle_dir(root, workspace_id)
+    assert resolved == root / workspace_id
+
+
+def test_write_json_is_deterministic(tmp_path: Path) -> None:
+    """Two dumps of key-shuffled payloads produce identical bytes."""
+    a = tmp_path / "a.json"
+    b = tmp_path / "b.json"
+    size_a = export._write_json(a, {"z": 1, "a": 2, "m": {"y": 1, "x": 2}})
+    size_b = export._write_json(b, {"a": 2, "m": {"x": 2, "y": 1}, "z": 1})
+    assert a.read_bytes() == b.read_bytes()
+    assert size_a == size_b == len(a.read_bytes())
+    assert a.read_text().endswith("\n")
+
+
+def test_validate_checksums_round_trip(tmp_path: Path) -> None:
+    """A hand-built bundle validates; a tampered file flips validated False."""
+    bundle = tmp_path / "wsid"
+    bundle.mkdir()
+    payload = bundle / "manifest.json"
+    payload.write_text("hello\n", encoding="utf-8")
+    digest = export._compute_sha256(payload)
+    (bundle / "checksums.sha256").write_text(f"{digest}  manifest.json\n", encoding="utf-8")
+    assert export._validate_checksums(bundle) is True
+
+    payload.write_text("tampered\n", encoding="utf-8")
+    assert export._validate_checksums(bundle) is False
+
+
+# =============================================================================
+# Unit -- manifest assembly via a mocked in-process client
+# =============================================================================
+
+
+def _row(**overrides: object) -> SimpleNamespace:
+    """An ORM-shaped ShowcaseWorkspace stand-in (mirrors test_routes._orm_like_row)."""
+    base: dict[str, object] = {
+        "workspace_id": "a" * 32,
+        "name": "e6-export",
+        "status": "completed",
+        "seed": 42,
+        "scenario": "showcase_rich",
+        "reset": False,
+        "skip_seed": True,
+        "store_id": 3,
+        "product_id": 7,
+        "date_start": _dt.date(2026, 1, 1),
+        "date_end": _dt.date(2026, 3, 31),
+        "created_objects": {},
+        "result_summary": {"winner_model_type": "naive"},
+        "created_at": _dt.datetime(2026, 6, 1, 12, 0, tzinfo=_dt.UTC),
+    }
+    base.update(overrides)
+    return SimpleNamespace(**base)
+
+
+def _mock_client() -> httpx.AsyncClient:
+    """In-process client returning canned registry / scenario bodies + one 404."""
+
+    def handler(request: httpx.Request) -> httpx.Response:
+        path = request.url.path
+        if path == "/registry/runs/run-win":
+            return httpx.Response(
+                200,
+                json={
+                    "run_id": "run-win",
+                    "model_type": "naive",
+                    "status": "success",
+                    "artifact_uri": "demo/naive-model_abc.joblib",
+                    "artifact_hash": "deadbeef",
+                    "metrics": {"wape": 0.12},
+                },
+            )
+        if path == "/registry/runs/run-win/verify":
+            return httpx.Response(200, json={"verified": True})
+        if path == "/registry/runs/run-gone":
+            return httpx.Response(404, json={"detail": "run not found"})
+        if path == "/scenarios/plan-1":
+            return httpx.Response(
+                200,
+                json={
+                    "scenario_id": "plan-1",
+                    "name": "Price cut 15%",
+                    "run_id": "model_xyz",
+                    "assumptions": {"price_change_pct": -0.15},
+                    "comparison": {},
+                    "tags": ["showcase"],
+                },
+            )
+        if path == "/scenarios/dangling":
+            return httpx.Response(404, json={"detail": "scenario not found"})
+        return httpx.Response(404, json={"detail": f"unmatched {path}"})
+
+    return httpx.AsyncClient(
+        transport=httpx.MockTransport(handler), base_url="http://demo.internal"
+    )
+
+
+async def test_export_assembles_manifest_and_reports_dangles(
+    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
+) -> None:
+    """A mixed run resolves one run + one plan and reports two dangles."""
+    row = _row(
+        created_objects={
+            "winning_run_id": "run-win",
+            "v2_run_id": "run-gone",  # 404 -> unresolved
+            "scenario_plan_ids": ["plan-1", "dangling"],
+        }
+    )
+
+    async def fake_get(_db: object, _workspace_id: str) -> SimpleNamespace:
+        return row
+
+    monkeypatch.setattr(workspace, "get_workspace", fake_get)
+    monkeypatch.setattr(export, "_open_client", lambda _app: _mock_client())
+
+    result: WorkspaceExportResult = await export.export_workspace(
+        db=_NO_DB, app=_NO_APP, workspace_id="a" * 32, export_root=tmp_path
+    )
+
+    assert result.validated is True
+    assert result.model_runs_referenced == 1
+    assert result.scenario_plans_exported == 1
+    # Two dangles: the v2 run (404) and the dangling scenario plan (404).
+    keys = sorted((ref.key, ref.ref_id) for ref in result.unresolved_references)
+    assert keys == [("scenario_plan_ids", "dangling"), ("v2_run_id", "run-gone")]
+
+    bundle = tmp_path / ("a" * 32)
+    manifest = json.loads((bundle / "manifest.json").read_text())
+    assert manifest["bundle_format_version"] == 1
+    assert manifest["workspace"]["workspace_id"] == "a" * 32
+    assert manifest["model_runs"][0]["run_id"] == "run-win"
+    assert manifest["model_runs"][0]["artifact_verified"] is True
+    assert manifest["scenario_plans"][0]["scenario_id"] == "plan-1"
+    # The plan body is stored verbatim under scenario_plans/.
+    plan = json.loads((bundle / "scenario_plans" / "plan-1.json").read_text())
+    assert plan["name"] == "Price cut 15%"
+    # checksums.sha256 covers every file except itself, two-space separator.
+    lines = (bundle / "checksums.sha256").read_text().splitlines()
+    covered = {line.split("  ", 1)[1] for line in lines}
+    assert "manifest.json" in covered
+    assert "scenario_plans/plan-1.json" in covered
+    assert "checksums.sha256" not in covered
+    # The response inventory DOES include the checksum file itself.
+    assert any(entry.path == "checksums.sha256" for entry in result.files)
+
+
+async def test_export_overwrites_stale_bundle(
+    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
+) -> None:
+    """A pre-existing stale file in the bundle dir is gone after re-export."""
+    row = _row(created_objects={"winning_run_id": "run-win"})
+
+    async def fake_get(_db: object, _workspace_id: str) -> SimpleNamespace:
+        return row
+
+    monkeypatch.setattr(workspace, "get_workspace", fake_get)
+    monkeypatch.setattr(export, "_open_client", lambda _app: _mock_client())
+
+    bundle = tmp_path / ("a" * 32)
+    (bundle / "scenario_plans").mkdir(parents=True)
+    stale = bundle / "scenario_plans" / "stale.json"
+    stale.write_text("{}", encoding="utf-8")
+
+    await export.export_workspace(
+        db=_NO_DB, app=_NO_APP, workspace_id="a" * 32, export_root=tmp_path
+    )
+
+    assert not stale.exists()
+    assert (bundle / "manifest.json").exists()
+
+
+async def test_export_empty_created_objects_minimal_bundle(
+    tmp_path: Path, monkeypatch: pytest.MonkeyPatch
+) -> None:
+    """An empty-references run still exports a valid manifest + checksums."""
+    row = _row(created_objects={})
+
+    async def fake_get(_db: object, _workspace_id: str) -> SimpleNamespace:
+        return row
+
+    monkeypatch.setattr(workspace, "get_workspace", fake_get)
+    monkeypatch.setattr(export, "_open_client", lambda _app: _mock_client())
+
+    result = await export.export_workspace(
+        db=_NO_DB, app=_NO_APP, workspace_id="a" * 32, export_root=tmp_path
+    )
+    assert result.validated is True
+    assert result.model_runs_referenced == 0
+    assert result.scenario_plans_exported == 0
+    assert result.unresolved_references == []
+    paths = {entry.path for entry in result.files}
+    assert paths == {"manifest.json", "checksums.sha256"}
+
+
+async def test_export_404_on_missing_workspace(monkeypatch: pytest.MonkeyPatch) -> None:
+    """A missing row raises NotFoundError before any disk work."""
+    from app.core.exceptions import NotFoundError
+
+    async def fake_get(_db: object, _workspace_id: str) -> None:
+        return None
+
+    monkeypatch.setattr(workspace, "get_workspace", fake_get)
+    with pytest.raises(NotFoundError):
+        await export.export_workspace(db=_NO_DB, app=_NO_APP, workspace_id="z" * 32)
+
+
+async def test_export_409_on_running_workspace(monkeypatch: pytest.MonkeyPatch) -> None:
+    """A running row raises ConflictError (references not yet settled)."""
+    from app.core.exceptions import ConflictError
+
+    async def fake_get(_db: object, _workspace_id: str) -> SimpleNamespace:
+        return _row(status="running")
+
+    monkeypatch.setattr(workspace, "get_workspace", fake_get)
+    with pytest.raises(ConflictError):
+        await export.export_workspace(db=_NO_DB, app=_NO_APP, workspace_id="a" * 32)
+
+
+# =============================================================================
+# Integration -- real endpoint, real Postgres, tmp_path export root
+# =============================================================================
+
+
+@pytest.mark.integration
+async def test_export_endpoint_round_trip(
+    client: httpx.AsyncClient,
+    db_session: Any,
+    tmp_path: Path,
+    monkeypatch: pytest.MonkeyPatch,
+) -> None:
+    """A completed row exports; checksums verify; a dangling plan is reported."""
+    from app.core.config import get_settings
+
+    workspace_id = "e6" + "0" * 30
+    db_session.add(
+        ShowcaseWorkspace(
+            workspace_id=workspace_id,
+            name="e6-integration",
+            seed=42,
+            scenario="showcase_rich",
+            reset=False,
+            skip_seed=True,
+            status="completed",
+            created_objects={"scenario_plan_ids": ["dangling-plan-1"]},
+        )
+    )
+    await db_session.commit()
+
+    # Point the export root at tmp_path without disturbing the cached settings.
+    patched = get_settings().model_copy(update={"showcase_export_root": str(tmp_path)})
+    monkeypatch.setattr(export, "get_settings", lambda: patched)
+
+    resp = await client.post(f"/demo/workspaces/{workspace_id}/export")
+    assert resp.status_code == 200
+    body = resp.json()
+    assert body["validated"] is True
+    assert body["bundle_path"].endswith(workspace_id)
+    # The dangling scenario plan is reported, not fatal.
+    assert any(ref["ref_id"] == "dangling-plan-1" for ref in body["unresolved_references"])
+
+    bundle = tmp_path / workspace_id
+    assert (bundle / "manifest.json").exists()
+    # Independently re-verify every checksum line (don't trust validated alone).
+    for line in (bundle / "checksums.sha256").read_text().splitlines():
+        if not line.strip():
+            continue
+        expected, _, rel = line.partition("  ")
+        actual = hashlib.sha256((bundle / rel).read_bytes()).hexdigest()
+        assert actual == expected, rel
+
+    # Re-export overwrites: plant a stale file, re-export, assert it's gone.
+    stale = bundle / "scenario_plans" / "stale.json"
+    stale.write_text("{}", encoding="utf-8")
+    resp2 = await client.post(f"/demo/workspaces/{workspace_id}/export")
+    assert resp2.status_code == 200
+    assert not stale.exists()
+
+
+@pytest.mark.integration
+async def test_export_endpoint_409_on_running(
+    client: httpx.AsyncClient,
+    db_session: Any,
+) -> None:
+    """The endpoint rejects a still-running workspace with 409 problem+json."""
+    workspace_id = "e6run" + "0" * 27
+    db_session.add(
+        ShowcaseWorkspace(
+            workspace_id=workspace_id,
+            name="e6-running",
+            seed=1,
+            scenario="demo_minimal",
+            reset=False,
+            skip_seed=True,
+            status="running",
+        )
+    )
+    await db_session.commit()
+
+    resp = await client.post(f"/demo/workspaces/{workspace_id}/export")
+    assert resp.status_code == 409
+    assert resp.headers["content-type"].startswith("application/problem+json")
diff --git a/app/features/demo/tests/test_routes.py b/app/features/demo/tests/test_routes.py
index 7b8858ba..1b00e6a9 100644
--- a/app/features/demo/tests/test_routes.py
+++ b/app/features/demo/tests/test_routes.py
@@ -14,8 +14,14 @@
 from fastapi.testclient import TestClient
 from sqlalchemy.ext.asyncio import AsyncSession
 
-from app.features.demo import service, workspace
-from app.features.demo.schemas import DemoRunRequest, DemoRunResult, StepEvent
+from app.features.demo import export, service, workspace
+from app.features.demo.schemas import (
+    DemoRunRequest,
+    DemoRunResult,
+    ExportFileEntry,
+    StepEvent,
+    WorkspaceExportResult,
+)
 from app.main import app
 
 
@@ -481,6 +487,74 @@ async def fake_delete(_db, _workspace_id: str) -> bool:
     assert "Workspace not found" in resp.json()["detail"]
 
 
+# =============================================================================
+# E6 (#412) -- POST /demo/workspaces/{workspace_id}/export (unit)
+# =============================================================================
+
+
+async def test_export_workspace_404(client, monkeypatch):
+    """An unknown workspace_id is a 404 problem+json (export never runs)."""
+
+    async def fake_get(_db, _workspace_id: str) -> None:
+        return None
+
+    monkeypatch.setattr(workspace, "get_workspace", fake_get)
+
+    resp = await client.post("/demo/workspaces/" + "0" * 32 + "/export")
+    assert resp.status_code == 404
+    assert resp.headers["content-type"].startswith("application/problem+json")
+    assert "Workspace not found" in resp.json()["detail"]
+
+
+async def test_export_workspace_409_when_running(client, monkeypatch):
+    """A still-running workspace is a 409 problem+json (refs not settled)."""
+
+    async def fake_get(_db, workspace_id: str) -> SimpleNamespace:
+        return _orm_like_row(workspace_id=workspace_id, status="running")
+
+    monkeypatch.setattr(workspace, "get_workspace", fake_get)
+
+    resp = await client.post("/demo/workspaces/" + "a" * 32 + "/export")
+    assert resp.status_code == 409
+    assert resp.headers["content-type"].startswith("application/problem+json")
+
+
+async def test_export_workspace_200_happy_path(client, monkeypatch):
+    """A completed workspace returns the export result the writer produced."""
+
+    async def fake_get(_db, workspace_id: str) -> SimpleNamespace:
+        return _orm_like_row(workspace_id=workspace_id, status="completed")
+
+    canned = WorkspaceExportResult(
+        workspace_id="a" * 32,
+        bundle_path="artifacts/showcase/" + "a" * 32,
+        bundle_format_version=1,
+        exported_at=_dt.datetime(2026, 6, 12, 14, 0, tzinfo=_dt.UTC),
+        files=[
+            ExportFileEntry(path="manifest.json", sha256="0" * 64, size_bytes=128),
+            ExportFileEntry(path="checksums.sha256", sha256="1" * 64, size_bytes=80),
+        ],
+        scenario_plans_exported=0,
+        model_runs_referenced=1,
+        unresolved_references=[],
+        validated=True,
+    )
+
+    async def fake_export(_db, _app, workspace_id: str) -> WorkspaceExportResult:
+        return canned
+
+    monkeypatch.setattr(workspace, "get_workspace", fake_get)
+    monkeypatch.setattr(export, "export_workspace", fake_export)
+
+    resp = await client.post("/demo/workspaces/" + "a" * 32 + "/export")
+    assert resp.status_code == 200
+    body = resp.json()
+    assert body["validated"] is True
+    assert body["bundle_format_version"] == 1
+    assert body["model_runs_referenced"] == 1
+    assert len(body["files"]) == 2
+
+
 # =============================================================================
 # E1 (#407) -- PATCH /demo/workspaces/{workspace_id} (unit)
 # =============================================================================

From 93282edcc59d9e78ccad248d12f19740ac19fd26 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 09:44:37 +0200
Subject: [PATCH 29/32] feat(ui): add export button to saved-workspaces panel
 (#412)

Per-row Export button (between Replay and the actions menu) calls
POST /demo/workspaces/{id}/export via a new useExportWorkspace mutation.
Non-destructive, so no confirmation dialog; success toast shows the
bundle path, file count, checksum state, and any unresolved-reference
count; failure surfaces the problem-details message. Self-contained
block to survive an E2 row restyle. Adds WorkspaceExportResult /
ExportFileEntry / UnresolvedReference types.
---
 .../components/demo/WorkspacePanel.test.tsx   | 87 ++++++++++++++++++-
 .../src/components/demo/WorkspacePanel.tsx    | 38 +++++++-
 frontend/src/hooks/use-workspaces.ts          | 13 +++
 frontend/src/types/api.ts                     | 29 +++++++
 4 files changed, 165 insertions(+), 2 deletions(-)

diff --git a/frontend/src/components/demo/WorkspacePanel.test.tsx b/frontend/src/components/demo/WorkspacePanel.test.tsx
index f1fa2d25..dffb3ce0 100644
--- a/frontend/src/components/demo/WorkspacePanel.test.tsx
+++ b/frontend/src/components/demo/WorkspacePanel.test.tsx
@@ -71,6 +71,11 @@ let mockPatchResult: { mutate: ReturnType<typeof vi.fn>; isPending: boolean } =
   isPending: false,
 }
 
+let mockExportResult: { mutate: ReturnType<typeof vi.fn>; isPending: boolean } = {
+  mutate: vi.fn(),
+  isPending: false,
+}
+
 const mockNavigate = vi.fn()
 
 vi.mock('@/hooks/use-workspaces', () => ({
@@ -82,6 +87,7 @@ vi.mock('@/hooks/use-workspaces', () => ({
   useWorkspace: () => ({ data: undefined, isSuccess: false, isError: false }),
   useDeleteWorkspace: () => mockDeleteResult,
   usePatchWorkspace: () => mockPatchResult,
+  useExportWorkspace: () => mockExportResult,
 }))
 
 vi.mock('react-router-dom', async (importOriginal) => {
@@ -97,6 +103,7 @@ beforeEach(() => {
   lastListParams = undefined
   mockDeleteResult = { mutate: vi.fn(), mutateAsync: vi.fn(), isPending: false }
   mockPatchResult = { mutate: vi.fn(), isPending: false }
+  mockExportResult = { mutate: vi.fn(), isPending: false }
 })
 
 function renderPanel(props: Partial<Parameters<typeof WorkspacePanel>[0]> = {}) {
@@ -197,7 +204,7 @@ describe('WorkspacePanel', () => {
   it('disables row actions while a run is in flight', () => {
     mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
     renderPanel({ isRunning: true })
-    const labels = ['Load', 'Replay']
+    const labels = ['Load', 'Replay', 'Export']
     for (const label of labels) {
       const button = screen
         .getAllByRole('button')
@@ -207,6 +214,84 @@ describe('WorkspacePanel', () => {
   })
 })
 
+describe('WorkspacePanel — E6 export', () => {
+  function findExportButton(container: HTMLElement) {
+    return Array.from(container.querySelectorAll('button')).find((b) =>
+      (b.textContent ?? '').includes('Export')
+    )!
+  }
+
+  it('renders an Export button per row', () => {
+    mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
+    const { container } = renderPanel()
+    expect(findExportButton(container)).toBeTruthy()
+  })
+
+  it('fires the export mutation with the row id and toasts the bundle path', () => {
+    mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
+    const { container } = renderPanel()
+    fireEvent.click(findExportButton(container))
+
+    expect(mockExportResult.mutate).toHaveBeenCalledTimes(1)
+    const [workspaceId, options] = mockExportResult.mutate.mock.calls[0] as [
+      string,
+      { onSuccess: (r: unknown) => void; onError: (error: unknown) => void },
+    ]
+    expect(workspaceId).toBe(baseItem.workspace_id)
+
+    options.onSuccess({
+      workspace_id: baseItem.workspace_id,
+      bundle_path: `artifacts/showcase/${baseItem.workspace_id}`,
+      bundle_format_version: 1,
+      exported_at: '2026-06-12T14:00:00Z',
+      files: [
+        { path: 'manifest.json', sha256: 'a', size_bytes: 1 },
+        { path: 'checksums.sha256', sha256: 'b', size_bytes: 1 },
+      ],
+      scenario_plans_exported: 0,
+      model_runs_referenced: 0,
+      unresolved_references: [],
+      validated: true,
+    })
+    expect(toast.success).toHaveBeenCalledWith(expect.stringContaining('Bundle written to'))
+    expect(toast.success).toHaveBeenCalledWith(expect.stringContaining('checksums verified'))
+  })
+
+  it('notes dangling references in the success toast', () => {
+    mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
+    const { container } = renderPanel()
+    fireEvent.click(findExportButton(container))
+    const [, options] = mockExportResult.mutate.mock.calls[0] as [
+      string,
+      { onSuccess: (r: unknown) => void; onError: (error: unknown) => void },
+    ]
+    options.onSuccess({
+      workspace_id: baseItem.workspace_id,
+      bundle_path: `artifacts/showcase/${baseItem.workspace_id}`,
+      bundle_format_version: 1,
+      exported_at: '2026-06-12T14:00:00Z',
+      files: [{ path: 'manifest.json', sha256: 'a', size_bytes: 1 }],
+      scenario_plans_exported: 0,
+      model_runs_referenced: 0,
+      unresolved_references: [{ key: 'scenario_plan_ids', ref_id: 'gone', reason: 'HTTP 404' }],
+      validated: true,
+    })
+    expect(toast.success).toHaveBeenCalledWith(expect.stringContaining('1 unresolved reference'))
+  })
+
+  it('surfaces an export failure via the error toast', () => {
+    mockResponse = { data: { workspaces: [baseItem], total: 1 }, isLoading: false }
+    const { container } = renderPanel()
+    fireEvent.click(findExportButton(container))
+    const [, options] = mockExportResult.mutate.mock.calls[0] as [
+      string,
+      { onSuccess: (r: unknown) => void; onError: (error: unknown) => void },
+    ]
+    options.onError(new ApiError('Export bundle write failed: disk full', 500))
+    expect(toast.error).toHaveBeenCalledWith(expect.stringContaining('Export failed'))
+  })
+})
+
 describe('WorkspacePanel — E2 lifecycle badges + toolbar params', () => {
   it('renders pinned / archived / replay badges', () => {
     mockResponse = {
diff --git a/frontend/src/components/demo/WorkspacePanel.tsx b/frontend/src/components/demo/WorkspacePanel.tsx
index fe931421..b3cdcd89 100644
--- a/frontend/src/components/demo/WorkspacePanel.tsx
+++ b/frontend/src/components/demo/WorkspacePanel.tsx
@@ -24,6 +24,7 @@ import { useQueryClient } from '@tanstack/react-query'
 import {
   Archive,
   ArchiveRestore,
+  FileDown,
   FolderOpen,
   MoreHorizontal,
   Pencil,
@@ -63,7 +64,12 @@ import {
   SelectTrigger,
   SelectValue,
 } from '@/components/ui/select'
-import { useDeleteWorkspace, usePatchWorkspace, useWorkspaces } from '@/hooks/use-workspaces'
+import {
+  useDeleteWorkspace,
+  useExportWorkspace,
+  usePatchWorkspace,
+  useWorkspaces,
+} from '@/hooks/use-workspaces'
 import { ApiError, getErrorMessage } from '@/lib/api'
 import { ROUTES } from '@/lib/constants'
 import { cn } from '@/lib/utils'
@@ -159,6 +165,7 @@ export function WorkspacePanel({
   const queryClient = useQueryClient()
   const deleteWorkspace = useDeleteWorkspace()
   const patchWorkspace = usePatchWorkspace()
+  const exportWorkspace = useExportWorkspace()
 
   // ── dialogs + selection state ────────────────────────────────────────────
   const [pendingDelete, setPendingDelete] = useState<WorkspaceListItem | null>(null)
@@ -231,6 +238,24 @@ export function WorkspacePanel({
     )
   }
 
+  // E6 (#412) — non-destructive export; no confirmation dialog. Success toast
+  // surfaces the bundle path + file count + checksum state + any dangling refs.
+  const handleExport = (ws: WorkspaceListItem) => {
+    exportWorkspace.mutate(ws.workspace_id, {
+      onSuccess: (result) => {
+        const fileCount = `${result.files.length} file${result.files.length === 1 ? '' : 's'}`
+        const checksums = result.validated ? 'verified' : 'FAILED'
+        const unresolved = result.unresolved_references.length
+          ? ` ${result.unresolved_references.length} unresolved reference(s).`
+          : ''
+        toast.success(
+          `Bundle written to ${result.bundle_path} — ${fileCount}, checksums ${checksums}.${unresolved}`
+        )
+      },
+      onError: (error) => toast.error(`Export failed: ${getErrorMessage(error)}`),
+    })
+  }
+
   const toggleSelected = (workspaceId: string) => {
     setSelected((prev) => {
       const next = new Set(prev)
@@ -410,6 +435,17 @@ export function WorkspacePanel({
                       <Play className="mr-1 h-3 w-3" />
                       Replay
                     </Button>
+                    {/* E6 (#412) — export a checksum-validated bundle. Self-
+                        contained block (survives an E2 row restyle / rebase). */}
+                    <Button
+                      size="sm"
+                      variant="outline"
+                      disabled={isRunning || exportWorkspace.isPending}
+                      onClick={() => handleExport(ws)}
+                    >
+                      <FileDown className="mr-1 h-3 w-3" />
+                      Export
+                    </Button>
                     <DropdownMenu>
                       <DropdownMenuTrigger asChild>
                         <Button
diff --git a/frontend/src/hooks/use-workspaces.ts b/frontend/src/hooks/use-workspaces.ts
index 610cefb8..7e7c6a7d 100644
--- a/frontend/src/hooks/use-workspaces.ts
+++ b/frontend/src/hooks/use-workspaces.ts
@@ -2,6 +2,7 @@ import { useMutation, useQuery, useQueryClient } from '@tanstack/react-query'
 import { api, ApiError } from '@/lib/api'
 import type {
   WorkspaceDetail,
+  WorkspaceExportResult,
   WorkspaceHealth,
   WorkspaceListParams,
   WorkspaceListResponse,
@@ -94,6 +95,18 @@ export function useWorkspaceHealth(workspaceId: string, enabled = true) {
   })
 }
 
+/**
+ * E6 (#412) — export a saved workspace to a checksum-validated bundle on disk
+ * (artifacts/showcase/<id>/). Export is stateless and re-runnable: it writes no
+ * server-side row, so it does NOT invalidate the workspaces list.
+ */
+export function useExportWorkspace() {
+  return useMutation({
+    mutationFn: (workspaceId: string) =>
+      api<WorkspaceExportResult>(`/demo/workspaces/${workspaceId}/export`, { method: 'POST' }),
+  })
+}
+
 /** One ancestor entry in a workspace's replay lineage chain (newest first). */
 export interface LineageEntry {
   workspace_id: string
diff --git a/frontend/src/types/api.ts b/frontend/src/types/api.ts
index 74a691cb..a817bcc4 100644
--- a/frontend/src/types/api.ts
+++ b/frontend/src/types/api.ts
@@ -990,6 +990,35 @@ export interface WorkspaceHealth {
   checked_at: string
 }
 
+// === Showcase Workspace Export (E6, #412) ===
+
+// One file inside an exported workspace bundle.
+export interface ExportFileEntry {
+  path: string
+  sha256: string
+  size_bytes: number
+}
+
+// A soft reference that could not be resolved during export.
+export interface UnresolvedReference {
+  key: string
+  ref_id: string
+  reason: string
+}
+
+// Result of POST /demo/workspaces/{workspace_id}/export.
+export interface WorkspaceExportResult {
+  workspace_id: string
+  bundle_path: string
+  bundle_format_version: number
+  exported_at: string
+  files: ExportFileEntry[]
+  scenario_plans_exported: number
+  model_runs_referenced: number
+  unresolved_references: UnresolvedReference[]
+  validated: boolean
+}
+
 // === AI Model Configuration (/config) ===
 
 // Presence + masked preview of one provider API key (never the raw value).

From 9a4f12dd55eb9806d312556be02dfa9250c9a6c2 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 09:44:37 +0200
Subject: [PATCH 30/32] docs(docs): document workspace export bundle contract
 (#412)

API_CONTRACTS: add the POST /demo/workspaces/{id}/export row. RUNBOOKS:
add the export-semantics bullet to the showcase-workspace section
(overwrite-on-re-export, dangling-ref warnings, gitignored artifacts/,
sha256sum -c verification) and move export bundles off the out-of-scope
list (import/restore remains out).
---
 docs/_base/API_CONTRACTS.md | 1 +
 docs/_base/RUNBOOKS.md      | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/docs/_base/API_CONTRACTS.md b/docs/_base/API_CONTRACTS.md
index b2fb5c64..8ca1c2df 100644
--- a/docs/_base/API_CONTRACTS.md
+++ b/docs/_base/API_CONTRACTS.md
@@ -65,6 +65,7 @@ All endpoints serve JSON; error responses use `application/problem+json` (RFC 78
 | demo | GET | `/demo/workspaces/{workspace_id}/health` | **E2 (#408)** — probe the workspace's soft references in-process (model runs, scenario plans, alias, batch, agent session, `job_ids` slot) via `httpx.ASGITransport`; per-reference `status` ∈ `alive` (2xx) / `dead` (404 — deleted after the run) / `unknown` (anything else — never a 500), plus `alive`/`dead`/`unknown` counts and `partial_run` (true when the row's status ≠ `completed`); non-probeable keys (`v2_model_path`, `scenario_artifact_key`, `train_model_types`) are skipped; `404 application/problem+json` when the workspace is missing |
 | demo | PATCH | `/demo/workspaces/{workspace_id}` | **E1 (#407)** — partial lifecycle update (`name` / `notes` / `tags` / `archived` / `pinned`; `exclude_unset` semantics — only provided fields change; explicit `null` clears `name`/`notes`; explicit `null` on `archived`/`pinned`/`tags` → `422` (send `[]` to clear tags); `status` NOT patchable — the pipeline owns it); returns the updated `WorkspaceDetailResponse`; empty body = `200` no-op; `404 application/problem+json` when missing; `422` on unknown keys / bad name pattern / >20 tags |
 | demo | DELETE | `/demo/workspaces/{workspace_id}` | Delete one saved workspace METADATA row; `204` on success, `404 application/problem+json` when missing. The run's created objects (model runs, scenario plans, aliases, jobs, artifacts) are soft references and are NOT deleted |
+| demo | POST | `/demo/workspaces/{workspace_id}/export` | **E6 (#412)** — write a checksum-validated bundle under `artifacts/showcase/<workspace_id>/`: a versioned `manifest.json` (full `WorkspaceDetailResponse` snapshot + `bundle_format_version: 1` + `exported_at` + model-run references), one `scenario_plans/<scenario_id>.json` per resolvable plan, and a `sha256sum`-compatible `checksums.sha256` covering every other file. Re-reads + recomputes every checksum before returning (`validated: bool`). Soft references resolve over the in-process HTTP surface (`GET /registry/runs/{id}` + `/verify`, `GET /scenarios/{id}`); **model artifacts are REFERENCED (uri + registry hash + live `artifact_verified`), never copied**. Dangling soft references (deleted run / plan) become `unresolved_references` entries and the export still returns `200`. Returns `WorkspaceExportResult` (`bundle_path`, full `files` inventory with hashes/sizes, counts, `unresolved_references`, `validated`). `404 application/problem+json` when missing; `409` while `status="running"` (references not yet settled); `500` on a disk write failure. Re-export overwrites the bundle deterministically (`exported_at` records the moment). No migration, no DB writes — stateless and re-runnable; `artifacts/` is gitignored so bundles never enter version control. `failed` and archived workspaces export normally |
 | demo | POST | `/demo/hitl-decision` | **E5 (#411)** — relay the Showcase HITL step card's Approve/Reject to the in-flight pipeline. Body `{action_id: str, decision: 'approved' \| 'rejected', reason?: str ≤500}` (`ConfigDict(strict=True, extra='forbid')`). `204` on success; `404 application/problem+json` when no matching action is pending; `409` when the action was already decided; `422` on a malformed body. The in-memory single-slot relay is safe because the pipeline runs one-at-a-time under the module `_pipeline_lock`; the pipeline forwards the real decision to `/agents/sessions/{id}/approve` (`approved=true\|false` + reason) — `agent_require_approval` is untouched. A reject keeps the pipeline GREEN (D5); the gated `save_scenario` never executes |
 | demo | GET | `/demo/approval-events` | **E5 (#411)** — recent HITL approval events flattened across the newest saved workspaces carrying the `approval_events` slot, newest-workspace-first (`limit` 1-200 default 50); `200` + empty list when none. Each item carries `workspace_id` / `workspace_name` plus the entry's base + additive keys (`decision`, `tool_name`, `auto_approved`, `reason`, `execution_status`, `transcript_summary`, …). Audit-glance surface — no pagination/offset (D6). Backs the `/ops` page's Approval History table (frontend-only — the ops slice does not import demo code) |
 | config | GET | `/config/ai` | Effective AI-model config (agent LLM + RAG embeddings); API keys masked, never raw |
diff --git a/docs/_base/RUNBOOKS.md b/docs/_base/RUNBOOKS.md
index 6cdb9fd7..2dd0d6dc 100644
--- a/docs/_base/RUNBOOKS.md
+++ b/docs/_base/RUNBOOKS.md
@@ -161,10 +161,11 @@ uv run python scripts/run_demo.py --seed 42 --quiet 2>&1 | tee demo.log
 3. **Rows accumulate unless deleted.** `DELETE /demo/workspaces/{workspace_id}` (and the panel's per-row **Delete** button, behind a confirmation dialog) removes a saved row; a missing id is an RFC 7807 404. Undeleted rows are harmless audit records. E2 (#408) — the panel adds search / tag filter / sort / show-archived (archived rows are hidden from the list by default) and a multi-select **Delete selected** action — N sequential single DELETEs behind one confirmation; deliberately NO bulk endpoint.
 4. **Deleting a workspace deletes METADATA ONLY.** The delete removes just the `showcase_workspace` row — the model runs, scenario plans, aliases, jobs, agent sessions, and on-disk artifacts the run created are NOT touched (and the seeded data is not reverted). `created_objects` ids are SOFT references (deliberately no FKs), so deletion in either direction never cascades: an operator-issued `DELETE /registry/runs/{id}` or scenario-plan delete leaves dangling deep links on a loaded workspace's artifact cards — expected; the workspace row records what WAS created, not what still exists. E2 (#408) — that staleness now SURFACES instead of dangling silently: loading a workspace probes its references via `GET /demo/workspaces/{id}/health`, dead references get a warning marker on the artifact cards, and a summary chip shows alive/dead counts plus a partial-run warning for never-completed rows.
 5. **`holiday_rush` workspaces replay the pinned 2024 window.** The preset seeds a fixed Oct–Dec 2024 window (incident 28 above); a Replay with `reset=false` ADDS those rows to a today-anchored dataset, so `/seeder/status` reports the union range afterwards. For a clean pinned window, save the workspace from a run with **Reset database** ticked — its (destructive) Replay then reproduces the pinned window exactly.
+6. **Export writes a checksum-validated bundle, overwriting on re-export (E6 #412).** The panel's per-row **Export** button (and `POST /demo/workspaces/{workspace_id}/export`) writes `artifacts/showcase/<workspace_id>/` — `manifest.json` (full workspace snapshot + model-run references + `bundle_format_version`/`exported_at`), one `scenario_plans/<id>.json` per resolvable plan, and a `sha256sum`-compatible `checksums.sha256`. The endpoint re-reads and recomputes every checksum before returning (`validated: true`); verify by hand with `cd artifacts/showcase/<id> && sha256sum -c checksums.sha256`. Export is **non-destructive and stateless** — no DB write, no story slot, and `artifacts/` is gitignored so bundles never reach version control. **Dangling soft references are expected, not errors:** a scenario plan or model run deleted since the run becomes an `unresolved_references` entry and the export still succeeds (the success toast names the count). Model artifacts are REFERENCED (uri + registry hash + a live `artifact_verified`), never copied — the registry already owns and hash-verifies them. **Re-export overwrites:** the previous bundle directory is removed wholesale and rewritten (the `shutil.rmtree` target is always the traversal-guarded `<root>/<workspace_id>`, never a raw request value), so a stale `scenario_plans/<id>.json` from a prior export disappears on the next one. A `running` workspace returns `409` (its references are not yet settled); a `failed` or archived workspace exports normally. The export root is configurable via `SHOWCASE_EXPORT_ROOT` (default `./artifacts/showcase`, resolved against the backend CWD — repo root for local uvicorn, `/app` in the container).
 
 **Notes:** keep-runs are recorded by warn-and-continue hooks — a DB hiccup during `create_workspace` yields a green pipeline with `workspace_id: null` and no row (check uvicorn logs for `demo.workspace_create_failed`). Ephemeral runs write no workspace rows and stay in the localStorage Run-history strip; kept runs appear ONLY in the server-backed panel. On `showcase_rich` keep-runs, the planning-phase scenario plans carry the `workspace:<name|id>` tag (E3 #392) — retrieve them via `GET /scenarios?tags=workspace:<label>`. E3 (#409) — a kept run additionally records its `seed_overrides` and `user_scope` story slots at create time; Replay re-submits both verbatim (the slot records the REQUESTED config; the row's `store_id`/`product_id` columns record the EFFECTIVE grain, so a fallen-back scope stays visible).
 
-**Explicitly out of scope (not implemented; future epics, do not assume they exist):** export bundles under `artifacts/showcase/<workspace>/` (E6 #412); mid-run / per-phase re-entry (the linear single-`asyncio.Lock` pipeline is preserved — all configuration is start-frame-time only); the `job_ids` / `phase_summaries` story slots (columns exist, still unwritten). (Replay provenance shipped in E1 #407 — `replayed_from_workspace_id` is recorded on every Replay. Advanced seed configuration shipped in E3 #409 — the 7-knob `seed_overrides` panel + `user_scope` focus pair, both replay-verbatim. Run configuration shipped in E4 #410 — the start-frame model set + backtest config in the `run_config` replay-input column, replay-verbatim. Agent/HITL + RAG story capture shipped in E5 #411 — the `approval_events` / `rag_events` slots are now written, plus the **Reject** button + 10 s decision window + `/ops` Approval History + the loaded-workspace Run-story panel.)
+**Explicitly out of scope (not implemented; future epics, do not assume they exist):** bundle **import / restore** (E6 ships export only — import is the highest-risk surface with an unsettled layout); mid-run / per-phase re-entry (the linear single-`asyncio.Lock` pipeline is preserved — all configuration is start-frame-time only); the `job_ids` / `phase_summaries` story slots (columns exist, still unwritten). (Replay provenance shipped in E1 #407 — `replayed_from_workspace_id` is recorded on every Replay. Advanced seed configuration shipped in E3 #409 — the 7-knob `seed_overrides` panel + `user_scope` focus pair, both replay-verbatim. Run configuration shipped in E4 #410 — the start-frame model set + backtest config in the `run_config` replay-input column, replay-verbatim. Agent/HITL + RAG story capture shipped in E5 #411 — the `approval_events` / `rag_events` slots are now written, plus the **Reject** button + 10 s decision window + `/ops` Approval History + the loaded-workspace Run-story panel. Export bundles shipped in E6 #412 — `POST /demo/workspaces/{id}/export` writes a checksum-validated `artifacts/showcase/<workspace_id>/` bundle; **import/restore remains out of scope**.)
 
 ### release-please skipped the bump after a dev → main merge
 **Symptoms:** `dev → main` PR is merged, `CD Release` workflow on `main` completes in ~10s, **no Release PR** is opened. release-please log shows `No user facing commits found since <sha> - skipping`.

From 4d1b9ae1fb0f12897e2a3830c9f7b9a75edccf08 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 10:34:03 +0200
Subject: [PATCH 31/32] docs(docs): reconcile domain model showcase export
 out-of-scope note (#420)

---
 docs/_base/DOMAIN_MODEL.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/_base/DOMAIN_MODEL.md b/docs/_base/DOMAIN_MODEL.md
index fb9c7878..225ab73d 100644
--- a/docs/_base/DOMAIN_MODEL.md
+++ b/docs/_base/DOMAIN_MODEL.md
@@ -76,7 +76,7 @@
   - Persistence is warn-and-continue: a workspace write failure must never break the demo pipeline (the run completes with `workspace_id: null`). The HTTP-backed helpers (`update_workspace` for PATCH, like get/list/delete) take a caller-owned session and raise normally — warn-and-continue is pipeline-only.
   - E1 (#407): `replayed_from_workspace_id` is a SOFT reference — **no ForeignKey, not even self-referential**: ancestor workspace rows must stay independently deletable (metadata-only delete) without cascading to or blocking descendants. The value is recorded verbatim from the request (no existence check); dangling lineage pointers after an ancestor delete are expected and harmless, like every `created_objects` id.
   - E1 (#407): `status` is NOT patchable — `PATCH /demo/workspaces/{id}` covers `name`/`notes`/`tags`/`archived`/`pinned` only; `archived` is an orthogonal curation flag and the `ck_showcase_workspace_status` CHECK is untouched.
-- **Out of scope (deliberately not modeled yet):** export bundles under `artifacts/showcase/<workspace>/` (E6 #412) and per-phase interactive configuration — see `docs/_base/RUNBOOKS.md` § Showcase workspace. (Advanced seed config + scope selection shipped in E3 #409 — the `seed_overrides`/`user_scope` slots are written. Agent/HITL + RAG story capture shipped in E5 #411 — the `approval_events`/`rag_events` slots above are now written, plus `result_summary.story_reproduction` on replay keep-runs. The `job_ids`/`phase_summaries` slots remain unwritten.)
+- **Out of scope (deliberately not modeled yet):** bundle **import / restore** (E6 #412 ships export only — import is the highest-risk surface with an unsettled layout) and per-phase interactive configuration — see `docs/_base/RUNBOOKS.md` § Showcase workspace. (Advanced seed config + scope selection shipped in E3 #409 — the `seed_overrides`/`user_scope` slots are written. Agent/HITL + RAG story capture shipped in E5 #411 — the `approval_events`/`rag_events` slots above are now written, plus `result_summary.story_reproduction` on replay keep-runs. Export bundles shipped in E6 #412 — `POST /demo/workspaces/{id}/export` writes a checksum-validated `artifacts/showcase/<workspace_id>/` bundle; import/restore remains out of scope. The `job_ids`/`phase_summaries` slots remain unwritten.)
 
 ## Key Invariants — NEVER violate
 

From 3c49373f53d68d78b404312cad51e1cdb116d657 Mon Sep 17 00:00:00 2001
From: Gabor Szabo <shellsnake@icloud.com>
Date: Sat, 13 Jun 2026 10:34:09 +0200
Subject: [PATCH 32/32] docs(repo): track showcase completion e7 release-gate
 prp (#420)

---
 ...PRP-showcase-completion-E7-release-gate.md | 751 ++++++++++++++++++
 1 file changed, 751 insertions(+)
 create mode 100644 PRPs/PRP-showcase-completion-E7-release-gate.md

diff --git a/PRPs/PRP-showcase-completion-E7-release-gate.md b/PRPs/PRP-showcase-completion-E7-release-gate.md
new file mode 100644
index 00000000..53165471
--- /dev/null
+++ b/PRPs/PRP-showcase-completion-E7-release-gate.md
@@ -0,0 +1,751 @@
+name: "PRP showcase-completion E7 — release gate: docs reconciliation + E1–E6 dogfood evidence matrix + regression/CI audit + release path + umbrella close-out"
+description: |
+  Issue #420 (epic E7 of umbrella #406, milestone showcase-workspace-completion).
+  Release-gate epic: NO production code. Deliverables are (a) a docs
+  reconciliation — verify docs/_base/{API_CONTRACTS,RUNBOOKS,DOMAIN_MODEL}.md
+  reflect E1–E6 and fix the ONE stale line (DOMAIN_MODEL.md:79 still lists E6
+  export as "not modeled yet"); (b) an executed dogfood evidence matrix across
+  the E1–E6 *frontend* control surface on a fresh-DB stack, each action mapped
+  to an umbrella success criterion; (c) a regression + CI-gate audit (the
+  showcase_rich e2e tests + the #146/#324 replay guard + the legacy-frame
+  byte-identical contract, all CI-covered — cite + targeted re-run); (d) the
+  release path confirmed and the target version recorded (the dev→main cut is
+  stop-and-ask, NOT autonomous); (e) evidence on #420, umbrella #406 success
+  criteria ticked, #406 + #420 closed. If any dogfood check fails OUTSIDE the
+  documented expected-outcome matrix, the gate STOPS and files a fix issue — it
+  never fixes forward inside this epic.
+
+---
+
+## Goal
+
+Close umbrella #406 (showcase workspace completion — the forecastlab control
+story) on **proof, not per-epic merges**. E1 #407, E2 #408, E3 #409, E4 #410,
+E5 #411, E6 #412 are all CLOSED and merged to `dev` (E6 in #419, merge commit
+`0381edb`), but **none of the six is in a release** — `main` is still at
+**v0.2.22**, which carries only the *prior* umbrella #389 (#390–#393). Nothing
+has yet verified the six epics' *combined* live behavior across the new control
+surface (lifecycle PATCH, safe replay, advanced seed config + scope, run-config,
+HITL reject + story capture, export), the umbrella's eight success-criteria
+checkboxes are 1/8 ticked (only E6's, ticked during E6 close-out), and one
+`docs/_base/` line is stale.
+
+1. **Docs reconciliation** — `API_CONTRACTS.md` + `RUNBOOKS.md` already document
+   E1–E6 (audited 2026-06-13: complete). Fix the single stale line in
+   `DOMAIN_MODEL.md:79` (E6 export bundles are SHIPPED, not "not modeled yet").
+2. **Dogfood evidence matrix** — on a fresh-DB stack, exercise each E1–E6
+   frontend control once and record the outcome, each mapped to an umbrella
+   success criterion. The backend pipeline is CI-covered (see Task 1); the
+   live gate's unique value is the *frontend* surface + the cross-epic combined
+   behavior on a real browser.
+3. **Regression + CI-gate audit** — cite the latest green `dev` CI run (all five
+   gates incl. the showcase_rich e2e + replay regression run there) and re-run
+   the load-bearing proofs locally: the legacy-frame contract test, the
+   `test_demo_replay_same_config_twice` guard, the demo-slice units, and the
+   frontend demo-component vitest.
+4. **Release path** — confirm E1–E6 are release-ready on `dev`; record the
+   target version (release-please will bump v0.2.22 → next on the dev→main
+   merge). The actual cut is a separate **stop-and-ask** decision (release-please
+   owns tagging) — this gate does NOT merge dev→main.
+5. **Close-out** — evidence comment on #420; tick the 7 remaining #406
+   success-criteria boxes (E6's is already ticked) with evidence; close #406;
+   close #420 last.
+
+**End state**: #406 and #420 CLOSED with linked evidence; `DOMAIN_MODEL.md:79`
+corrected; this PRP file committed; E1–E6 confirmed release-ready with the
+target version recorded for the maintainer's release decision.
+
+## Why
+
+- Every umbrella #406 success criterion is implemented but **only 1/8 is ticked
+  with evidence**, and five of eight are only fully provable by live combined
+  behavior (lineage + destructive-replay confirm; rename/archive/pin/search/
+  multi-delete; seed-override + scope replay-verbatim; run-config echo; HITL
+  reject + story capture on Showcase and /ops).
+- E6 (#412) just merged; the six epics ship together as one user story but have
+  never been exercised back-to-back on one stack. A gate proves the *seams*
+  (e.g. a keep-run that sets seed overrides AND a custom run-config AND triggers
+  an approval, then is replayed and exported).
+- The umbrella deferred the close-out evidence + the final doc reconciliation to
+  this gate (per its decomposition); without it #406 closes on faith.
+- `main` is two umbrellas behind perception: operators reading the GitHub
+  release list see only v0.2.22 (#389). E7 surfaces the unreleased-on-`dev`
+  state and records the release path so the maintainer can cut it deliberately.
+
+## What
+
+A verification campaign plus a one-line docs fix. No `app/`, `frontend/`, or
+`alembic/` change is in scope. Tracked changes: this PRP file +
+`docs/_base/DOMAIN_MODEL.md` (one line), one branch
+(`docs/showcase-completion-e7-gate`), one PR into `dev`.
+
+### Success Criteria (mirror of #420 sub-tasks + the #406 criteria they close)
+
+- [ ] Fresh-DB stack via **DROP/CREATE DATABASE** (NOT `down -v` — Known
+      Gotchas) + `alembic upgrade b7c1d9e3f204` clean (the COMMITTED head — a
+      bare `upgrade head` errors on the local two-head state; see Known
+      Gotchas); a targeted downgrade→upgrade round-trip of the head-adjacent
+      showcase migration (E4 run_config) proves "applies + downgrades cleanly"
+      (#406 criterion 1).
+- [ ] Latest `dev` CI run cited GREEN (all five gates; the showcase_rich e2e +
+      replay regression run there); local re-run of the legacy-frame contract
+      test + `test_demo_replay_same_config_twice` both pass (#406 criteria 8 + 2).
+- [ ] **E1 dogfood**: a workspace renamed / annotated (notes+tags) via the Edit
+      dialog, pinned, archived — each reflected in `GET /demo/workspaces/{id}`
+      (#406 criterion 3).
+- [ ] **E2 dogfood**: a `reset=true` workspace Replay opens the destructive-
+      escalated confirm dialog (recorded-vs-sent diff) and only runs on confirm;
+      a replayed row shows the lineage chain + replay badge and
+      `replayed_from_workspace_id` set; search / tag-filter / sort / multi-select
+      delete all work from the panel; the two-workspace compare page renders;
+      a loaded workspace shows link-health markers (#406 criteria 2 + 3).
+- [ ] **E3 dogfood**: a keep-run started with ≥2 seed-override knobs + an explicit
+      focus pair persists both story slots (`GET /demo/workspaces/{id}` →
+      `seed_overrides` + `user_scope`); Replay re-submits them verbatim and the
+      new row carries identical slots (#406 criterion 4; backed by
+      `test_demo_replay_preserves_seed_overrides_and_scope`).
+- [ ] **E4 dogfood**: a keep-run started with a custom model set + backtest config
+      records `run_config` on the row (the panel renders the "custom: …" badge);
+      a default-config run leaves `run_config` null (#406 criterion 5).
+- [ ] **E5 dogfood**: an approval is captured on a `showcase_rich` keep-run
+      (auto-approve OR a live Reject if a cloud agent model is configured),
+      surfaced in the loaded workspace's Run-story panel AND the `/ops` Approval
+      History table; a Reject keeps the run GREEN and writes no `scenario_plan`
+      (#406 criterion 6; backed by `test_run_demo_showcase_rich_full_epic`).
+- [ ] **E6 dogfood**: Export on a saved row → `sha256sum -c checksums.sha256`
+      passes (re-confirm; #406 criterion 7 already ticked).
+- [ ] 8-preset run matrix executed/recorded against RUNBOOKS entry 28; outcomes
+      conformant (no undocumented ❌).
+- [ ] `DOMAIN_MODEL.md:79` corrected (export off the "not modeled yet" list;
+      import/restore stays out); `git diff --stat` shows only intended lines.
+- [ ] Five validation gates green on the docs branch + targeted frontend
+      demo-component vitest green.
+- [ ] Release path recorded (target version; dev→main is stop-and-ask).
+- [ ] Evidence on #420; #406's 7 remaining success boxes ticked; #406 closed;
+      #420 closed; docs PR open into `dev`.
+
+## All Needed Context
+
+### Documentation & References
+
+```yaml
+# ── The gate's contract ──────────────────────────────────────────────────────
+- issue: "#420 — gh issue view 420"
+  why: The epic's six sub-tasks this PRP encodes (docs sweep, dogfood matrix,
+       regression audit, CI-gate audit, release path, umbrella close-out).
+
+- issue: "#406 — gh issue view 406 --json body"
+  why: "Umbrella. STATE (verified 2026-06-13): Decomposition E1–E6 ticked +
+       E7 line wired to #420; Success-criteria 1/8 ticked (only the E6 'Export
+       produces…' box). Tick the OTHER 7 with evidence; close with a close-out
+       comment. E1–E6 = #407 #408 #409 #410 #411 #412, all CLOSED, merged to
+       dev, UNRELEASED (main = v0.2.22 = prior umbrella #389)."
+
+- file: PRPs/PRP-showcase-workspace-E5-release-gate.md
+  why: "THE release-gate precedent this PRP mirrors directly (STOP rule, fresh-
+       stack procedure, per-preset matrix, evidence format, close-out order).
+       That gate closed the PRIOR umbrella #389; this one closes #406. Reuse
+       its structure verbatim where it still applies; the deltas are: docs are
+       already swept (verify + 1-line fix, not from-scratch), the dogfood centres
+       on the E1–E6 frontend surface (backend is CI-covered), and there is a
+       real release-path sub-task."
+
+- file: PRPs/PRP-reliability-E6-release-gate.md
+  why: "Second release-gate precedent (STOP rule + evidence + close-out order).
+       Its 'docker compose down -v' fresh-stack step is SUPERSEDED — use the
+       DROP/CREATE procedure in Known Gotchas."
+
+# ── What E1–E6 shipped (the six PRPs — read the one a failing check touches) ──
+- file: PRPs/PRP-showcase-completion-E1-metadata-provenance-backbone.md
+  why: "E1 contract: showcase_workspace lifecycle/provenance columns + 6 JSONB
+       story slots + PATCH /demo/workspaces/{id}. config_schema_version starts 1."
+- file: PRPs/PRP-showcase-completion-E2-safe-replay-lifecycle.md
+  why: "E2: destructive-replay confirm + diff, lineage, rename/archive/pin/tags,
+       list search/filter/sort, multi-delete, compare, link-health probe."
+- file: PRPs/PRP-showcase-completion-E3-seed-config-scope.md
+  why: "E3: 7-knob SeederOverrides + user_scope; replay-verbatim slot contract."
+- file: PRPs/PRP-showcase-completion-E4-run-config-phase-controls.md
+  why: "E4: train_model_types + backtest (DemoBacktestConfig) → run_config column."
+- file: PRPs/PRP-showcase-completion-E5-agent-rag-story-capture.md
+  why: "E5: hitl-decision relay, approval_events/rag_events capture, Reject +
+       10s window, /ops approval history, config_schema_version 1→2."
+- file: PRPs/PRP-showcase-completion-E6-export-bundle.md
+  why: "E6: POST /demo/workspaces/{id}/export + Export button + bundle layout."
+
+# ── Frontend dogfood surface (the E1–E6 controls — file:line, verified 2026-06-13)
+- file: frontend/src/pages/showcase.tsx
+  why: "Core run controls: scenario card grid (ScenarioPicker), 'Re-seed first'
+       checkbox (:345 → skip_seed=false), 'Reset database' (:363 → reset=true),
+       Seed input (:383), 'Save as workspace' + name input (:398/:409), Run
+       (:323) / Stop (:337). Dirty-only rule (:193-194): train_model_types /
+       backtest omitted from the frame when equal to defaults (the byte-identical
+       guard). SeedConfigPanel rendered only when reseed ticked (:432); ScopeSelector
+       warns on reset (:446); WorkspaceArtifactsPanel link-health (:554)."
+- file: frontend/src/components/demo/WorkspacePanel.tsx
+  why: "Saved-workspaces panel. E1: pin (:379), Archive dropdown (:469), Edit
+       details… → WorkspaceEditDialog. E2: search (:297), Show-archived (:306),
+       Sort select (:312), tag chip filter (:324/:406), multi-select + Delete
+       selected (:496), Compare → /showcase/compare (:517), replay/archived/
+       pinned badges (:389-390). E6: Export button (:440), success toast (:251)."
+- file: frontend/src/components/demo/ReplayConfirmDialog.tsx
+  why: "E2 destructive-replay confirm. Recorded-vs-sent diff table (:56); the
+       confirm button + warning escalate destructively when reset=true (:99-102).
+       NO replay starts without this dialog."
+- file: frontend/src/components/demo/WorkspaceLineageStrip.tsx
+  why: "E2 lineage breadcrumb (:26); renders nothing when <2 entries (:31);
+       '(original deleted)' for dangling ancestors."
+- file: frontend/src/components/demo/WorkspaceEditDialog.tsx
+  why: "E1 rename/notes/tags editor (:46); opened from the row 'Edit details…'."
+- file: frontend/src/components/demo/SeedConfigPanel.tsx
+  why: "E3 7-knob panel (:66/:91); window_days locks on holiday_rush (:109);
+       risk warning at high sparsity/stockout. Only when Re-seed ticked."
+- file: frontend/src/components/demo/ScopeSelector.tsx
+  why: "E3 focus-pair selects (:40/:65/:90); 'Auto-discover' placeholders."
+- file: frontend/src/components/demo/RunConfigPanel.tsx
+  why: "E4 'Run configuration (advanced)' (:46/:82); CandidateModelPicker (:119,
+       opt-in models only when forecast_enable_* on, filtered at :58-60);
+       DemoBacktestSettingsForm; train-candidate preview (:137). 'custom: …'
+       badge on the row via WorkspacePanel runConfigSummary."
+- file: frontend/src/components/demo/demo-step-card.tsx
+  why: "E5 HITL card: Approve (:417) + Reject (:425) + auto-approve countdown
+       (:434, reads data.decision_window_s); rendered only when
+       awaiting_approval && status='running' && action_id is str (:518-529).
+       On this host the 2B agent often skips the tool → buttons may never appear
+       (see Known Gotchas — acceptable; e2e test covers the path)."
+- file: frontend/src/components/demo/WorkspaceStoryPanel.tsx
+  why: "E5 Run-story panel on a loaded workspace (:74): approval history (:114),
+       replay-reproduction markers (:95), knowledge/rag events (:152). Self-hides
+       when no approval_events/rag_events/story_reproduction."
+- file: frontend/src/pages/ops.tsx
+  why: "E5 Approval History table (:101, useApprovalEvents) — flattened approvals
+       across saved workspaces."
+
+# ── Regression tests (CI-covered — cite + targeted re-run; verified 2026-06-13)
+- file: tests/test_e2e_demo.py
+  why: "@pytest.mark.integration, subprocess uvicorn :8124, real Postgres.
+       test_demo_replay_same_config_twice (:561) — #146/#324 replay guard
+       (reset=true; RESETS the shared DB — run ONLY after the dogfood, never
+       concurrently). test_demo_replay_preserves_seed_overrides_and_scope (:618)
+       — E3 replay-slot contract. test_run_demo_showcase_rich_full_epic (:410)
+       — PRP-41 agent_hitl_flow + ops_snapshot (CI proof for E5 backend).
+       test_run_demo_showcase_rich_e2e (:204) + _decision_portfolio (:295) —
+       24-step pipeline. These make the backend pipeline CI-covered; the live
+       dogfood need not re-prove it step-by-step."
+- file: app/features/demo/tests/test_routes.py
+  why: "test_demo_stream_websocket_legacy_frame_ignores_unknown_keys (:190) —
+       THE legacy-frame byte-identical contract (#406 criterion 8). PATCH/DELETE/
+       health/list integration tests (:696-834). Export route 404/409/200
+       (:256-? via test_export.py)."
+- file: app/features/demo/tests/test_workspace.py
+  why: "Module-level @pytest.mark.integration. create/finalize + E3 slot
+       persistence (:79/:97), E4 run_config (:110/:135), E5 story slots
+       (:423/:439). The ORM-level proof behind the dogfood's curl assertions."
+- file: app/features/demo/tests/test_export.py
+  why: "E6 unit + integration (sha256/traversal/manifest + endpoint round-trip)."
+
+# ── Doc-sweep targets ────────────────────────────────────────────────────────
+- file: docs/_base/DOMAIN_MODEL.md
+  why: "THE only stale line. Line 79: '**Out of scope (deliberately not modeled
+       yet):** export bundles under `artifacts/showcase/<workspace>/` (E6 #412)
+       and per-phase interactive configuration …'. E6 SHIPPED export — drop it
+       from the out-of-scope clause, keep import/restore + per-phase config out.
+       Mirror the RUNBOOKS.md:168 phrasing E6 already landed ('bundle import/
+       restore … E6 ships export only')."
+- file: docs/_base/API_CONTRACTS.md
+  why: "READ-ONLY — audited complete for E1–E6 (the /demo + /seeder rows carry
+       every epic's additive fields incl. the E6 export row at line 68). Cross-
+       check; do NOT edit."
+- file: docs/_base/RUNBOOKS.md
+  why: "READ-ONLY — audited complete (Showcase-workspace section lines 154-168
+       covers E1–E6; out-of-scope line 168 already correct after E6). Cross-check."
+
+# ── Close-out mechanics ──────────────────────────────────────────────────────
+- file: .claude/rules/umbrella-issue.md
+  why: "Write discipline for gh mutations: echo each command → idempotent check
+       → confirm. Applies to the #406 body edit + closes. Fetch the LIVE body;
+       never retype it."
+- file: .claude/rules/output-formatting.md
+  why: "Evidence-comment format: emoji status indicators, box separators, ≤40 lines."
+- doc: "Release flow — docs/_base/PIPELINE_CONTRACT.md + .claude/rules/versioning.md"
+  why: "dev→main → release-please opens a Release PR → merge tags vX.Y.Z. Pre-1.0
+       feat: → PATCH bump (v0.2.22 → v0.2.23). The merge-commit-subject trap
+       (RUNBOOKS 'release-please skipped the bump'): use the GitHub web UI or a
+       non-conventional --subject. This gate RECORDS the path; it does NOT cut."
+```
+
+### Current Codebase tree (verification-relevant subset)
+
+```bash
+app/features/demo/                         # the slice E1–E6 extended
+  ├── models.py        # showcase_workspace ORM (E1 columns + story slots)
+  ├── routes.py        # /demo/workspaces[,/{id}[,/health,/export]], PATCH, hitl-decision, approval-events
+  ├── workspace.py     # create/finalize/list/get/update/delete (slot writers)
+  ├── export.py        # E6 bundle writer
+  └── tests/           # test_routes, test_workspace, test_export, test_link_health, test_pipeline, test_hitl, test_schemas
+tests/test_e2e_demo.py                     # showcase_rich e2e + replay guard (CI)
+frontend/src/pages/showcase.tsx            # dogfood entry point
+frontend/src/pages/ops.tsx                 # E5 approval history
+frontend/src/components/demo/              # E1–E6 controls + *.test.tsx (17 component tests)
+docs/_base/DOMAIN_MODEL.md                 # sweep target (1 stale line :79)
+docs/_base/API_CONTRACTS.md               # audited complete (read-only)
+docs/_base/RUNBOOKS.md                     # audited complete (read-only)
+docker-compose.yml                         # base Postgres+pgvector
+docker-compose.gpu.yml                     # GPU overlay for ollama (REQUIRED for the rag legs)
+docker-compose.lan.yml                     # untracked local overlay — do NOT use here
+```
+
+### Desired Codebase tree (files added/modified)
+
+```bash
+PRPs/PRP-showcase-completion-E7-release-gate.md   # ADD — this file
+docs/_base/DOMAIN_MODEL.md                        # MOD — fix the one stale line (:79)
+# No app/, frontend/, or alembic/ change is in scope.
+```
+
+### Known Gotchas & Environment Quirks
+
+```python
+# ── STOP RULE (governs the whole epic) ───────────────────────────────────────
+# If ANY dogfood check deviates from the expected-outcome matrix below: capture
+# evidence (response body / screenshot / step table), open a NEW fix issue
+# referencing #406 + #420, comment the failure on #420, and STOP the close-out.
+# The docs fix (Task 4) + the PR still land — they document already-shipped
+# semantics and are independent of dogfood outcomes. A DOCUMENTED expected-skip
+# (agent_hitl_flow ⏭️, rag legs ⏭️ when the provider is down, sparse fail) is NOT
+# a deviation.
+
+# ── Fresh stack — DROP/CREATE, NEVER `down -v` (memory: fresh-stack-gate-procedure)
+# `down -v` removes ALL named volumes incl. forecastlab_ollama_models (pulled
+# gemma4/qwen3 models, expensive to rebuild). Fresh-DB equivalent:
+#   docker compose --profile gpu down --remove-orphans
+#   docker compose -f docker-compose.yml -f docker-compose.gpu.yml --profile gpu up -d
+#   docker compose exec -T postgres psql -U forecastlab -d postgres \
+#     -c "DROP DATABASE IF EXISTS forecastlab WITH (FORCE);" \
+#     -c "CREATE DATABASE forecastlab OWNER forecastlab;"
+#   uv run alembic upgrade b7c1d9e3f204   # cold-boot proof to the COMMITTED head
+# MULTI-HEAD BLOCKER (verified 2026-06-13): `alembic heads` returns TWO heads —
+# b7c1d9e3f204 (committed E4 run_config) AND the untracked, LOCAL-ONLY
+# a2b3c4d5e6f7_rag_embedding_dim_2560_qwen3 (branches off old rev c1d2e3f40512).
+# So a BARE `alembic upgrade head` ERRORS "Multiple head revisions are present".
+# Before upgrading, EITHER move the untracked qwen3 file out of alembic/versions/
+# (then `alembic upgrade head` is unambiguous), OR target the committed head
+# explicitly: `uv run alembic upgrade b7c1d9e3f204`. (CI never hits this — the
+# qwen3 file is untracked, so CI's migration-check sees one head.)
+# DOWNGRADE proof (#406 criterion 1 — "applies + downgrades cleanly"): after the
+# clean upgrade, round-trip the head-adjacent showcase migration once:
+#   uv run alembic downgrade -1 && uv run alembic upgrade b7c1d9e3f204   # both exit 0
+# (The head-adjacent COMMITTED showcase migration is E4 b7c1d9e3f204 = run_config,
+#  down_revision d45cf40dfe47 = the E1 #407 metadata/slots migration. `downgrade
+#  -1` reverses E4 run_config — a clean round-trip that satisfies criterion 1;
+#  to also exercise the E1 column/slot down-path, downgrade a second step.)
+# GOTCHA (memory: rag-runtime-config-and-corpus-state): that untracked qwen3
+# migration is LOCAL-ONLY — the committed tree's head yields the 1536-dim
+# (OpenAI) embedding column. If .env has RAG_EMBEDDING_PROVIDER=ollama +
+# qwen3-embedding:4b (2560-d), the showcase_rich knowledge legs dim-mismatch
+# unless that local migration is applied too (keep it in alembic/versions/ and
+# `alembic upgrade head` once BOTH heads are intended), OR accept the rag legs
+# ⏭️-skipping (documented-acceptable RUNBOOKS entries 20-22) — both conformant.
+# GOTCHA: WITHOUT the gpu overlay ollama runs CPU-only and rag_index_subset can
+# 502 on the cold embedder load. Verify `docker exec forecastlab-ollama nvidia-smi`,
+# then WARM the embedder before any showcase_rich run:
+#   curl -s localhost:11434/api/embed -d '{"model":"<configured-embed-model>","input":"warmup"}'
+# GOTCHA: fresh DB wipes app_config overrides — agent model reverts to .env
+# (AGENT_DEFAULT_MODEL on this host is an ollama model). Re-check GET /config/ai.
+# GOTCHA (memory: dogfood-stale-uvicorn-port-8123): a stale uvicorn from a prior
+# session can hold :8123 → curl hits OLD code. `lsof -iTCP:8123 -sTCP:LISTEN`,
+# kill stale PIDs first. Run the backend as LOCAL uvicorn from the REPO ROOT
+# (host-filesystem artifacts + docs/ for the rag legs; the compose backend image
+# lacks docs/, which is why docker-compose.lan.yml exists — do NOT use it here).
+# pnpm 11 depsStatusCheck can stall `pnpm dev` → start Vite directly:
+#   cd frontend && ./node_modules/.bin/vite --host 0.0.0.0
+# GOTCHA (memory: seeder-does-not-reset-id-sequences): the seeder does NOT reset
+# store/product id sequences. After a reset+reseed the focus-pair ids change —
+# discover live ids via GET /dimensions/stores + /dimensions/products before
+# entering an E3 focus pair; never assume id=1.
+
+# ── E5 live-Reject caveat (memory: gemma4-agent-local-deployment) ─────────────
+# AGENT_DEFAULT_MODEL on this host is a 2B ollama model that RELIABLY skips the
+# save_scenario tool → the HITL Approve/Reject buttons may never render (step
+# ⏭️ 'agent did not trigger save_scenario' — RUNBOOKS entry 25, acceptable).
+# To live-dogfood the Reject BUTTON: PATCH /config/ai to a cloud agent model
+# (e.g. anthropic:claude-… or openai:gpt-…; keys present in .env per
+# CLAUDE.local.md) — no restart needed — so the agent reliably calls
+# save_scenario and the 10s window opens; then click Reject and assert the run
+# stays GREEN with NO new scenario_plan row. If you keep the ollama model, the
+# E5 criterion rests on (a) test_run_demo_showcase_rich_full_epic (CI), (b) the
+# unit tests test_step_agent_hitl_manual_approval / _auto_approved, and (c) the
+# /ops Approval History + Run-story panel rendering an AUTO-APPROVED capture from
+# a prior keep-run. Record which path you took. Revert the model override after.
+
+# ── Per-preset expected-outcome matrix (RUNBOOKS entry 28 — the dogfood spec) ─
+# Every run: 'Re-seed first' TICKED. seed=42.
+#   demo_minimal / retail_standard / high_variance / stockout_heavy /
+#   new_launches  → 11 steps GREEN.
+#   sparse        → 11 steps GREEN **or documented FAIL** at features/backtest
+#                   (50% missing grains / all-NaN WAPE gate) — the card carries
+#                   the expected-skip badge; either outcome is matrix-conformant.
+#   holiday_rush  → tick **Reset database** TOO (pinned 2024-10-01..12-31 window;
+#                   re-seed without reset ADDS rows → union range). 11 steps GREEN.
+#   showcase_rich → 24 steps / 10 phases; tick **Reset database** TOO. Acceptable
+#                   non-green (RUNBOOKS 9-26, 23-26): agent_hitl_flow ⏭️ (2B agent),
+#                   rag_index_subset / rag_retrieve_probe ⏭️ (provider down/dim-
+#                   mismatch), verify ⏭️ (V2 prophet_like winner), batch_preset ⚠️
+#                   (90s poll), ops_snapshot ⚠️. ANY other ❌/⏭️ = deviation → STOP.
+# Only ONE pipeline at a time (module asyncio.Lock; 2nd start → 409 / one error
+# event; Stop releases in ~5s). The 8-preset run matrix can be LIGHTER than the
+# #401 gate's (the pipeline is CI-covered) — one representative GREEN per preset
+# class + the two keep-runs (demo_minimal, showcase_rich) is sufficient evidence.
+
+# ── Tests / gates (memory: integration-suite-shared-state-pollution) ─────────
+# NEVER run the full `-m integration` suite as a gate — known shared-state
+# pollution. Run TARGETED only. test_demo_replay_same_config_twice RESETS the
+# shared DB (reset=true on :8124) — run it ONLY after the dogfood matrix, never
+# concurrently with a :8123 run.
+# memory: frontend-tsc-noemit-gate-vacuous — `pnpm tsc --noEmit` checks 0 files;
+# `tsc -b` has PRE-EXISTING dev failures (baseline 30 at gate time). Frontend
+# evidence = `pnpm lint` (0 errors) + targeted vitest, NOT a clean tsc.
+# memory: playwright-dogfood-snap-chromium — Playwright MCP + `playwright install`
+# FAIL on this host; use native Python Playwright with
+# executable_path="/snap/bin/chromium" (symlink verified) or the agent-browser
+# skill. localhost:5173 is fine.
+
+# ── Docs fix (memory: repo-line-endings-crlf) ────────────────────────────────
+# DOMAIN_MODEL.md is CRLF-dominant/mixed. Editing it can flip an unrelated LF
+# block to CRLF → a whole-file noise diff. After the one-line edit, run
+# `git diff --stat docs/_base/DOMAIN_MODEL.md`; if it dwarfs ~2 changed lines,
+# restore from HEAD and re-apply byte-precisely (`git show HEAD:<file>` + Python
+# binary-mode replace preserving the line's CRLF terminator). The single target
+# line (:79) is CRLF — keep it CRLF.
+
+# ── Third-party API claims ───────────────────────────────────────────────────
+# None. This PRP cites no new library attributes; every verification command is
+# first-party (curl / pytest / grep / gh / sha256sum). (Policy per #258.)
+
+# ── GitHub close-out (write discipline: .claude/rules/umbrella-issue.md) ──────
+# #406 body edit: fetch with `gh issue view 406 --json body`, tick the 7
+# remaining Success-criteria boxes (leave the already-ticked E6 'Export
+# produces…' box). Decomposition E1–E7 already correct (done during E6 + E7
+# scaffold). Preserve everything else byte-identical; edit the FETCHED markdown,
+# never retype it. Close order: PR opened → evidence on #420 → tick #406 →
+# close #406 (comment links the #420 evidence) → close #420 last. The PR needs
+# 1 review + CI — opening it suffices to proceed (reliability-E6 precedent).
+# The dev→main RELEASE cut is a SEPARATE stop-and-ask — do NOT `gh pr merge`
+# dev→main inside this gate.
+```
+
+## Implementation Blueprint
+
+### Data models and structure
+
+None. Zero schemas, zero migrations, zero source changes. The only authored
+content is one corrected sentence in `DOMAIN_MODEL.md` (Task 4) and this PRP file.
+
+### List of tasks in execution order
+
+```yaml
+Task 0 — Preflight:
+  VERIFY branch: git switch dev && git pull --ff-only → clean, up to date.
+  VERIFY no stale server: lsof -iTCP:8123 -sTCP:LISTEN → kill stale PIDs.
+  VERIFY chromium: ls -la /snap/bin/chromium (else plan agent-browser skill).
+  VERIFY epics CLOSED: gh issue view 407 408 409 410 411 412 → all CLOSED.
+  RECORD: git rev-parse HEAD → the SHA all evidence refers to (expect the #419
+    merge 0381edb or later).
+
+Task 1 — Regression + CI-gate audit (committed proofs; do this BEFORE the live
+         stack so a red gate stops the gate early):
+  CITE CI: gh run list --workflow ci.yml --branch dev --limit 1 --json
+    databaseId,conclusion,headSha → MUST be success on the current dev HEAD
+    (27460856895 / 0381edb2 at authoring time; re-cite the latest). This run
+    is the green proof of all five gates INCLUDING the showcase_rich e2e
+    (test_run_demo_showcase_rich_* ) + the replay guard (they are -m integration
+    and run in CI's test job).
+  RUN the five gates locally on dev (fast, no live stack):
+    uv run ruff check . && uv run ruff format --check .
+    uv run mypy app/ && uv run pyright app/
+    uv run pytest -v -m "not integration"          # ~2112 pass
+  RUN the load-bearing targeted proofs:
+    uv run pytest "app/features/demo/tests/test_routes.py::test_demo_stream_websocket_legacy_frame_ignores_unknown_keys" -v   # legacy-frame contract
+    cd frontend && ./node_modules/.bin/eslint src && ./node_modules/.bin/vitest run src/components/demo/ && cd ..   # frontend demo components
+  DEFER to Task 6 (resets the DB): test_demo_replay_same_config_twice.
+  ON any red gate on an untouched surface → STOP RULE (regression).
+
+Task 2 — Fresh-DB stack (memory-corrected; NEVER down -v):
+  RUN: docker compose --profile gpu down --remove-orphans
+  RUN: docker compose -f docker-compose.yml -f docker-compose.gpu.yml --profile gpu up -d
+  VERIFY: docker exec forecastlab-ollama nvidia-smi → GPU visible (else rag legs ⏭️)
+  RUN: docker compose exec -T postgres psql -U forecastlab -d postgres \
+         -c "DROP DATABASE IF EXISTS forecastlab WITH (FORCE);" \
+         -c "CREATE DATABASE forecastlab OWNER forecastlab;"
+  RUN: uv run alembic upgrade b7c1d9e3f204   # MUST exit 0 (COMMITTED head; a bare
+       `upgrade head` errors "Multiple head revisions" while the untracked qwen3
+       migration is present — see the MULTI-HEAD BLOCKER in Known Gotchas)
+  RUN: uv run alembic downgrade -1 && uv run alembic upgrade b7c1d9e3f204   # round-trip
+       (the "applies + downgrades cleanly" proof for #406 criterion 1 — reverses
+       the head-adjacent E4 run_config migration and re-applies it)
+  (Optional rag path) To run the showcase_rich knowledge legs against ollama
+       qwen3 embeddings, keep the local qwen3 migration in place and resolve
+       BOTH heads (`alembic upgrade head` is then the two-head merge); else
+       accept rag-leg ⏭️.
+  WARM embedder: curl -s localhost:11434/api/embed \
+       -d '{"model":"<configured-embed-model>","input":"warmup"}'
+  START backend: uv run uvicorn app.main:app --port 8123  (background, repo root,
+       log to file); VERIFY curl /health → {"status":"ok"}.
+  VERIFY config: curl -s localhost:8123/config/ai → agent model = the .env value.
+  START frontend: cd frontend && ./node_modules/.bin/vite --host 0.0.0.0
+       (background); VERIFY curl -sI localhost:5173 → 200.
+
+Task 3 — Dogfood evidence matrix (the live gate; browser at :5173/showcase). For
+         each, drive the UI then ASSERT over curl; capture a screenshot.
+  3a SEED + LEGACY-FRAME probe:
+     DRIVE one run, defaults (demo_minimal, Re-seed ✓, Save-as-workspace UNticked)
+     → green 11 steps. ASSERT GET '/demo/workspaces?limit=100' → zero rows
+     (ephemeral created none — the live byte-compat echo of test :190).
+  3b E3 keep-run (= demo_minimal matrix row + E3 + E4 proof in one):
+     UI: demo_minimal, Re-seed ✓, open 'Advanced seed config' → set ≥2 knobs
+       (e.g. stores=6, noise_sigma=0.2); open ScopeSelector → pick a live
+       store+product pair (discover ids via /dimensions first); open 'Run
+       configuration (advanced)' → change the model set and/or a backtest knob
+       (e.g. metric=rmse); Save as workspace ✓, name=e7-gate-cfg → Run → green.
+     ASSERT: GET /demo/workspaces/{id} → seed_overrides has the 2 knobs,
+       user_scope = the picked pair, run_config = {train_model_types, backtest};
+       the panel row shows the 'custom: …' badge.
+     UI: Load the row → config repopulates (seed panel + scope + run-config);
+       WorkspaceArtifactsPanel renders link-health markers.
+  3c E1 lifecycle on that row:
+     UI: 'Edit details…' → set notes + 2 tags → save; pin; archive (then show-
+       archived to see it). ASSERT GET /demo/workspaces/{id} → notes/tags/
+       pinned/archived reflect; the panel search + tag-chip filter + sort
+       toolbar all narrow the list; select 2 rows → Compare → /showcase/compare
+       renders a diff.
+  3d E2 safe replay + lineage:
+     UI: Replay the e7-gate-cfg row → the ReplayConfirmDialog opens with the
+       recorded-vs-sent diff; confirm → green re-run → a NEW row appears with
+       a 'replay' badge. ASSERT GET /demo/workspaces/{new_id} →
+       replayed_from_workspace_id == the source id, AND seed_overrides /
+       user_scope / run_config identical to the source (E3+E4 replay-verbatim).
+       Load the new row → the lineage strip shows the chain (≥2 entries).
+     UI (destructive escalation): make/keep a reset=true workspace, click Replay
+       → the dialog shows destructive copy + 'Replay & wipe database' (do NOT
+       confirm unless you intend the wipe — the dialog-open is the evidence).
+     UI: multi-select 2 rows → Delete selected → confirm once → both gone;
+       created objects survive (ASSERT a referenced run still GET-able).
+  3e E5 keep-run on showcase_rich (the approval + story proof):
+     (If live-Reject: PATCH /config/ai → a cloud agent model first.)
+     UI: showcase_rich, Re-seed ✓, Reset database ✓, Save as workspace ✓,
+       name=e7-gate-rich → Run → 24 steps, zero undocumented ❌. If the HITL
+       Approve/Reject buttons render in the 10s window: click Reject → ASSERT
+       the run stays GREEN (terminal pass 'rejected by operator') and GET
+       /scenarios shows NO new plan from this run. Else record the auto-approve
+       (⏭️/auto path) per the gemma4 caveat.
+     ASSERT: GET /demo/workspaces/{id} → approval_events populated (the decision),
+       rag_events populated (if the rag legs ran), created_objects has
+       winning_run_id/v2_run_id/alias/scenario_plan_ids/batch_id.
+     UI: Load the row → Run-story panel shows the approval history (+ reproduction
+       markers if a replay); open /ops → Approval History table lists the entry.
+  3f E6 export (re-confirm criterion 7):
+     UI: Export the e7-gate-rich row → success toast (bundle path + checksums
+       verified). SHELL: cd artifacts/showcase/<id> && sha256sum -c checksums.sha256
+       → all OK.
+  3g 8-preset run matrix (light — pipeline is CI-covered):
+     FOR preset IN [retail_standard, high_variance, stockout_heavy, new_launches]:
+       Re-seed ✓ → Run → record GREEN.
+     sparse: Re-seed ✓ → Run → record GREEN or the documented features/backtest
+       FAIL (either is conformant).
+     holiday_rush: Re-seed ✓ AND Reset ✓ → Run → GREEN; RECORD /seeder/status
+       range == 2024-10-01..2024-12-31 (pinned).
+     ON any non-conformant outcome → STOP RULE.
+
+Task 4 — Docs fix (lands regardless of dogfood outcome):
+  BRANCH: git switch -c docs/showcase-completion-e7-gate  (off dev)
+  MODIFY docs/_base/DOMAIN_MODEL.md line ~79 — drop 'export bundles under
+    artifacts/showcase/<workspace>/ (E6 #412)' from the "Out of scope
+    (deliberately not modeled yet)" clause; keep 'per-phase interactive
+    configuration' out, and ADD the parenthetical that export shipped in E6
+    #412 (import/restore remains out) — mirror RUNBOOKS.md:168 phrasing.
+  CHECK: git diff --stat docs/_base/DOMAIN_MODEL.md → ~1-3 lines only (CRLF guard).
+  COMMIT 1: docs(docs): reconcile domain model showcase export out-of-scope note (#420)
+  COMMIT 2: docs(repo): track showcase completion e7 release-gate prp (#420)  # this file
+  PUSH; OPEN PR into dev (needs 1 review + CI; opening suffices to proceed).
+
+Task 5 — Five validation gates (on the docs branch):
+  RUN: uv run ruff check . && uv run ruff format --check .
+  RUN: uv run mypy app/ && uv run pyright app/
+  RUN: uv run pytest -v -m "not integration"
+  PLUS: cd frontend && ./node_modules/.bin/eslint src && ./node_modules/.bin/vitest run src/components/demo/
+  ALL must pass. A failure on an untouched surface = regression → STOP RULE.
+
+Task 6 — Replay regression (verify-only; AFTER the dogfood — it RESETS the DB):
+  RUN: uv run pytest "tests/test_e2e_demo.py::test_demo_replay_same_config_twice" -v -m integration
+  (Optional) uv run pytest "tests/test_e2e_demo.py::test_demo_replay_preserves_seed_overrides_and_scope" -v -m integration
+  EXPECT: pass in ≤ ~8 min each (240s-budget runs on :8124).
+
+Task 7 — Release path (record only; the cut is stop-and-ask):
+  CONFIRM: gh pr list --base main → no open Release PR; main = v0.2.22.
+  RECORD: the E1–E6 commits on dev are all feat:/fix:/docs: — release-please
+    will bump v0.2.22 → v0.2.23 (pre-1.0 feat→PATCH) on the dev→main merge.
+  WRITE in the #420 evidence: "Release-ready: dev is green; cutting v0.2.23
+    requires a dev→main PR (web-UI merge or non-conventional --subject to avoid
+    the merge-subject trap, RUNBOOKS 'release-please skipped the bump'). This
+    gate does NOT cut — the release is the maintainer's stop-and-ask."
+
+Task 8 — Evidence + close-out (gh write discipline: echo each command first;
+         ONLY if Tasks 1-3 + 5-6 were fully matrix-conformant):
+  COMMENT on #420: evidence block per output-formatting.md — HEAD SHA, CI
+    citation, fresh-DB + downgrade proof, the dogfood matrix (per-epic
+    action → outcome → curl assertion), 8-preset matrix, export sha256sum -c,
+    gate results, replay-test result, screenshot paths, release path, PR link.
+  EDIT #406 body: tick the 7 remaining Success-criteria boxes (leave E6's
+    'Export produces…' already-ticked box). Byte-preserve the rest (fetch live).
+  CLOSE #406: gh issue close 406 --comment "<close-out linking the #420
+    evidence + epics #407-#412 + the docs PR + the recorded release path>"
+  CLOSE #420: gh issue close 420 --comment "<gate complete — evidence above;
+    docs PR <link> lands through normal review; release v0.2.23 pending the
+    maintainer's dev→main cut>"
+
+Task 9 — Teardown:
+  STOP the background uvicorn + vite; REVERT any /config/ai agent-model override.
+  LEAVE the seeded DB + workspace rows + export bundles in place (operator
+    artifacts). LEAVE the compose stack (postgres + GPU ollama) up.
+```
+
+### Integration Points
+
+```yaml
+GITHUB:
+  - issue #420: evidence comment + close
+  - issue #406: 7 success-criteria checkbox ticks + close-out comment + close
+  - PR: docs branch (DOMAIN_MODEL one-line fix + this PRP) into dev
+  - (stop-and-ask, NOT in this gate): dev→main Release PR → release-please → v0.2.23
+
+RUNTIME (consumers only — no code integration):
+  - compose Postgres :5433 + GPU ollama :11434 (gpu overlay, warmed embedder)
+  - local uvicorn :8123 (repo root), Vite :5173
+  - test-owned uvicorn :8124 (Task 6 only)
+```
+
+## Validation Loop
+
+### Level 1 — environment sanity (before the live stack)
+
+```bash
+git status --short && git rev-parse --abbrev-ref HEAD      # dev, clean
+lsof -iTCP:8123 -sTCP:LISTEN                                # must be empty
+gh run list --workflow ci.yml --branch dev --limit 1 --json conclusion  # success
+docker compose ps                                           # postgres healthy
+docker exec forecastlab-ollama nvidia-smi | head -3         # GPU overlay (after Task 2)
+curl -s http://localhost:8123/health                        # {"status":"ok"} after Task 2
+```
+
+### Level 2 — targeted committed proofs
+
+```bash
+uv run ruff check . && uv run ruff format --check .
+uv run mypy app/ && uv run pyright app/
+uv run pytest -v -m "not integration"                       # full unit gate (~2112)
+uv run pytest "app/features/demo/tests/test_routes.py::test_demo_stream_websocket_legacy_frame_ignores_unknown_keys" -v
+cd frontend && ./node_modules/.bin/eslint src && ./node_modules/.bin/vitest run src/components/demo/ && cd ..
+# Task 6 ONLY (resets the shared DB):
+uv run pytest "tests/test_e2e_demo.py::test_demo_replay_same_config_twice" -v -m integration
+```
+
+### Level 3 — live system (the dogfood matrix + curl probes)
+
+```bash
+# Browser: http://localhost:5173/showcase per Task 3.
+curl -s 'http://localhost:8123/demo/workspaces?limit=100' | python3 -m json.tool | head -60
+curl -s "http://localhost:8123/demo/workspaces/<id>" | python3 -m json.tool   # slots + run_config
+curl -s 'http://localhost:8123/demo/approval-events?limit=20' | python3 -m json.tool
+curl -s "http://localhost:8123/demo/workspaces/<id>/health" | python3 -m json.tool
+curl -s -X POST "http://localhost:8123/demo/workspaces/<id>/export" | python3 -m json.tool
+( cd artifacts/showcase/<id> && sha256sum -c checksums.sha256 )
+curl -s 'http://localhost:8123/dimensions/stores?limit=5' | python3 -m json.tool   # focus-pair ids
+```
+
+### Level 4 — repo gates (docs branch)
+
+```bash
+uv run ruff check . && uv run ruff format --check .
+uv run mypy app/ && uv run pyright app/
+uv run pytest -v -m "not integration"
+git diff --stat docs/_base/DOMAIN_MODEL.md                  # ~1-3 lines (CRLF guard)
+```
+
+## Final validation Checklist
+
+- [ ] dev CI cited green on the current HEAD (five gates + showcase_rich e2e +
+      replay run there); local five gates + legacy-frame test + demo vitest green
+- [ ] Fresh DB via DROP/CREATE (NOT down -v); `alembic upgrade b7c1d9e3f204`
+      clean (committed head; multi-head note in Gotchas) +
+      `downgrade -1 && upgrade b7c1d9e3f204` round-trip clean; GPU ollama up + warmed
+- [ ] Legacy-frame run green; zero workspace rows created by it
+- [ ] E1: rename/notes/tags/pin/archive reflected in GET detail + panel
+- [ ] E2: destructive-replay confirm dialog gates the run; replay row has
+      replayed_from_workspace_id + lineage chain; search/filter/sort/multi-delete/
+      compare/link-health all exercised
+- [ ] E3: keep-run with ≥2 seed knobs + focus pair → slots persist; replay carries
+      identical slots
+- [ ] E4: custom run-config → run_config recorded + 'custom: …' badge; default
+      run → run_config null
+- [ ] E5: approval captured on a showcase_rich keep-run (Reject live OR auto +
+      CI/unit backing); Run-story panel + /ops Approval History render it; a
+      Reject keeps the run green with no scenario_plan
+- [ ] E6: Export → sha256sum -c passes
+- [ ] 8-preset matrix conformant (sparse + holiday_rush per RUNBOOKS 28)
+- [ ] DOMAIN_MODEL.md:79 corrected; git diff --stat shows only intended lines
+- [ ] Five gates green on the docs branch + demo-component vitest green
+- [ ] Replay regression test green (Task 6)
+- [ ] Release path recorded (target v0.2.23; dev→main is stop-and-ask)
+- [ ] Evidence on #420; #406's 7 boxes ticked; #406 closed; #420 closed; docs
+      PR open into dev
+- [ ] Background servers stopped; /config/ai override reverted; compose stack +
+      seeded DB left in place
+
+---
+
+## Anti-Patterns to Avoid
+
+- ❌ Don't `docker compose down -v` — it destroys the Ollama models volume; use
+     the DROP/CREATE DATABASE procedure (memory: fresh-stack-gate-procedure)
+- ❌ Don't fix forward inside the gate — a non-conformant outcome files a NEW
+     issue and STOPS the close-out (the docs fix + PR still land)
+- ❌ Don't treat a documented expected-skip (agent_hitl_flow ⏭️, rag legs ⏭️,
+     sparse fail) as a deviation — but don't hand-wave an undocumented ❌
+- ❌ Don't run `test_demo_replay_same_config_twice` (or the full integration
+     suite) mid-dogfood — both mutate the shared DB
+- ❌ Don't skip Reset on holiday_rush / showcase_rich — the union-window trap
+- ❌ Don't gate on a clean `tsc` — it's vacuous/pre-failing; use lint + vitest
+     (memory: frontend-tsc-noemit-gate-vacuous)
+- ❌ Don't `gh pr merge` dev→main here — the release cut is a separate
+     stop-and-ask (release-please owns tagging)
+- ❌ Don't retype #406's body — fetch, tick the 7 boxes, push back byte-preserved
+- ❌ Don't edit API_CONTRACTS.md / RUNBOOKS.md — audited complete; only
+     DOMAIN_MODEL.md:79 is stale
+- ❌ Don't assume focus-pair id=1 — the seeder doesn't reset sequences; discover
+     live ids first (memory: seeder-does-not-reset-id-sequences)
+
+## Confidence Score: 9/10
+
+> Updated 8.5 → 9 after the prp-quality-agent pass (2026-06-13): the one
+> high-severity gap it found — the local alembic two-head state that makes a
+> bare `alembic upgrade head` error, plus a downgrade-target mislabel (the
+> head-adjacent committed showcase migration is E4 `b7c1d9e3f204` run_config,
+> not E1) — is now folded into Task 2 + Known Gotchas (target the committed head
+> explicitly). All other load-bearing claims (CI run id/SHA, #406 criteria
+> mapping, the DOMAIN_MODEL:79 stale line, every test marker, the
+> expected-skip carve-outs) verified accurate against the live repo.
+
+One-pass success likelihood is high: this is the second showcase release gate
+(the #401 PRP is a proven, near-identical template), the docs audit already
+reduced the doc work to one verified-stale line, the backend pipeline is
+CI-covered (so the live dogfood is additive evidence, not the sole proof), every
+dogfood action is pinned to a file:line UI control + a curl assertion + a backing
+committed test, and all the hard-won environment corrections (Ollama volume, GPU
+overlay, embedder warm-up, stale-uvicorn, seeder id sequences, snap chromium,
+integration-suite pollution, CRLF) are folded in from memory. Residual risk
+(−1.5): the E5 live-Reject depends on swapping to a cloud agent model (the 2B
+ollama agent reliably skips the tool), the showcase_rich rag legs may ⏭️ on the
+committed 1536-dim schema vs a local qwen3 setup, and browser automation on snap
+chromium remains the most fragile dependency — none blocks the gate (each has a
+documented-acceptable fallback that still closes the criterion via CI + unit
+coverage).
+```