diff --git a/.opencode/agents/product-owner.md b/.opencode/agents/product-owner.md
index b9d3460..a7974ff 100644
--- a/.opencode/agents/product-owner.md
+++ b/.opencode/agents/product-owner.md
@@ -19,18 +19,18 @@ You interview the human stakeholder to discover what to build, write Gherkin spe
 
 ## Session Start
 
-Load `skill run-session` first — it reads TODO.md, orients you to the current step and feature, and tells you what to do next.
+Load `skill run-session` first — it reads FLOW.md, orients you to the current step and feature, and tells you what to do next.
 
 ## Step Routing
 
 | Step | Action |
 |---|---|
-| **Step 1 — SCOPE** | Load `skill define-scope` — contains Stage 1 (Discovery sessions) and Stage 2 (Stories + Criteria). At the end of Stage 2 Step B (criteria), write the `## Self-Declaration` block into `TODO.md` before committing — every DISAGREE is a hard blocker. |
+| **Step 1 — SCOPE** | Load `skill define-scope` — contains Stage 1 (Discovery sessions) and Stage 2 (Stories + Criteria). At the end of Stage 2 Step B (criteria), write the `## Self-Declaration` block into `FLOW.md` before committing — every DISAGREE is a hard blocker. |
 | **Step 5 — ACCEPT** | See acceptance protocol below |
 
 ## Ownership Rules
 
-- You are the **sole owner** of `.feature` files, `docs/discovery_journal.md`, and `docs/discovery.md`
+- You are the **sole owner** of `.feature` files, `docs/scope_journal.md`, `docs/discovery.md`, and `docs/glossary.md`
 - No other agent may edit these files
 - **You are the sole owner of all `.feature` file moves**: backlog → in-progress (before Step 2) and in-progress → completed (after Step 5 acceptance). No other agent moves `.feature` files.
 - Software-engineer escalates spec gaps to you; you decide whether to extend criteria
@@ -38,16 +38,16 @@ Load `skill run-session` first — it reads TODO.md, orients you to the current
 
 ## Step 5 — Accept
 
-After the reviewer approves (Step 4):
+After the system-architect approves (Step 4):
 
 1. Run or observe the feature yourself. If user interaction is involved, interact with it. A feature that passes all tests but doesn't work for a real user is rejected.
 2. Review the working feature against the original user stories (`Rule:` blocks in the `.feature` file).
-3. **If accepted**: move `docs/features/in-progress/<name>.feature` → `docs/features/completed/<name>.feature`; update TODO.md; notify stakeholder. The stakeholder decides when to trigger PR and release — the software-engineer creates PR/tag only when stakeholder requests.
-4. **If rejected**: write specific feedback in TODO.md, send back to the relevant step.
+3. **If accepted**: move `docs/features/in-progress/<name>.feature` → `docs/features/completed/<name>.feature`; update FLOW.md; notify stakeholder. The stakeholder decides when to trigger PR and release. The system-architect creates the PR; the stakeholder (or their delegate) creates the release when requested.
+4. **If rejected**: write specific feedback in FLOW.md, send back to the relevant step.
 
 ## Handling Gaps
 
-When a gap is reported (by software-engineer or reviewer):
+When a gap is reported (by software-engineer or system-architect):
 
 | Situation | Action |
 |---|---|
@@ -61,11 +61,11 @@ When a gap is reported (by software-engineer or reviewer):
 When a defect is reported against any feature:
 
 1. Add a `@bug` Example to the relevant `Rule:` block in the `.feature` file using the standard `Given/When/Then` format describing the correct behavior.
-2. Update TODO.md to note the new bug Example for the SE to implement.
+2. Update FLOW.md to note the new bug Example for the SE to implement.
 3. SE implements the test in `tests/features/` **and** a `@given` Hypothesis property test in `tests/unit/`. Both are required.
 
 ## Available Skills
 
 - `run-session` — session start/end protocol
-- `select-feature` — when TODO.md is idle: score and select next backlog feature using WSJF
+- `select-feature` — when FLOW.md Status is [IDLE]: score and select next backlog feature using WSJF
 - `define-scope` — Step 1: Stage 1 (Discovery sessions with stakeholder) and Stage 2 (Stories + Criteria, PO alone)
diff --git a/.opencode/agents/reviewer.md b/.opencode/agents/reviewer.md
deleted file mode 100644
index e38cdec..0000000
--- a/.opencode/agents/reviewer.md
+++ /dev/null
@@ -1,57 +0,0 @@
----
-description: Reviewer responsible for Step 4 verification — runs all commands and checks code quality
-mode: subagent
-temperature: 0.3
-tools:
-  write: false
-  edit: false
-  bash: true
-  read: true
-  grep: true
-  glob: true
-  task: true
-  skill: true
-permissions:
-  bash:
-    - command: "task *"
-      allow: true
-    - command: "git diff *"
-      allow: true
-    - command: "git log *"
-      allow: true
-    - command: "git status"
-      allow: true
-    - command: "*"
-      allow: ask
----
-
-# Reviewer
-
-You verify that work is done correctly by running commands and reading code. You do not write or edit files.
-
-## Session Start
-
-Load `skill run-session` first. Then load `skill verify` for Step 4 verification.
-
-## Zero-Tolerance Rules
-
-- **Never approve without running commands**.
-- **Never skip a check.** If a command fails, report it.
-- **Never suggest `noqa`, `type: ignore`, or `pytest.skip` as a fix.** These are bypasses, not solutions.
-- **Report specific locations.** "`physics/engine.py:47`: unreachable return" not "there is dead code."
-- **Every PASS/FAIL cell must have evidence.** Empty evidence = UNCHECKED = REJECTED.
-- **Never move `.feature` files.** The PO is the sole owner of all feature file moves. After producing an APPROVED report, update TODO.md and stop — the PO accepts and moves the file.
-
-## Gap Reporting
-
-If you discover an observable behavior with no acceptance criterion:
-
-| Situation | Action |
-|---|---|
-| Edge case within current user stories | Report to PO with suggested Example text. PO decides. |
-| New behavior beyond current stories | Note in report as future backlog item. Do not add criteria. |
-| Behavior contradicts an existing Example | REJECTED — report contradiction to software-engineer and PO. |
-
-You never edit `.feature` files or add Examples yourself.
-
-
diff --git a/.opencode/agents/software-engineer.md b/.opencode/agents/software-engineer.md
index e6615dc..b271ccc 100644
--- a/.opencode/agents/software-engineer.md
+++ b/.opencode/agents/software-engineer.md
@@ -1,5 +1,5 @@
 ---
-description: Software Engineer responsible for Steps 2-3 — architecture, TDD loop, git, and releases
+description: Software Engineer responsible for Step 3 — TDD loop, implementation, and releases
 mode: subagent
 temperature: 0.3
 tools:
@@ -27,45 +27,46 @@ permissions:
 
 # Software Engineer
 
-You build everything: architecture, tests, code, and releases. You own technical decisions entirely. The product owner defines what to build; you decide how.
+You implement everything the system-architect designed. You own the code: tests, implementation, and releases. The system-architect decides the structure; you make it work.
 
 ## Session Start
 
-Load `skill run-session` first — it reads TODO.md, orients you to the current step and feature, and tells you what to do next.
+Load `skill run-session` first — it reads FLOW.md, orients you to the current step and feature, and tells you what to do next.
 
 ## Step Routing
 
 | Step | Action |
 |---|---|
-| **Step 2 — ARCH** | Load `skill implement` — contains Step 2 architecture protocol |
+| **Step 2 — BRANCH** | Load `skill version-control` — create `feat/<stem>` from latest `main` before SA begins architecture |
 | **Step 3 — TDD LOOP** | Load `skill implement` — contains Step 3 TDD Loop; load `skill refactor` when entering REFACTOR phase or doing preparatory refactoring |
-| **Step 5 — after PO accepts** | Load `skill create-pr` and `skill git-release` as needed |
+| **Step 5 — after PO accepts** | Load `skill version-control` — merge feature branch to `main` with `--no-ff`; stop. The stakeholder decides when to trigger release.
 
 ## Ownership Rules
 
-- You own all technical decisions: module structure, patterns, internal APIs, test tooling, linting config
+- You own all implementation code: test bodies, production logic, fixtures, tooling config
+- You own git commits and releases
+- **System-architect approves**: any change to stubs, Protocols, or ADR decisions
 - **PO approves**: new runtime dependencies, changed entry points, scope changes
-- **You never move `.feature` files.** The PO is the sole owner of all feature file moves (backlog → in-progress → completed). If you find no `.feature` file in `docs/features/in-progress/`, **STOP** — do not self-select a feature. Write the gap in TODO.md and escalate to PO.
+- **You never move `.feature` files.** The PO is the sole owner of all feature file moves (backlog → in-progress → completed). If you find no `.feature` file in `docs/features/in-progress/`, **STOP** — do not self-select a feature. Write the gap in FLOW.md and escalate to PO.
 
 ## No In-Progress Feature
 
 If `docs/features/in-progress/` contains only `.gitkeep` (no `.feature` file):
 1. Do not pick a feature from backlog yourself.
-2. Update TODO.md: `Next: Run @product-owner — load skill select-feature and pick the next BASELINED feature from backlog.`
-3. Stop. The PO must move the chosen feature into `in-progress/` before you can begin Step 2.
+2. Update FLOW.md: `Next: Run @product-owner — load skill select-feature and pick the next BASELINED feature from backlog.`
+3. Stop. The PO must move the chosen feature into `in-progress/` before you can begin Step 3.
 
 ## Spec Gaps
 
 If during implementation you discover behavior not covered by existing acceptance criteria:
 - Do not extend criteria yourself — escalate to the PO
-- Note the gap in TODO.md under `## Next`
+- Note the gap in FLOW.md under `## Next`
 
 ## Available Skills
 
 - `run-session` — session start/end protocol
-- `implement` — Steps 2-3: architecture + TDD loop
+- `version-control` — Git branching, commit hygiene, merging to main
+- `implement` — Step 3: TDD loop
 - `refactor` — REFACTOR phase and preparatory refactoring (load on-demand)
-- `apply-patterns` — on-demand when smell detected during architecture or refactor
-- `create-pr` — Step 5: PRs with conventional commits
-- `git-release` — Step 5: calver versioning and themed release naming
+- `apply-patterns` — on-demand when smell detected during refactor
 - `create-skill` — meta: create new skills when needed
diff --git a/.opencode/agents/system-architect.md b/.opencode/agents/system-architect.md
new file mode 100644
index 0000000..bc78c9a
--- /dev/null
+++ b/.opencode/agents/system-architect.md
@@ -0,0 +1,79 @@
+---
+description: System Architect responsible for Step 2 (architecture design) and Step 4 (technical verification) — designs the system, hands off to SE, reviews the build
+mode: subagent
+temperature: 0.3
+tools:
+  write: true
+  edit: true
+  bash: true
+  read: true
+  grep: true
+  glob: true
+  task: true
+  skill: true
+permissions:
+  bash:
+    - command: "git *"
+      allow: true
+    - command: "gh *"
+      allow: true
+    - command: "task *"
+      allow: true
+    - command: "uv *"
+      allow: true
+    - command: "*"
+      allow: ask
+---
+
+# System Architect
+
+You design the system's structure and verify that the implementation respects that design. You bridge the gap between the PO's requirements and the SE's code. The same mind that designs the architecture reviews it — no context loss.
+
+## Session Start
+
+Load `skill run-session` first — it reads FLOW.md, orients you to the current step and feature, and tells you what to do next.
+
+## Step Routing
+
+| Step | Action |
+|---|---|
+| **Step 2 — ARCH** | Load `skill architect` — verify on `feat/<stem>` branch, design domain model, write stubs, create ADRs, generate test stubs |
+| **Step 4 — VERIFY** | Load `skill verify` — adversarial technical review of the SE's implementation |
+| **Step 5 — after PO accepts** | Load `skill create-pr` — create and merge the feature pull request |
+
+## Ownership Rules
+
+- You own all architectural decisions: module structure, domain model, interfaces, Protocols, patterns
+- You own `docs/domain-model.md`, `docs/system.md`, and `docs/adr/ADR-*.md` — create and update these at Step 2
+- You review implementation at Step 4 to ensure architectural decisions were respected
+- **PO approves**: new runtime dependencies, changed entry points, scope changes
+- **You never move `.feature` files.** The PO is the sole owner of all feature file moves. If you find no `.feature` file in `docs/features/in-progress/`, **STOP** — do not self-select a feature. Write the gap in FLOW.md and escalate to PO.
+
+## Step 2 → Step 3 Handoff
+
+After architecture is complete and test stubs are generated:
+1. Commit all changes on `feat/<stem>`
+2. Update FLOW.md: `Next: Run @software-engineer — Step 3 TDD Loop`
+3. Stop. The SE takes over for implementation.
+
+## Step 4 Review Stance
+
+Your default hypothesis is that the code is broken despite passing automated checks. You designed the architecture; you know what should have been preserved. Verify that:
+- Stubs were not violated (signatures, boundaries, Protocols)
+- ADR decisions were respected
+- No architectural smells were introduced
+
+## Spec Gaps
+
+If during Step 2 or Step 4 you discover behavior not covered by existing acceptance criteria:
+- Do not extend criteria yourself — escalate to the PO
+- Note the gap in FLOW.md under `## Next`
+
+## Available Skills
+
+- `run-session` — session start/end protocol
+- `architect` — Step 2: architecture and domain design
+- `verify` — Step 4: adversarial technical review
+- `create-pr` — Step 5: create and merge PR after PO acceptance
+- `apply-patterns` — on-demand when smell detected during architecture or review
+- `create-skill` — meta: create new skills when needed
diff --git a/.opencode/skills/apply-patterns/SKILL.md b/.opencode/skills/apply-patterns/SKILL.md
index def03de..6645ff8 100644
--- a/.opencode/skills/apply-patterns/SKILL.md
+++ b/.opencode/skills/apply-patterns/SKILL.md
@@ -1,7 +1,7 @@
 ---
 name: apply-patterns
-description: GoF design pattern catalogue — smell triggers and Python before/after examples
-version: "2.1"
+description: GoF design pattern catalogue — smell triggers and before/after structural descriptions
+version: "3.0"
 author: software-engineer
 audience: software-engineer
 workflow: feature-lifecycle
@@ -9,9 +9,9 @@ workflow: feature-lifecycle
 
 # Design Patterns Reference
 
-Load this skill when the refactor skill's smell table points to a GoF pattern and you need the Python before/after example.
+Load this skill when the refactor skill's smell table points to a GoF pattern and you need structural guidance on how to apply it.
 
-Sources: Gamma, Helm, Johnson, Vlissides. *Design Patterns: Elements of Reusable Object-Oriented Software*. Addison-Wesley, 1995. See `docs/scientific-research/oop-design.md` entry 34.
+Sources: Gamma, Helm, Johnson, Vlissides. *Design Patterns: Elements of Reusable Object-Oriented Software*. Addison-Wesley, 1995; Shvets, A. *Refactoring.Guru* (2014–present) https://refactoring.guru/design-patterns. See `docs/research/oop-design.md` entries 34 and 36.
 
 ---
 
@@ -67,70 +67,35 @@ Load this skill when the `refactor` skill's smell table points to a GoF pattern,
 
 ---
 
-## Smell-Triggered Patterns — Python Examples
+## Smell-Triggered Patterns
 
 ### Creational Smells
 
 ---
 
 #### Smell: Scattered Object Construction
-**Signal**: The same object is constructed in 3+ places with slightly different arguments, or construction logic is duplicated across callers.
+**Signal**: The same object is constructed in 3+ places with slightly different arguments, or construction logic is duplicated across callers. Changes to construction (e.g. adding a required field) require updating every call site.
 
 **Pattern**: Factory Method or Factory Function
 
-```python
-# BEFORE — scattered construction
-# in order_service.py
-order = Order(id=uuid4(), status="pending", created_at=datetime.now())
-
-# in test_order.py
-order = Order(id=UUID("abc..."), status="pending", created_at=datetime(2026, 1, 1))
-
-# in import_service.py
-order = Order(id=uuid4(), status="pending", created_at=datetime.now())
-```
-
-```python
-# AFTER — factory function owns construction
-def make_order(
-    *,
-    order_id: OrderId | None = None,
-    clock: Callable[[], datetime] = datetime.now,
-) -> Order:
-    return Order(
-        id=order_id or OrderId(uuid4()),
-        status=OrderStatus.PENDING,
-        created_at=clock(),
-    )
-```
+**Before**: Construction is repeated inline at every call site with raw arguments. Tests, services, and importers each hardcode the construction details.
+
+**After**: A dedicated factory function or factory method owns construction. All callers go through it. The factory can inject defaults, substitute a clock or ID generator, and be swapped in tests.
+
+**Key structural change**: Creation knowledge moves from N call sites to one place.
 
 ---
 
 #### Smell: Multi-Step Construction with Optional Parts
-**Signal**: An object requires several setup calls before it is valid. Callers must remember the correct sequence.
+**Signal**: An object requires several setup calls before it is valid. Callers must remember the correct sequence. Forgetting a step leaves the object in an invalid or partially initialized state.
 
 **Pattern**: Builder
 
-```python
-# BEFORE — callers must know the correct build sequence
-report = Report()
-report.set_title("Q4 Sales")
-report.add_section(summary)
-report.add_section(detail)
-report.set_footer("Confidential")
-# easy to forget a step or get the order wrong
-```
-
-```python
-# AFTER — builder enforces sequence and provides defaults
-report = (
-    ReportBuilder("Q4 Sales")
-    .with_section(summary)
-    .with_section(detail)
-    .with_footer("Confidential")
-    .build()
-)
-```
+**Before**: Object constructed with a series of setter calls. Order matters but is not enforced. Optional sections may be skipped by accident.
+
+**After**: A builder object accepts each optional part via named methods and produces the final object only when `build()` is called. The builder validates completeness and enforces sequence.
+
+**Key structural change**: Invalid intermediate states are impossible; callers read as a named sequence of intent.
 
 ---
 
@@ -138,116 +103,44 @@ report = (
 
 ---
 
-#### Smell: Type-Switching (if/elif on type or status)
-**Signal**: A function or method contains `if isinstance(x, A): ... elif isinstance(x, B): ...` or `if x.type == "a": ... elif x.type == "b": ...`. Adding a new type requires editing this function.
-
-**Pattern**: Strategy (behavior varies) or Visitor (operation varies over a fixed structure)
-
-```python
-# BEFORE — type switch must be updated for every new discount type
-def apply_discount(order: Order, discount_type: str) -> Money:
-    if discount_type == "percentage":
-        return order.total * (1 - order.rate)
-    elif discount_type == "fixed":
-        return order.total - order.amount
-    elif discount_type == "bogo":
-        return order.total - (order.total / 2)
-    else:
-        raise ValueError(discount_type)
-```
-
-```python
-# AFTER — Strategy: each discount is a callable, closed to modification
-class DiscountStrategy(Protocol):
-    def apply(self, order: Order) -> Money: ...
-
-@dataclass
-class PercentageDiscount:
-    rate: Decimal
-    def apply(self, order: Order) -> Money:
-        return order.total * (1 - self.rate)
-
-@dataclass
-class FixedDiscount:
-    amount: Money
-    def apply(self, order: Order) -> Money:
-        return order.total - self.amount
-
-def apply_discount(order: Order, strategy: DiscountStrategy) -> Money:
-    return strategy.apply(order)
-```
+#### Smell: Type-Switching (branching on a type or status field)
+**Signal**: A function or method branches on a type flag, kind field, or status string. Adding a new variant requires editing this function — it is open to modification but closed to extension.
+
+**Pattern**: Strategy (behavior varies per call) or Visitor (operation varies over a fixed structure)
+
+**Before**: A single function contains a multi-branch conditional on the variant. Every new variant requires modifying the function and all its tests.
+
+**After (Strategy)**: Each variant is encapsulated in its own class implementing a shared interface. The caller receives the strategy as a dependency. Adding a new variant means adding a new class — the caller and existing variants are untouched.
+
+**After (Visitor)**: When the object structure is stable but operations vary, a visitor separates each operation into its own class. Each element accepts a visitor and dispatches to the right method.
+
+**Key structural change**: Open/Closed principle restored — new variants extend without modifying existing code.
 
 ---
 
 #### Smell: Feature Envy
-**Signal**: A method in class A uses data from class B more than its own data. The method "envies" class B.
-
-**Pattern**: Move Method to the envied class (Fowler refactoring that often precedes Strategy or Command)
-
-```python
-# BEFORE — OrderPrinter knows too much about Order internals
-class OrderPrinter:
-    def format_total(self, order: Order) -> str:
-        subtotal = sum(item.price * item.quantity for item in order.items)
-        tax = subtotal * order.tax_rate
-        return f"{subtotal + tax:.2f}"
-```
-
-```python
-# AFTER — total belongs on Order
-@dataclass
-class Order:
-    items: list[LineItem]
-    tax_rate: Decimal
-
-    def total(self) -> Money:
-        subtotal = sum(item.subtotal() for item in self.items)
-        return subtotal * (1 + self.tax_rate)
-
-class OrderPrinter:
-    def format_total(self, order: Order) -> str:
-        return f"{order.total():.2f}"
-```
+**Signal**: A method in class A uses data or methods from class B more than its own. The method "envies" class B and is likely in the wrong place.
+
+**Pattern**: Move Function (Fowler) — often a precursor to Strategy or Command
+
+**Before**: A method on one class navigates into another class's fields to perform a computation. The computation is separated from the data it operates on.
+
+**After**: The computation moves to the class whose data it uses. The original class delegates to it. The envied class gains behavior; the original class becomes a coordinator.
+
+**Key structural change**: Behavior lives next to the data it depends on.
 
 ---
 
 #### Smell: Parallel Inheritance Hierarchies
-**Signal**: Every time you add a subclass to hierarchy A, you must also add a corresponding subclass to hierarchy B. The two trees grow in lockstep.
+**Signal**: Every time a subclass is added to hierarchy A, a corresponding subclass must also be added to hierarchy B. The two trees grow in lockstep — a sign that the two axes of variation are entangled.
 
 **Pattern**: Bridge
 
-```python
-# BEFORE — adding a new Shape requires a new renderer subclass too
-class Shape: ...
-class Circle(Shape): ...
-class Square(Shape): ...
-
-class SVGCircle(Circle): ...
-class SVGSquare(Square): ...
-class PNGCircle(Circle): ...
-class PNGSquare(Square): ...
-```
-
-```python
-# AFTER — Bridge separates shape from renderer
-class Renderer(Protocol):
-    def render_circle(self, radius: float) -> None: ...
-    def render_square(self, side: float) -> None: ...
-
-@dataclass
-class Circle:
-    radius: float
-    renderer: Renderer
-    def draw(self) -> None:
-        self.renderer.render_circle(self.radius)
-
-@dataclass
-class Square:
-    side: float
-    renderer: Renderer
-    def draw(self) -> None:
-        self.renderer.render_square(self.side)
-```
+**Before**: Two hierarchies are coupled. A `Shape` hierarchy and a `Renderer` hierarchy grow together. Each shape–renderer combination requires its own subclass.
+
+**After**: The Bridge pattern separates the two hierarchies. The abstraction (shape) holds a reference to the implementation (renderer) as a dependency. Each axis can vary independently. Combinatorial subclass explosion is eliminated.
+
+**Key structural change**: Two axes of variation become two independent hierarchies composed at runtime.
 
 ---
 
@@ -256,131 +149,41 @@ class Square:
 ---
 
 #### Smell: Large State Machine in One Class
-**Signal**: A class has a `status` or `state` field, and many methods begin with `if self.state == X: ... elif self.state == Y: ...`. Adding a new state requires editing all these methods.
+**Signal**: A class has a status or state field, and many methods begin by branching on that field. Adding a new state requires editing all of those methods. The class grows in proportion to the number of states.
 
 **Pattern**: State
 
-```python
-# BEFORE — Order methods all branch on status
-class Order:
-    def confirm(self) -> None:
-        if self.status == "pending":
-            self.status = "confirmed"
-        else:
-            raise InvalidTransition(self.status, "confirm")
-
-    def ship(self) -> None:
-        if self.status == "confirmed":
-            self.status = "shipped"
-        else:
-            raise InvalidTransition(self.status, "ship")
-```
-
-```python
-# AFTER — each state owns its own transitions
-class OrderState(Protocol):
-    def confirm(self, order: Order) -> None: ...
-    def ship(self, order: Order) -> None: ...
-
-class PendingState:
-    def confirm(self, order: Order) -> None:
-        order.state = ConfirmedState()
-    def ship(self, order: Order) -> None:
-        raise InvalidTransition("pending", "ship")
-
-class ConfirmedState:
-    def confirm(self, order: Order) -> None:
-        raise InvalidTransition("confirmed", "confirm")
-    def ship(self, order: Order) -> None:
-        order.state = ShippedState()
-
-@dataclass
-class Order:
-    state: OrderState = field(default_factory=PendingState)
-    def confirm(self) -> None: self.state.confirm(self)
-    def ship(self) -> None: self.state.ship(self)
-```
+**Before**: The class contains multi-branch conditionals in every method that involves state. Each state's transitions and guards are scattered across the class body.
+
+**After**: Each state is its own class implementing a shared interface. Each state object owns its transitions — it knows which transitions are valid and what the next state is. The context object (the original class) delegates to the current state. Adding a new state means adding a new class.
+
+**Key structural change**: State-specific behavior is co-located in the state class; the context becomes a thin delegator.
 
 ---
 
 #### Smell: Scattered Notification / Event Fan-Out
-**Signal**: When something happens in class A, it directly calls methods on classes B, C, and D. Adding a new listener requires modifying class A.
+**Signal**: When something happens in class A, it directly calls methods on classes B, C, and D. Adding a new listener requires modifying class A. Class A knows about all downstream consumers.
 
 **Pattern**: Observer
 
-```python
-# BEFORE — Order directly notifies every downstream system
-class Order:
-    def confirm(self) -> None:
-        self.status = "confirmed"
-        EmailService().send_confirmation(self)      # direct coupling
-        InventoryService().reserve(self)             # direct coupling
-        AnalyticsService().record_conversion(self)   # direct coupling
-```
-
-```python
-# AFTER — Order emits an event; listeners register independently
-class OrderConfirmedListener(Protocol):
-    def on_order_confirmed(self, order: Order) -> None: ...
-
-@dataclass
-class Order:
-    _listeners: list[OrderConfirmedListener] = field(default_factory=list)
-
-    def add_listener(self, listener: OrderConfirmedListener) -> None:
-        self._listeners.append(listener)
-
-    def confirm(self) -> None:
-        self.status = OrderStatus.CONFIRMED
-        for listener in self._listeners:
-            listener.on_order_confirmed(self)
-```
+**Before**: The event source directly invokes each downstream system. The source and all consumers are tightly coupled. Adding a consumer modifies the source.
+
+**After**: The source defines a listener interface and maintains a list of registered listeners. Each listener registers itself. When the event occurs, the source notifies all listeners without knowing their concrete types. New listeners are added without touching the source.
+
+**Key structural change**: Coupling direction reversed — listeners depend on the source, not the other way around.
 
 ---
 
 #### Smell: Repeated Algorithm Skeleton
-**Signal**: Two or more functions share the same high-level structure (setup → process → teardown) but differ only in one or two steps. The structure is copied rather than shared.
+**Signal**: Two or more functions share the same high-level structure (setup → process → teardown, or read → parse → validate → save) but differ only in one or two steps. The structure is copied rather than shared.
 
 **Pattern**: Template Method
 
-```python
-# BEFORE — CSV and JSON importers duplicate the pipeline structure
-def import_csv(path: Path) -> list[Record]:
-    raw = path.read_text()
-    rows = parse_csv(raw)        # varies
-    records = [validate(r) for r in rows]
-    save_all(records)
-    return records
-
-def import_json(path: Path) -> list[Record]:
-    raw = path.read_text()
-    rows = parse_json(raw)       # varies
-    records = [validate(r) for r in rows]
-    save_all(records)
-    return records
-```
-
-```python
-# AFTER — Template Method: skeleton in base, varying step overridden
-class Importer(ABC):
-    def run(self, path: Path) -> list[Record]:
-        raw = path.read_text()
-        rows = self.parse(raw)          # hook
-        records = [validate(r) for r in rows]
-        save_all(records)
-        return records
-
-    @abstractmethod
-    def parse(self, raw: str) -> list[dict]: ...
-
-class CsvImporter(Importer):
-    def parse(self, raw: str) -> list[dict]:
-        return parse_csv(raw)
-
-class JsonImporter(Importer):
-    def parse(self, raw: str) -> list[dict]:
-        return parse_json(raw)
-```
+**Before**: Two functions duplicate the pipeline structure. When the shared steps change (e.g. validation logic), both must be updated in sync. The differing step is buried inside the duplication.
+
+**After**: A base class defines the algorithm skeleton as a method that calls abstract hook methods for the varying steps. Each subclass implements only the hook(s) that differ. The shared steps exist in one place.
+
+**Key structural change**: Invariant structure lives in one place; variants are isolated in named hooks.
 
 ---
 
@@ -390,12 +193,12 @@ class JsonImporter(Importer):
 |---|---|
 | Same object constructed in 3+ places | Factory Method / Factory Function |
 | Multi-step setup before object is valid | Builder |
-| `if type == X: ... elif type == Y:` | Strategy |
-| Method uses another class's data more than its own | Move Method (Fowler) |
+| Branching on a type, kind, or status field | Strategy |
+| Method uses another class's data more than its own | Move Function (Fowler) |
 | Two class hierarchies that grow in lockstep | Bridge |
-| `if self.state == X:` in multiple methods | State |
-| Class directly calls B, C, D on state change | Observer |
-| Two functions share the same skeleton, differ in one step | Template Method |
+| Many methods branch on the same state field | State |
+| Object directly calls multiple downstream systems on change | Observer |
+| Two functions share the same algorithm skeleton, differ in one step | Template Method |
 | Subsystem is complex and callers need a simple entry point | Facade |
 
 ---
diff --git a/.opencode/skills/architect/SKILL.md b/.opencode/skills/architect/SKILL.md
new file mode 100644
index 0000000..b4a2ffd
--- /dev/null
+++ b/.opencode/skills/architect/SKILL.md
@@ -0,0 +1,192 @@
+---
+name: architect
+description: Step 2 — Architecture and domain design, one feature at a time
+version: "1.0"
+author: system-architect
+audience: system-architect
+workflow: feature-lifecycle
+---
+
+# Architect
+
+Step 2: design the domain model, write architecture stubs, record decisions, and generate test stubs. The system-architect owns this step entirely.
+
+## When to Use
+
+Load this skill when starting Step 2 (Architecture) after the PO has moved a BASELINED feature to `in-progress/`.
+
+## System-Architect Quality Gate Priority Order
+
+During architecture, correctness priorities are (in order):
+
+1. **Design correctness** — YAGNI > KISS > DRY > SOLID > Object Calisthenics > appropriate design patterns > complex code > complicated code > failing code > no code
+2. **One test green** — `uv run task test-fast` passes after stub generation
+3. **Commit** — when stubs and ADRs are complete
+
+Design correctness is far more important than lint/pyright/coverage compliance. Never run lint or static-check during architecture — those are handoff-only checks.
+
+---
+
+## Step 2 — Architecture
+
+### Prerequisites (stop if any fail — escalate to PO)
+
+1. `docs/features/in-progress/` contains exactly one `.feature` file (not just `.gitkeep`). If none exists, **STOP** — update FLOW.md `Next:` to `Run @product-owner — move the chosen feature to in-progress/` and stop. Never self-select or move a feature yourself.
+2. The feature file's discovery section has `Status: BASELINED`. If not, escalate to PO — Step 1 is incomplete.
+3. The feature file contains `Rule:` blocks with `Example:` blocks and `@id` tags. If not, escalate to PO — criteria have not been written.
+4. Package name confirmed: read `pyproject.toml` → locate `[tool.setuptools]` → confirm directory exists on disk.
+5. **Branch verification**: `git branch --show-current` must output `feat/<stem>` or `fix/<stem>`. If it outputs `main` or any other branch, stop — the SE must create the correct branch via `skill version-control` before architecture begins.
+
+### Package Verification (mandatory — before writing any code)
+
+1. Read `pyproject.toml` → locate `[tool.setuptools]` → record `packages = ["<name>"]`
+2. Confirm directory exists: `ls <name>/`
+3. All new source files go under `<name>/`
+
+**Note on feature file moves**: The PO moves `.feature` files between folders. The system-architect never moves, creates, or edits `.feature` files. Update FLOW.md `Feature:` and `Source:` to reflect `in-progress/` once the PO has moved the file.
+
+### Read Phase (targeted reads only — before writing anything)
+
+1. Read `docs/system.md` — understand current system structure and constraints
+2. Read `docs/glossary.md` if it exists — use existing domain terms when naming classes, methods, and modules; do not invent synonyms
+3. Read in-progress `.feature` file (full: Rules + Examples + @id)
+4. Run `tree <package>/` — understand package structure without reading every file
+5. Read **specific `.py` files** whose names match nouns from the feature — understand what already exists before adding anything. Do not read the entire package.
+
+### Domain Analysis
+
+From `docs/glossary.md` + Rules (Business) in the `.feature` file:
+- **Nouns** → candidate classes, value objects, aggregates
+- **Verbs** → method names with typed signatures
+- **Datasets** → named types (not bare dict/list)
+- **Bounded Context check**: same word, different meaning across features? → module boundary
+- **Cross-feature entities** → candidate shared domain layer
+
+### Create / Update Domain Model
+
+**If `docs/domain-model.md` does not exist**: create it from the domain analysis using the template in `domain-model.md.template` in the `implement` skill's directory.
+
+**If `docs/domain-model.md` exists**: append new entities, verbs, and relationships discovered in this feature. Deprecate old entries if they are superseded. Never edit existing live entries — code depends on them.
+
+This file is system-architect-owned. The PO reads it but never writes to it.
+
+### Silent Pre-mortem (before writing anything)
+
+> "In 6 months this design is a mess. What mistakes did we make?"
+
+For each candidate class:
+- >2 ivars? → split
+- >1 reason to change? → isolate
+
+For each external dep:
+- Is it behind a Protocol? → if not, add
+
+For each noun:
+- Serving double duty across modules? → isolate
+
+If pattern smell detected, load `skill apply-patterns`.
+
+### Write Stubs into Package
+
+From the domain analysis, write or extend `.py` files in `<package>/`. For each entity:
+
+- **If the file already exists**: add the new class or method signature — do not remove or alter existing code.
+- **If the file does not exist**: create it with the new signatures only.
+
+**Stub rules (strictly enforced):**
+- Method bodies must be `...` — no logic, no conditionals, no imports beyond `typing` and domain types
+- No docstrings — signatures will change; add docstrings after GREEN (lint enforces this at quality gate)
+- No inline comments, no TODO comments, no speculative code
+
+**Example — correct stub style:**
+
+```python
+from dataclasses import dataclass
+from typing import Protocol
+
+
+@dataclass(frozen=True, slots=True)
+class EmailAddress:
+    value: str
+
+    def validate(self) -> None: ...
+
+
+class UserRepository(Protocol):
+    def save(self, user: "User") -> None: ...
+    def find_by_email(self, email: EmailAddress) -> "User | None": ...
+```
+
+**File placement (common patterns, not required names):**
+- `<package>/domain/<noun>.py` — entities, value objects
+- `<package>/domain/service.py` — cross-entity operations
+
+Place stubs where responsibility dictates — do not pre-create `ports/` or `adapters/` folders unless a concrete external dependency was identified in scope. Structure follows domain analysis, not a template.
+
+### Record Architectural Decisions
+
+For each significant decision, create a new file:
+
+```bash
+docs/adr/ADR-YYYY-MM-DD-<slug>.md
+```
+
+Use the template in `adr.md.template` in the `implement` skill's directory. Fill in Decision, Reason, Alternatives Considered, and Consequences.
+
+Only create an ADR for non-obvious decisions with meaningful trade-offs. Routine YAGNI choices do not need a record.
+
+Reference relevant ADRs from `docs/system.md` so other agents know which decisions affect the current system state.
+
+### Architecture Smell Check (hard gate)
+
+Apply to the stub files just written:
+
+- [ ] No class with >2 responsibilities (SOLID-S)
+- [ ] No behavioural class with >2 instance variables (OC-8; dataclasses, Pydantic models, value objects, and TypedDicts are exempt)
+- [ ] All external deps assigned a Protocol (SOLID-D + Hexagonal) — N/A if no external dependencies identified in scope
+- [ ] No noun with different meaning across modules (DDD Bounded Context)
+- [ ] No missing Creational pattern: repeated construction without Factory/Builder
+- [ ] No missing Structural pattern: type-switching without Strategy/Visitor
+- [ ] No missing Behavioral pattern: state machine or scattered notification without State/Observer
+- [ ] Each ADR consistent with each @id AC — no contradictions
+
+If any check fails: fix the stub files before committing.
+
+### Generate Test Stubs
+
+Run `uv run task test-fast` once. It reads the in-progress `.feature` file, assigns `@id` tags to any untagged `Example:` blocks (writing them back to the `.feature` file), and generates `tests/features/<feature_slug>/<rule_slug>_test.py` — one file per `Rule:` block, one skipped function per `@id`. Verify the files were created, then stage all changes (including any `@id` write-backs to the `.feature` file).
+
+Commit: `feat(<feature-stem>): add architecture and test stubs`
+
+### Hand off to Step 3 (TDD Loop)
+
+1. Update FLOW.md: `Next: Run @software-engineer — Step 3 TDD Loop`
+2. Provide the SE with:
+   - Feature file path
+   - Summary of stubs created
+   - Any ADRs that constrain implementation
+   - Any domain-model changes
+3. Stop. The SE takes over.
+
+---
+
+## Handling Spec Gaps
+
+If during architecture you discover behavior not covered by existing acceptance criteria:
+- **Do not extend criteria yourself** — escalate to PO
+- Note the gap in FLOW.md under `## Next`
+- The PO will decide whether to add a new Example to the `.feature` file
+
+---
+
+## Templates
+
+Templates for files written by this skill live in the `implement` skill's directory:
+
+- `domain-model.md.template` — `docs/domain-model.md` structure
+- `system.md.template` — `docs/system.md` structure
+- `adr.md.template` — individual ADR file structure
+
+Base directory for this skill: file:///home/user/Documents/projects/python-project-template/.opencode/skills/architect
+Relative paths in this skill (e.g., scripts/, reference/) are relative to this base directory.
+Note: file list is sampled.
diff --git a/.opencode/skills/check-quality/SKILL.md b/.opencode/skills/check-quality/SKILL.md
index 673f202..6de9f77 100644
--- a/.opencode/skills/check-quality/SKILL.md
+++ b/.opencode/skills/check-quality/SKILL.md
@@ -3,19 +3,19 @@ name: check-quality
 description: Enforce code quality using ruff, pytest coverage, and static type checking
 version: "2.1"
 author: software-engineer
-audience: software-engineer, reviewer
+audience: software-engineer, system-architect
 workflow: feature-lifecycle
 ---
 
 # Check Quality
 
-Quick reference for the software-engineer quality gate before handing off to the reviewer (Step 4).
+Quick reference for the software-engineer quality gate before handing off to the system-architect (Step 4).
 
-**For the full verification protocol used by the reviewer, load `skill verify`.**
+**For the full verification protocol used by the system-architect, load `skill verify`.**
 
 ## When to Use
 
-Load this skill when completing Step 3 and preparing to hand off to the reviewer. Run all four commands; all must pass before signalling handoff.
+Load this skill when completing Step 3 and preparing to hand off to the system-architect. Run all four commands; all must pass before signalling handoff.
 
 ## Step-by-Step
 
diff --git a/.opencode/skills/create-agent/SKILL.md b/.opencode/skills/create-agent/SKILL.md
index b1cc5e9..5e0da1f 100644
--- a/.opencode/skills/create-agent/SKILL.md
+++ b/.opencode/skills/create-agent/SKILL.md
@@ -49,7 +49,7 @@ Create `.opencode/agents/<agent-name>.md`:
 ---
 name: <agent-name>
 description: <1-sentence description of what this agent does>
-role: <product-owner | software-engineer | reviewer | setup-project | human-user>
+role: <product-owner | system-architect | software-engineer | setup-project | human-user>
 steps: <step numbers this agent owns, e.g., "2, 3">
 ---
 
@@ -138,7 +138,7 @@ Register the agent in the workflow section of `AGENTS.md`:
 ---
 name: <agent-name>
 description: <what this agent does, 1 sentence>
-role: <product-owner | software-engineer | reviewer | setup-project | human-user>
+role: <product-owner | system-architect | software-engineer | setup-project | human-user>
 steps: <owned steps, e.g., "2-3">
 ---
 
@@ -184,8 +184,8 @@ When to escalate to human: <conditions>
 | Agent | Role | Steps | Purpose |
 |---|---|---|---|
 | `product-owner` | product-owner | 1, 5 | Scope discovery, acceptance |
-| `software-engineer` | software-engineer | 2, 3, 5 | Architecture, TDD, releases |
-| `reviewer` | reviewer | 4 | Adversarial verification |
+| `system-architect` | system-architect | 2, 4 | Architecture, adversarial verification |
+| `software-engineer` | software-engineer | 3, 5 | TDD, releases |
 | `setup-project` | setup-project | meta | Initialize new projects |
 
 ## Best Practices Summary
diff --git a/.opencode/skills/create-pr/SKILL.md b/.opencode/skills/create-pr/SKILL.md
index d4cb3e5..72bfea0 100644
--- a/.opencode/skills/create-pr/SKILL.md
+++ b/.opencode/skills/create-pr/SKILL.md
@@ -2,8 +2,8 @@
 name: create-pr
 description: Create pull requests with conventional commits, proper formatting, and branch workflow
 version: "1.0"
-author: software-engineer
-audience: software-engineer
+author: system-architect
+audience: system-architect
 workflow: git-management
 ---
 
@@ -11,7 +11,7 @@ workflow: git-management
 
 ## When to Use
 
-Load this skill after the reviewer approves the feature (Step 4 APPROVED) and the PO has accepted it (Step 5). Use it to create and merge the feature pull request.
+Load this skill after the system-architect approves the feature (Step 4 APPROVED) and the PO has accepted it (Step 5). Use it to create and merge the feature pull request.
 
 ## Step-by-Step
 
@@ -66,7 +66,7 @@ gh pr create \
 - Application runs: `timeout 10s task run` (exit 124 = hung = fix it)
 
 ## Reviewer Notes
-<Any context the reviewer needs>
+<Any context the system-architect needs>
 EOF
 )"
 ```
@@ -83,13 +83,15 @@ EOF
 
 ## Merging
 
-Use squash merge for feature branches to keep main history clean:
+Use `--no-ff` merge to preserve feature boundary in history. This makes the feature revertible as a single unit:
 ```bash
-gh pr merge <number> --squash --delete-branch
+gh pr merge <number> --merge --delete-branch
 ```
 
-After merge, update local main:
+**After merge**:
 ```bash
 git checkout main
 git pull origin main
 ```
+
+**Why not squash**: Squash merge erases the individual commit history of the feature. With `--no-ff`, the merge commit groups all feature commits together while preserving each commit's message and authorship.
diff --git a/.opencode/skills/create-skill/SKILL.md b/.opencode/skills/create-skill/SKILL.md
index 7774de3..049f899 100644
--- a/.opencode/skills/create-skill/SKILL.md
+++ b/.opencode/skills/create-skill/SKILL.md
@@ -27,7 +27,7 @@ Before writing any skill, research the domain to ground the skill in industry st
    - Vendor documentation (OpenAI, Anthropic, Google, Microsoft)
    - Industry standards (ISO, NIST, OMG)
    - Established methodologies (e.g., FDD, Scrum, Kanban for process skills)
-3. **Read existing research**: Check `docs/scientific-research/` for related entries — each file covers a domain (testing, oop-design, architecture, ai-agents, etc.)
+3. **Read existing research**: Check `docs/research/` for related entries — each file covers a domain (testing, oop-design, architecture, ai-agents, etc.)
 4. **Synthesize conclusions**: Extract actionable conclusions — what works, why, and when to apply it
 5. **Embed as guidance**: Write the skill's steps, checklists, and decision rules based on those conclusions — not as academic citations but as direct guidance ("Use X because it produces Y outcome")
 
@@ -137,10 +137,10 @@ Add the skill name to the agent's "Available Skills" section so the agent knows
 | `define-scope` | product-owner | Step 1: define acceptance criteria |
 | `implement` | software-engineer | Steps 2-3: architecture + TDD loop |
 | `apply-patterns` | software-engineer | Steps 2, 3: refactor when smell detected |
-| `verify` | reviewer | Step 4: adversarial verification |
+| `verify` | system-architect | Step 4: adversarial verification |
 | `check-quality` | software-engineer | Quick reference — redirects to verify |
-| `create-pr` | software-engineer | Step 5: create PR with squash merge |
-| `git-release` | software-engineer | Step 5: calver versioning and release |
+| `create-pr` | system-architect | Step 5: create PR with --no-ff merge |
+| `git-release` | stakeholder | Step 5: calver versioning and release |
 | `update-docs` | product-owner | Step 5 (after acceptance) + on stakeholder demand: C4 diagrams + glossary |
 | `design-colors` | designer | Color palette selection and WCAG validation |
 | `design-assets` | designer | SVG visual asset creation and updates |
diff --git a/.opencode/skills/define-scope/SKILL.md b/.opencode/skills/define-scope/SKILL.md
index b8803a6..7c93c7d 100644
--- a/.opencode/skills/define-scope/SKILL.md
+++ b/.opencode/skills/define-scope/SKILL.md
@@ -1,7 +1,7 @@
 ---
 name: define-scope
 description: Step 1 — discover requirements through stakeholder interviews and write Gherkin acceptance criteria
-version: "5.0"
+version: "6.0"
 author: product-owner
 audience: product-owner
 workflow: feature-lifecycle
@@ -21,7 +21,7 @@ Step 1 has two stages:
 
 | Stage | Who | Output |
 |---|---|---|
-| **Stage 1 — Discovery** | PO + stakeholder | `docs/discovery_journal.md` (Q&A) + `docs/discovery.md` (synthesis) + `.feature` descriptions |
+| **Stage 1 — Discovery** | PO + stakeholder | `docs/scope_journal.md` (Q&A) + `docs/discovery.md` (synthesis) + `.feature` descriptions |
 | **Stage 2 — Specification** | PO alone | `Rule:` blocks + `Example:` blocks with `@id` tags in `.feature` files |
 
 Stage 1 is iterative and ongoing — sessions happen whenever the PO or stakeholder needs to discover or refine scope. Stage 2 runs per feature, only after that feature has `Status: BASELINED`.
@@ -76,15 +76,44 @@ Discovery is a continuous, iterative process. Sessions happen whenever scope nee
 
 **Before asking any questions:**
 
-1. Check `docs/discovery_journal.md` for the most recent session block.
+1. Check `docs/scope_journal.md` for the most recent session block.
    - If the most recent block has `Status: IN-PROGRESS` → the previous session was interrupted. Resume it: check which `.feature` files need updating (compare journal Q&A against current `.feature` descriptions), write the `discovery.md` synthesis block if missing, then mark the block `Status: COMPLETE`. Only then begin a new session.
-   - If `docs/discovery_journal.md` does not exist → this is the first session. Create both `docs/discovery_journal.md` and `docs/discovery.md` using the templates at the end of this skill.
-2. Open `docs/discovery_journal.md` and append a new session header:
+   - If `docs/scope_journal.md` does not exist → this is the first session. Create both `docs/scope_journal.md` and `docs/discovery.md` using the templates in `scope-journal.md.template` and `discovery.md.template` in this skill's directory.
+2. Read `docs/domain-model.md` (if it exists) to check existing entities. The PO reads this file but never writes to it. If it does not exist yet, the SA will create it at Step 2.
+3. Declare session scope to the stakeholder: announce the total groups and estimated question count (e.g., "3 groups: General (7 Q), Cross-cutting, Feature: login").
+4. Open `docs/scope_journal.md` and append a new session header:
    ```markdown
    ## YYYY-MM-DD — Session N
    Status: IN-PROGRESS
    ```
-   Write this header **before** asking any questions. This is the durability marker — if the session is interrupted, the next agent sees `IN-PROGRESS` and knows writes are pending.
+    Write this header **before** asking any questions. This is the durability marker — if the session is interrupted, the next agent sees `IN-PROGRESS` and knows writes are pending.
+
+### Interview Protocol
+
+**Progress declaration (first message):**
+State the session structure upfront:
+> "This discovery session has 3 question groups:
+> 1. General (7 questions) — about users, goals, success/failure
+> 2. Cross-cutting — about behavior groups, integrations, lifecycle events
+> 3. Feature: <name> — about specific functionality
+>
+> I will ask one group at a time and summarize before moving on."
+
+**Question grouping:**
+- One `question` tool call per question group
+- Each question within the group uses a clear `header` showing progress, e.g.:
+  - `General — Q1/7`
+  - `General — Q2/7`
+  - `Feature: login — Q3/5`
+
+**Input types:**
+- **Checkbox (`multiple: true`)**: for multi-select answers (e.g., "Which platforms?" "Which user roles?")
+- **Options**: for single-select with known choices (e.g., "Priority: High / Medium / Low")
+- **Fill-up field (free text)**: for open-ended responses that cannot be pre-listed
+
+**Defaults:**
+- Offer "Other" or pre-fill with most common answer when context permits
+- Never force a stakeholder into a false dichotomy; always include "Something else / Not sure"
 
 ### Question Order (within every session)
 
@@ -111,7 +140,7 @@ Target behavior groups, bounded contexts, integration points, lifecycle events,
 **3. Feature questions** (one feature at a time)
 
 For each feature the session touches:
-- Extract relevant nouns and verbs from `docs/discovery.md` Domain Model (if it exists)
+- Extract relevant nouns and verbs from `docs/glossary.md` and `docs/domain-model.md` (if they exist)
 - Generate questions from entity gaps: boundaries, edge cases, interactions, failure modes
 - Run a silent pre-mortem: "Imagine the developer builds this feature exactly as described, all tests pass, but the feature doesn't work for the user. What would be missing?"
 - Apply CIT, Laddering, and CI Perspective Change per question
@@ -121,33 +150,64 @@ For each feature the session touches:
 2. Create stub `.feature` files for both parts (if they don't already exist)
 3. Continue feature questions for both new features in sequence within the same session
 
+### Write Confirmation Gate
+
+**Before writing ANY file:** `docs/scope_journal.md`, `.feature` files, or `docs/discovery.md`.
+
+1. State exactly what will be written:
+   > "I will now append the Q&A from this session to `docs/scope_journal.md`."
+
+2. State exactly which file(s):
+   > "I will create `docs/features/backlog/<feature-stem>.feature`."
+
+3. **Ask for explicit confirmation** using the `question` tool:
+   - `header: "Ready to write"`
+   - Question text: "Confirm: write to `<path>`?"
+   - Options: `["Yes, write it", "Show me a preview first", "No, I need changes"]`
+
+4. Only proceed with `write`/`edit` if the answer is confirmation.
+
+**This applies to all write operations in this skill**, including:
+- `docs/scope_journal.md` (session header and Q&A)
+- `docs/features/backlog/<feature-stem>.feature` (initial description or update)
+- `docs/discovery.md` (synthesis block)
+
 ### After Questions (PO alone, same session)
 
 **Step A — Write answered Q&A to journal**
 
-Append all answered Q&A to `docs/discovery_journal.md`, in groups (general, cross-cutting, then per-feature). Write only answered questions. Unanswered questions are discarded.
+Append all answered Q&A to `docs/scope_journal.md`, in groups (general, cross-cutting, then per-feature). Write only answered questions. Unanswered questions are discarded.
 
 Group headers use this format:
 - General group: `### General`
 - Cross-cutting group: `### <Group Name>`
 - Feature group: `### Feature: <feature-stem>`
 
-**Step B — Update .feature descriptions**
+**Step B — Update glossary and discovery.md**
+
+1. Update `docs/glossary.md` (new or corrected definitions; edits allowed).
+2. Append to `docs/discovery.md` (use the template in `discovery.md.template`):
+   - 3-line session summary (general/behavioral focus)
+   - Entities **added or deprecated** this session (suggestions for the SE; not a formal model)
+   - Features **touched** this session + 1-line reason why
+
+The PO does **not** write `docs/domain-model.md`. Entity suggestions live in `discovery.md` for the SA to formalize at Step 2.
+
+**Step C — Update .feature descriptions**
 
 For each feature touched in this session: rewrite the `.feature` file description to reflect the current state of understanding. Only touched features are updated; all others remain exactly as-is.
 
-If a feature is new (just created as a stub): write its initial description now.
+If a feature is new (just created as a stub): write its initial description now. Use the template in `feature.md.template`.
 
-**Step C — Append session synthesis to discovery.md (LAST)**
+**Step D — Completed feature regression check**
 
-After all `.feature` files are updated, append one `## Session: YYYY-MM-DD` block to `docs/discovery.md`. The block contains:
-- `### Feature List` — which features were added or changed (0–N entries); if nothing changed, write "No changes"
-- `### Domain Model` — new or updated domain entities and verbs; if nothing changed, write "No changes"
-- `### Context` (first session only) — 3–5 sentence synthesis of who the users are, what the product does, why it exists, success/failure conditions, and explicit out-of-scope
+If a `completed/` feature was touched and its description/rules changed:
+- **Move it to `backlog/`**. Description changes always imply behavior changes; cosmetic rewrites are never performed.
+- Record the move in `discovery.md`: "Moved `<feature-stem>` from completed to backlog due to changed requirements."
 
-**Step D — Mark session complete**
+**Step E — Mark session complete**
 
-Update the session header in `docs/discovery_journal.md`:
+Update the session header in `docs/scope_journal.md`:
 ```markdown
 ## YYYY-MM-DD — Session N
 Status: COMPLETE
@@ -281,16 +341,17 @@ All Rules must have their pre-mortems completed before any Examples are written.
 
 Communicate verbally to the next agent. Every `DISAGREE` is a **hard blocker** — fix before committing. Do not commit until all items are AGREE or have a documented resolution.
 
-- INVEST-I: each Rule is Independent (no hidden ordering or dependency between Rules) — AGREE/DISAGREE | conflict:
-- INVEST-V: each Rule delivers Value to a named user — AGREE/DISAGREE | Rule:
-- INVEST-S: each Rule is Small enough for one development cycle — AGREE/DISAGREE | Rule:
-- INVEST-T: each Rule is Testable (I can write a pass/fail Example for it) — AGREE/DISAGREE | Rule:
-- Observable: every Then is a single, observable, measurable outcome — AGREE/DISAGREE | file:line
-- No impl details: no Example tests internal state or implementation — AGREE/DISAGREE | file:line
-- Coverage: every entity in the feature description appears in at least one Rule — AGREE/DISAGREE | missing:
-- Distinct: no two Examples test the same observable behavior — AGREE/DISAGREE | file:line
-- Pre-mortem: I ran a pre-mortem on each Rule and found no hidden failure modes — AGREE/DISAGREE | Rule:
-- Scope: no Example introduces behavior outside the feature boundary — AGREE/DISAGREE | file:line
+As a product-owner I declare that:
+* INVEST-I: each Rule is Independent (no hidden ordering or dependency between Rules) — AGREE/DISAGREE | conflict:
+* INVEST-V: each Rule delivers Value to a named user — AGREE/DISAGREE | Rule:
+* INVEST-S: each Rule is Small enough for one development cycle — AGREE/DISAGREE | Rule:
+* INVEST-T: each Rule is Testable (I can write a pass/fail Example for it) — AGREE/DISAGREE | Rule:
+* Observable: every Then is a single, observable, measurable outcome — AGREE/DISAGREE | file:line
+* No impl details: no Example tests internal state or implementation — AGREE/DISAGREE | file:line
+* Coverage: every entity in the feature description appears in at least one Rule — AGREE/DISAGREE | missing:
+* Distinct: no two Examples test the same observable behavior — AGREE/DISAGREE | file:line
+* Pre-mortem: I ran a pre-mortem on each Rule and found no hidden failure modes — AGREE/DISAGREE | Rule:
+* Scope: no Example introduces behavior outside the feature boundary — AGREE/DISAGREE | file:line
 
 Commit: `feat(criteria): write acceptance criteria for <feature-stem>`
 
@@ -323,153 +384,81 @@ When a defect is reported against a completed or in-progress feature:
 
 ## Feature File Format
 
-Each feature is a single `.feature` file. The description block contains the feature description and Status. All Q&A belongs in `docs/discovery_journal.md`; all architectural decisions belong in `docs/architecture.md`.
-
-```gherkin
-Feature: <Feature title>
-
-  <2–4 sentence description of what this feature does and why it exists.
-  Written in plain language, always kept current by the PO.>
-
-  Status: ELICITING | BASELINED (YYYY-MM-DD)
-
-  Rules (Business):
-  - <Business rule that applies across multiple Examples>
+Each feature is a single `.feature` file. The description block contains the feature description and Status. All Q&A belongs in `docs/scope_journal.md`; all architectural decisions belong in `docs/adr/ADR-YYYY-MM-DD-<slug>.md`.
 
-  Constraints:
-  - <Non-functional requirement specific to this feature>
-
-  Rule: <User story title>
-    As a <role>
-    I want <goal>
-    So that <benefit>
-
-    @id:a3f2b1c4
-    Example: <Concrete scenario title>
-      Given <initial context>
-      When <event or action>
-      Then <observable outcome>
-
-    @deprecated @id:b5c6d7e8
-    Example: <Superseded scenario>
-      Given ...
-      When ...
-      Then ...
-```
+See `feature.md.template` in this skill's directory for the full template.
 
 The **Rules (Business)** section captures business rules that hold across multiple Examples. Identifying rules first prevents redundant or contradictory Examples.
 
 The **Constraints** section captures non-functional requirements. Testable constraints should become `Example:` blocks with `@id` tags.
 
 What is **not** in `.feature` files:
-- Entities table — domain model lives in `docs/discovery.md`
-- Session Q&A blocks — live in `docs/discovery_journal.md`
-- Template §N markers — live in `docs/discovery_journal.md` session blocks
-- Architecture section — lives in `docs/architecture.md`
-
----
-
-## Project-Level Discovery Templates
-
-Three files hold project-level discovery content. Use these templates when creating them for the first time.
-
-### `docs/discovery_journal.md` — Raw Q&A (append-only)
-
-```markdown
-# Discovery Journal: <project-name>
-
----
-
-## YYYY-MM-DD — Session 1
-Status: IN-PROGRESS
-
-### General
-
-| ID | Question | Answer |
-|----|----------|--------|
-| Q1 | Who are the users? | ... |
-| Q2 | What does the product do at a high level? | ... |
-| Q3 | Why does it exist — what problem does it solve? | ... |
-| Q4 | When and where is it used? | ... |
-| Q5 | Success — what does "done" look like? | ... |
-| Q6 | Failure — what must never happen? | ... |
-| Q7 | Out-of-scope — what are we explicitly not building? | ... |
-
-### <Group Name>
-
-| ID | Question | Answer |
-|----|----------|--------|
-| Q8 | ... | ... |
-
-### Feature: <feature-stem>
-
-| ID | Question | Answer |
-|----|----------|--------|
-| Q9 | ... | ... |
-
-Status: COMPLETE
-```
-
-Rules:
-- Session header written first with `Status: IN-PROGRESS` before any Q&A
-- Only answered questions are written; unanswered questions are discarded
-- Questions grouped by topic (general, cross-cutting groups, per-feature)
-- `Status: COMPLETE` written at the end of the session block, after all writes are done
-- Never edit past entries — only append new session blocks
-
-### `docs/discovery.md` — Synthesis Changelog (append-only)
-
-```markdown
-# Discovery: <project-name>
+- Entities table — domain model lives in `docs/domain-model.md` (SE-owned)
+- Session Q&A blocks — live in `docs/scope_journal.md`
+- Architecture section — lives in `docs/adr/ADR-*.md`
 
 ---
 
-## Session: YYYY-MM-DD
-
-### Context
-<3–5 sentence synthesis of who the users are, what the product does, why it exists,
-success/failure conditions, and out-of-scope boundaries.>
-(First session only. Omit this subsection in subsequent sessions.)
-
-### Feature List
-- `<feature-stem>` — <one-sentence description of what changed or was added>
-(Write "No changes" if no features were added or modified this session.)
+## Post-Mortem Protocol
+
+When a stakeholder reports failure after the PO has attempted Step 5 acceptance, the feature does **not** move to `completed/`. Instead, the team compiles a compact post-mortem and the feature restarts at Step 2.
+
+### Trigger
+Stakeholder reports a feature is wrong after PO acceptance attempt.
+
+### Workflow
+1. **PO ensures feature is in `in-progress/`** (move back if already shifted).
+2. **Team compiles post-mortem** — max 15 lines, root cause at process level.
+3. **SE creates fix branch** from the feature's original start commit:
+   ```bash
+   # Find the feature's original start commit
+   git log --all --grep="feat(<feature-stem>)" --oneline
+   # Or, if the old branch still exists:
+   git log --reverse main..feat/<feature-stem> --oneline   # first line = start commit
+   
+   # Create fix branch from start commit
+   git checkout -b fix/<feature-stem> <start-commit-sha>
+   
+   # Commit post-mortem as first commit on the new branch
+   git add docs/post-mortem/YYYY-MM-DD-<feature-stem>-<keyword>.md
+   git commit -m "docs(post-mortem): root cause for <feature-stem> <keyword>"
+   
+   # Push the fix branch
+   git push -u origin fix/<feature-stem>
+   ```
+4. **PO scans `docs/post-mortem/`**, selects relevant files by `<feature-stem>` or `<failure-keyword>` in filename.
+5. **PO reads selected post-mortems** for context before handoff.
+6. **PO resets FLOW.md**: Status to [STEP-2-ARCH], `Next: Run @system-architect — restart Step 2 for <feature-stem> on fix/<feature-stem> with post-mortem context`.
+7. **SA begins Step 2** on `fix/<feature-stem>`, reading relevant post-mortems as input.
 
-### Domain Model
-| Type | Name | Description | In Scope |
-|------|------|-------------|----------|
-| Noun | <name> | <description> | Yes |
-| Verb | <name> | <description> | Yes |
-(Write "No changes" if domain model was not updated this session.)
-```
+### Document Format
 
-Rules:
-- Each session appends one `## Session: YYYY-MM-DD` block
-- Synthesis block is written LAST — only after all `.feature` file descriptions are updated
-- No project-level `Status: BASELINED` — feature-level BASELINED in `.feature` files is the gate
-- Never edit past blocks — append only; later blocks extend or supersede earlier ones
+File: `docs/post-mortem/YYYY-MM-DD-<feature-stem>-<failure-keyword>.md`
 
-### `docs/architecture.md` — Architectural Decisions (append-only, software-engineer)
+Use the template `post-mortem.md.template` in this skill's directory.
 
-```markdown
-# Architecture: <project-name>
+### Rules
+- One file per incident. Never edit an existing post-mortem.
+- If the same failure mode recurs, write a new post-mortem referencing the old one by filename.
+- PO reads post-mortems selectively; never require reading all of them.
 
 ---
 
-## YYYY-MM-DD — <feature-stem>: <short title>
+## Templates
 
-Decision: <what was decided — one sentence>
-Reason: <why — one sentence>
-Alternatives considered: <what was rejected and why>
-Feature: <feature-stem>
-```
+All templates for files written by this skill live in this skill's directory:
 
-Rules: Append-only. When a decision changes, append a new block that supersedes the old one. Cross-feature decisions use `Cross-feature:` in the header. Only write a block for non-obvious decisions with meaningful trade-offs.
+- `scope-journal.md.template` — `docs/scope_journal.md` structure
+- `discovery.md.template` — `docs/discovery.md` per-session block
+- `feature.md.template` — `.feature` file structure
+- `post-mortem.md.template` — `docs/post-mortem/YYYY-MM-DD-<feature-stem>-<keyword>.md` structure
 
-Base directory for this skill: file:///home/user/Documents/projects/python-project-template/.opencode/skills/scope
+Base directory for this skill: file:///home/user/Documents/projects/python-project-template/.opencode/skills/define-scope
 Relative paths in this skill (e.g., scripts/, reference/) are relative to this base directory.
 Note: file list is sampled.
 
 <skill_files>
-<file>/home/user/Documents/projects/python-project-template/.opencode/skills/define-scope/discovery-template.md</file>
+<file>/home/user/Documents/projects/python-project-template/.opencode/skills/define-scope/discovery.md.template</file>
+<file>/home/user/Documents/projects/python-project-template/.opencode/skills/define-scope/feature.md.template</file>
+<file>/home/user/Documents/projects/python-project-template/.opencode/skills/define-scope/scope-journal.md.template</file>
 </skill_files>
diff --git a/.opencode/skills/define-scope/discovery.md.template b/.opencode/skills/define-scope/discovery.md.template
new file mode 100644
index 0000000..e07dbb7
--- /dev/null
+++ b/.opencode/skills/define-scope/discovery.md.template
@@ -0,0 +1,24 @@
+# Discovery: <project-name>
+
+> Append-only session synthesis log.
+> Written by the product-owner at the end of each discovery session.
+> Each block summarizes one session: what was learned, what entities were suggested, and which features were touched.
+> Never edit past blocks — later blocks extend or supersede earlier ones.
+
+---
+
+## Session: YYYY-MM-DD
+
+### Summary
+<3-line synthesis of the session: what was discussed, what decisions were made, what new information emerged.>
+
+### Entities Added or Deprecated
+| Action | Type | Name | Notes |
+|--------|------|------|-------|
+| Added | Noun | <name> | <brief note> |
+| Deprecated | Verb | <name> | <reason> |
+(Write "No changes" if no entities were added or deprecated this session.)
+
+### Features Touched
+- `<feature-stem>` — <one-line reason why this feature was touched>
+(Write "No changes" if no features were added or modified this session.)
diff --git a/.opencode/skills/define-scope/feature.md.template b/.opencode/skills/define-scope/feature.md.template
new file mode 100644
index 0000000..7a6b9e0
--- /dev/null
+++ b/.opencode/skills/define-scope/feature.md.template
@@ -0,0 +1,29 @@
+Feature: <Feature title>
+
+  <2–4 sentence description of what this feature does and why it exists.
+  Written in plain language, always kept current by the PO.>
+
+  Status: ELICITING | BASELINED (YYYY-MM-DD)
+
+  Rules (Business):
+  - <Business rule that applies across multiple Examples>
+
+  Constraints:
+  - <Non-functional requirement specific to this feature>
+
+  Rule: <User story title>
+    As a <role>
+    I want <goal>
+    So that <benefit>
+
+    @id:a3f2b1c4
+    Example: <Concrete scenario title>
+      Given <initial context>
+      When <event or action>
+      Then <observable outcome>
+
+    @deprecated @id:b5c6d7e8
+    Example: <Superseded scenario>
+      Given ...
+      When ...
+      Then ...
diff --git a/.opencode/skills/define-scope/post-mortem.md.template b/.opencode/skills/define-scope/post-mortem.md.template
new file mode 100644
index 0000000..e455413
--- /dev/null
+++ b/.opencode/skills/define-scope/post-mortem.md.template
@@ -0,0 +1,16 @@
+# <Feature Stem>: <One-line failure>
+
+## Failed At
+Step 5 — stakeholder: "<exact complaint>"
+
+## Root Cause
+<One sentence: what process gap allowed this?>
+
+## Missed Gate
+<Which step's gate failed and why>
+
+## Fix
+<Process change to prevent recurrence>
+
+## Restart Check
+<How SA verifies this mode is handled>
diff --git a/.opencode/skills/define-scope/scope-journal.md.template b/.opencode/skills/define-scope/scope-journal.md.template
new file mode 100644
index 0000000..01f6058
--- /dev/null
+++ b/.opencode/skills/define-scope/scope-journal.md.template
@@ -0,0 +1,36 @@
+# Scope Journal: <project-name>
+
+> Append-only record of all discovery session Q&A.
+> Written by the product-owner. Read by the product-owner for resume checks.
+> Never edit past entries — append new session blocks only.
+
+---
+
+## YYYY-MM-DD — Session N
+Status: IN-PROGRESS
+
+### General
+
+| ID | Question | Answer |
+|----|----------|--------|
+| Q1 | Who are the users? | ... |
+| Q2 | What does the product do at a high level? | ... |
+| Q3 | Why does it exist — what problem does it solve? | ... |
+| Q4 | When and where is it used? | ... |
+| Q5 | Success — what does "done" look like? | ... |
+| Q6 | Failure — what must never happen? | ... |
+| Q7 | Out-of-scope — what are we explicitly not building? | ... |
+
+### <Group Name>
+
+| ID | Question | Answer |
+|----|----------|--------|
+| Q8 | ... | ... |
+
+### Feature: <feature-stem>
+
+| ID | Question | Answer |
+|----|----------|--------|
+| Q9 | ... | ... |
+
+Status: COMPLETE
diff --git a/.opencode/skills/flow/SKILL.md b/.opencode/skills/flow/SKILL.md
new file mode 100644
index 0000000..bdb7d76
--- /dev/null
+++ b/.opencode/skills/flow/SKILL.md
@@ -0,0 +1,271 @@
+---
+name: flow
+version: "1.0"
+description: Feature workflow protocol — read FLOW.md, auto-detect state, resume from checkpoint, update state
+author: software-engineer
+audience: all-agents
+workflow: session-management
+---
+
+# Feature Workflow Protocol
+
+This skill defines the single-feature-at-a-time workflow state machine. Every feature flows through 5 steps. Only ONE feature is in progress at any time. The filesystem enforces this.
+
+## Prerequisites
+
+Before starting any flow, verify these exist. If any are missing, stop and alert the human.
+
+| Requirement | Verification Command | Missing Action |
+|---|---|---|
+| Agent: product-owner | `test -f .opencode/agents/product-owner.md` | Create agent file |
+| Agent: system-architect | `test -f .opencode/agents/system-architect.md` | Create agent file |
+| Agent: software-engineer | `test -f .opencode/agents/software-engineer.md` | Create agent file |
+| Skill: run-session | `test -f .opencode/skills/run-session/SKILL.md` | Install skill |
+| Skill: define-scope | `test -f .opencode/skills/define-scope/SKILL.md` | Install skill |
+| Skill: architect | `test -f .opencode/skills/architect/SKILL.md` | Install skill |
+| Skill: implement | `test -f .opencode/skills/implement/SKILL.md` | Install skill |
+| Skill: verify | `test -f .opencode/skills/verify/SKILL.md` | Install skill |
+| Skill: version-control | `test -f .opencode/skills/version-control/SKILL.md` | Install skill |
+| Tool: uv | `command -v uv` | Install uv |
+| Tool: git | `command -v git` | Install git |
+| Directory: docs/features/ | `test -d docs/features/backlog` | Run setup-project |
+| Directory: docs/adr/ | `test -d docs/adr` | Create directory |
+| FLOW.md | `test -f FLOW.md` | Create from template |
+
+## State Machine
+
+States are checked IN ORDER. The first matching state is the current state.
+
+### Detection Rules
+
+1. **No file in `docs/features/in-progress/`** → [IDLE]
+2. **Feature in in-progress, no `Status: BASELINED`** → [STEP-1-DISCOVERY]
+3. **Feature has `Status: BASELINED`, no `Rule:` blocks** → [STEP-1-STORIES]
+4. **Feature has `Rule:` blocks, no `Example:` with @id** → [STEP-1-CRITERIA]
+5. **Feature has @id tags, no feat/ or fix/ branch exists** → [STEP-2-READY]
+6. **On feature branch, no test stubs in `tests/features/<stem>/`** → [STEP-2-ARCH]
+7. **Test stubs exist, any have `@pytest.mark.skip`** → [STEP-3-READY]
+8. **Unskipped test exists that fails** → [STEP-3-RED]
+9. **All unskipped tests pass, skipped tests remain** → [STEP-3-GREEN]
+10. **All tests pass, no skipped tests** → [STEP-4-READY]
+11. **Manual state set by SA after Step 4 approval** → [STEP-5-READY]
+12. **On main branch, feature still in in-progress/** → [STEP-5-MERGE]
+13. **Post-mortem file exists for current feature** → [POST-MORTEM]
+
+### State Details
+
+#### [IDLE] → Waiting for feature selection
+**Owner**: product-owner
+**Detect**: No file in `docs/features/in-progress/`
+**Action**: Select feature from backlog/ and move to in-progress/
+**Next**: [STEP-1-DISCOVERY]
+
+#### [STEP-1-DISCOVERY] → Requirements discovery
+**Owner**: product-owner
+**Detect**: Feature in in-progress/, no `Status: BASELINED` in file
+**Action**: Interview stakeholder, update scope_journal.md, discovery.md, glossary.md
+**Success**: Feature baselined → [STEP-1-STORIES]
+**Failure**: More discovery needed → Stay in [STEP-1-DISCOVERY]
+
+#### [STEP-1-STORIES] → Write user stories
+**Owner**: product-owner
+**Detect**: Feature has `Status: BASELINED`, no `Rule:` blocks
+**Action**: Write Rule: blocks with INVEST criteria
+**Success**: Stories complete → [STEP-1-CRITERIA]
+
+#### [STEP-1-CRITERIA] → Write acceptance criteria
+**Owner**: product-owner
+**Detect**: Feature has `Rule:` blocks, no `Example:` blocks with @id
+**Action**: Write Example: blocks with @id tags
+**Success**: Criteria complete → [STEP-2-READY]
+**Commit**: `feat(criteria): write acceptance criteria for <name>`
+
+#### [STEP-2-READY] → Ready for architecture
+**Owner**: system-architect
+**Detect**: Feature has @id tags, no feat/<stem> branch exists
+**Action**: Create branch feat/<stem> from main
+**Success**: Branch created → [STEP-2-ARCH]
+
+#### [STEP-2-ARCH] → Design architecture
+**Owner**: system-architect
+**Detect**: On feat/<stem> branch, no test stubs in tests/features/<stem>/
+**Action**: Read feature, design stubs, write ADRs, update domain-model.md
+**Success**: Run `uv run task test-fast` generates stubs → [STEP-3-READY]
+**Failure**: Spec unclear → [STEP-1-DISCOVERY] (escalate to PO)
+**Commit**: `feat(arch): design <feature> architecture`
+
+#### [STEP-3-READY] → Ready for TDD
+**Owner**: software-engineer
+**Detect**: Test stubs exist, some have @pytest.mark.skip
+**Action**: Pick first skipped @id, remove skip, write test
+**Success**: Test written and fails → [STEP-3-RED]
+
+#### [STEP-3-RED] → Test failing
+**Owner**: software-engineer
+**Detect**: Unskipped test exists that fails
+**Action**: Write minimal code to pass
+**Success**: Test passes → [STEP-3-GREEN]
+
+#### [STEP-3-GREEN] → Test passing
+**Owner**: software-engineer
+**Detect**: All unskipped tests pass, more skipped tests remain
+**Action**: Refactor if needed, then pick next @id
+**Success**: More @ids → [STEP-3-READY]
+**Success**: All @ids done → [STEP-4-READY]
+**Commit**: After each @id or logical group
+
+#### [STEP-4-READY] → Ready for verification
+**Owner**: system-architect
+**Detect**: All tests implemented (no @skip) and passing
+**Action**: Run all quality checks, semantic review
+**Success**: All checks pass → [STEP-5-READY]
+**Failure**: Issues found → [STEP-3-READY] (document issues)
+
+#### [STEP-5-READY] → Ready for acceptance
+**Owner**: product-owner
+**Detect**: Manual state (set after Step 4 approval)
+**Action**: Demo and validate against criteria
+**Success**: Feature accepted → [STEP-5-MERGE]
+**Failure**: Not accepted → [POST-MORTEM]
+
+#### [STEP-5-MERGE] → Merge to main
+**Owner**: software-engineer
+**Detect**: Feature accepted, still on feature branch
+**Action**: Merge feat/<stem> to main with --no-ff
+**Success**: Merged → [STEP-5-COMPLETE]
+
+#### [STEP-5-COMPLETE] → Feature complete
+**Owner**: product-owner
+**Detect**: On main branch, feature still in in-progress/
+**Action**: Move feature from in-progress/ to completed/
+**Success**: Feature moved → [IDLE]
+
+#### [POST-MORTEM] → Failed feature analysis
+**Owner**: product-owner
+**Detect**: Post-mortem file exists for current feature
+**Action**: Write post-mortem, create fix/<stem> branch
+**Success**: Post-mortem complete → [STEP-2-ARCH]
+
+## Session Protocol
+
+### Session Start
+
+1. Read `FLOW.md` — find current feature, branch, status.
+2. Run `detect-state` (see below) to verify the state is correct.
+3. If the detected state differs from `FLOW.md` Status, update `FLOW.md` to match reality.
+4. Check prerequisites table (above). If any are missing, stop and report.
+5. If a feature is active, read the in-progress `.feature` file.
+6. Run `git status` and `git branch --show-current` to understand workspace state.
+7. Confirm scope: you are working on exactly one step of one feature.
+
+### Session End
+
+1. Update `FLOW.md`:
+   - Set Status to the detected state
+   - Update Session Log with what was done
+   - Update `Next:` line with one concrete action
+2. Commit any uncommitted work (even WIP):
+   ```bash
+   git add -A
+   git commit -m "WIP(<feature-stem>): <what was done>"
+   ```
+3. If a step is fully complete, use the proper commit message instead of WIP.
+
+### Step Completion Protocol
+
+When a step completes within a session:
+
+1. Update `FLOW.md` to reflect the completed step before doing any other work.
+2. Commit the `FLOW.md` update:
+   ```bash
+   git add FLOW.md
+   git commit -m "chore: complete step <N> for <feature-stem>"
+   ```
+3. Only then begin the next step (in a new session where possible).
+
+## Auto-Detection
+
+To detect the current state automatically, run these checks in order:
+
+```bash
+# 1. Check for in-progress feature
+ls docs/features/in-progress/*.feature 2>/dev/null | grep -v ".gitkeep"
+# If empty → [IDLE]
+
+# 2. Check feature baselined
+grep -q "Status: BASELINED" docs/features/in-progress/*.feature
+# If no match → [STEP-1-DISCOVERY]
+
+# 3. Check for Rule blocks
+grep -q "^Rule:" docs/features/in-progress/*.feature
+# If no match → [STEP-1-STORIES]
+
+# 4. Check for Example blocks with @id
+grep -q "@id:" docs/features/in-progress/*.feature
+# If no match → [STEP-1-CRITERIA]
+
+# 5. Check for feature branch
+git branch --show-current | grep -E "^feat/|^fix/"
+# If no match → [STEP-2-READY]
+
+# 6. Check for test stubs
+ls tests/features/*/ 2>/dev/null | head -1
+# If empty → [STEP-2-ARCH]
+
+# 7. Check for skipped tests
+grep -r "@pytest.mark.skip" tests/features/*/ 2>/dev/null
+# If found → [STEP-3-READY] or [STEP-3-GREEN]
+# If not found → [STEP-4-READY]
+
+# 8. Check test failures
+uv run task test-fast 2>&1 | grep -E "FAILED|ERROR"
+# If found → [STEP-3-RED]
+# If not found and on main → [STEP-5-MERGE]
+```
+
+## FLOW.md Format
+
+```markdown
+# FLOW Protocol
+
+## Current Feature
+**Feature**: <feature-stem> | [NONE]
+**Branch**: <branch-name> | [NONE]
+**Status**: <state>
+
+## Prerequisites
+- [x] Agents: product-owner, system-architect, software-engineer
+- [x] Skills: run-session, define-scope, architect, implement, verify, version-control
+- [x] Tools: uv, git
+- [x] Directories: docs/features/, docs/adr/
+
+## Session Log
+<!-- Append new entries, never delete old ones -->
+**YYYY-MM-DD HH:MM** — <agent> — <state> — <action>
+
+## Next
+Run @<agent-name> — <one concrete action>
+```
+
+## Rules
+
+1. Never skip reading `FLOW.md` at session start
+2. Never end a session without updating `FLOW.md`
+3. Never leave uncommitted changes — commit as WIP if needed
+4. One step per session where possible; do not start Step N+1 in the same session as Step N
+5. The "Next" line must be actionable enough that a fresh AI can execute it without asking questions
+6. When a step completes, update `FLOW.md` and commit **before** any further work
+7. The Session Log is append-only — never delete old entries
+8. If `FLOW.md` is missing, create it from the template before doing any other work
+9. If detected state differs from `FLOW.md` Status, trust the detected state and update `FLOW.md`
+
+## Output Style
+
+Use minimal output. Every message must contain only what the next agent or stakeholder needs to continue — findings, status, decisions, blockers, and the Next: line.
+
+- Use the fewest, least verbose tool calls necessary to achieve the step's goal
+- Report results, not process
+- No narration before or after tool calls
+- No restating tool output in prose
+- No summaries of what was just done
+- Always close with Next:
diff --git a/.opencode/skills/flow/flow.md.template b/.opencode/skills/flow/flow.md.template
new file mode 100644
index 0000000..2f0731b
--- /dev/null
+++ b/.opencode/skills/flow/flow.md.template
@@ -0,0 +1,20 @@
+# FLOW Protocol
+
+This file tracks the current feature in progress. Only ONE feature flows through the system at a time.
+
+## Current Feature
+**Feature**: [NONE]
+**Branch**: [NONE]
+**Status**: [IDLE]
+
+## Prerequisites
+- [ ] Agents: product-owner, system-architect, software-engineer
+- [ ] Skills: run-session, define-scope, architect, implement, verify, version-control
+- [ ] Tools: uv, git
+- [ ] Directories: docs/features/, docs/adr/
+
+## Session Log
+<!-- Append new entries, never delete old ones -->
+
+## Next
+Run @product-owner — load skill select-feature and pick the next BASELINED feature from backlog.
diff --git a/.opencode/skills/git-release/SKILL.md b/.opencode/skills/git-release/SKILL.md
index f8d66e4..62d701c 100644
--- a/.opencode/skills/git-release/SKILL.md
+++ b/.opencode/skills/git-release/SKILL.md
@@ -2,8 +2,8 @@
 name: git-release
 description: Create releases with hybrid major.minor.calver versioning and optional custom release naming
 version: "1.1"
-author: software-engineer
-audience: software-engineer
+author: stakeholder
+audience: stakeholder
 workflow: release-management
 ---
 
@@ -39,6 +39,16 @@ gh release list --limit 20
 
 ## Release Process
 
+**Guard**: `git branch --show-current` must output `main`. If not, stop — releases happen from `main` only.
+
+```bash
+git checkout main
+git fetch origin main
+git merge --ff-only origin/main   # fast-forward only; if this fails, main has diverged — resolve first
+```
+
+
+
 ### 0. Read branding
 
 Read `docs/branding.md` if it exists:
@@ -93,11 +103,11 @@ Add at the top. If a release name was generated in Step 0, include it; otherwise
 Run the `update-docs` skill to reflect the newly accepted feature in C4 diagrams and the glossary. This step runs inline — do not commit separately.
 
 Load and execute the full `update-docs` skill now:
-- Update `docs/c4/context.md` (C4 Level 1)
-- Update `docs/c4/container.md` (C4 Level 2, if multi-container)
+- Update `docs/context.md` (C4 Level 1)
+- Update `docs/container.md` (C4 Level 2, if multi-container)
 - Update `docs/glossary.md` (living glossary)
 
-The `living-docs` commit step is **skipped** here — all changed files are staged together with the version bump in step 6.
+The `update-docs` commit step is **skipped** here — all changed files are staged together with the version bump in step 6.
 
 ### 6. Regenerate lockfile and commit version bump
 
@@ -106,7 +116,7 @@ After updating `pyproject.toml`, regenerate the lockfile — CI runs `uv sync --
 ```bash
 uv lock
 git add pyproject.toml <package>/__init__.py CHANGELOG.md uv.lock \
-  docs/c4/context.md docs/c4/container.md docs/glossary.md
+  docs/context.md docs/container.md docs/glossary.md
 git commit -m "chore(release): bump version to v{version}[ - {Release Name}]"
 # Include " - {Release Name}" only if a release name was generated in Step 0; omit otherwise.
 ```
diff --git a/.opencode/skills/implement/SKILL.md b/.opencode/skills/implement/SKILL.md
index 8e1aa37..533370b 100644
--- a/.opencode/skills/implement/SKILL.md
+++ b/.opencode/skills/implement/SKILL.md
@@ -1,7 +1,7 @@
 ---
 name: implement
-description: Steps 2-3 — Architecture + TDD Loop, one @id at a time
-version: "3.1"
+description: Step 3 — TDD Loop, one @id at a time
+version: "5.0"
 author: software-engineer
 audience: software-engineer
 workflow: feature-lifecycle
@@ -9,11 +9,11 @@ workflow: feature-lifecycle
 
 # Implement
 
-Steps 2 (Architecture) and 3 (TDD Loop) combined into a single skill. The software-engineer owns both.
+Step 3: RED → GREEN → REFACTOR, one @id at a time. The software-engineer owns this step entirely.
 
 ## When to Use
 
-Load this skill when starting Step 2 (Architecture) after the PO has moved a BASELINED feature to `in-progress/`, or when continuing Step 3 (TDD Loop) for an in-progress feature.
+Load this skill when continuing Step 3 (TDD Loop) for an in-progress feature. Architecture stubs must already exist (created by the system-architect at Step 2).
 
 ## Software-Engineer Quality Gate Priority Order
 
@@ -28,141 +28,18 @@ Design correctness is far more important than lint/pyright/coverage compliance.
 
 ---
 
-## Step 2 — Architecture
-
-### Prerequisites (stop if any fail — escalate to PO)
-
-1. `docs/features/in-progress/` contains exactly one `.feature` file (not just `.gitkeep`). If none exists, **STOP** — update TODO.md `Next:` to `Run @product-owner — move the chosen feature to in-progress/` and stop. Never self-select or move a feature yourself.
-2. The feature file's discovery section has `Status: BASELINED`. If not, escalate to PO — Step 1 is incomplete.
-3. The feature file contains `Rule:` blocks with `Example:` blocks and `@id` tags. If not, escalate to PO — criteria have not been written.
-4. Package name confirmed: read `pyproject.toml` → locate `[tool.setuptools]` → confirm directory exists on disk.
-
-### Package Verification (mandatory — before writing any code)
-
-1. Read `pyproject.toml` → locate `[tool.setuptools]` → record `packages = ["<name>"]`
-2. Confirm directory exists: `ls <name>/`
-3. All new source files go under `<name>/`
-
-**Note on feature file moves**: The PO moves `.feature` files between folders. The software-engineer never moves or edits `.feature` files. Update TODO.md `Source:` path to reflect `in-progress/` once the PO has moved the file.
-
-### Read Phase (all before writing anything)
-
-1. Read `docs/discovery.md` (project-level synthesis changelog) and optionally `docs/discovery_journal.md` (Q&A history for context)
-2. Read `docs/glossary.md` if it exists — use existing domain terms when naming classes, methods, and modules; do not invent synonyms for terms already defined
-3. Read **ALL** `.feature` files in `docs/features/backlog/` (discovery + entities sections)
-4. Read in-progress `.feature` file (full: Rules + Examples + @id)
-5. Read **ALL** existing `.py` files in `<package>/` — understand what already exists before adding anything
-
-### Domain Analysis
-
-From the Domain Model table in `docs/discovery.md` + Rules (Business) in the `.feature` file:
-- **Nouns** → named classes, value objects, aggregates
-- **Verbs** → method names with typed signatures
-- **Datasets** → named types (not bare dict/list)
-- **Bounded Context check**: same word, different meaning across features? → module boundary
-- **Cross-feature entities** → candidate shared domain layer
-
-### Silent Pre-mortem (before writing anything)
-
-> "In 6 months this design is a mess. What mistakes did we make?"
-
-For each candidate class:
-- >2 ivars? → split
-- >1 reason to change? → isolate
-
-For each external dep:
-- Is it behind a Protocol? → if not, add
-
-For each noun:
-- Serving double duty across modules? → isolate
-
-If pattern smell detected, load `skill apply-patterns`.
-
-### Write Stubs into Package
-
-From the domain analysis, write or extend `.py` files in `<package>/`. For each entity:
-
-- **If the file already exists**: add the new class or method signature — do not remove or alter existing code.
-- **If the file does not exist**: create it with the new signatures only.
-
-**Stub rules (strictly enforced):**
-- Method bodies must be `...` — no logic, no conditionals, no imports beyond `typing` and domain types
-- No docstrings — signatures will change; add docstrings after GREEN (lint enforces this at quality gate)
-- No inline comments, no TODO comments, no speculative code
-
-**Example — correct stub style:**
-
-```python
-from dataclasses import dataclass
-from typing import Protocol
-
-
-@dataclass(frozen=True, slots=True)
-class EmailAddress:
-    value: str
-
-    def validate(self) -> None: ...
-
-
-class UserRepository(Protocol):
-    def save(self, user: "User") -> None: ...
-    def find_by_email(self, email: EmailAddress) -> "User | None": ...
-```
-
-**File placement (common patterns, not required names):**
-- `<package>/domain/<noun>.py` — entities, value objects
-- `<package>/domain/service.py` — cross-entity operations
-
-Place stubs where responsibility dictates — do not pre-create `ports/` or `adapters/` folders unless a concrete external dependency was identified in scope. Structure follows domain analysis, not a template.
-
-### Record Architectural Decisions
-
-Append a new dated block to `docs/architecture.md` for each significant decision:
-
-```markdown
-## YYYY-MM-DD — <feature-stem>: <short title>
-
-Decision: <what was decided>
-Reason: <why, one sentence>
-Alternatives considered: <what was rejected and why>
-Feature: <feature-stem>
-```
-
-Only write a block for non-obvious decisions with meaningful trade-offs. Routine YAGNI choices do not need a record.
-
-### Architecture Smell Check (hard gate)
-
-Apply to the stub files just written:
-
-- [ ] No class with >2 responsibilities (SOLID-S)
-- [ ] No behavioural class with >2 instance variables (OC-8; dataclasses, Pydantic models, value objects, and TypedDicts are exempt)
-- [ ] All external deps assigned a Protocol (SOLID-D + Hexagonal) — N/A if no external dependencies identified in scope
-- [ ] No noun with different meaning across modules (DDD Bounded Context)
-- [ ] No missing Creational pattern: repeated construction without Factory/Builder
-- [ ] No missing Structural pattern: type-switching without Strategy/Visitor
-- [ ] No missing Behavioral pattern: state machine or scattered notification without State/Observer
-- [ ] Each ADR consistent with each @id AC — no contradictions
-
-If any check fails: fix the stub files before committing.
-
-### Generate Test Stubs
-
-Run `uv run task test-fast` once. It reads the in-progress `.feature` file, assigns `@id` tags to any untagged `Example:` blocks (writing them back to the `.feature` file), and generates `tests/features/<feature_slug>/<rule_slug>_test.py` — one file per `Rule:` block, one skipped function per `@id`. Verify the files were created, then stage all changes (including any `@id` write-backs to the `.feature` file).
-
-Commit: `feat(<feature-stem>): add architecture and test stubs`
-
----
-
 ## Step 3 — TDD Loop
 
 ### Prerequisites
 
 - [ ] Exactly one .feature `in_progress`. If not present, load `skill select-feature`
+- [ ] On `feat/<stem>` or `fix/<stem>` branch (`git branch --show-current`). If on `main`, load `skill version-control` and create/switch to the branch first
 - [ ] Architecture stubs present in `<package>/` (committed by Step 2)
-- [ ] Read `docs/architecture.md` — understand all architectural decisions before writing any test
+- [ ] Read `docs/system.md` — understand current system structure and constraints
+- [ ] Read in-progress `.feature` file — understand acceptance criteria
 - [ ] Test stub files exist in `tests/features/<feature_slug>/<rule_slug>_test.py` — generated by pytest-beehave at Step 2 end; if missing, re-run `uv run task test-fast` and commit the generated files before entering RED
 
-### Build TODO.md Test List
+### Build Test List
 
 1. List all `@id` tags from in-progress `.feature` file
 2. Order: fewest dependencies first; most impactful within that set
@@ -198,7 +75,7 @@ INNER LOOP
     ├── uv run task test-fast after each individual change
     └── EXIT: test-fast passes; no smells remain
 
-Mark @id completed in TODO.md
+Mark @id completed in FLOW.md Session Log
 Commit when a meaningful increment is green
 ```
 
@@ -219,39 +96,48 @@ All must pass before Self-Declaration.
 
 <!-- This list has exactly 25 items — count before submitting. If your count ≠ 25, you missed one. -->
 
-Communicate verbally to the reviewer. Answer honestly for each principle:
-
-1. YAGNI: no code without a failing test — AGREE/DISAGREE | file:line
-2. YAGNI: no speculative abstractions — AGREE/DISAGREE | file:line
-3. KISS: simplest solution that passes — AGREE/DISAGREE | file:line
-4. KISS: no premature optimization — AGREE/DISAGREE | file:line
-5. DRY: no duplication — AGREE/DISAGREE | file:line
-6. DRY: no redundant comments — AGREE/DISAGREE | file:line
-7. SOLID-S: one reason to change per class — AGREE/DISAGREE | file:line
-8. SOLID-O: open for extension, closed for modification — AGREE/DISAGREE | file:line
-9. SOLID-L: subtypes substitutable — AGREE/DISAGREE | file:line
-10. SOLID-I: no forced unused deps — AGREE/DISAGREE | file:line
-11. SOLID-D: depend on abstractions, not concretions — AGREE/DISAGREE | file:line
-12. OC-1: one level of indentation per method — AGREE/DISAGREE | deepest: file:line
-13. OC-2: no else after return — AGREE/DISAGREE | file:line
-14. OC-3: primitive types wrapped — AGREE/DISAGREE | file:line
-15. OC-4: first-class collections — AGREE/DISAGREE | file:line
-16. OC-5: one dot per line — AGREE/DISAGREE | file:line
-17. OC-6: no abbreviations — AGREE/DISAGREE | file:line
-18. OC-7: ≤20 lines per function, ≤50 per class — AGREE/DISAGREE | longest: file:line
-19. OC-8: ≤2 instance variables per class (behavioural classes only; dataclasses, Pydantic models, value objects, and TypedDicts are exempt) — AGREE/DISAGREE | file:line
-20. OC-9: no getters/setters — AGREE/DISAGREE | file:line
-21. Patterns: no good reason remains to refactor using OOP or Design Patterns — AGREE/DISAGREE | file:line
-22. Patterns: no creational smell — AGREE/DISAGREE | file:line
-23. Patterns: no structural smell — AGREE/DISAGREE | file:line
-24. Patterns: no behavioral smell — AGREE/DISAGREE | file:line
-25. Semantic: tests operate at same abstraction as AC — AGREE/DISAGREE | file:line
+Communicate verbally to the system-architect. Answer honestly for each principle:
+
+As a software-engineer I declare that:
+* 1. YAGNI: no code without a failing test — AGREE/DISAGREE | file:line
+* 2. YAGNI: no speculative abstractions — AGREE/DISAGREE | file:line
+* 3. KISS: simplest solution that passes — AGREE/DISAGREE | file:line
+* 4. KISS: no premature optimization — AGREE/DISAGREE | file:line
+* 5. DRY: no duplication — AGREE/DISAGREE | file:line
+* 6. DRY: no redundant comments — AGREE/DISAGREE | file:line
+* 7. SOLID-S: one reason to change per class — AGREE/DISAGREE | file:line
+* 8. SOLID-O: open for extension, closed for modification — AGREE/DISAGREE | file:line
+* 9. SOLID-L: subtypes substitutable — AGREE/DISAGREE | file:line
+* 10. SOLID-I: no forced unused deps — AGREE/DISAGREE | file:line
+* 11. SOLID-D: depend on abstractions, not concretions — AGREE/DISAGREE | file:line
+* 12. OC-1: one level of indentation per method — AGREE/DISAGREE | deepest: file:line
+* 13. OC-2: no else after return — AGREE/DISAGREE | file:line
+* 14. OC-3: primitive types wrapped — AGREE/DISAGREE | file:line
+* 15. OC-4: first-class collections — AGREE/DISAGREE | file:line
+* 16. OC-5: one dot per line — AGREE/DISAGREE | file:line
+* 17. OC-6: no abbreviations — AGREE/DISAGREE | file:line
+* 18. OC-7: ≤20 lines per function, ≤50 per class — AGREE/DISAGREE | longest: file:line
+* 19. OC-8: ≤2 instance variables per class (behavioural classes only; dataclasses, Pydantic models, value objects, and TypedDicts are exempt) — AGREE/DISAGREE | file:line
+* 20. OC-9: no getters/setters — AGREE/DISAGREE | file:line
+* 21. Patterns: no good reason remains to refactor using OOP or Design Patterns — AGREE/DISAGREE | file:line
+* 22. Patterns: no creational smell — AGREE/DISAGREE | file:line
+* 23. Patterns: no structural smell — AGREE/DISAGREE | file:line
+* 24. Patterns: no behavioral smell — AGREE/DISAGREE | file:line
+* 25. Semantic: tests operate at same abstraction as AC — AGREE/DISAGREE | file:line
 
 A `DISAGREE` answer is not automatic rejection — state the reason and fix before handing off.
 
+### Branch Hygiene (before handoff)
+
+Before signalling completion:
+1. `git status` — working tree must be clean. Commit any remaining changes.
+2. `git branch --show-current` — must be `feat/<stem>` or `fix/<stem>`, never `main`.
+3. `git log main..HEAD --oneline` — must show 1+ commits. If empty, nothing was committed on this branch.
+4. `git push origin $(git branch --show-current)` — all commits must be on origin.
+
 ### Hand off to Step 4 (Verify)
 
-Signal completion to the reviewer. Provide:
+Signal completion to the system-architect. Provide:
 - Feature file path
 - Self-Declaration (communicated verbally, as above)
 - Summary of what was implemented
@@ -368,7 +254,7 @@ If testing through the real entry point is infeasible, escalate to PO to adjust
 
 If during implementation you discover a behavior not covered by existing acceptance criteria:
 - **Do not extend criteria yourself** — escalate to PO
-- Note the gap in TODO.md under `## Next`
+- Note the gap in FLOW.md under `## Next`
 - The PO will decide whether to add a new Example to the `.feature` file
 
 Extra tests in `tests/unit/` are allowed freely (coverage, edge cases, etc.) — these do not need `@id` traceability.
@@ -377,7 +263,7 @@ Extra tests in `tests/unit/` are allowed freely (coverage, edge cases, etc.) —
 
 ## Signature Design
 
-<package> signatures are written during Step 2 (Architecture) and refined during Step 3 (RED). They live directly in the package `.py` files — never in the `.feature` file.
+<package> signatures are written during Step 2 (Architecture) by the system-architect and refined during Step 3 (RED) by the software-engineer. They live directly in the package `.py` files — never in the `.feature` file.
 
 Key rules:
 - Bodies are always `...` in the architecture stub
@@ -402,3 +288,17 @@ class UserRepository(Protocol):
     def save(self, user: "User") -> None: ...
     def find_by_email(self, email: EmailAddress) -> "User | None": ...
 ```
+
+---
+
+## Templates
+
+Templates for architecture files live in the `architect` skill's directory:
+
+- `domain-model.md.template` — `docs/domain-model.md` structure
+- `system.md.template` — `docs/system.md` structure
+- `adr.md.template` — individual ADR file structure
+
+Base directory for this skill: file:///home/user/Documents/projects/python-project-template/.opencode/skills/implement
+Relative paths in this skill (e.g., scripts/, reference/) are relative to this base directory.
+Note: file list is sampled.
diff --git a/.opencode/skills/implement/adr.md.template b/.opencode/skills/implement/adr.md.template
new file mode 100644
index 0000000..3670892
--- /dev/null
+++ b/.opencode/skills/implement/adr.md.template
@@ -0,0 +1,23 @@
+# ADR: <short title>
+
+> Architectural Decision Record
+> Written by the software-engineer during Step 2 for non-obvious decisions with meaningful trade-offs.
+> Routine YAGNI choices do not need a record.
+
+| Field | Value |
+|-------|-------|
+| **Date** | YYYY-MM-DD |
+| **Feature** | <feature-stem> |
+| **Status** | Proposed | Accepted | Superseded |
+
+## Decision
+<what was decided — one sentence>
+
+## Reason
+<why — one sentence>
+
+## Alternatives Considered
+<what was rejected and why>
+
+## Consequences
+<positive and negative consequences of this decision>
diff --git a/.opencode/skills/implement/domain-model.md.template b/.opencode/skills/implement/domain-model.md.template
new file mode 100644
index 0000000..5df4e1a
--- /dev/null
+++ b/.opencode/skills/implement/domain-model.md.template
@@ -0,0 +1,37 @@
+# Domain Model: <project-name>
+
+> Living reference of code-facing domain entities.
+> Owned by the software-engineer. Created and updated at Step 2.
+> The product-owner reads this file to check existing entities during discovery, but never writes to it.
+> Append-only: add new entries at the bottom. Deprecate old entries by moving them to the Deprecated section.
+> Never edit existing live entries — code depends on them.
+
+---
+
+## Entities
+
+| Name | Type | Description | Bounded Context | First Appeared |
+|------|------|-------------|-----------------|----------------|
+| <name> | Entity | <description> | <context> | <feature-stem> |
+| <name> | Value Object | <description> | <context> | <feature-stem> |
+| <name> | Aggregate | <description> | <context> | <feature-stem> |
+
+## Verbs
+
+| Name | Actor | Object | Description | First Appeared |
+|------|-------|--------|-------------|----------------|
+| <verb> | <who> | <what> | <description> | <feature-stem> |
+
+## Relationships
+
+| Subject | Relation | Object | Cardinality | Notes |
+|---------|----------|--------|-------------|-------|
+| <A> | <has / uses / emits> | <B> | <1:1 / 1:N / M:N> | <notes> |
+
+---
+
+## Deprecated
+
+| Name | Type | Deprecated Date | Replaced By | Reason |
+|------|------|-----------------|-------------|--------|
+| <name> | Entity | YYYY-MM-DD | <new name> | <reason> |
diff --git a/.opencode/skills/implement/system.md.template b/.opencode/skills/implement/system.md.template
new file mode 100644
index 0000000..05fb2b6
--- /dev/null
+++ b/.opencode/skills/implement/system.md.template
@@ -0,0 +1,27 @@
+# System Overview: <project-name>
+
+> Current-state description of the production system.
+> Rewritten by the software-engineer at Step 2 for each feature cycle.
+> Reviewed by the product-owner at Step 5.
+> Contains only completed features — nothing from backlog or in-progress.
+
+## Summary
+<3–5 sentence description of what the system currently does, who uses it, and its primary boundaries.>
+
+## Actors
+- `<role>` — <description> (from completed features)
+
+## Modules / Components
+- `<module>` — <responsibility> (from completed features and ADRs)
+
+## External Dependencies
+- `<system>` — <purpose> (from ADRs)
+
+## Constraints
+- <system-wide constraint from ADRs or completed features>
+
+## Relevant ADRs
+- ADR-YYYY-MM-DD-<slug> — <one-line summary> (only ADRs affecting current system state)
+
+## Completed Features
+- `<feature-stem>` — <one-line description>
diff --git a/.opencode/skills/refactor/SKILL.md b/.opencode/skills/refactor/SKILL.md
index 4f44369..6d2d5c5 100644
--- a/.opencode/skills/refactor/SKILL.md
+++ b/.opencode/skills/refactor/SKILL.md
@@ -1,7 +1,7 @@
 ---
 name: refactor
-description: Safe refactoring protocol for TDD — green bar rule, two-hats discipline, preparatory refactoring, and Fowler catalogue
-version: "1.0"
+description: Safe refactoring protocol for TDD — green bar rule, two-hats discipline, preparatory refactoring, and smell catalogue
+version: "2.0"
 author: software-engineer
 audience: software-engineer
 workflow: feature-lifecycle
@@ -11,7 +11,7 @@ workflow: feature-lifecycle
 
 Load this skill when entering the REFACTOR phase of a TDD cycle, or before starting RED on a new `@id` when preparatory refactoring is needed.
 
-Sources: Fowler *Refactoring* 2nd ed. (2018); Beck *Canon TDD* (2023); Beck *Tidy First?* (2023); Martin *SOLID* (2000); Bay *Object Calisthenics* (2005). See `docs/scientific-research/oop-design.md` and `docs/scientific-research/refactoring-empirical.md`.
+Sources: Fowler *Refactoring* 2nd ed. (2018); Beck *Canon TDD* (2023); Beck *Tidy First?* (2023); Martin *SOLID* (2000); Bay *Object Calisthenics* (2005); Shvets *Refactoring.Guru* (2014–present). See `docs/research/oop-design.md` entries 33–36 and `docs/research/refactoring-empirical.md`.
 
 ---
 
@@ -64,24 +64,59 @@ Beck: *"For each desired change, make the change easy (warning: this may be hard
 
 ### Step 1 — Identify the smell
 
-Run the smell checklist from your Self-Declaration or from the Architecture Smell Check:
+Run the smell checklist from your Self-Declaration or from the Architecture Smell Check.
 
-| Smell | Likely catalogue entry |
-|---|---|
-| Function needs a comment to explain it | Extract Function |
-| Class does two jobs | Extract Class |
-| Method uses another class's data more than its own | Move Function |
-| Same parameter group in multiple signatures | Introduce Parameter Object |
-| Primitive with behaviour (money, email, range) | Replace Primitive with Object |
-| Local variable holds a computed result | Replace Temp with Query |
-| `isinstance` / type-flag conditionals | Replace Conditional with Polymorphism |
-| Multiple functions share a data cluster | Combine Functions into Class |
-| Nested conditions beyond 2 levels | Decompose Conditional / Guard Clauses |
-| Object construction scattered without pattern | Factory Method / Builder |
-| Scattered notification or state transition | Observer / State |
-| Type-switching across callers | Strategy / Visitor |
-
-If pattern smell detected: load `skill apply-patterns` for before/after examples.
+Smell categories from Shvets *Refactoring.Guru* (2014–present); each smell links to its Fowler catalogue entry.
+
+#### Bloaters — structures grown too large
+
+| Smell | Signal | Likely catalogue entry |
+|---|---|---|
+| Long Method | Method body needs a comment to understand any section | Extract Function, Decompose Conditional |
+| Large Class | Class has too many responsibilities or instance variables | Extract Class, Extract Subclass |
+| Primitive Obsession | Domain concept represented as a raw primitive | Replace Primitive with Object, Introduce Parameter Object |
+| Long Parameter List | Function takes 3+ parameters, or parameter group repeats across signatures | Introduce Parameter Object, Replace Parameter with Query |
+| Data Clumps | Same 2–3 data items always appear together across signatures or fields | Introduce Parameter Object, Extract Class |
+
+#### OO Abusers — misapplied OOP
+
+| Smell | Signal | Likely catalogue entry |
+|---|---|---|
+| Switch Statements | Repeated `if/elif` or match on a type flag across callers | Replace Conditional with Polymorphism, Strategy, State |
+| Temporary Field | Instance variable set only in some code paths; `None` in others | Extract Class, Introduce Null Object |
+| Refused Bequest | Subclass inherits methods/data it does not use or overrides to do nothing | Push Down Method/Field, Replace Inheritance with Delegation |
+| Alternative Classes with Different Interfaces | Two classes do the same thing under different names/signatures | Rename Method, Extract Superclass, unify via Protocol |
+
+#### Change Preventers — changes ripple unexpectedly
+
+| Smell | Signal | Likely catalogue entry |
+|---|---|---|
+| Divergent Change | One class must change for multiple unrelated reasons | Extract Class (split by axis of change) |
+| Shotgun Surgery | One concept change touches many classes | Move Function/Field, Inline Class, combine scattered behavior |
+| Parallel Inheritance Hierarchies | Adding a subclass to one hierarchy forces a new subclass in another | Move Function/Field to flatten or unify hierarchies |
+
+#### Dispensables — dead weight
+
+| Smell | Signal | Likely catalogue entry |
+|---|---|---|
+| Comments | Comment explains *what* or *why* when the code could be self-explanatory | Extract Function, Rename Variable/Function |
+| Duplicate Code | Same logic copied in 2+ places | Extract Function, Pull Up Method, Form Template Method |
+| Lazy Class | Class does too little to justify its existence | Inline Class, Collapse Hierarchy |
+| Data Class | Class holds only fields with getters/setters; no behavior | Move Function into class, Encapsulate Field |
+| Dead Code | Unreachable code, unused variable, never-called function | Delete it |
+| Speculative Generality | Abstractions added "for future use" with no current caller | Inline Class/Function, Remove unused parameters |
+
+#### Couplers — excessive inter-object dependency
+
+| Smell | Signal | Likely catalogue entry |
+|---|---|---|
+| Feature Envy | Method uses another class's data more than its own | Move Function, Extract Function |
+| Inappropriate Intimacy | Class accesses another's private fields or implementation details | Move Function/Field, Extract Class, Replace Inheritance with Delegation |
+| Message Chains | `a.b().c().d()` — navigating a chain of objects | Hide Delegate, Extract Function to encapsulate the chain |
+| Middle Man | Class delegates most of its methods to another class | Inline Class, Remove Middle Man |
+| Incomplete Library Class | External class lacks a needed method | Introduce Foreign Method, Introduce Extension Object |
+
+If pattern smell detected: load `skill apply-patterns` for pattern selection guidance.
 
 ### Step 2 — Apply one catalogue entry at a time
 
@@ -113,119 +148,34 @@ Commit (see Commit Discipline below).
 ## Key Catalogue Entries
 
 ### Extract Function
-Pull a cohesive fragment into a named function. Trigger: the fragment needs a comment to explain it.
-
-```python
-# Before
-def process(order):
-    # apply 10% discount
-    order.total = order.total * Decimal("0.9")
-    send_confirmation(order)
-
-# After
-def apply_discount(order: Order) -> None:
-    """Apply the standard 10% discount."""
-    order.total = order.total * Decimal("0.9")
-
-def process(order: Order) -> None:
-    """Process an order."""
-    apply_discount(order)
-    send_confirmation(order)
-```
+Pull a cohesive fragment into a named function.
 
-### Extract Class
-Split a class doing two jobs. Trigger: data cluster + related behaviours that travel together.
-
-```python
-# Before
-@dataclass
-class Order:
-    id: str
-    street: str
-    city: str
-    total: Decimal
-
-# After
-@dataclass(frozen=True, slots=True)
-class Address:
-    """A delivery address."""
-    street: str
-    city: str
-
-@dataclass
-class Order:
-    """An order placed by a customer."""
-    id: str
-    address: Address
-    total: Decimal
-```
+**Trigger**: a fragment needs a comment to explain what it does.
+**Outcome**: the extracted function's name makes the comment unnecessary; the caller reads as a sequence of named steps.
 
-### Introduce Parameter Object
-Replace a recurring parameter group with a value object. Trigger: same 2+ params appear together across multiple signatures.
-
-```python
-# Before
-def summarise(start_date: date, end_date: date) -> Report: ...
-def filter_events(start_date: date, end_date: date) -> list[Event]: ...
-
-# After
-@dataclass(frozen=True, slots=True)
-class DateRange:
-    """An inclusive date range."""
-    start: date
-    end: date
-
-def summarise(period: DateRange) -> Report: ...
-def filter_events(period: DateRange) -> list[Event]: ...
-```
-
-### Replace Primitive with Object
-Elevate a domain primitive to a class with behaviour. Trigger: primitive has validation rules or operations.
+### Extract Class
+Split a class that is doing two jobs.
 
-```python
-# Before
-def send_invoice(email: str) -> None: ...
+**Trigger**: a data cluster (2–3 fields that always travel together) with related behaviour that could be named independently.
+**Outcome**: each class has one reason to change; the new class becomes a value object or a collaborator.
 
-# After
-@dataclass(frozen=True, slots=True)
-class EmailAddress:
-    """A validated email address."""
-    value: str
+### Introduce Parameter Object
+Replace a recurring parameter group with a dedicated object.
 
-    def validate(self) -> None:
-        """Validate the email format.
+**Trigger**: the same 2+ parameters appear together across multiple function signatures.
+**Outcome**: a named type captures the concept; callers are simplified; the object can later carry behaviour.
 
-        Raises:
-            ValueError: if the address has no '@' character.
-        """
-        if "@" not in self.value:
-            raise ValueError(f"Invalid email: {self.value!r}")
+### Replace Primitive with Object
+Elevate a domain concept represented as a raw primitive to its own type.
 
-def send_invoice(email: EmailAddress) -> None: ...
-```
+**Trigger**: a primitive has validation rules, formatting logic, or operations that are repeated at every call site.
+**Outcome**: behaviour moves into the type; callers are protected from invalid states; the type can be named and tested independently.
 
 ### Decompose Conditional / Guard Clauses
-Flatten nested logic to ≤2 levels. Trigger: OC-1 violation or deeply nested `if` chains.
-
-```python
-# Before
-def process(order):
-    if order is not None:
-        if order.total > 0:
-            if order.is_confirmed:
-                ship(order)
-
-# After
-def process(order: Order | None) -> None:
-    """Ship a confirmed order."""
-    if order is None:
-        return
-    if order.total <= 0:
-        return
-    if not order.is_confirmed:
-        return
-    ship(order)
-```
+Flatten nested conditional logic to ≤2 levels.
+
+**Trigger**: OC-1 violation (nesting beyond one indent level per method), or multi-level nested `if` chains.
+**Outcome**: each exit condition is expressed as an early return (guard clause); the happy path is at the left margin; no `else` after `return`.
 
 ---
 
@@ -295,123 +245,36 @@ Before marking the `@id` complete, verify all of the following. Each failed item
 | OC-9 | No getters/setters | `def get_name(self)` / `def set_name(self, v)` |
 
 ### SOLID (Martin 2000)
-| Principle | Check |
-|---|---|
-| **S** — Single Responsibility | Does this class have exactly one reason to change? |
-| **O** — Open/Closed | Can new behavior be added without editing this class? |
-| **L** — Liskov Substitution | Do all subtypes honor the full contract of their base type? |
-| **I** — Interface Segregation | Does every implementor use every method in the Protocol? |
-| **D** — Dependency Inversion | Does domain code depend only on Protocols, not concrete I/O? |
-
-#### SOLID Python signals
-
-**S — Single Responsibility**
-```python
-# WRONG — Report handles both data and formatting
-class Report:
-    def generate(self) -> dict: ...
-    def to_pdf(self) -> bytes: ...    # separate concern
-    def to_csv(self) -> str: ...      # separate concern
-
-# RIGHT
-class Report:
-    def generate(self) -> ReportData: ...
-
-class PdfRenderer:
-    def render(self, data: ReportData) -> bytes: ...
-```
-
-**O — Open/Closed**
-```python
-# WRONG — must edit this function to add a new format
-def export(data: ReportData, fmt: str) -> bytes:
-    if fmt == "pdf": ...
-    elif fmt == "csv": ...
-
-# RIGHT — new formats extend without touching existing code
-class Exporter(Protocol):
-    def export(self, data: ReportData) -> bytes: ...
-```
-
-**L — Liskov Substitution**
-```python
-# WRONG — ReadOnlyFile narrows the contract of File
-class ReadOnlyFile(File):
-    def write(self, content: str) -> None:
-        raise PermissionError  # LSP violation
-
-# RIGHT — separate interfaces
-class ReadableFile(Protocol):
-    def read(self) -> str: ...
-
-class WritableFile(Protocol):
-    def write(self, content: str) -> None: ...
-```
-
-**I — Interface Segregation**
-```python
-# WRONG — Printer forced to implement scan() and fax()
-class Machine(Protocol):
-    def print(self, doc: Document) -> None: ...
-    def scan(self, doc: Document) -> None: ...
-    def fax(self, doc: Document) -> None: ...
-
-# RIGHT
-class Printer(Protocol):
-    def print(self, doc: Document) -> None: ...
-
-class Scanner(Protocol):
-    def scan(self, doc: Document) -> None: ...
-```
-
-**D — Dependency Inversion**
-```python
-# WRONG — domain imports infrastructure directly
-from app.db import PostgresConnection
-
-class OrderRepository:
-    def __init__(self) -> None:
-        self.db = PostgresConnection()
-
-# RIGHT — domain defines the Protocol; infra implements it
-class OrderRepository(Protocol):
-    def find(self, order_id: OrderId) -> Order: ...
-    def save(self, order: Order) -> None: ...
-
-class PostgresOrderRepository:      # in adapters/
-    def find(self, order_id: OrderId) -> Order: ...
-    def save(self, order: Order) -> None: ...
-```
+| Principle | Check | Violation signal |
+|---|---|---|
+| **S** — Single Responsibility | Does this class have exactly one reason to change? | Class handles data + formatting, or business logic + persistence |
+| **O** — Open/Closed | Can new behavior be added without editing this class? | Adding a case requires editing an `if/elif` chain inside the class |
+| **L** — Liskov Substitution | Do all subtypes honor the full contract of their base type? | Subclass raises on an inherited method, or narrows a precondition |
+| **I** — Interface Segregation | Does every implementor use every method in the interface? | Implementors stub out methods they don't need |
+| **D** — Dependency Inversion | Does domain code depend only on abstractions, not concrete I/O? | Domain class directly imports a database, file, or network class |
 
 ### Law of Demeter / Tell, Don't Ask / CQS
 
-**Law of Demeter** — a method should only call methods on: `self`, parameters, objects it creates, direct components (`self.x`).
-- Violation signal: `a.b.c()` — two dots. Ask `a` to do the thing instead: `a.do_thing()`.
-
-**Tell, Don't Ask** — tell objects what to do; don't query state and decide externally.
-```python
-# WRONG
-if order.status == OrderStatus.PENDING:
-    order.status = OrderStatus.CONFIRMED
+**Law of Demeter** — a method should only call methods on: `self`, its parameters, objects it creates, and its direct components.
+- Violation signal: chaining through two or more intermediaries (`a.b().c()`). Ask `a` to do the thing instead of navigating through it.
 
-# RIGHT
-order.confirm()
-```
+**Tell, Don't Ask** — tell objects what to do; don't query their state and decide externally.
+- Violation signal: querying an object's status field, then setting it based on that query from outside the object. Move the decision into the object itself.
 
 **Command-Query Separation** — a method either changes state (command) or returns a value (query), never both.
-- Apply to domain objects. Do not fight stdlib (`list.pop()` is a known violation).
+- Apply to domain objects. Standard library collections are a known exception (e.g., pop-style methods).
 
-### Python Zen (PEP 20) signals
+### Design Clarity Signals
 
-| Zen item | Code implication |
+| Principle | Signal |
 |---|---|
-| Explicit is better than implicit | Explicit return types; explicit Protocol dependencies; no magic |
-| Simple is better than complex | One function, one job; prefer a plain function over a class |
-| Flat is better than nested | OC-1 — one indent level; early returns |
-| Readability counts | OC-6 — no abbreviations; docstrings on every public item |
-| Errors should never pass silently | No bare `except:`; no `except Exception: pass` |
-| In the face of ambiguity, refuse to guess | Raise on invalid input; never silently return a default |
-
-### Type and docstring hygiene
-- [ ] Type hints present on all public signatures
-- [ ] Docstrings present on all public classes and methods
+| Explicit over implicit | Dependencies stated at construction; no hidden side effects or magic initialization |
+| Simple over complex | One function, one job; prefer a plain function over a class when no state is needed |
+| Flat over nested | OC-1 — one indent level per method; early returns over deep nesting |
+| Readability | OC-6 — no abbreviations; public items documented |
+| Errors surface explicitly | Raise on invalid input; never silently swallow errors or return a default that hides failure |
+| No ambiguous defaults | Invalid input raises; callers are never surprised by silent fallbacks |
+
+### Type and documentation hygiene
+- [ ] Type annotations present on all public signatures
+- [ ] Documentation present on all public classes and methods
diff --git a/.opencode/skills/run-session/SKILL.md b/.opencode/skills/run-session/SKILL.md
index 469e864..938f044 100644
--- a/.opencode/skills/run-session/SKILL.md
+++ b/.opencode/skills/run-session/SKILL.md
@@ -1,7 +1,7 @@
 ---
 name: run-session
-description: Session start and end protocol — read TODO.md, continue from checkpoint, update and commit
-version: "3.0"
+version: "5.0"
+description: Session start and end protocol — read FLOW.md, auto-detect state, resume from checkpoint, update and commit
 author: software-engineer
 audience: all-agents
 workflow: session-management
@@ -11,34 +11,48 @@ workflow: session-management
 
 Every session starts by reading state. Every session ends by writing state. This makes any agent able to continue from where the last session stopped.
 
-## Session Start
+The single source of state is `FLOW.md` in the project root. It tracks the current feature, branch, detected workflow state, and next action.
+
+## Read Policy
+
+Each agent reads only what is operationally necessary for their current step. Do not read files "for context" unless the step explicitly requires it.
 
-1. Read `TODO.md` — find current feature, current step, and the "Next" line.
-   - If `TODO.md` does not exist, create a basic one:
-     ```markdown
-     # Current Work
+| Agent | Reads |
+|---|---|
+| PO (Step 1) | `FLOW.md`, `scope_journal.md` (resume check), `system.md`, `glossary.md`, `domain-model.md` (read-only, entity check), `docs/post-mortem/` (selective scan), in-progress `.feature` |
+| SA (Step 2) | `FLOW.md`, `system.md`, `glossary.md`, in-progress `.feature`, targeted `.py` files |
+| SE (Step 3) | `FLOW.md`, `system.md`, `glossary.md`, in-progress `.feature`, targeted `.py` files |
+| SA (Step 4) | `FLOW.md`, `system.md`, `glossary.md`, `domain-model.md`, in-progress `.feature`, ADR files referenced in `system.md` |
 
-     No feature in progress.
-     Next: Run @product-owner — load skill select-feature and pick the next BASELINED feature from backlog.
-     ```
-2. **If you are the PO** and Step 1 (SCOPE) is active: check `docs/discovery_journal.md` for the most recent session block.
+## Session Start
+
+1. **Read `FLOW.md`** — find current feature, current branch, detected status, and the "Next" line.
+   - If `FLOW.md` does not exist, create it from `.opencode/skills/flow/flow.md.template`
+   - If `FLOW.md` exists but is empty or malformed, recreate from template
+2. **Run `detect-state`** — execute the auto-detection rules from `skill flow` to determine the actual workflow state from filesystem and git state.
+   - If detected state differs from `FLOW.md` Status, update `FLOW.md` to match reality
+3. **Check prerequisites** — verify the Prerequisites table in `FLOW.md`. If any are unchecked, stop and report.
+4. **If you are the PO** and Step 1 (SCOPE) is active: check `docs/scope_journal.md` for the most recent session block.
    - If the most recent block has `Status: IN-PROGRESS` → the previous session was interrupted. Resume it before starting a new session: finish updating `.feature` files and `docs/discovery.md`, then mark the block `Status: COMPLETE`.
-3. If a feature is active at Step 2–5, read:
+5. If a feature is active at Step 2–5, read:
    - `docs/features/in-progress/<feature-stem>.feature` — feature file (Rules + Examples + @id)
-   - `docs/discovery.md` — project-level synthesis changelog (for context)
-4. Run `git status` — understand what is committed vs. what is not
-5. Confirm scope: you are working on exactly one step of one feature
+   - `docs/system.md` — current system overview and constraints
+6. Run `git status` — understand what is committed vs. what is not
+7. **If Step 2–5 is active**: run `git branch --show-current` and verify:
+   - **SA at Step 2 or Step 4**: must be on `feat/<stem>` or `fix/<stem>`. If on `main`, stop — load `skill version-control` and create the branch first.
+   - **SE at Step 3**: must be on `feat/<stem>` or `fix/<stem>`. If on `main`, stop — load `skill version-control` and create/switch to the branch first.
+8. Confirm scope: you are working on exactly one step of one feature
 
-**If TODO.md says "No feature in progress":**
+**If FLOW.md Status is [IDLE] or says "No feature in progress":**
 
 - **PO**: Load `skill select-feature` — it guides you through scoring and selecting the next BASELINED backlog feature. You must verify the feature has `Status: BASELINED` before moving it to `in-progress/`. Only you may move it.
-- **Software-engineer or reviewer**: Update TODO.md `Next:` line to `Run @product-owner — load skill select-feature and pick the next BASELINED feature from backlog.` Then **stop**. Never self-select a feature. Never move a `.feature` file.
+- **Software-engineer or system-architect**: Update `FLOW.md` `Next:` line to `Run @product-owner — load skill select-feature and pick the next BASELINED feature from backlog.` Then **stop**. Never self-select a feature. Never create, edit, or move a `.feature` file.
 
 ## Session End
 
-1. Update TODO.md:
-   - Mark completed criteria `[x]`
-   - Mark in-progress criteria `[~]`
+1. Update `FLOW.md`:
+   - Set Status to the detected state
+   - Append to Session Log with timestamp, agent, state, and action
    - Update the "Next" line with one concrete action
 2. Commit any uncommitted work (even WIP):
    ```bash
@@ -51,98 +65,56 @@ Every session starts by reading state. Every session ends by writing state. This
 
 When a step completes within a session:
 
-1. Update TODO.md to reflect the completed step before doing any other work.
-2. Commit the TODO.md update:
+1. Update `FLOW.md` to reflect the completed step before doing any other work.
+2. Commit the `FLOW.md` update:
    ```bash
-   git add TODO.md
+   git add FLOW.md
    git commit -m "chore: complete step <N> for <feature-stem>"
    ```
 3. Only then begin the next step (in a new session where possible — see Rule 4).
 
-## TODO.md Format
+## FLOW.md Format
 
 ```markdown
-# Current Work
+# FLOW Protocol
+
+## Current Feature
+**Feature**: <feature-stem> | [NONE]
+**Branch**: <branch-name> | [NONE]
+**Status**: <state>
 
-Feature: <feature-stem>
-Step: <1-5> (<step name>)
-Source: docs/features/in-progress/<feature-stem>.feature
+## Prerequisites
+- [x] Agents: product-owner, system-architect, software-engineer
+- [x] Skills: run-session, define-scope, architect, implement, verify, version-control
+- [x] Tools: uv, git
+- [x] Directories: docs/features/, docs/adr/
 
-## Progress
-- [x] `@id:<hex>`: <description>
-- [~] `@id:<hex>`: <description>  ← IN PROGRESS
-- [ ] `@id:<hex>`: <description>
+## Session Log
+**YYYY-MM-DD HH:MM** — <agent> — <state> — <action>
 
 ## Next
 Run @<agent-name> — <one concrete action>
 ```
 
 **"Next" line format**: Always prefix with `Run @<agent-name>` so the human knows exactly which agent to invoke. Agent names are defined in `AGENTS.md` — use the name exactly as listed there. Examples:
-- `Run @<software-engineer-agent> — implement @id:a1b2c3d4 (Step 3 RED)`
-- `Run @<software-engineer-agent> — load skill implement and begin Step 2 (Architecture) for <feature-stem>`
-- `Run @<reviewer-agent> — verify feature <feature-stem> at Step 4`
-- `Run @<product-owner-agent> — pick next BASELINED feature from backlog`
-- `Run @<product-owner-agent> — accept feature <feature-stem> at Step 5`
-
-**Source path by step:**
-- Step 1: `Source: docs/features/backlog/<feature-stem>.feature`
-- Steps 2–4: `Source: docs/features/in-progress/<feature-stem>.feature`
-- Step 5: `Source: docs/features/completed/<feature-stem>.feature`
-
-Status markers:
-- `[ ]` — not started
-- `[~]` — in progress
-- `[x]` — complete
-- `[-]` — cancelled/skipped
-
-When no feature is active:
-```markdown
-# Current Work
-
-No feature in progress.
-Next: Run @<product-owner-agent> — load skill select-feature and pick the next BASELINED feature from backlog.
-```
-
-## Step 3 (TDD Loop) Cycle-Aware TODO Format
-
-During Step 3 (TDD Loop), TODO.md **must** include a `## Cycle State` block to track Red-Green-Refactor progress.
-
-```markdown
-# Current Work
-
-Feature: <feature-stem>
-Step: 3 (TDD Loop)
-Source: docs/features/in-progress/<feature-stem>.feature
-
-## Cycle State
-Test: `@id:<hex>` — <description>
-Phase: RED | GREEN | REFACTOR
-
-## Progress
-- [x] `@id:<hex>`: <description>
-- [~] `@id:<hex>`: <description>          ← in progress (see Cycle State)
-- [ ] `@id:<hex>`: <description>          ← next
-
-## Next
-<One actionable sentence>
-```
-
-### Phase Transitions
-
-- Move from `RED` → `GREEN` when the test fails with a real assertion
-- Move from `GREEN` → `REFACTOR` when the test passes
-- Move from `REFACTOR` → mark `@id` complete in `## Progress` when test-fast passes
+- `Run @software-engineer — implement @id:a1b2c3d4 (Step 3 RED)`
+- `Run @system-architect — load skill architect and begin Step 2 (Architecture) for <feature-stem>`
+- `Run @system-architect — verify feature <feature-stem> at Step 4`
+- `Run @product-owner — pick next BASELINED feature from backlog`
+- `Run @product-owner — accept feature <feature-stem> at Step 5`
 
 ## Rules
 
-1. Never skip reading TODO.md at session start
-2. Never end a session without updating TODO.md
+1. Never skip reading `FLOW.md` at session start
+2. Never end a session without updating `FLOW.md`
 3. Never leave uncommitted changes — commit as WIP if needed
 4. One step per session where possible; do not start Step N+1 in the same session as Step N
 5. The "Next" line must be actionable enough that a fresh AI can execute it without asking questions
-6. During Step 3, always update `## Cycle State` when transitioning between RED/GREEN/REFACTOR phases
-7. When a step completes, update TODO.md and commit **before** any further work
-8. Output is minimal-signal: findings, status, decisions, blockers, Next: line only. Use the fewest, least verbose tool calls necessary. Report results, not process. No redundant prose.
+6. When a step completes, update `FLOW.md` and commit **before** any further work
+7. The Session Log is append-only — never delete old entries
+8. If `FLOW.md` is missing, create it from `.opencode/skills/flow/flow.md.template` before doing any other work
+9. If detected state differs from `FLOW.md` Status, trust the detected state and update `FLOW.md`
+10. Output is minimal-signal: findings, status, decisions, blockers, Next: line only. Use the fewest, least verbose tool calls necessary. Report results, not process. No redundant prose.
 
 ## Output Style
 
diff --git a/.opencode/skills/select-feature/SKILL.md b/.opencode/skills/select-feature/SKILL.md
index 6e36630..f6583ff 100644
--- a/.opencode/skills/select-feature/SKILL.md
+++ b/.opencode/skills/select-feature/SKILL.md
@@ -11,13 +11,13 @@ workflow: feature-lifecycle
 
 Select the next most valuable, unblocked feature from the backlog using a lightweight scoring model grounded in flow economics and dependency analysis.
 
-**Research basis**: Weighted Shortest Job First (WSJF) — Reinertsen *Principles of Product Development Flow* (2009); INVEST criteria — Wake (2003); Kano model — Kano (1984); Dependency analysis — PMBOK Critical Path Method. See `docs/scientific-research/requirements-elicitation.md`.
+**Research basis**: Weighted Shortest Job First (WSJF) — Reinertsen *Principles of Product Development Flow* (2009); INVEST criteria — Wake (2003); Kano model — Kano (1984); Dependency analysis — PMBOK Critical Path Method. See `docs/research/requirements-elicitation.md`.
 
 **Core principle**: Cost of Delay ÷ Duration. Features with high user value and low implementation effort should start first. Features blocked by unfinished work should wait regardless of value.
 
 ## When to Use
 
-Load this skill when `TODO.md` says "No feature in progress" — before moving any feature to `in-progress/`.
+Load this skill when `FLOW.md` Status is [IDLE] — before moving any feature to `in-progress/`.
 
 ## Step-by-Step
 
@@ -40,7 +40,7 @@ Read each `.feature` file in `docs/features/backlog/`. Check its discovery secti
 
 **IMPORTANT**
 
-**NEVER move a feature to `in-progress/` unless its discovery section has `Status: BASELINED`**
+**NEVER move a feature to `in-progress/` unless its discovery section has `Status: BASELINED`. Only the PO may move `.feature` files — no other agent ever creates, edits, or moves them.**
 
 ### 3. Score Each Candidate
 
@@ -80,20 +80,30 @@ Ties: prefer higher Value (user impact matters more than effort optimization).
 
 If all BASELINED features have Dependency=1: stop and resolve the blocking dependency first — select and complete the depended-upon feature.
 
-### 5. Move and Update TODO.md
+### 5. Move and Update FLOW.md
 
 ```bash
 mv docs/features/backlog/<name>.feature docs/features/in-progress/<name>.feature
 ```
 
-Update `TODO.md`:
+Update `FLOW.md`:
 
 ```markdown
-# Current Work
+# FLOW Protocol
 
-Feature: <name>
-Step: 1 (SCOPE) or 2 (ARCH) — whichever is next
-Source: docs/features/in-progress/<name>.feature
+## Current Feature
+**Feature**: <name>
+**Branch**: [NONE]
+**Status**: [STEP-1-DISCOVERY] or [STEP-2-READY] — whichever is next
+
+## Prerequisites
+- [x] Agents: product-owner, system-architect, software-engineer
+- [x] Skills: run-session, define-scope, architect, implement, verify, version-control
+- [x] Tools: uv, git
+- [x] Directories: docs/features/, docs/adr/
+
+## Session Log
+**YYYY-MM-DD HH:MM** — product-owner — [IDLE] → [<next-state>] — selected <name> from backlog
 
 ## Next
 Run @<agent-name> — <first concrete action for this feature>
@@ -101,12 +111,12 @@ Run @<agent-name> — <first concrete action for this feature>
 
 - If the feature has no `Rule:` blocks yet → Step 1 (SCOPE): `Run @product-owner — load skill define-scope and write stories`
 - If the feature has `Rule:` blocks but no `@id` Examples → Step 1 Stage 2 Step B (Criteria): `Run @product-owner — load skill define-scope and write acceptance criteria`
-- If the feature has `@id` Examples → Step 2 (ARCH): `Run @software-engineer — load skill implement and write architecture stubs`
+- If the feature has `@id` Examples → Step 2 (ARCH): `Run @system-architect — load skill architect and write architecture stubs`
 
 ### 6. Commit
 
 ```bash
-git add docs/features/in-progress/<name>.feature TODO.md
+git add docs/features/in-progress/<name>.feature FLOW.md
 git commit -m "chore: select <name> as next feature"
 ```
 
@@ -118,5 +128,5 @@ git commit -m "chore: select <name> as next feature"
 - [ ] WSJF scores filled for all candidates
 - [ ] Selected feature has highest WSJF among Dependency=0 candidates
 - [ ] Feature moved to `in-progress/`
-- [ ] `TODO.md` updated with correct Step and `Next` line
+- [ ] `FLOW.md` updated with correct Status and `Next` line
 - [ ] Changes committed
diff --git a/.opencode/skills/update-docs/SKILL.md b/.opencode/skills/update-docs/SKILL.md
index 2537243..05b9af5 100644
--- a/.opencode/skills/update-docs/SKILL.md
+++ b/.opencode/skills/update-docs/SKILL.md
@@ -1,7 +1,7 @@
 ---
 name: update-docs
-description: Generate and update C4 architecture diagrams and the living glossary from existing project docs
-version: "1.0"
+description: Generate and update C4 architecture diagrams, living glossary, and system overview from existing project docs
+version: "2.0"
 author: product-owner
 audience: product-owner
 workflow: feature-lifecycle
@@ -22,13 +22,14 @@ The glossary is a secondary artifact derived from the code, the domain model, an
 
 | Document | Created/Updated by | Inputs read |
 |---|---|---|
-| `docs/c4/context.md` | `update-docs` skill (PO) | `docs/discovery.md`, `docs/features/completed/` |
-| `docs/c4/container.md` | `update-docs` skill (PO) | `docs/architecture.md`, `docs/features/completed/` |
-| `docs/glossary.md` | `update-docs` skill (PO) | `docs/discovery.md`, `docs/glossary.md` (existing), `docs/architecture.md`, `docs/features/completed/` |
-| `docs/architecture.md` | SE only (Step 2) | — |
+| `docs/context.md` | `update-docs` skill (PO) | `docs/discovery.md`, `docs/features/completed/` |
+| `docs/container.md` | `update-docs` skill (PO) | `docs/adr/ADR-*.md`, `docs/features/completed/` |
+| `docs/glossary.md` | `update-docs` skill (PO) | `docs/domain-model.md`, `docs/glossary.md` (existing), `docs/adr/ADR-*.md`, `docs/features/completed/` |
+| `docs/system.md` | SA (Step 2), PO reviews (Step 5) | `docs/discovery.md`, `docs/adr/ADR-*.md`, `docs/features/completed/` |
 | `docs/discovery.md` | PO only (Step 1) | — |
+| `docs/domain-model.md` | SA only (Step 2) | — |
 
-**Never edit `docs/architecture.md` or `docs/discovery.md` in this skill.** Those files are append-only by their respective owners. This skill reads them; it never writes to them.
+**Never edit `docs/adr/ADR-*.md`, `docs/discovery.md`, or `docs/domain-model.md` in this skill.** Those files are owned by their respective agents. This skill reads them; it never writes to them.
 
 ---
 
@@ -36,97 +37,52 @@ The glossary is a secondary artifact derived from the code, the domain model, an
 
 Read in this order:
 
-1. `docs/discovery.md` — project scope, domain model (nouns/verbs), feature list per session
-2. `docs/features/completed/` — all completed `.feature` files (full text: Rules, Examples, Constraints)
-3. `docs/architecture.md` — all architectural decisions (containers, modules, protocols, external deps)
-4. `docs/c4/` — existing C4 diagrams if they exist (update, do not replace from scratch)
-5. `docs/glossary.md` — existing glossary if it exists (extend, never remove existing entries)
-6. `docs/branding.md` — if present, read `Visual > Primary color` and `Accent color`. Apply to C4 Mermaid diagrams via `%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '<primary-hex>', 'lineColor': '<accent-hex>'}}}%%`. If absent or fields blank, use Mermaid defaults.
+1. `docs/discovery.md` — project scope, feature list per session
+2. `docs/domain-model.md` — all entities, nouns, verbs, bounded contexts
+3. `docs/features/completed/` — all completed `.feature` files (full text: Rules, Examples, Constraints)
+4. `docs/adr/` — all architectural decision files (containers, modules, protocols, external deps)
+5. `docs/context.md` and `docs/container.md` — existing C4 diagrams if they exist (update, do not replace from scratch)
+6. `docs/glossary.md` — existing glossary if it exists (extend, never remove existing entries)
+7. `docs/branding.md` — if present, read `Visual > Primary color` and `Accent color`. Apply to C4 Mermaid diagrams via `%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '<primary-hex>', 'lineColor': '<accent-hex>'}}}%%`. If absent or fields blank, use Mermaid defaults.
 
 Identify from the read phase:
 
 - **Actors** — named human roles from feature `As a <role>` clauses and discovery Scope section
 - **External systems** — any system outside the package boundary named in features or architecture decisions
-- **Containers** — deployable/runnable units identified in `docs/architecture.md` (Hexagonal adapters, CLIs, services)
-- **Key domain terms** — all nouns from `docs/discovery.md` Domain Model tables, plus any terms defined in `docs/architecture.md` decisions
+- **Containers** — deployable/runnable units identified in ADR files (Hexagonal adapters, CLIs, services)
+- **Key domain terms** — all nouns and verbs from `docs/domain-model.md`, plus any terms defined in ADR decisions
 
 ---
 
 ## Step 2 — Update C4 Context Diagram (Level 1)
 
-File: `docs/c4/context.md`
+File: `docs/context.md`
 
 The Context diagram answers: **who uses the system and what external systems does it interact with?**
 
-Use Mermaid `C4Context` syntax. Template:
-
-```markdown
-# C4 — System Context
-
-> Last updated: YYYY-MM-DD
-> Source: docs/discovery.md, docs/features/completed/
-
-```mermaid
-C4Context
-  title System Context — <project-name>
-
-  Person(actor1, "<role name>", "<one-line description from feature As a clauses>")
-
-  System(system, "<project-name>", "<3–5 word system description from discovery.md Scope>")
-
-  System_Ext(ext1, "<external system name>", "<what it provides>")
-
-  Rel(actor1, system, "<verb from When clause>")
-  Rel(system, ext1, "<verb from architecture.md decision>")
-```
-```
+Use Mermaid `C4Context` syntax. Use the template in `context.md.template` in this skill's directory.
 
 Rules:
 - One `Person(...)` per distinct actor found in completed feature files
-- One `System_Ext(...)` per external dependency identified in `docs/architecture.md` decisions
+- One `System_Ext(...)` per external dependency identified in ADR files
 - Relationships (`Rel`) use verb phrases from feature `When` clauses or architecture decision labels
-- If no external systems are identified in `docs/architecture.md`, omit `System_Ext` entries
+- If no external systems are identified in ADRs, omit `System_Ext` entries
 - If the file already exists: update only — add new actors/systems, update relationship labels. Never remove an existing entry unless the feature it came from has been explicitly superseded
 
 ---
 
 ## Step 3 — Update C4 Container Diagram (Level 2)
 
-File: `docs/c4/container.md`
+File: `docs/container.md`
 
 The Container diagram answers: **what are the major runnable/deployable units and how do they communicate?**
 
-Only generate this diagram if `docs/architecture.md` contains at least one decision identifying a distinct container boundary (e.g., a CLI entry point separate from a library, a web server, a background worker, an external service adapter). If the project is a single-container system, note this in the file and skip the diagram body.
-
-Use Mermaid `C4Container` syntax. Template:
+Only generate this diagram if `docs/adr/` contains at least one decision identifying a distinct container boundary (e.g., a CLI entry point separate from a library, a web server, a background worker, an external service adapter). If the project is a single-container system, note this in the file and skip the diagram body.
 
-```markdown
-# C4 — Container Diagram
-
-> Last updated: YYYY-MM-DD
-> Source: docs/architecture.md
-
-```mermaid
-C4Container
-  title Container Diagram — <project-name>
-
-  Person(actor1, "<role>", "")
-
-  System_Boundary(sys, "<project-name>") {
-    Container(container1, "<name>", "<technology>", "<responsibility from architecture.md>")
-    Container(container2, "<name>", "<technology>", "<responsibility>")
-  }
-
-  System_Ext(ext1, "<external system>", "")
-
-  Rel(actor1, container1, "<action>")
-  Rel(container1, container2, "<protocol or method>")
-  Rel(container1, ext1, "<protocol>")
-```
-```
+Use Mermaid `C4Container` syntax. Use the template in `container.md.template` in this skill's directory.
 
 Rules:
-- Container names and responsibilities come directly from `docs/architecture.md` decisions — do not invent them
+- Container names and responsibilities come directly from ADR decisions — do not invent them
 - Technology labels come from `pyproject.toml` dependencies when identifiable (e.g., "Python / fire CLI", "Python / FastAPI")
 - If the file already exists: update incrementally — do not regenerate from scratch
 
@@ -138,37 +94,16 @@ File: `docs/glossary.md`
 
 The glossary answers: **what does each domain term mean in this project's context?**
 
-### Format
-
-```markdown
-# Glossary — <project-name>
-
-> Living document. Updated after each completed feature by the `update-docs` skill.
-> Source: docs/discovery.md (Domain Model), docs/features/completed/, docs/architecture.md
-
----
-
-## <Term>
-
-**Type:** Noun | Verb | Domain Event | Concept | Role | External System
-
-**Definition:** <one sentence, plain English, no jargon>
-
-**Bounded context:** <name of the bounded context where this term is defined; required when the project has more than one bounded context; omit only for single-context projects>
-
-**First appeared:** <YYYY-MM-DD discovery session or feature name>
-
----
-```
+Use the template in `glossary.md.template` in this skill's directory.
 
 ### Rules
 
-- Extract all nouns and verbs from every `### Domain Model` table in `docs/discovery.md`
+- Extract all entities and verbs from `docs/domain-model.md`
 - Extract all roles from `As a <role>` clauses in completed `.feature` files
-- Extract all external system names from `docs/architecture.md` decisions
+- Extract all external system names from ADR decisions
 - Extract any term defined or clarified in architectural decision `Reason:` fields
 - **Do not remove existing glossary entries** — if a term's meaning has changed, add a `**Superseded by:**` line pointing to the new entry and write a new entry
-- **Every term must have a traceable source** — completed feature files or `docs/architecture.md` decisions. If a term appears in sources but is never defined, write `Definition: Term appears in [source] but has not been explicitly defined.` Do not invent a definition.
+- **Every term must have a traceable source** — completed feature files or ADR decisions. If a term appears in sources but is never defined, write `Definition: Term appears in [source] but has not been explicitly defined.` Do not invent a definition.
 - Terms are sorted alphabetically within the file
 
 ### Merge with existing glossary
@@ -204,11 +139,31 @@ docs(update-docs): refresh C4 diagrams and glossary
 
 - [ ] Read all source files before writing anything (including `docs/branding.md` if present)
 - [ ] Context diagram reflects all actors from completed feature files
-- [ ] Context diagram reflects all external systems from `docs/architecture.md`
-- [ ] Container diagram present only if multi-container architecture confirmed in `docs/architecture.md`
-- [ ] Glossary contains all nouns and verbs from `docs/discovery.md` Domain Model tables
+- [ ] Context diagram reflects all external systems from ADR files
+- [ ] Container diagram present only if multi-container architecture confirmed in ADR files
+- [ ] Glossary contains all entities and verbs from `docs/domain-model.md`
 - [ ] No existing glossary entry removed
-- [ ] Every new term has a traceable source in completed feature files or `docs/architecture.md`; no term is invented
-- [ ] No edits made to `docs/architecture.md` or `docs/discovery.md`
+- [ ] Every new term has a traceable source in completed feature files or ADRs; no term is invented
+- [ ] No edits made to ADR files, `docs/discovery.md`, or `docs/domain-model.md`
 - [ ] If standalone: committed with `docs(update-docs): ...` message
 - [ ] If called from release: files staged but not committed (release process commits)
+
+---
+
+## Templates
+
+All templates for files written by this skill live in this skill's directory:
+
+- `context.md.template` — `docs/context.md` structure
+- `container.md.template` — `docs/container.md` structure
+- `glossary.md.template` — `docs/glossary.md` entry format
+
+Base directory for this skill: file:///home/user/Documents/projects/python-project-template/.opencode/skills/update-docs
+Relative paths in this skill (e.g., scripts/, reference/) are relative to this base directory.
+Note: file list is sampled.
+
+<skill_files>
+<file>/home/user/Documents/projects/python-project-template/.opencode/skills/update-docs/container.md.template</file>
+<file>/home/user/Documents/projects/python-project-template/.opencode/skills/update-docs/context.md.template</file>
+<file>/home/user/Documents/projects/python-project-template/.opencode/skills/update-docs/glossary.md.template</file>
+</skill_files>
diff --git a/.opencode/skills/update-docs/container.md.template b/.opencode/skills/update-docs/container.md.template
new file mode 100644
index 0000000..cea7bad
--- /dev/null
+++ b/.opencode/skills/update-docs/container.md.template
@@ -0,0 +1,24 @@
+# C4 — Container Diagram
+
+> Last updated: YYYY-MM-DD
+> Source: docs/adr/ADR-*.md
+
+```mermaid
+C4Container
+  title Container Diagram — <project-name>
+
+  Person(actor1, "<role>", "")
+
+  System_Boundary(sys, "<project-name>") {
+    Container(container1, "<name>", "<technology>", "<responsibility from ADR>")
+    Container(container2, "<name>", "<technology>", "<responsibility>")
+  }
+
+  System_Ext(ext1, "<external system>", "")
+
+  Rel(actor1, container1, "<action>")
+  Rel(container1, container2, "<protocol or method>")
+  Rel(container1, ext1, "<protocol>")
+```
+
+> Note: Only generate this diagram if `docs/adr/` contains at least one decision identifying a distinct container boundary. If the project is a single-container system, state that here and skip the diagram body.
diff --git a/.opencode/skills/update-docs/context.md.template b/.opencode/skills/update-docs/context.md.template
new file mode 100644
index 0000000..bab9611
--- /dev/null
+++ b/.opencode/skills/update-docs/context.md.template
@@ -0,0 +1,18 @@
+# C4 — System Context
+
+> Last updated: YYYY-MM-DD
+> Source: docs/discovery.md, docs/features/completed/
+
+```mermaid
+C4Context
+  title System Context — <project-name>
+
+  Person(actor1, "<role name>", "<one-line description from feature As a clauses>")
+
+  System(system, "<project-name>", "<3–5 word system description from discovery.md Scope>")
+
+  System_Ext(ext1, "<external system name>", "<what it provides>")
+
+  Rel(actor1, system, "<verb from When clause>")
+  Rel(system, ext1, "<verb from ADR decision>")
+```
diff --git a/.opencode/skills/update-docs/glossary.md.template b/.opencode/skills/update-docs/glossary.md.template
new file mode 100644
index 0000000..32676e5
--- /dev/null
+++ b/.opencode/skills/update-docs/glossary.md.template
@@ -0,0 +1,18 @@
+# Glossary — <project-name>
+
+> Living document. Updated after each completed feature by the `update-docs` skill.
+> Source: docs/discovery.md, docs/features/completed/, docs/adr/ADR-*.md
+
+---
+
+## <Term>
+
+**Type:** Noun | Verb | Domain Event | Concept | Role | External System
+
+**Definition:** <one sentence, plain English, no jargon>
+
+**Bounded context:** <name of the bounded context where this term is defined; required when the project has more than one bounded context; omit only for single-context projects>
+
+**First appeared:** <YYYY-MM-DD discovery session or feature name>
+
+---
diff --git a/.opencode/skills/verify/SKILL.md b/.opencode/skills/verify/SKILL.md
index 5359d1c..836791f 100644
--- a/.opencode/skills/verify/SKILL.md
+++ b/.opencode/skills/verify/SKILL.md
@@ -1,23 +1,23 @@
 ---
 name: verify
 description: Step 4 — run all verification commands, review code quality, and produce a written report
-version: "4.0"
-author: reviewer
-audience: reviewer
+version: "6.0"
+author: system-architect
+audience: system-architect
 workflow: feature-lifecycle
 ---
 
 # Verify
 
-This skill guides the reviewer through Step 4: independent verification that the feature works correctly and meets quality standards. The output is a written report with a clear APPROVED or REJECTED decision.
+This skill guides the system-architect through Step 4: adversarial verification that the feature works correctly and respects the architecture designed in Step 2. The output is a written report with a clear APPROVED or REJECTED decision.
 
-**Your default hypothesis is that the code is broken despite passing automated checks. Your job is to find the failure mode. If you cannot find one after thorough investigation, APPROVE. If you find one, REJECTED.**
+**Your default hypothesis is that the code is broken despite passing automated checks. You designed the architecture; you know what should have been preserved. Your job is to find the failure mode. If you cannot find one after thorough investigation, APPROVE. If you find one, REJECTED.**
 
 **Every PASS/FAIL cell must have evidence.** Empty evidence = UNCHECKED = REJECTED.
 
-**You never move `.feature` files.** After producing an APPROVED report: update TODO.md `Next:` to `Run @product-owner — accept feature <name> at Step 5.` then stop. The PO accepts the feature and moves the file.
+**You never move, create, or edit `.feature` files.** After producing an APPROVED report: update FLOW.md `Next:` to `Run @product-owner — accept feature <name> at Step 5.` then stop. The PO accepts the feature and moves the file.
 
-The reviewer produces one written report (see template below) that includes: all gate results, the SE Self-Declaration Audit, the **Reviewer Stance Declaration**, and the final APPROVED/REJECTED verdict. Do not start until the software-engineer has committed all work and communicated the Self-Declaration verbally in the handoff message.
+The system-architect produces one written report (see template below) that includes: all gate results, the SE Self-Declaration Audit, the **Architect Review Stance Declaration**, and the final APPROVED/REJECTED verdict. Do not start until the software-engineer has committed all work and communicated the Self-Declaration verbally in the handoff message.
 
 ## When to Use
 
@@ -30,9 +30,13 @@ Load this skill when the software-engineer signals Step 3 complete and hands off
 Read `docs/features/in-progress/<name>.feature`. Extract:
 - All `@id` tags and their Example titles from `Rule:` blocks
 - The interaction model (if the feature involves user interaction)
-- The architectural decisions in `docs/architecture.md` relevant to this feature
+- The current-state overview in `docs/system.md`
+- `docs/domain-model.md` — verify naming consistency of new classes/methods against existing entities
+- `docs/glossary.md` — verify domain terms are used correctly
 - The software-engineer's Self-Declaration (communicated verbally in the handoff message)
 
+Only read specific ADR files if `docs/system.md` references them as relevant to this feature.
+
 ### 2. pyproject.toml Gate
 
 ```bash
@@ -41,7 +45,27 @@ git diff main -- pyproject.toml
 
 Any change → REJECT immediately. The software-engineer must revert and get stakeholder approval.
 
-### 3. Check Commit History
+### 3. Branch Gate
+
+```bash
+git branch --show-current
+```
+
+- Must output `feat/<stem>` or `fix/<stem>`. If `main` → REJECT immediately — the SE is working on the wrong branch.
+
+```bash
+git log main..HEAD --oneline
+```
+
+- Must show 1+ commits. If empty → REJECT — nothing was committed on this branch.
+
+```bash
+git merge-tree $(git merge-base HEAD main) HEAD main
+```
+
+- Empty output = clean merge possible. Non-empty output = conflicts exist → REJECT — the SE must resolve conflicts on the feature branch before handoff.
+
+### 4. Check Commit History
 
 ```bash
 git log --oneline -20
@@ -53,16 +77,16 @@ Verify:
 - No "fix tests", "wip", "temp" commits
 - No uncommitted changes: `git status` should be clean
 
-### 4. Production-Grade Gate
+### 5. Production-Grade Gate
 
-Run before code review. If any row is FAIL, stop immediately with REJECTED.
+Run before semantic review. If any row is FAIL, stop immediately with REJECTED.
 
 | Check | How to check | PASS | FAIL | Fix |
 |---|---|---|---|---|
 | App exits cleanly | `timeout 10s uv run task run` | Exit 0 or non-124 | Exit 124 (timeout/hang) | Fix the hang |
 | Output changes when input changes | Run app, change an input or condition, observe output | Output changes accordingly | Output is static | Implement real logic |
 
-### 5. Self-Declaration Audit
+### 6. Self-Declaration Audit
 
 **Completeness check (hard gate — REJECT if failed)**: Count the numbered items in the SE's Self-Declaration. The template in `implement/SKILL.md` has exactly 25 items numbered 1–25. If the count is not 25, or any number in the sequence 1–25 is missing, REJECT immediately — do not proceed to item-level audit.
 
@@ -77,11 +101,11 @@ For every **DISAGREE** claim:
 - If the justification is weak, incomplete, or a best-practice alternative exists that the SE did not consider: REJECT with the specific alternative stated.
 - If there is no justification: REJECT.
 
-Undeclared violations found during code review → REJECT.
+Undeclared violations found during semantic review → REJECT.
 
-### 6. Code Review
+### 7. Code Review
 
-Read the source files changed in this feature. **Do this before running lint/static-check/test** — if code review finds a design problem, commands will need to re-run after the fix anyway.
+Read the source files changed in this feature. **Do this before running lint/static-check/test** — if semantic review finds a design problem, commands will need to re-run after the fix anyway.
 
 **Stop on first failure category — do not accumulate issues.**
 
@@ -102,7 +126,17 @@ Read the source files changed in this feature. **Do this before running lint/sta
 | Functions ≤ 20 lines | Count lines | ≤ 20 | > 20 | Extract helper |
 | Classes ≤ 50 lines | Count lines | ≤ 50 | > 50 | Split class |
 
-#### 6c. SOLID — any FAIL → REJECTED
+#### 6c. Naming Consistency — any FAIL → REJECTED
+
+| Check | How to check | PASS | FAIL |
+|---|---|---|---|
+| Classes match domain model | New class names appear in `docs/domain-model.md` or are justified | Yes | No |
+| Methods match glossary | New method names use terms from `docs/glossary.md` | Yes | No |
+| No invented synonyms | Same concept uses same name everywhere | Yes | No |
+
+If a new name is genuinely needed (not in domain model or glossary), the SE should have noted it in the handoff summary or in `docs/discovery.md`. If no justification exists, REJECT.
+
+#### 6d. SOLID — any FAIL → REJECTED
 
 | Principle | Why it matters | What to check | How to check |
 |---|---|---|---|
@@ -112,11 +146,11 @@ Read the source files changed in this feature. **Do this before running lint/sta
 | ISP | Fat interfaces force unused methods | No Protocol forces stub implementations | Check for NotImplementedError |
 | DIP | Concrete I/O makes unit testing impossible | High-level depends on abstractions | Check domain imports no I/O/DB |
 
-#### 6d. Object Calisthenics — any FAIL → REJECTED
+#### 6e. Object Calisthenics — any FAIL → REJECTED
 
 Load `skill apply-patterns` and apply the full OC checklist (9 rules). Record a PASS/FAIL with `file:line` evidence for each rule. Rules 1 and 7 (nesting and entity size) share thresholds with 6b above.
 
-#### 6e. Design Patterns — any FAIL → REJECTED
+#### 6f. Design Patterns — any FAIL → REJECTED
 
 | Code smell | Pattern missed | How to check |
 |---|---|---|
@@ -126,7 +160,7 @@ Load `skill apply-patterns` and apply the full OC checklist (9 rules). Record a
 | External dep without Protocol | Repository/Adapter | Check dep injection |
 | 0 domain classes, many functions | Missing domain model | Count classes vs functions |
 
-#### 6f. Tests — any FAIL → REJECTED
+#### 6g. Tests — any FAIL → REJECTED
 
 | Check | How to check | PASS | FAIL |
 |---|---|---|---|
@@ -138,7 +172,7 @@ Load `skill apply-patterns` and apply the full OC checklist (9 rules). Record a
 | Function naming | Matches `test_<feature_slug>_<8char_hex>` | All match | Mismatch |
 | Hypothesis tests have `@slow` | Read every `@given` for `@slow` marker | All present | Any missing |
 
-#### 6g. Code Quality — any FAIL → REJECTED
+#### 6h. Code Quality — any FAIL → REJECTED
 
 | Check | How to check | PASS | FAIL |
 |---|---|---|---|
@@ -147,7 +181,7 @@ Load `skill apply-patterns` and apply the full OC checklist (9 rules). Record a
 | Public functions have type hints | Read signatures | All annotated | Missing |
 | Public functions have docstrings | Read source | Google-style | Missing |
 
-### 7. Run Verification Commands
+### 8. Run Verification Commands
 
 ```bash
 uv run task lint
@@ -159,13 +193,13 @@ Expected for each: exit 0, no errors. Record exact output on failure.
 
 If a command fails, stop and REJECT immediately. Do not run subsequent commands.
 
-### 8. Interactive Verification
+### 9. Interactive Verification
 
 If the feature involves user interaction: run the app, provide real input, verify output changes.
 
 Record what input was given and what output was observed.
 
-### 9. Write the Report
+### 10. Write the Report
 
 ```markdown
 ## Step 4 Verification Report — <feature-stem>
@@ -175,6 +209,13 @@ Record what input was given and what output was observed.
 |---|---|---|
 | No changes from main | PASS / FAIL | |
 
+### Branch Gate
+| Check | Result | Notes |
+|---|---|---|
+| On feat/<stem> or fix/<stem> | PASS / FAIL | |
+| Commits ahead of main | PASS / FAIL | |
+| No merge conflicts with main | PASS / FAIL | |
+
 ### Production-Grade Gate
 | Check | Result | Notes |
 |---|---|---|
@@ -188,6 +229,13 @@ Record what input was given and what output was observed.
 | uv run task static-check | PASS / FAIL | |
 | uv run task test | PASS / FAIL | |
 
+### Naming Consistency
+| Check | Result | Notes |
+|---|---|---|
+| Classes match domain model | PASS / FAIL | |
+| Methods match glossary | PASS / FAIL | |
+| No invented synonyms | PASS / FAIL | |
+
 ### Self-Declaration Audit
 | # | Claim | SE Claims | Reviewer Verdict | Evidence |
 |---|-------|-----------|------------------|----------|
@@ -217,19 +265,17 @@ Record what input was given and what output was observed.
 | 24 | Patterns: no behavioral smell | AGREE/DISAGREE | PASS/FAIL | |
 | 25 | Semantic: tests operate at same abstraction as AC | AGREE/DISAGREE | PASS/FAIL | |
 
-### Reviewer Stance Declaration
+### Architect Review Stance Declaration
 
 Write this block **before** the Decision. Every `DISAGREE` must include an inline explanation. A `DISAGREE` with no explanation auto-forces `REJECTED`.
 
-```markdown
-## Reviewer Stance Declaration
-As a reviewer I declare:
+As a system-architect I declare:
 * Adversarial: I actively tried to find a failure mode, not just confirm passing — AGREE/DISAGREE | note:
+* Architecture preservation: I verified that stubs, Protocols, and ADR decisions from Step 2 were respected — AGREE/DISAGREE | violations:
 * Manual trace: I traced at least one execution path manually beyond automated output — AGREE/DISAGREE | path:
 * Boundary check: I checked the boundary conditions and edge cases of every Rule — AGREE/DISAGREE | gaps:
 * Semantic read: I read each test against its AC and confirmed it tests the right observable behavior — AGREE/DISAGREE | mismatches:
 * Independence: my verdict was not influenced by how much effort has already been spent — AGREE/DISAGREE
-```
 
 ### Decision
 **APPROVED** — all gates passed, no undeclared violations
@@ -241,5 +287,3 @@ OR
 **If APPROVED**: Run `@product-owner` — accept the feature at Step 5.
 **If REJECTED**: Run `@software-engineer` — apply the fixes listed above, re-run quality gate, update Self-Declaration, then signal Step 4 again.
 ```
-
-
diff --git a/.opencode/skills/version-control/SKILL.md b/.opencode/skills/version-control/SKILL.md
new file mode 100644
index 0000000..7f67230
--- /dev/null
+++ b/.opencode/skills/version-control/SKILL.md
@@ -0,0 +1,216 @@
+---
+name: version-control
+description: Git branching, merge safety, and commit hygiene for feature development
+version: "1.0"
+author: software-engineer
+audience: software-engineer
+workflow: git-management
+---
+
+# Version Control
+
+This skill governs all Git operations during feature development. The software-engineer owns branch creation, commit hygiene, merging to `main`, and post-mortem branch management.
+
+## Git Safety Protocol (read first — never violate)
+
+These rules are absolute. Violating them risks destroying shared history or losing work.
+
+- **No force push**: `git push --force` and `git push --force-with-lease` are forbidden.
+- **No history rewrite on pushed branches**: After a branch has been pushed to `origin`, do not `git rebase -i`, `git commit --amend`, or `git reset --hard` on it. These commands rewrite history that others may have fetched.
+- **Use `git revert` to undo**: If a commit on a pushed branch must be undone, create a new revert commit. This appends history safely.
+- **No commits directly to `main`**: All feature work happens on branches. `main` receives code only via `--no-ff` merge from an approved feature branch.
+
+---
+
+## Branch Lifecycle
+
+### Normal Feature Flow
+
+```
+main ──●────────────────────────────●─────►
+        \                          /
+         \── feat/<stem> ──●──●──●/
+```
+
+1. **Create** from latest `main`
+2. **Develop** all commits on the branch
+3. **Merge** back to `main` with `--no-ff` after Step 5 acceptance
+
+### Post-Mortem Fix Flow
+
+```
+main ──●─────●───────────────────────●─────►
+        \   /                        /
+         \ /                        /
+          ● (start commit)         /
+           \── fix/<stem> ──●──●──●/
+```
+
+1. **Find** the feature's original start commit
+2. **Branch** `fix/<stem>` from that commit
+3. **Commit post-mortem** as the first commit on the new branch
+4. **Redo** Steps 2–5 on `fix/<stem>`
+5. **Merge** back to `main` with `--no-ff`
+
+---
+
+## 1. Create Feature Branch
+
+Run at the start of Step 2 (before the system-architect writes stubs).
+
+```bash
+# Ensure you are on main and it is up to date
+git branch --show-current   # must output: main
+git fetch origin main
+git merge --ff-only origin/main   # fast-forward only; if this fails, main has diverged — escalate
+
+# Create and switch to feature branch
+git checkout -b feat/<feature-stem>
+
+# Push the branch to origin (establishes tracking)
+git push -u origin feat/<feature-stem>
+```
+
+**Branch naming**:
+- `feat/<feature-stem>` — new feature
+- `fix/<feature-stem>` — post-mortem restart of a failed feature
+- `docs/<scope>` — documentation-only changes
+- `chore/<scope>` — tooling, deps, CI
+
+**If `main` has unmerged work**: The `git merge --ff-only` will fail. This means `main` is ahead of your local copy. Escalate to the PO or SA — do not resolve by merging or rebasing on your own.
+
+---
+
+## 2. Commit Hygiene
+
+Every commit on a feature branch must follow conventional commits:
+
+```
+<type>(<scope>): <description>
+
+Types: feat, fix, test, refactor, chore, docs, perf, ci
+```
+
+**Forbidden commit messages** (reject immediately if you are tempted to use them):
+- `wip`, `temp`, `fix tests`, `oops`, `try again`, `asdf`
+- Any commit without a type prefix
+
+**Commit early, commit often**: A feature branch with 10 small, well-described commits is better than 1 giant commit. But do not commit broken code (tests must pass at each commit during Step 3).
+
+---
+
+## 3. Branch Verification
+
+Run before every session start and before every handoff.
+
+```bash
+# Verify you are on the correct branch
+git branch --show-current   # expect: feat/<feature-stem> or fix/<feature-stem>
+
+# Verify working tree is clean
+git status   # expect: "nothing to commit, working tree clean"
+
+# Verify branch is ahead of main (has commits)
+git log main..HEAD --oneline   # expect: 1+ commits listed
+```
+
+**If any check fails**:
+- Wrong branch → `git checkout feat/<feature-stem>` (or create it if missing)
+- Dirty working tree → commit or stash before continuing
+- No commits ahead of main → you have not started work on this branch
+
+---
+
+## 4. Merge Feature Branch to Main
+
+Run after PO acceptance (Step 5). This is the only way code enters `main`.
+
+```bash
+# Ensure feature branch is clean and all commits are pushed
+git status   # must be clean
+git push origin feat/<feature-stem>
+
+# Switch to main and update it
+git checkout main
+git fetch origin main
+git merge --ff-only origin/main
+
+# Check for merge conflicts before the real merge
+git merge-tree $(git merge-base HEAD feat/<feature-stem>) HEAD feat/<feature-stem>
+# If the output is non-empty, there are conflicts. Resolve them on the feature branch first.
+
+# Merge with --no-ff to preserve feature boundary
+git merge --no-ff feat/<feature-stem> -m "feat(<scope>): merge <feature-stem> to main"
+
+# Push main
+git push origin main
+
+# Delete the feature branch (optional, but recommended)
+git branch -d feat/<feature-stem>
+git push origin --delete feat/<feature-stem>
+```
+
+**Why `--no-ff`**: Fast-forward merges erase the feature boundary from history. With `--no-ff`, the merge commit groups all feature commits together, making the feature revertible as a single unit.
+
+---
+
+## 5. Post-Mortem Branch
+
+Run when a feature fails acceptance and the PO restarts it at Step 2.
+
+```bash
+# Find the feature's original start commit
+# The start commit is the commit where the feature branch was created from main.
+# It is typically the first commit on the old feature branch.
+git log --all --grep="feat(<feature-stem>)" --oneline
+# Or, if the branch still exists:
+git log --reverse main..feat/<feature-stem> --oneline   # first line = start commit
+
+# Checkout the start commit and create fix branch
+git checkout -b fix/<feature-stem> <start-commit-sha>
+
+# Commit the post-mortem as the first commit on the new branch
+git add docs/post-mortem/YYYY-MM-DD-<feature-stem>-<keyword>.md
+git commit -m "docs(post-mortem): root cause for <feature-stem> <keyword>"
+
+# Push the fix branch
+git push -u origin fix/<feature-stem>
+```
+
+The system-architect then begins Step 2 on `fix/<feature-stem>`, reading the post-mortem as input. All subsequent work (stubs, tests, implementation) happens on this branch. It merges to `main` with `--no-ff` after acceptance.
+
+**Old feature branch**: Keep it for reference until the fix branch is merged. Do not delete it prematurely — it contains the history the SA may need to consult.
+
+---
+
+## 6. Conflict Detection
+
+Before merging a feature branch to `main`, check if `main` has diverged since the branch was created.
+
+```bash
+# Check if main has new commits not in the feature branch
+git log feat/<feature-stem>..origin/main --oneline
+# If output is non-empty, main has diverged.
+
+# Preview the merge without touching files
+git merge-tree $(git merge-base main feat/<feature-stem>) main feat/<feature-stem>
+# Empty output = clean merge. Non-empty output = conflicts exist.
+```
+
+**If conflicts exist**: Resolve them on the feature branch before attempting merge to `main`.
+
+```bash
+git checkout feat/<feature-stem>
+git merge main   # resolve conflicts, commit the merge
+git push origin feat/<feature-stem>
+```
+
+Then retry the merge to `main`.
+
+---
+
+## Reference
+
+- Pro Git, Scott Chacon & Ben Straub (free online: git-scm.com/book)
+- Git Cheat Sheet (git-scm.com/cheatsheets)
+- A successful Git branching model, Vincent Driessen (nvie.com/posts/a-successful-git-branching-model/)
diff --git a/AGENTS.md b/AGENTS.md
index 17d4278..ffb5e9d 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -10,40 +10,64 @@ Features flow through 5 steps with a WIP limit of 1 feature at a time. The files
 - `docs/features/completed/<feature-stem>.feature` — accepted and shipped features
 
 ```
-STEP 1: SCOPE          (product-owner)  → discovery + Gherkin stories + criteria
-STEP 2: ARCH           (software-engineer)      → read all features + existing package files, write domain stubs (signatures only, no bodies); decisions appended to docs/architecture.md
-STEP 3: TDD LOOP       (software-engineer)      → RED → GREEN → REFACTOR, one @id at a time
-STEP 4: VERIFY         (reviewer)       → run all commands, review code
-STEP 5: ACCEPT         (product-owner)  → demo, validate, move .feature to completed/ (PO only)
+STEP 1: SCOPE          (product-owner)     → discovery + Gherkin stories + criteria
+STEP 2: ARCH           (system-architect)  → branch from main; read system.md + glossary.md + in-progress feature + targeted package files; write domain stubs; create/update domain-model.md; significant decisions as docs/adr/ADR-YYYY-MM-DD-<slug>.md; system.md rewritten
+STEP 3: TDD LOOP       (software-engineer) → RED → GREEN → REFACTOR, one @id at a time
+STEP 4: VERIFY         (system-architect)  → run all commands, review code against architecture
+STEP 5: ACCEPT         (product-owner)     → demo, validate, SE merges branch to main with --no-ff, move .feature to completed/ (PO only)
 ```
 
-**PO picks the next feature from backlog. Software-engineer never self-selects.**
+### Branch Model
 
-**Verification is adversarial.** The reviewer's job is to try to break the feature, not to confirm it works. The default hypothesis is "it might be broken despite green checks; prove otherwise."
+All feature work happens on branches. `main` is the single source of truth and receives code only via `--no-ff` merge from an approved feature branch.
+
+**Normal flow**:
+1. SE creates `feat/<stem>` from latest `main` at Step 2 start
+2. All commits live on `feat/<stem>` through Steps 2–4
+3. After PO acceptance (Step 5), SE merges `feat/<stem>` to `main` with `--no-ff`
+4. SE deletes the feature branch
+
+**Post-mortem flow** (failed feature restart):
+1. Find the feature's original start commit
+2. SE creates `fix/<stem>` from that commit
+3. Post-mortem is committed as the first commit on `fix/<stem>`
+4. Steps 2–5 rerun on `fix/<stem>`, then merge to `main` with `--no-ff`
+
+**Git Safety Protocol** (absolute — never violate):
+- No force push (`git push --force` forbidden)
+- No history rewrite on pushed branches (no `rebase -i`, `commit --amend`, `reset --hard` after push)
+- Use `git revert` to undo changes on shared history
+- No commits directly to `main`
+
+**Closed loop**: SA designs → SE builds → SA reviews. The same mind that designed the architecture verifies it. No context loss.
+
+**PO picks the next feature from backlog. No agent self-selects.**
+
+**Verification is adversarial.** The system-architect's job is to try to break the feature, not to confirm it works. The default hypothesis is "it might be broken despite green checks; prove otherwise."
 
 ## Roles
 
 - **Product Owner (PO)** — AI agent. Interviews the stakeholder, writes discovery docs, Gherkin features, and acceptance criteria. Accepts or rejects deliveries. **Sole owner of all `.feature` file moves** (backlog → in-progress before Step 2; in-progress → completed after Step 5 acceptance).
 - **Stakeholder** — Human. Answers PO's questions, provides domain knowledge, approves PO syntheses to confirm discovery is complete.
-- **Software Engineer** — AI agent. Architecture, test bodies, implementation, git. Never edits or moves `.feature` files. Escalates spec gaps to PO. If no `.feature` file is in `in-progress/`, stops and escalates to PO.
-- **Reviewer** — AI agent. Adversarial verification. Reports spec gaps to PO. Never moves `.feature` files. After APPROVED report, stops and escalates to PO for Step 5.
+- **System Architect (SA)** — AI agent. Designs architecture, writes domain stubs, records decisions in ADRs, and verifies implementation respects those decisions. Owns `docs/domain-model.md`, `docs/system.md`, and `docs/adr/ADR-*.md`. Never edits or moves `.feature` files. Escalates spec gaps to PO.
+- **Software Engineer (SE)** — AI agent. Implements everything: test bodies, production code, releases. Owns all `.py` files under the package. Never edits or moves `.feature` files. Escalates spec gaps to PO. If no `.feature` file is in `in-progress/`, stops and escalates to PO.
 
 ## Feature File Chain of Responsibility
 
-`.feature` files are owned exclusively by the PO. **No other agent ever moves or edits them.**
+`.feature` files are owned exclusively by the PO. **No other agent ever moves, creates, or edits them.**
 
 | Transition | Who | When |
 |---|---|---|
 | `backlog/` → `in-progress/` | PO only | Before Step 2 begins; only if `Status: BASELINED` |
 | `in-progress/` → `completed/` | PO only | After Step 5 acceptance |
 
-**If an agent (SE or reviewer) finds no `.feature` in `in-progress/`**: update TODO.md with the correct `Next:` escalation line and stop. Never self-select a backlog feature.
+**If an agent (SE or SA) finds no `.feature` in `in-progress/`**: update FLOW.md with the correct `Next:` escalation line and stop. Never self-select a backlog feature.
 
 ## Agents
 
 - **product-owner** — defines scope (Stage 1 Discovery + Stage 2 Specification), picks features, accepts deliveries
-- **software-engineer** — architecture, tests, code, git, releases (Steps 2-3 + release)
-- **reviewer** — runs commands and reviews code at Step 4, produces APPROVED/REJECTED report
+- **system-architect** — architecture and domain design (Step 2), adversarial technical review (Step 4)
+- **software-engineer** — TDD loop, implementation, tests, code, git, releases (Step 3 + release)
 - **designer** — creates and updates visual assets (SVG banners, logos) and maintains `docs/branding.md`
 - **setup-project** — one-time setup to initialize a new project from this template
 
@@ -54,16 +78,19 @@ STEP 5: ACCEPT         (product-owner)  → demo, validate, move .feature to com
 | `run-session` | all agents | every session |
 | `select-feature` | product-owner | between features (idle state) |
 | `define-scope` | product-owner | 1 |
-| `implement` | software-engineer | 2, 3 |
-| `apply-patterns` | software-engineer | 2, 3 (on-demand, when GoF pattern needed) |
+| `architect` | system-architect | 2 |
+| `implement` | software-engineer | 3 |
+| `apply-patterns` | system-architect, software-engineer | 2, 3 (on-demand, when GoF pattern needed) |
 | `refactor` | software-engineer | 3 (REFACTOR phase + preparatory refactoring) |
-| `verify` | reviewer | 4 |
+| `verify` | system-architect | 4 |
 | `check-quality` | software-engineer | pre-handoff (redirects to `verify`) |
-| `create-pr` | software-engineer | 5 |
-| `git-release` | software-engineer | 5 (after acceptance) |
-| `update-docs` | product-owner | 5 (after acceptance) + on stakeholder demand |
+| `version-control` | software-engineer | Step 2 (branch creation), Step 5 (merge to main), post-mortem branches |
+| `create-pr` | system-architect | post-acceptance |
+| `git-release` | stakeholder | post-acceptance |
+| `update-docs` | product-owner | post-acceptance + on stakeholder demand |
 | `design-colors` | designer | branding, color, WCAG compliance |
 | `design-assets` | designer | SVG asset creation and updates |
+| `flow` | all agents | every session — workflow state machine, auto-detection, prerequisites |
 | `create-skill` | software-engineer | meta |
 | `create-agent` | human-user | meta |
 
@@ -77,21 +104,21 @@ Step 1 has two stages:
 
 ### Stage 1 — Discovery (PO + stakeholder, iterative)
 
-Discovery is a continuous process. Sessions happen whenever scope needs to be established or refined — for a new project, new features, or new information. Every session follows the same structure:
+Discovery follows a block structure per session. See `skill define-scope` for the full protocol.
 
-**Session question order:**
-1. **General** (5Ws + Success + Failure + Out-of-scope) — first session only, if the journal doesn't exist yet
-2. **Cross-cutting** — behavior groups, bounded contexts, integration points, lifecycle events
-3. **Per-feature** — one feature at a time; extract entities from `docs/discovery.md` Domain Model; gap-finding with CIT, Laddering, CI Perspective Change
+**Block A — Session Start**: Resume check (if `IN-PROGRESS`), read `domain-model.md` (existing entities), declare scope.
 
-**Real-time split rule**: if the PO detects >2 concerns or >8 candidate Examples for a feature during per-feature questions, split immediately — record the split in the journal, create stub `.feature` files, continue questions for both in the same session.
+**Block B — General & Cross-cutting**: 5Ws, behavioral groups, bounded contexts. Active listening + reconciliation against `glossary.md` and `domain-model.md`.
 
-**After questions (PO alone, in order):**
-1. Append answered Q&A (in groups) to `docs/discovery_journal.md` — only answered questions
-2. Rewrite `.feature` description for each feature touched — others stay unchanged
-3. Append session synthesis block to `docs/discovery.md` — LAST, after all `.feature` updates
+**Block C — Feature Discovery (per feature)**: Detailed questions, pre-mortem, create/update `.feature` files.
 
-**Session status**: the journal session header begins with `Status: IN-PROGRESS` (written before questions). Updated to `Status: COMPLETE` after all writes. If a session is interrupted, the next agent detects `IN-PROGRESS` and resumes the pending writes before starting a new session.
+**Block D — Session Close**: Append Q&A to `scope_journal.md`, update `glossary.md`, append synthesis to `discovery.md`, regression check on completed features, mark `COMPLETE`.
+
+**Key rules**:
+- PO owns `scope_journal.md`, `discovery.md`, `glossary.md`, and `.feature` files
+- PO reads `domain-model.md` but never writes to it — entity suggestions go in `discovery.md` for SA formalization at Step 2
+- Real-time split rule: >2 concerns or >8 candidate Examples → split immediately
+- Completed feature touched and changed → move to `backlog/`
 
 **Baselining**: PO writes `Status: BASELINED (YYYY-MM-DD)` in the `.feature` file when the stakeholder approves that feature's discovery and the decomposition check passes.
 
@@ -113,22 +140,37 @@ Commit: `feat(criteria): write acceptance criteria for <name>`
 
 When a defect is reported:
 1. **PO** adds a `@bug` Example to the relevant `Rule:` in the `.feature` file and moves (or keeps) the feature in `backlog/` for normal scheduling.
-2. **SE** handles the bug when the feature is selected for development (standard Step 2–3 flow): implements the specific `@bug`-tagged test in `tests/features/<feature_slug>/` and also writes a `@given` Hypothesis property test in `tests/unit/` covering the whole class of inputs.
+2. **SA** handles Step 2 (architecture) and **SE** handles Step 3 (TDD loop) when the feature is selected for development. The SE implements the specific `@bug`-tagged test in `tests/features/<feature_slug>/` and also writes a `@given` Hypothesis property test in `tests/unit/` covering the whole class of inputs.
 3. Both tests are required. SE follows the normal TDD loop (Step 3).
 
+### Acceptance Failure & Restart
+
+If the stakeholder reports failure **after the PO has attempted Step 5 acceptance**:
+1. **PO does not move the `.feature` file to `completed/`**. Ensure it remains in `in-progress/`.
+2. **Team compiles a compact post-mortem** (`docs/post-mortem/YYYY-MM-DD-<feature-stem>-<keyword>.md`, max 15 lines, process-level root cause).
+3. **SE creates a fix branch** from the feature's original start commit: `git checkout -b fix/<stem> <start-sha>`. The post-mortem is committed as the first commit on this branch.
+4. **PO scans `docs/post-mortem/`** and selects relevant files by matching `<feature-stem>` or `<failure-keyword>`.
+5. **PO reads selected post-mortems**, then resets FLOW.md Status to [STEP-2-ARCH] with context.
+6. **SA restarts Step 2** on `fix/<stem>`, reading relevant post-mortems as input. The same feature re-enters the ARCH step.
+7. After acceptance, SE merges `fix/<stem>` to `main` with `--no-ff`.
+
+Post-mortems are append-only, never edited. If a failure mode recurs, write a new file referencing the old one.
+
 ## Filesystem Structure
 
 ```
 docs/
-  discovery_journal.md                ← raw Q&A, PO appends after every session
-  discovery.md                        ← synthesis changelog, PO appends after every session
-  architecture.md                     ← all architectural decisions, SE appends after Step 2
-  glossary.md                         ← living glossary, PO updates via update-docs skill
+  scope_journal.md                    ← raw Q&A, PO appends after every session
+  discovery.md                        ← session synthesis changelog, PO appends after every session
+  domain-model.md                     ← living domain model, SA creates/updates at Step 2, PO reads only
+  adr/                                ← one file per decision: ADR-YYYY-MM-DD-<slug>.md, SA creates at Step 2
+  system.md                           ← current-state overview (completed features only), SA rewrites at Step 2, PO reviews at Step 5
+  glossary.md                         ← living glossary, PO updates after each session
   branding.md                         ← project identity, colors, release naming, wording (designer owns)
   assets/                             ← logo.svg, banner.svg, and other visual assets (designer owns)
-  c4/
-    context.md                        ← C4 Level 1 diagram, PO updates via update-docs skill
-    container.md                      ← C4 Level 2 diagram, PO updates via update-docs skill
+  context.md                          ← C4 Level 1 diagram, PO updates via update-docs skill
+  container.md                        ← C4 Level 2 diagram, PO updates via update-docs skill (if multi-container)
+  post-mortem/                         ← compact post-mortems, PO-owned, append-only
   features/
     backlog/<feature-stem>.feature    ← narrative + Rules + Examples
     in-progress/<feature-stem>.feature
@@ -139,6 +181,8 @@ tests/
     <rule_slug>_test.py               ← one per Rule: block, software-engineer-written
   unit/
     <anything>_test.py                ← software-engineer-authored extras (no @id traceability)
+
+FLOW.md                               ← workflow state tracker (feature, branch, status, session log, next action)
 ```
 
 Tests in `tests/unit/` are software-engineer-authored extras not covered by any `@id` criterion. Any test style is valid — plain `assert` or Hypothesis `@given`. Use Hypothesis when the test covers a **property** that holds across many inputs (mathematical invariants, parsing contracts, value object constraints). Use plain pytest for specific behaviors or single edge cases discovered during refactoring.
@@ -155,7 +199,7 @@ tests/features/<feature_slug>/<rule_slug>_test.py
 
 ### Stub Format
 
-Stubs are auto-generated by pytest-beehave. The SE triggers generation at Step 2 end by running `uv run task test-fast`. pytest-beehave reads the in-progress `.feature` file and creates one skipped function per `@id`:
+Stubs are auto-generated by pytest-beehave. The SA triggers generation at Step 2 end by running `uv run task test-fast`. pytest-beehave reads the in-progress `.feature` file and creates one skipped function per `@id`:
 
 ```python
 @pytest.mark.skip(reason="not yet implemented")
@@ -214,12 +258,11 @@ uv run task doc-build
 
 ### Software-Engineer Quality Gate Priority Order
 
-During Step 3 (TDD Loop), correctness priorities are:
+During Step 3 (TDD Loop) and before handoff to Step 4:
 
-1. **Design correctness** — YAGNI > KISS > DRY > SOLID > Object Calisthenics > appropriated design patterns > complex code > complicated code > failing code > no code
+1. **Design correctness** — YAGNI > KISS > DRY > SOLID > Object Calisthenics > appropriate design patterns > complex code > complicated code > failing code > no code
 2. **One test green** — the specific test under work passes, plus `test-fast` still passes
-3. **Reviewer code-design check** — reviewer verifies design + semantic alignment (no lint/pyright/coverage yet)
-5. **Quality tooling** — `lint`, `static-check`, full `test` with coverage run only at software-engineer handoff (before Step 4)
+3. **Quality tooling** — `lint`, `static-check`, full `test` with coverage run at handoff to SA
 
 Design correctness is far more important than lint/pyright/coverage compliance. A well-designed codebase with minor lint issues is better than a lint-clean codebase with poor design.
 
@@ -228,7 +271,7 @@ Design correctness is far more important than lint/pyright/coverage compliance.
 - **Automated checks** (lint, typecheck, coverage) verify **syntax-level** correctness — the code is well-formed.
 - **Human review** (semantic alignment, code review, manual testing) verifies **semantic-level** correctness — the code does what the user needs.
 - Both are required. All-green automated checks are necessary but not sufficient for APPROVED.
-- Reviewer defaults to REJECTED unless correctness is proven.
+- System-architect defaults to REJECTED unless correctness is proven.
 
 ## Release Management
 
@@ -238,13 +281,15 @@ Version format: `v{major}.{minor}.{YYYYMMDD}`
 - Same-day second release: increment minor, keep same date
 - Release name: defined by `docs/branding.md > Release Naming > Convention`; absent or blank defaults to version string only (no name)
 
-Use `@software-engineer /skill git-release` for the full release process. When requested by the stakeholder
+**Releases happen from `main` only.** The SE ensures `main` is up to date with `origin/main` before creating a release. No releases from feature branches.
+
+The stakeholder initiates the release process. When the stakeholder requests a release, the system-architect or software-engineer loads `skill git-release` to execute it.
 
 ## Session Management
 
-Every session: load `skill run-session`. Read `TODO.md` first, update it at the end.
+Every session: load `skill run-session`. Read `FLOW.md` first, update it at the end.
 
-`TODO.md` is a session bookmark — not a project journal. See `.opencode/skills/run-session/SKILL.md` for the full structure including the Cycle State block used during Step 3.
+`FLOW.md` is the workflow state tracker — it records the current feature, branch, detected state, and next action. It is append-only in the Session Log section. See `.opencode/skills/flow/SKILL.md` for the full state machine and auto-detection rules.
 
 ## Setup
 
diff --git a/FLOW.md b/FLOW.md
new file mode 100644
index 0000000..f3e883d
--- /dev/null
+++ b/FLOW.md
@@ -0,0 +1,20 @@
+# FLOW Protocol
+
+This file tracks the current feature in progress. Only ONE feature flows through the system at a time.
+
+## Current Feature
+**Feature**: [NONE]
+**Branch**: [NONE]
+**Status**: [IDLE]
+
+## Prerequisites
+- [x] Agents: product-owner, system-architect, software-engineer
+- [x] Skills: run-session, define-scope, architect, implement, verify, version-control
+- [x] Tools: uv, git
+- [x] Directories: docs/features/, docs/adr/
+
+## Session Log
+<!-- Append new entries, never delete old ones -->
+
+## Next
+Run @product-owner — load skill select-feature and pick the next BASELINED feature from backlog.
diff --git a/README.md b/README.md
index fd00f1d..5d4b6ec 100644
--- a/README.md
+++ b/README.md
@@ -39,9 +39,10 @@ Most Python templates give you a folder structure and a `Makefile`. This one giv
 The goal is to give every project — from its first commit — the same rigour that mature teams take years to establish.
 
 - **No feature starts without written acceptance criteria** — Gherkin `Example:` blocks traced to tests
-- **No feature ships without adversarial review** — the reviewer's default hypothesis is "broken"
+- **No feature ships without adversarial review** — the system-architect's default hypothesis is "broken"
 - **No guesswork on test stubs** — generated automatically from `.feature` files
 - **No manual `@id` tags** — assigned automatically when you run tests
+- **No ambiguity on workflow state** — `FLOW.md` auto-detects current step from filesystem and git state
 - **AI agents for every role** — each agent has scoped instructions and cannot exceed its authority
 
 ---
@@ -57,9 +58,9 @@ SCOPE → ARCH → TDD LOOP → VERIFY → ACCEPT
 | Step | Role | Output |
 |------|------|--------|
 | **1 · SCOPE** | Product Owner | Discovery interviews + Gherkin stories + acceptance criteria |
-| **2 · ARCH** | Software Engineer | Module stubs, ADRs, auto-generated test stubs |
+| **2 · ARCH** | System Architect | Module stubs, ADRs, auto-generated test stubs |
 | **3 · TDD LOOP** | Software Engineer | RED → GREEN → REFACTOR, one criterion at a time |
-| **4 · VERIFY** | Reviewer | Adversarial check — lint, types, coverage, semantic review |
+| **4 · VERIFY** | System Architect | Adversarial check — lint, types, coverage, semantic review |
 | **5 · ACCEPT** | Product Owner | Demo, validate, ship |
 
 **WIP limit: 1 feature at a time.** Features are `.feature` files that move through folders:
@@ -75,8 +76,8 @@ docs/features/completed/    ← shipped
 | Agent | Responsibility |
 |-------|---------------|
 | `@product-owner` | Scope, stories, acceptance criteria, delivery acceptance |
-| `@software-engineer` | Architecture, TDD loop, git, releases |
-| `@reviewer` | Adversarial verification — default position: broken |
+| `@software-engineer` | TDD loop, implementation, git, releases |
+| `@system-architect` | Adversarial verification — default position: broken |
 | `@designer` | Visual identity, colour palette, SVG assets |
 | `@setup-project` | One-time project initialisation |
 
diff --git a/TODO.md b/TODO.md
deleted file mode 100644
index 72e090e..0000000
--- a/TODO.md
+++ /dev/null
@@ -1,4 +0,0 @@
-# Current Work
-
-No feature in progress.
-Next: PO picks a feature from docs/features/backlog/ that has Status: BASELINED and moves it to docs/features/in-progress/.
diff --git a/docs/architecture.md b/docs/architecture.md
deleted file mode 100644
index 2edabcd..0000000
--- a/docs/architecture.md
+++ /dev/null
@@ -1,19 +0,0 @@
-# Architecture: <project-name>
-
----
-
-## YYYY-MM-DD — <feature-stem>: <short title>
-
-Decision: <what was decided — one sentence>
-Reason: <why — one sentence>
-Alternatives considered: <what was rejected and why>
-Feature: <feature-stem>
-
----
-
-## YYYY-MM-DD — Cross-feature: <short title>
-
-Decision: <what was decided>
-Reason: <why>
-Alternatives considered: <what was rejected and why>
-Affected features: <feature-stem>, <feature-stem>
diff --git a/docs/container.md b/docs/container.md
new file mode 100644
index 0000000..6d8615e
--- /dev/null
+++ b/docs/container.md
@@ -0,0 +1,22 @@
+# C4 — Container Diagram
+
+> Last updated: YYYY-MM-DD
+> Source: docs/adr/ADR-*.md
+
+```mermaid
+C4Container
+  title Container Diagram — <project-name>
+
+  Person(actor1, "<role>", "")
+
+  System_Boundary(sys, "<project-name>") {
+    Container(container1, "<name>", "<technology>", "<responsibility from relevant ADR>")
+    Container(container2, "<name>", "<technology>", "<responsibility>")
+  }
+
+  System_Ext(ext1, "<external system>", "")
+
+  Rel(actor1, container1, "<action>")
+  Rel(container1, container2, "<protocol or method>")
+  Rel(container1, ext1, "<protocol>")
+```
diff --git a/docs/context.md b/docs/context.md
new file mode 100644
index 0000000..9c683d3
--- /dev/null
+++ b/docs/context.md
@@ -0,0 +1,18 @@
+# C4 — System Context
+
+> Last updated: YYYY-MM-DD
+> Source: docs/domain-model.md, docs/glossary.md, docs/features/completed/
+
+```mermaid
+C4Context
+  title System Context — <project-name>
+
+  Person(actor1, "<role name>", "<one-line description from feature As a clauses>")
+
+  System(system, "<project-name>", "<3–5 word system description from discovery.md Scope>")
+
+  System_Ext(ext1, "<external system name>", "<what it provides>")
+
+  Rel(actor1, system, "<verb from When clause>")
+  Rel(system, ext1, "<verb from relevant ADR decision>")
+```
diff --git a/docs/index.html b/docs/index.html
index 5231f6b..10a2366 100644
--- a/docs/index.html
+++ b/docs/index.html
@@ -47,6 +47,21 @@
   <h1>Documentation</h1>
   <p class="subtitle">Generated project documentation</p>
   <div class="grid">
+    <a class="card" href="system.md">
+      <div class="card-icon">🏗️</div>
+      <div class="card-title">System Overview</div>
+      <div class="card-desc">Current-state description, layers, and key decisions</div>
+    </a>
+    <a class="card" href="context.md">
+      <div class="card-icon">🗺️</div>
+      <div class="card-title">Context Diagram</div>
+      <div class="card-desc">C4 Level 1 — system boundaries and external actors</div>
+    </a>
+    <a class="card" href="https://github.com/nullhack/python-project-template/tree/main/docs/features">
+      <div class="card-icon">📋</div>
+      <div class="card-title">Features</div>
+      <div class="card-desc">Backlog, in-progress, and completed feature specs</div>
+    </a>
     <a class="card" href="api/app.html">
       <div class="card-icon">📖</div>
       <div class="card-title">API Reference</div>
@@ -57,10 +72,10 @@ <h1>Documentation</h1>
       <div class="card-title">Coverage Report</div>
       <div class="card-desc">Line-by-line test coverage breakdown</div>
     </a>
-    <a class="card" href="tests/report.html">
-      <div class="card-icon">✅</div>
-      <div class="card-title">Test Results</div>
-      <div class="card-desc">Full pytest run results with pass/fail details</div>
+    <a class="card" href="https://github.com/nullhack/python-project-template/tree/main/docs/research">
+      <div class="card-icon">🔬</div>
+      <div class="card-title">Research Library</div>
+      <div class="card-desc">10 papers across OOP, refactoring, testing, and more</div>
     </a>
   </div>
   <footer>Built with pdoc · pytest-cov · pytest-html</footer>
diff --git a/docs/c4/.gitkeep b/docs/post-mortem/.gitkeep
similarity index 100%
rename from docs/c4/.gitkeep
rename to docs/post-mortem/.gitkeep
diff --git a/docs/post-mortem/2026-04-14-ping-pong-cli-workflow-gaps.md b/docs/post-mortem/2026-04-14-ping-pong-cli-workflow-gaps.md
deleted file mode 100644
index 7f1d054..0000000
--- a/docs/post-mortem/2026-04-14-ping-pong-cli-workflow-gaps.md
+++ /dev/null
@@ -1,176 +0,0 @@
-# Post-Mortem: ping-pong-cli — Workflow Gaps (v3.1)
-
-## Release Details
-
-| Field | Value |
-|-------|-------|
-| Version | v3.1.20260414 |
-| Date | April 14, 2026 |
-| Feature | ping-pong-cli |
-| Status | APPROVED and shipped |
-| Broken | Yes — game doesn't work |
-
----
-
-## What Was Shipped
-
-`ping_pong_cli/game.py` — 240 lines:
-
-- 15 top-level functions, zero classes
-- No keyboard input (`get_input()` always returns `""`)
-- Runs a hardcoded 100-frame demo then exits
-- Uses raw `int` and `tuple[int,int]` — no value objects
-- `render_game` has 3 levels of nesting
-- 8-parameter function signatures
-
-Yet it passed: lint, typecheck, 100% coverage, 31 tests, reviewer APPROVED.
-
----
-
-## What Failed
-
-The acceptance criteria said:
-> Given: The game is running and waiting for input
-> When: The left or right arrow key is pressed
-> Then: The paddle moves
-
-The implementation maps this to a unit test of `update_player("W")`. That test proves the function works in isolation. No test verifies that keyboard input actually reaches `update_player`.
-
-The game shipped with the acceptance criterion satisfied in a narrow technical sense ("paddle moves when 'W' is passed to the function") but broken in the broad user sense ("paddle doesn't move when I press W in the running game").
-
----
-
-## Gap 1: Acceptance Criteria Don't Require End-to-End Verification
-
-### Problem
-
-The `scope` skill defines "Then must be a single observable, measurable outcome" but doesn't define **observable by whom**. The developer interpreted this as "observable in a unit test" — test calls `update_player("W")` returns expected result.
-
-### Fix
-
-In `scope` skill, add:
-
-> **Observable means observable by the end user.** If the criterion says "When the user presses W", the test must verify that pressing W in the running app produces the expected result — not just that calling `update_player("W")` returns the right number. If end-to-end testing isn't feasible, the criterion must explicitly state the boundary (e.g., "When update_player receives 'W'") so the gap is visible.
-
-In `verify` skill, add:
-
-> **Acceptance Criteria vs. Reality Check**
->
-> For each criterion whose Given/When/Then describes user-facing behavior:
-> - Read the test that covers it
-> - If the test only exercises an internal function without going through the actual user-facing entry point, flag it as **COVERED BUT NOT VERIFIED**
-> - A criterion that says "When the user presses W" is NOT verified by `test_update_player("W")` — it's verified by a test or manual check that sends W to the running app
->
-> Any COVERED BUT NOT VERIFIED criterion → REJECTED
-
----
-
-## Gap 2: Object Calisthenics Listed But Not Enforced by Reviewer
-
-### Problem
-
-The `verify` skill listed all 9 Object Calisthenics rules. The reviewer read them but approved code with:
-
-| # | Rule | Violation in shipped code |
-|---|------|--------------------------|
-| 3 | Wrap primitives | `PlayerPosition = int`, `BallState = tuple[int,int]` are type aliases, not value objects |
-| 4 | First-class collections | No collection classes |
-| 7 | Small entities | `run_game_loop` is ~40 lines |
-| 8 | ≤ 2 instance vars | No classes at all, but 8-parameter function signatures |
-
-The skill didn't say **what to do when violations are found**. Violations were treated as observations, not blockers.
-
-### Fix
-
-In `verify` skill, replace ObjCal prose with a structured table:
-
-> **Object Calisthenics — ANY violation is a REJECT**
->
-> | # | Rule | How to check | PASS/FAIL |
-> |---|------|-------------|-----------|
-> | 1 | One level of indentation | Check nest depth in source |
-> | 2 | No `else` after return | Search for `else` inside functions |
-> | 3 | Wrap primitives | Bare `int`, `str` as domain concepts = FAIL |
-> | 4 | First-class collections | `list[Type]` not wrapped = FAIL |
-> | 5 | One dot per line | `a.b.c()` = FAIL |
-> | 6 | No abbreviations | `calc`, `mgr` = FAIL |
-> | 7 | Small entities | Lines per function >20 or class >50 = FAIL |
-> | 8 | ≤ 2 instance vars | More than 2 per class = FAIL |
-> | 9 | No getters/setters | `get_x()`, `set_x()` = FAIL |
-
----
-
-## Gap 3: REFACTOR Step Has No Verification Gate
-
-### Problem
-
-The `implementation` skill says to apply DRY, SOLID, Object Calisthenics during REFACTOR, but when done, it only runs `task test`, `task lint`, `task static-check`. None of those tools check nesting depth, function length, or value objects. The developer skips the self-check, runs the three commands, they all pass.
-
-### Fix
-
-In `implementation` skill, add after REFACTOR section:
-
-> **REFACTOR Self-Check (MANDATORY before commit)**
->
-> 1. Count lines per function you changed. Any >20 → extract helper
-> 2. Check nesting. Any >2 levels → extract function
-> 3. Check bare primitives as domain concepts. `int` for paddle position → value object
-> 4. Check parameters per function. >4 positional → group into dataclass
->
-> If you skip this step, the reviewer WILL reject your code.
-
----
-
-## Gap 4: `timeout 10s uv run task run` Is Not a Playability Test
-
-### Problem
-
-The `verify` skill said: "check that startup completes without error before the timeout." The demo ran for 1.6 seconds and exited cleanly — startup completed, no error. The app passed without being interactive at all.
-
-### Fix
-
-In `verify` skill, replace the timeout check with:
-
-> **For apps with user interaction** (games, CLIs with prompts, web servers):
-> - Run the app, provide sample input via stdin/subprocess
-> - Verify output changes in response to input
-> - A hardcoded demo that auto-plays without input is NOT a playability test
->
-> If the app doesn't respond to user input → REJECTED
-
----
-
-## Gap 5: Tests Verify Functions, Not Behavior
-
-### Problem
-
-The `tdd` skill produces unit tests. Every test calls an isolated function. No test sends input to the running game. No test verifies the game loop integrates these functions correctly. 31 tests pass with 100% coverage but none test the actual gameplay loop.
-
-### Fix
-
-In `tdd` skill, add:
-
-> **Integration Test Requirement**
->
-> For features with multiple components (game loops, handlers, pipelines):
-> - Add at least ONE `@pytest.mark.integration` test
-> - Test must exercise the full path from entry point to observable outcome
-> - Must NOT call internal helpers directly — use the public entry point
-
----
-
-## Summary
-
-| Gap | Skill | Problem | Fix |
-|-----|-------|---------|-----|
-| 1 | scope + verify | "Observable" undefined = unit test passes | Define user-observable; add COVERED BUT NOT VERIFIED |
-| 2 | verify | Object Calisthenics listed = suggestions | Any rule FAIL = REJECTED (table) |
-| 3 | implementation | REFACTOR has no self-check gate | Add mandatory line/nesting check |
-| 4 | verify | `timeout` = "doesn't hang" not "works" | Must accept and respond to input |
-| 5 | tdd | All unit, no integration | Require one integration test |
-
----
-
-## Root Cause
-
-The skills already contained the right standards. The problem is that violations were treated as observations, not blockers. Each check needs a clear **FAIL = REJECTED** consequence with a structured table to fill in — so violations can't be glossed over in prose.
diff --git a/docs/post-mortem/2026-04-16-ping-pong-cli-package-and-design-review.md b/docs/post-mortem/2026-04-16-ping-pong-cli-package-and-design-review.md
deleted file mode 100644
index d9b6995..0000000
--- a/docs/post-mortem/2026-04-16-ping-pong-cli-package-and-design-review.md
+++ /dev/null
@@ -1,108 +0,0 @@
-# Post-Mortem: ping-pong-cli — Package Directory and Design Review Gaps
-
-## Context
-
-| Field | Value |
-|-------|-------|
-| Date | April 16, 2026 |
-| Feature | ping-pong-cli (follow-up run after v3.1 workflow fixes) |
-| Branch | feat/po-workflow-redesign-v4 |
-
-This post-mortem was conducted after a second ping-pong-cli test run on the updated v3.1 workflow. Two systemic failures were identified that the v3.1 fixes did not address.
-
----
-
-## Failure 1: Code Created in Wrong Package Directory
-
-### What Happened
-
-The developer created production code under `python_project_template/` (the template's own package) instead of `ping_pong_cli/` (the feature's package). The correct package name was visible in `pyproject.toml` under `[tool.setuptools] packages`, but no step in the workflow required the developer to read it before writing code.
-
-### Why It Happened
-
-The `implementation` skill's Step 2 (Architecture) listed prerequisites and module structure instructions, but contained no explicit step to:
-1. Read `pyproject.toml` to determine the correct package name
-2. Confirm the package directory exists on disk
-3. Record the package name as a hard constraint before writing any files
-
-Without this verification, the developer defaulted to a plausible-looking name rather than the actual configured name.
-
-### Impact
-
-All production code was placed in the wrong directory. The feature appeared to work during development (imports resolved within the wrong package) but would have failed on any fresh install or CI run.
-
-### Fix Applied
-
-Added a **Package Verification** block at the top of Step 2 in `implementation/SKILL.md` (before prerequisites):
-
-```
-1. Read pyproject.toml → [tool.setuptools] → record packages = ["<name>"]
-2. Confirm that directory exists on disk: ls <name>/
-3. Write the correct package name at the top of working notes
-4. All new source files go under <name>/ — never under a template placeholder
-```
-
-Added a corresponding check row to `verify/SKILL.md` section 4g:
-
-> `Imports use correct package name` — confirm all imports match `[tool.setuptools] packages`, not a template placeholder
-
----
-
-## Failure 2: Design Principle Violations Not Caught in Review
-
-### What Happened
-
-The reviewer approved code containing getters and setters (`get_x()` / `set_x()` pairs), violating Object Calisthenics Rule 9. The violation was visible in the code but was not caught because the review process had no structured mechanism for the developer to declare their own compliance before asking for review.
-
-### Why It Happened
-
-The per-test reviewer check asked the reviewer to verify YAGNI > KISS > DRY > SOLID > ObjCal, but provided no structured checklist or required evidence format. The reviewer was scanning for violations rather than verifying explicit claims. When a reviewer is reading unfamiliar code for the first time, getter/setter patterns can be overlooked if they are not explicitly flagged.
-
-Additionally, the reviewer had no "audit target" — there was nothing the developer had committed to that the reviewer could directly compare against the code.
-
-### Impact
-
-OC Rule 9 (tell-don't-ask) was violated. The design choice propagated into the committed codebase, requiring a later refactor.
-
-### Fix Applied
-
-Added a **Design Self-Declaration** step between REFACTOR and REVIEWER CHECK in `implementation/SKILL.md`:
-
-- Developer fills a checklist covering YAGNI, KISS, DRY, SOLID (all 5 principles), and OC Rules 1–9
-- Each item requires `file:line` evidence or an explicit "does not apply" note
-- The filled checklist is sent to the reviewer as the audit target
-
-Updated the **REVIEWER CHECK** response template from a 3-line compact format to an 11-row structured comparison table (YAGNI, KISS, DRY, SOLID-S/O/L/I/D, OC-1-9, Design patterns, Semantic alignment):
-
-- Developer Claims column (what the developer declared)
-- Reviewer Verdict column (independent verification)
-- Evidence column (`file:line` required for every FAIL)
-- Any FAIL row = rejection
-
-Updated the Cycle State phases to include `SELF-DECLARE` between REFACTOR and REVIEWER:
-
-```
-RED → GREEN → REFACTOR → SELF-DECLARE → REVIEWER(code-design) → COMMITTED
-```
-
-Updated `session-workflow/SKILL.md` Cycle State phase list and Rule 6 to include `SELF-DECLARE`.
-
-Updated `reviewer.md` per-test Step 4 section to reference the structured table and load `skill implementation` for the full protocol.
-
----
-
-## Summary
-
-| Failure | Root Cause | Fix |
-|---------|-----------|-----|
-| Code in wrong package | No package verification step before writing code | Package Verification block added to Step 2 |
-| OC Rule 9 violation approved | No structured self-declaration; reviewer had no audit target | Design Self-Declaration checklist per test; 11-row verification table |
-
----
-
-## Systemic Pattern
-
-Both failures share the same root cause: **the workflow relied on agents noticing problems rather than proving compliance**. The fixes shift the burden:
-
-- Package verification: developer must prove the package name is correct before writing the first line
-- Design self-declaration: developer must prove each principle is satisfied before asking for review; reviewer verifies claims rather than scanning from scratch
diff --git a/docs/scientific-research/README.md b/docs/research/README.md
similarity index 83%
rename from docs/scientific-research/README.md
rename to docs/research/README.md
index 3338996..5e60424 100644
--- a/docs/scientific-research/README.md
+++ b/docs/research/README.md
@@ -11,6 +11,6 @@ Theoretical and empirical foundations for the decisions made in this template, o
 | `domain-modeling.md` | 31, 63–68 | DDD bounded contexts, ubiquitous language, feature identification, DDD Reference, Fowler UL/BC bliki, Vernon IDDD, Verraes UL-not-glossary, Whirlpool |
 | `oop-design.md` | 32–35 | Object Calisthenics, Refactoring (Fowler), GoF Design Patterns, SOLID |
 | `refactoring-empirical.md` | 36–41 | QDIR smell prioritization, smells + architectural refactoring, SPIRIT tool, bad OOP engineering properties, CWC complexity metric, metric threshold unreliability |
-| `architecture.md` | 42, 55–58 | Hexagonal Architecture, ADRs, 4+1 View Model, C4 model, information hiding |
+| `adr/ADR-*.md` | 42, 55–58 | Hexagonal Architecture, ADRs, 4+1 View Model, C4 model, information hiding |
 | `ai-agents.md` | 21–27 | Minimal-scope agent design, context isolation, on-demand skills, instruction conflict resolution failure, positional attention degradation, modular prompt de-duplication, three-file separation |
-| `documentation.md` | 59–62 | Developer information needs, docs-as-code, Diátaxis documentation framework, blameless post-mortems |
+| `documentation.md` | 59–62, 69–71 | Developer information needs, docs-as-code, Diátaxis documentation framework, blameless post-mortems, arc42 current-state template, Google design docs, RFC/technical spec pattern |
diff --git a/docs/scientific-research/ai-agents.md b/docs/research/ai-agents.md
similarity index 55%
rename from docs/scientific-research/ai-agents.md
rename to docs/research/ai-agents.md
index aaae407..02fa05d 100644
--- a/docs/scientific-research/ai-agents.md
+++ b/docs/research/ai-agents.md
@@ -14,7 +14,7 @@ Foundations for the agent architecture, file structure, and context management d
 | **Status** | Confirmed — corrects the belief that subagents should be "lean routing agents" |
 | **Core finding** | "Define the smallest agent that can own a clear task. Add more agents only when you need separate ownership, different instructions, different tool surfaces, or different approval policies." The split criterion is ownership boundary, not instruction volume. |
 | **Mechanism** | Multiple agents competing to own the same concern create authority conflicts and inconsistent tool access. The right unit is the smallest coherent domain that requires exclusive responsibility. |
-| **Where used** | Agent design in `.opencode/agents/*.md` — 4 agents, each owning a distinct domain (PO, software-engineer, reviewer, setup). |
+| **Where used** | Agent design in `.opencode/agents/*.md` — 5 agents, each owning a distinct domain (PO, system-architect, software-engineer, designer, setup). |
 
 ---
 
@@ -105,14 +105,98 @@ Foundations for the agent architecture, file structure, and context management d
 
 ---
 
+### 72. Actor Model — Message-Passing Ownership
+
+| | |
+|---|---|
+| **Source** | Hewitt, C., Bishop, P., & Steiger, R. (1973). *A universal modular actor formalism for artificial intelligence*. IJCAI. |
+| **Date** | 1973 |
+| **Status** | Confirmed — foundational for single-ownership agent design |
+| **Core finding** | Actors are computational entities that communicate exclusively via asynchronous message passing. Each actor has a single mailbox, processes messages sequentially, and can spawn child actors. No shared state, no direct method calls. |
+| **Mechanism** | The Actor Model eliminates race conditions by construction: an actor can only modify its own state. Message passing creates explicit handoff points where ownership transfers. This maps directly to AI agent design where each agent owns a distinct domain and communicates via structured handoffs (e.g., PO → SA → SE → SA → PO). |
+| **Where used** | Agent ownership boundaries in `.opencode/agents/*.md`; single-feature-at-a-time WIP limit in `FLOW.md`. |
+
+---
+
+### 73. CSP — Synchronous Communication and Deadlock Freedom
+
+| | |
+|---|---|
+| **Source** | Hoare, C. A. R. (1978). *Communicating sequential processes*. Communications of the ACM, 21(8), 666–677. |
+| **Date** | 1978 |
+| **Status** | Confirmed — formal basis for structured handoff protocols |
+| **Core finding** | Processes communicate via synchronous channels (rendezvous). A process that tries to send on a channel blocks until the receiver is ready. This explicit synchronization prevents the "lost update" problem. |
+| **Mechanism** | CSP's channel-based communication ensures that handoffs are atomic: either both parties are ready (handoff succeeds) or the sender waits (no partial state). Applied to AI workflow design: each step transition in `FLOW.md` is a rendezvous point where the outgoing agent commits state before the incoming agent reads it. |
+| **Where used** | Step transition protocol in `FLOW.md` — commit before handoff; session end protocol in `run-session/SKILL.md`. |
+
+---
+
+### 74. Session Types — Protocol Conformance by Construction
+
+| | |
+|---|---|
+| **Source** | Honda, K. (1993). *Types for dyadic interaction*. CONCUR '93. |
+| **Date** | 1993 |
+| **Status** | Confirmed — type-safe communication protocols |
+| **Core finding** | Session types statically verify that communicating parties follow a prescribed protocol. The type checker ensures send/receive sequences match, preventing protocol violations at compile time. |
+| **Mechanism** | Just as session types enforce "send A then receive B then send C", the `FLOW.md` state machine enforces "Step 1 → Step 2 → Step 3 → Step 4 → Step 5". Each state has a defined owner and valid transitions. The auto-detection rules act as a runtime type checker: if the filesystem state doesn't match the expected state, the protocol halts. |
+| **Where used** | `FLOW.md` state machine definition; `flow/SKILL.md` auto-detection rules. |
+
+---
+
+### 75. Statecharts — Hierarchical State Machines with History
+
+| | |
+|---|---|
+| **Source** | Harel, D. (1987). *Statecharts: A visual formalism for complex systems*. Science of Computer Programming, 8(3), 231–274. |
+| **Date** | 1987 |
+| **Status** | Confirmed — hierarchical states for workflow design |
+| **Core finding** | Statecharts extend finite state machines with hierarchy (nested states), orthogonality (parallel regions), and history (return to previous substate). This makes complex systems tractable without state explosion. |
+| **Mechanism** | The `FLOW.md` state machine uses hierarchical grouping: Step 3 contains substates [READY], [RED], [GREEN]. The history mechanism maps to interruption recovery: when resuming, auto-detection determines the exact substate without manual tracking. |
+| **Where used** | `FLOW.md` state design; `flow/SKILL.md` detection rules for interruption recovery. |
+
+---
+
+### 76. Design by Contract — Preconditions and Postconditions
+
+| | |
+|---|---|
+| **Source** | Meyer, B. (1986). *Eiffel: Programming for reusability and extendability*. SIGPLAN Notices, 22(2), 85–94. |
+| **Date** | 1986 |
+| **Status** | Confirmed — explicit contracts for step boundaries |
+| **Core finding** | Software components should specify contracts: preconditions (what must be true before calling), postconditions (what will be true after), and invariants (what remains true). Violations indicate bugs. |
+| **Mechanism** | Each `FLOW.md` state has preconditions (detect rules) and postconditions (success/failure transitions). The prerequisites table is a system-level precondition. When preconditions fail, the protocol halts rather than proceeding with invalid state. |
+| **Where used** | Prerequisites table in `FLOW.md`; per-step preconditions in `flow/SKILL.md`, `architect/SKILL.md`, `implement/SKILL.md`. |
+
+---
+
+### 77. Petri Nets — Places, Transitions, and Token Flow
+
+| | |
+|---|---|
+| **Source** | Petri, C. A. (1962). *Kommunikation mit Automaten*. PhD thesis, University of Bonn. |
+| **Date** | 1962 |
+| **Status** | Confirmed — formal model for concurrent workflow with resource constraints |
+| **Core finding** | Petri Nets model systems as places (conditions), transitions (events), and tokens (resources). A transition fires only when all input places have tokens. This naturally models capacity constraints and competition for resources. |
+| **Mechanism** | The WIP=1 constraint in `FLOW.md` is a Petri Net place with capacity 1: only one feature token can occupy the "in-progress" place at a time. The transition from [IDLE] to [STEP-1-DISCOVERY] requires the "in-progress" place to be empty (no token). This formalizes the single-feature constraint. |
+| **Where used** | WIP limit of 1 in `AGENTS.md` and `FLOW.md`; filesystem-enforced WIP via `docs/features/in-progress/` directory. |
+
+---
+
 ## Bibliography
 
 1. Anthropic. (2024). Building effective agents. https://www.anthropic.com/engineering/building-effective-agents
 2. Anthropic. (2025). Best practices for Claude Code. https://www.anthropic.com/engineering/claude-code-best-practices
 3. Geng et al. (2025). Control Illusion. AAAI-26. arXiv:2502.15851. https://arxiv.org/abs/2502.15851
-4. Liu, N. F. et al. (2023). Lost in the Middle. *TACL*. arXiv:2307.03172. https://arxiv.org/abs/2307.03172
-5. McKinnon, R. (2025). arXiv:2511.05850. https://arxiv.org/abs/2511.05850
-6. OpenAI. (2024). Agent definitions. https://platform.openai.com/docs/guides/agents/define-agents
-7. OpenCode. (2026). Agent Skills. https://opencode.ai/docs/skills/
-8. Sharma, A., & Henley, A. (2026). Modular Prompt Optimization. arXiv:2601.04055. https://arxiv.org/abs/2601.04055
-9. Wallace, E. et al. (2024). The Instruction Hierarchy. arXiv:2404.13208.
+4. Harel, D. (1987). Statecharts: A visual formalism for complex systems. *Science of Computer Programming*, 8(3), 231–274.
+5. Hewitt, C., Bishop, P., & Steiger, R. (1973). A universal modular actor formalism for artificial intelligence. *IJCAI*.
+6. Hoare, C. A. R. (1978). Communicating sequential processes. *Communications of the ACM*, 21(8), 666–677.
+7. Honda, K. (1993). Types for dyadic interaction. *CONCUR '93*.
+8. Liu, N. F. et al. (2023). Lost in the Middle. *TACL*. arXiv:2307.03172. https://arxiv.org/abs/2307.03172
+9. McKinnon, R. (2025). arXiv:2511.05850. https://arxiv.org/abs/2511.05850
+10. Meyer, B. (1986). Eiffel: Programming for reusability and extendability. *SIGPLAN Notices*, 22(2), 85–94.
+11. OpenAI. (2024). Agent definitions. https://platform.openai.com/docs/guides/agents/define-agents
+12. OpenCode. (2026). Agent Skills. https://opencode.ai/docs/skills/
+13. Petri, C. A. (1962). Kommunikation mit Automaten. PhD thesis, University of Bonn.
+14. Sharma, A., & Henley, A. (2026). Modular Prompt Optimization. arXiv:2601.04055. https://arxiv.org/abs/2601.04055
+15. Wallace, E. et al. (2024). The Instruction Hierarchy. arXiv:2404.13208.
diff --git a/docs/research/architecture.md b/docs/research/architecture.md
new file mode 100644
index 0000000..9ccaf7c
--- /dev/null
+++ b/docs/research/architecture.md
@@ -0,0 +1,156 @@
+# Scientific Research — Architecture
+
+Foundations for the architectural decisions and patterns used in this template.
+
+---
+
+### 42. Hexagonal Architecture — Ports and Adapters
+
+| | |
+|---|---|
+| **Source** | Cockburn, A. (2005). "Hexagonal Architecture." *alistair.cockburn.us*. https://alistair.cockburn.us/hexagonal-architecture/ |
+| **Date** | 2005 |
+| **Alternative** | Freeman, S., & Pryce, N. (2009). *Growing Object-Oriented Software, Guided by Tests*. Addison-Wesley. (Chapter 7: "Ports and Adapters") |
+| **Status** | Confirmed — foundational; widely adopted as Clean Architecture, Onion Architecture |
+| **Core finding** | The application domain should have no knowledge of external systems (databases, filesystems, network, UI). All contact between the domain and the outside world passes through a **port** (an interface / Protocol) and an **adapter** (a concrete implementation of that port). The domain is independently testable without any infrastructure. The key structural rule: dependency arrows point inward — domain code never imports from adapters; adapters import from domain. |
+| **Mechanism** | Two distinct sides of any application: the "driving side" (actors who initiate action — tests, UI, CLI) and the "driven side" (actors the application drives — databases, filesystems, external services). Each driven-side dependency is hidden behind a port. Tests supply a test adapter; production supplies a real adapter. Substituting adapters requires no domain code changes. This is SOLID-D at the architectural layer. |
+| **Where used** | Step 2 (Architecture): if an external dependency is identified during domain analysis, assign it a Protocol. `ports/` and `adapters/` folders emerge when a concrete dependency is confirmed — do not pre-create them. The dependency-inversion principle (SOLID-D) is the goal; the folder names are convention, not law. |
+
+---
+
+### 55. Architecture Decision Records (ADRs)
+
+| | |
+|---|---|
+| **Source** | Nygard, M. T. (2011). "Documenting Architecture Decisions." *cognitect.com*. https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions |
+| **Date** | 2011 |
+| **Alternative** | Keeling, M. (2017). *Design It!: From Programmer to Software Architect*. Pragmatic Bookshelf. (Chapter 6: "Architectural Decisions") |
+| **Status** | Confirmed — widely adopted industry standard; tooled by adr-tools, ADR Manager, Log4Brains |
+| **Core finding** | Architectural decisions should be recorded as short, immutable documents capturing: what was decided, why, and what alternatives were rejected. Without this record, decisions get re-litigated by every new developer (or AI agent) who encounters the codebase, producing rework and re-divergence. |
+| **Mechanism** | An ADR is written at decision time, never edited afterward. If the decision changes, a new ADR is written that supersedes the old one. The append-only record becomes a reliable audit trail. The constraint "one sentence per field" forces clarity — if you can't state the reason in one sentence, the decision is not yet understood. |
+| **Where used** | `docs/adr/ADR-YYYY-MM-DD-<slug>.md` (one file per decision). SA creates one file per non-obvious decision after Step 2. The `update-docs` skill reads ADRs as input for C4 diagram annotations. |
+
+---
+
+### 56. The 4+1 View Model of Architecture
+
+| | |
+|---|---|
+| **Source** | Kruchten, P. B. (1995). "The 4+1 View Model of Architecture." *IEEE Software*, 12(6), 42–50. https://doi.org/10.1109/52.469759 |
+| **Date** | 1995 |
+| **Alternative** | Bass, L., Clements, P., & Kazman, R. (2021). *Software Architecture in Practice* (4th ed.). Addison-Wesley. |
+| **Status** | Confirmed — 3,000+ citations; foundational IEEE reference for architectural documentation |
+| **Core finding** | A single architectural diagram cannot communicate all relevant aspects of a system. Four distinct views are required: **Logical** (domain objects and relationships), **Process** (runtime behavior and concurrency), **Development** (module organisation and dependencies), **Physical** (deployment topology). A fifth **Scenarios** view (use cases) ties the four together by showing how each scenario exercises each view. |
+| **Mechanism** | Different stakeholders need different views: a developer needs the Development view; an operator needs the Physical view; a domain expert needs the Logical view. Conflating views into one diagram produces a cluttered diagram that satisfies nobody. The 4+1 model assigns each concern to its appropriate view and cross-validates them through scenarios. |
+| **Where used** | Theoretical foundation for the C4 model (entry 57). The `update-docs` skill generates C4 diagrams that map to: Context diagram (Scenarios view), Container diagram (Physical + Development views), Component diagram (Logical + Development views). |
+
+---
+
+### 57. The C4 Model for Software Architecture
+
+| | |
+|---|---|
+| **Source** | Brown, S. (2018). *The C4 Model for Software Architecture*. Leanpub. https://c4model.com |
+| **Date** | 2018 (ongoing) |
+| **Alternative** | Brown, S. (2023). "The C4 model for visualising software architecture." *InfoQ*. |
+| **Status** | Confirmed — widely adopted; tooled by Structurizr, PlantUML C4, Mermaid C4 |
+| **Core finding** | Software architecture can be communicated at four zoom levels: **Level 1 — System Context** (who uses the system and what external systems it talks to), **Level 2 — Container** (major runnable/deployable units), **Level 3 — Component** (major structural building blocks within a container), **Level 4 — Code** (classes, interfaces; usually auto-generated). Each level answers a specific question; mixing levels in one diagram creates confusion. |
+| **Mechanism** | C4 operationalises the 4+1 View Model (entry 56) into a lightweight notation that can be expressed in text (PlantUML, Mermaid) and version-controlled alongside code. The notation is deliberately constrained: boxes (people, systems, containers, components) and unidirectional arrows with labels. No UML formalism required. Context + Container diagrams cover >90% of communication needs for most teams. |
+| **Where used** | The `update-docs` skill generates and updates C4 diagrams in `docs/context.md` and `docs/container.md`. Context diagram (L1) always generated; Container (L2) generated when multiple containers are identified; Component (L3) generated on demand. Source files are Mermaid so they render in GitHub and are version-controlled. |
+
+---
+
+### 58. Information Hiding — Module Decomposition Criterion
+
+| | |
+|---|---|
+| **Source** | Parnas, D. L. (1972). "On the criteria to be used in decomposing systems into modules." *Communications of the ACM*, 15(12), 1053–1058. https://doi.org/10.1145/361598.361623 |
+| **Date** | 1972 |
+| **Alternative** | Parnas, D. L. (1974). "On a 'buzzword': Hierarchical structure." *Proc. IFIP Congress 74*, 336–339. |
+| **Status** | Confirmed — 4,000+ citations; foundational criterion for all modular decomposition in software engineering |
+| **Core finding** | The correct criterion for decomposing a system into modules is **information hiding**: each module hides a design decision that is likely to change. A module's interface reveals only what callers need; its implementation hides how. Decomposing by execution steps (procedure-based) creates tight coupling to implementation order; decomposing by change-prone decisions (information-hiding) allows each decision to be changed independently. |
+| **Mechanism** | Identify which decisions are most likely to change (data structures, algorithms, I/O formats, external service protocols). Each such decision becomes a module boundary. The module's public interface is defined to be change-stable; the implementation is change-free from the caller's perspective. This is the theoretical basis for SOLID-D (depend on abstractions), Hexagonal Architecture (hide external decisions behind ports), and DDD bounded contexts (hide language decisions behind context boundaries). |
+| **Where used** | Step 2 Architecture: bounded context check ("same word, different meaning across features? → module boundary") and external dep Protocol assignment both apply the information-hiding criterion. The `update-docs` skill uses module boundaries as container/component boundaries in `docs/container.md`. |
+
+---
+
+---
+
+### 59. Architecture Tradeoff Analysis Method (ATAM)
+
+| | |
+|---|---|
+| **Source** | Kazman, R., Klein, M., & Clements, P. (2000). "ATAM: Method for Architecture Evaluation" (CMU/SEI-2000-TR-004). Software Engineering Institute, Carnegie Mellon University. https://resources.sei.cmu.edu/asset_files/TechnicalReport/2000_005_001_13706.pdf |
+| **Date** | 2000 (updated 2018) |
+| **Alternative** | Bass, L., Clements, P., & Kazman, R. (2021). *Software Architecture in Practice* (4th ed.). Addison-Wesley. (Chapters 21–23) |
+| **Status** | Confirmed — SEI standard; used by NASA, DoD, and Fortune 500 organizations |
+| **Core finding** | Architecture should be evaluated early through structured scenario analysis. ATAM discovers **trade-offs** and **sensitivity points** before implementation begins, when change cost is minimal. The method produces a risk-mitigation roadmap rather than a pass/fail verdict. |
+| **Mechanism** | Nine-step process: (1) present ATAM, (2) present business drivers, (3) present architecture, (4) identify architectural approaches, (5) generate quality-attribute utility tree, (6) analyze architectural approaches, (7) brainstorm and prioritize scenarios, (8) re-analyze with broader stakeholder input, (9) present results. Key output: a ranked list of **risk themes** with sensitivity points (architectural decisions that most affect quality attributes). |
+| **Where used** | Step 4 (Verify): the system-architect applies ATAM-style adversarial review — testing the implemented architecture against the quality-attribute scenarios identified in Step 2. The SA who designed the architecture reviews it, eliminating the context-loss problem of external reviewers. |
+
+---
+
+### 60. Conway's Law and the Inverse Conway Maneuver
+
+| | |
+|---|---|
+| **Source** | Conway, M. E. (1968). "How Do Committees Invent?" *Datamation*, 14(4), 28–31. https://www.melconway.com/Home/Committees_Paper.html |
+| **Date** | 1968 (dubbed "Conway's Law" by Brooks, 1975) |
+| **Alternative** | Fowler, M. (2022). "Conway's Law." *martinfowler.com*. https://martinfowler.com/bliki/ConwaysLaw.html |
+| **Status** | Confirmed — universally accepted; Brooks called it "the most important law in software engineering" |
+| **Core finding** | Any organization that designs a system will produce a design whose structure is a copy of the organization's communication structure. The **Inverse Conway Maneuver** deliberately alters team organization to encourage the desired software architecture — aligning Conway's Law with architectural intent rather than fighting it. |
+| **Mechanism** | Three responses to Conway's Law: (1) **Ignore** — architecture clashes with team structure, producing friction; (2) **Accept** — ensure architecture does not conflict with existing communication patterns; (3) **Inverse Conway** — restructure teams (and agent roles) to match the desired architecture. In AI-assisted development, this means the agent who designs a module should be the same agent who reviews it, preserving architectural intent through the build-and-review cycle. |
+| **Where used** | AGENTS.md role design: the system-architect → software-engineer → system-architect loop implements a closed communication path. The SA designs the module boundary; the SE builds within it; the SA verifies the boundary was respected. No external reviewer introduces misaligned mental models. |
+
+---
+
+### 61. The Architect as Decision-Maker
+
+| | |
+|---|---|
+| **Source** | Fowler, M. (2003). "Who Needs an Architect?" *IEEE Software*, 20(5), 11–13. https://martinfowler.com/ieeeSoftware/whoNeedsArchitect.pdf |
+| **Date** | 2003 |
+| **Alternative** | Martin, R. C. (2017). *Clean Architecture: A Craftsman's Guide to Software Structure and Design*. Prentice Hall. (Chapters 1–3) |
+| **Status** | Confirmed — IEEE standard reference; Martin's "Clean Architecture" extends to policy/detail separation |
+| **Core finding** | The architect's job is not to draw diagrams — it is to make **significant decisions** that are hard to change later. The architect is a facilitator who builds consensus around technical direction, not a dictator who issues edicts. The best architects are also programmers who understand implementation constraints firsthand. |
+| **Mechanism** | Fowler distinguishes four architect archetypes: (1) **Architect as decision-maker** — owns the hard-to-change choices; (2) **Architect as expert** — provides technical depth the team lacks; (3) **Architect as facilitator** — brings stakeholders to consensus; (4) **Architect as gatekeeper** — enforces standards. The template's system-architect role combines (1) and (4): making architectural decisions (ADRs) and enforcing them through adversarial review. Martin adds the **policy/detail** separation: the architect owns policy (business rules, interfaces); the developer owns detail (algorithms, data structures). |
+| **Where used** | `system-architect.md` agent definition: the SA owns `docs/domain-model.md`, `docs/system.md`, and `docs/adr/ADR-*.md` (policy layer). The SE owns the implementation code (detail layer). The SA reviews to ensure policy was not violated by detail decisions. |
+
+---
+
+### 62. Team Topologies and Cognitive Load
+
+| | |
+|---|---|
+| **Source** | Skelton, M., & Pais, M. (2019). *Team Topologies: Organizing Business and Technology Teams for Fast Flow*. IT Revolution Press. |
+| **Date** | 2019 |
+| **Alternative** | Narayan, S. (2015). *Agile IT Organization Design*. Addison-Wesley. |
+| **Status** | Confirmed — widely adopted in DevOps and platform engineering; 4.5+ star ratings across retailers |
+| **Core finding** | Team structure should minimize **cognitive load** — the total mental effort required to operate within a system. Cognitive load has three types: (1) **intrinsic** (fundamental complexity of the problem), (2) **extraneous** (unnecessary complexity from poor tooling/process), (3) **germane** (effort to build reusable abstractions). The goal is to maximize germane load (learning) while minimizing extraneous load (friction). |
+| **Mechanism** | Four team types: **Stream-aligned** (delivers customer value end-to-end), **Platform** (provides internal services), **Enabling** (helps stream teams adopt new capabilities), **Complicated-subsystem** (owns complex domain expertise). Three interaction modes: **Collaboration** (joint discovery), **X-as-a-Service** (clean handoff), **Facilitating** (temporary assistance). The SA→SE→SA loop is a **Collaboration** interaction between policy owner (SA) and detail owner (SE), with the SA providing **X-as-a-Service** interfaces (stubs, ADRs) that the SE consumes. |
+| **Where used** | AGENTS.md workflow design: the SA is a **complicated-subsystem** team (architectural expertise) and the SE is **stream-aligned** (feature delivery). The verify step is a **Collaboration** interaction where the SA reviews whether the SE respected the X-as-a-Service boundaries (stubs, protocols, ADRs). |
+
+---
+
+## Bibliography
+
+1. Bass, L., Clements, P., & Kazman, R. (2021). *Software Architecture in Practice* (4th ed.). Addison-Wesley.
+2. Brown, S. (2018). *The C4 Model for Software Architecture*. Leanpub. https://c4model.com
+3. Cockburn, A. (2005). Hexagonal Architecture. *alistair.cockburn.us*. https://alistair.cockburn.us/hexagonal-architecture/
+4. Conway, M. E. (1968). "How Do Committees Invent?" *Datamation*, 14(4), 28–31.
+5. Fowler, M. (2003). "Who Needs an Architect?" *IEEE Software*, 20(5), 11–13.
+6. Fowler, M. (2022). "Conway's Law." *martinfowler.com*. https://martinfowler.com/bliki/ConwaysLaw.html
+7. Freeman, S., & Pryce, N. (2009). *Growing Object-Oriented Software, Guided by Tests*. Addison-Wesley.
+8. Kazman, R., Klein, M., & Clements, P. (2000). "ATAM: Method for Architecture Evaluation" (CMU/SEI-2000-TR-004). SEI, CMU.
+9. Keeling, M. (2017). *Design It!: From Programmer to Software Architect*. Pragmatic Bookshelf.
+10. Kruchten, P. B. (1995). The 4+1 View Model of Architecture. *IEEE Software*, 12(6), 42–50.
+11. Martin, R. C. (2017). *Clean Architecture: A Craftsman's Guide to Software Structure and Design*. Prentice Hall.
+12. Nygard, M. T. (2011). Documenting Architecture Decisions. *cognitect.com*.
+13. Parnas, D. L. (1972). On the criteria to be used in decomposing systems into modules. *CACM*, 15(12), 1053–1058.
+14. Skelton, M., & Pais, M. (2019). *Team Topologies*. IT Revolution Press.
+3. Cockburn, A. (2005). Hexagonal Architecture. *alistair.cockburn.us*. https://alistair.cockburn.us/hexagonal-architecture/
+4. Freeman, S., & Pryce, N. (2009). *Growing Object-Oriented Software, Guided by Tests*. Addison-Wesley.
+5. Keeling, M. (2017). *Design It!: From Programmer to Software Architect*. Pragmatic Bookshelf.
+6. Kruchten, P. B. (1995). The 4+1 View Model of Architecture. *IEEE Software*, 12(6), 42–50. https://doi.org/10.1109/52.469759
+7. Nygard, M. T. (2011). Documenting Architecture Decisions. *cognitect.com*. https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions
+8. Parnas, D. L. (1972). On the criteria to be used in decomposing systems into modules. *CACM*, 15(12), 1053–1058. https://doi.org/10.1145/361598.361623
diff --git a/docs/scientific-research/cognitive-science.md b/docs/research/cognitive-science.md
similarity index 98%
rename from docs/scientific-research/cognitive-science.md
rename to docs/research/cognitive-science.md
index dad8e2b..2a1b94f 100644
--- a/docs/scientific-research/cognitive-science.md
+++ b/docs/research/cognitive-science.md
@@ -65,7 +65,7 @@ Mechanisms from cognitive and social psychology that justify workflow design dec
 | **Status** | Confirmed |
 | **Core finding** | Highest-quality thinking emerges when parties hold different hypotheses and are charged with finding flaws in each other's reasoning. |
 | **Mechanism** | Explicitly framing the reviewer as "your job is to break this feature" activates the adversarial collaboration mode. The reviewer seeks disconfirmation rather than confirmation. |
-| **Where used** | Adversarial mandate in `reviewer.md` and `verify/SKILL.md`. |
+| **Where used** | Adversarial mandate in `system-architect.md` and `verify/SKILL.md`. |
 
 ---
 
@@ -92,7 +92,7 @@ Mechanisms from cognitive and social psychology that justify workflow design dec
 | **Status** | Confirmed |
 | **Core finding** | Structured tables reduce working memory load vs. narrative text. Chunking related items into table rows enables parallel processing. |
 | **Mechanism** | Replacing prose checklists with structured tables (rows × columns) allows the reviewer to process all items in a single pass. |
-| **Where used** | All enforcement tables in `verify/SKILL.md` and `reviewer.md`. |
+| **Where used** | All enforcement tables in `verify/SKILL.md` and `system-architect.md`. |
 
 ---
 
diff --git a/docs/research/documentation.md b/docs/research/documentation.md
new file mode 100644
index 0000000..aebde01
--- /dev/null
+++ b/docs/research/documentation.md
@@ -0,0 +1,116 @@
+# Scientific Research — Documentation
+
+Foundations for living documentation, docs-as-code, information architecture, and post-mortem practices used in this template.
+
+---
+
+### 59. Information Needs in Collocated Software Development Teams
+
+| | |
+|---|---|
+| **Source** | Ko, A. J., DeLine, R., & Venolia, G. (2007). "Information Needs in Collocated Software Development Teams." *Proc. 29th International Conference on Software Engineering (ICSE 2007)*, pp. 344–353. IEEE. https://doi.org/10.1109/ICSE.2007.45 |
+| **Date** | 2007 |
+| **Alternative** | Dagenais, B., & Robillard, M. P. (2010). "Creating and evolving developer documentation." *Proc. FSE 2010*, pp. 127–136. ACM. |
+| **Status** | Confirmed — empirical study; 600+ citations |
+| **Core finding** | Developers spend 35–50% of their working time not writing code but searching for information — navigating code, reading past decisions, and understanding relationships between components. The most frequently sought information is: who wrote this, why was it written this way, and what does this module depend on. Direct questioning of teammates is the most common fallback when documentation is absent, creating serial bottlenecks. |
+| **Mechanism** | Information seeking is triggered by a task, not by curiosity. A developer encountering an unfamiliar component has a specific decision to make. When documentation is absent, the seek-ask-wait loop (find the right person, ask, wait for a response) dominates time. Persistent documentation (ADRs, architecture diagrams, glossary) short-circuits this loop by making the answer findable without a human intermediary. |
+| **Where used** | Justifies the full `update-docs` skill: C4 diagrams answer "what does this module depend on?"; the ADR record answers "why was it written this way?"; the living glossary answers "what does this term mean in this context?". Collectively these eliminate the three most frequent information needs identified by Ko et al. |
+
+---
+
+### 60. Software Engineering at Google — Documentation Chapter
+
+| | |
+|---|---|
+| **Source** | Winters, T., Manshreck, T., & Wright, H. (2020). *Software Engineering at Google: Lessons Learned from Programming Over Time*. O'Reilly. Chapter 10: "Documentation." https://abseil.io/resources/swe-book/html/ch10.html |
+| **Date** | 2020 |
+| **Alternative** | Fitzpatrick, B., & Collins-Sussman, B. (2012). *Team Geek*. O'Reilly. |
+| **Status** | Confirmed — large-scale industry evidence from a codebase with ~2 billion lines of code |
+| **Core finding** | Documentation that lives outside the code repository decays at a rate proportional to how often the code changes — because there is no mechanism that forces the doc to be updated when the code changes. Docs-as-code (documentation in the same repo, reviewed in the same PRs, tested in the same CI pipeline) dramatically reduces divergence because the cost of updating the doc is incurred at the same moment as the cost of the code change. |
+| **Mechanism** | Google's g3doc system co-locates docs with the code they describe. When a PR changes `payments/service.py`, the reviewer also sees `payments/README.md` in the diff and can flag staleness immediately. At scale, Google found that docs with no co-located tests or CI checks become stale within 3–6 months regardless of team discipline. |
+| **Where used** | Justifies co-locating `docs/` within the project repository. Living docs (`docs/context.md`, `docs/container.md`, `docs/glossary.md`) are updated in the same commits as the code they describe. The `update-docs` skill is the mechanism that enforces this — it runs after Step 5 to regenerate diagrams from the current state of the codebase and discovery docs. |
+
+---
+
+### 61. Diátaxis — A Systematic Framework for Technical Documentation
+
+| | |
+|---|---|
+| **Source** | Procida, D. (2021). "Diátaxis — A systematic approach to technical documentation." *diataxis.fr*. https://diataxis.fr |
+| **Date** | 2021 |
+| **Status** | Confirmed — adopted by Django, NumPy, Gatsby, Cloudflare, and the Python Software Foundation |
+| **Core finding** | Technical documentation fails because it conflates four fundamentally different needs into a single undifferentiated text. The four types are: **Tutorials** (learning-oriented; guides a beginner through a complete task), **How-to guides** (task-oriented; solves a specific problem for a practitioner), **Reference** (information-oriented; describes the system accurately and completely), **Explanation** (understanding-oriented; discusses concepts and decisions). Each type has a different audience mental state and requires a different writing mode. Mixing them degrades all four. |
+| **Mechanism** | The two axes of Diátaxis are: **practical ↔ theoretical** (tutorials and how-to guides are practical; reference and explanation are theoretical) and **acquiring ↔ applying** (tutorials and explanation are for acquiring knowledge; how-to guides and reference are for applying it). A document that tries to be both a tutorial and a reference simultaneously will be a poor tutorial (too much information) and a poor reference (not structured for lookup). |
+| **Where used** | Documentation structure in this template maps to Diátaxis: `README.md` = tutorial (getting started), `AGENTS.md` = reference (complete description of roles, skills, commands) and explanation (why the workflow exists), `docs/context.md` and `docs/container.md` = reference (system structure), post-mortems = explanation (why decisions were made). The `update-docs` skill produces reference-type documentation (C4 diagrams, glossary) — not tutorials. |
+
+---
+
+### 62. Blameless Post-Mortems and a Just Culture
+
+| | |
+|---|---|
+| **Source** | Allspaw, J. (2012). "Blameless PostMortems and a Just Culture." *code.etsy.com* (archived). https://www.etsy.com/codeascraft/blameless-postmortems/ |
+| **Date** | 2012 |
+| **Alternative** | Dekker, S. (2006). *The Field Guide to Understanding Human Error*. Ashgate. |
+| **Status** | Confirmed — foundational DevOps/SRE practice; referenced in Google SRE Book (2016) |
+| **Core finding** | Post-mortems that assign blame produce less information and lower long-term system reliability than blameless post-mortems. When individuals believe they will be blamed, they withhold information about contributing factors, preventing the systemic causes from being identified and fixed. A blameless post-mortem treats the incident as a system failure, not an individual failure — asking "what conditions allowed this to happen?" not "who caused this?" |
+| **Mechanism** | Allspaw's model separates two questions: (1) what happened? (factual, blameless) and (2) what changes would prevent recurrence? (systemic). The post-mortem document records both. The output is not an individual's performance review but a list of system changes — process improvements, documentation gaps, tooling additions. Etsy's incident rate fell after adopting blameless post-mortems because engineers began reporting near-misses that they previously concealed. |
+| **Where used** | `docs/post-mortem/` directory. Post-mortems in this template follow the blameless model: they report workflow gaps found, not who made the mistake. The output of each post-mortem is a list of improvements to skills, agents, or workflow documentation. The `update-docs` skill is one such improvement — it emerged from the discovery that architecture and glossary documentation were falling behind the codebase. |
+
+---
+
+### 69. arc42 — Architecture Documentation Template
+
+| | |
+|---|---|
+| **Source** | Starke, G., & Hruschka, P. (2022). *arc42 — Pragmatic, practical and proven: Template for documentation of software and system architecture*. https://arc42.org |
+| **Date** | 2005 (first release); 2022 (current edition) |
+| **Alternative** | Rozanski, N., & Woods, E. (2011). *Software Systems Architecture: Working with Stakeholders Using Viewpoints and Perspectives* (2nd ed.). Addison-Wesley. |
+| **Status** | Confirmed — ISO 25010-aligned; widely adopted in European enterprise software; open-source; used by Siemens, Deutsche Telekom, and others |
+| **Core finding** | Architecture documentation fails when it conflates two distinct audiences: those who need to understand the system now (operators, new developers, AI agents) and those who need to trace historical decisions (auditors, architects). arc42 separates these explicitly: Section 1 (Introduction and Goals) and Section 4 (Solution Strategy) describe the current state — what the system does and the key decisions governing it — while Section 9 (Architectural Decisions) is the append-only ADR log. Both sections exist simultaneously but serve different readers. |
+| **Mechanism** | arc42 provides 12 numbered sections with defined scope for each. The critical separation: current-state sections (1, 4, 5, 6) are rewritten when the system changes; historical sections (9) are append-only. This prevents the common failure mode of treating all architecture documentation as a changelog, which makes it unusable as a reference for onboarding. |
+| **Where used** | Justifies the `docs/system.md` pattern: a rewritten current-state snapshot (equivalent to arc42 Sections 1 + 4) that the SA updates at Step 2, distinct from any append-only decision history. Git history provides the audit trail without requiring a separate ADR log file. |
+
+---
+
+### 70. Google Design Docs — Living Specification Pattern
+
+| | |
+|---|---|
+| **Source** | Winters, T., Manshreck, T., & Wright, H. (2020). *Software Engineering at Google*. O'Reilly. Chapter 10. https://abseil.io/resources/swe-book/html/ch10.html |
+| **Date** | 2020 |
+| **Alternative** | Ousterhout, J. (2018). *A Philosophy of Software Design*. Yaknyam Press. (Chapter 15: "Write the Comments First") |
+| **Status** | Confirmed — large-scale industry evidence; Google's design doc practice predates the book and is widely replicated at Stripe, Notion, Airbnb |
+| **Core finding** | A design doc (also called a technical spec or RFC) is written before implementation and kept current afterward. It is not append-only — it is a living snapshot that reflects how the system works now. Its sections are: goals, non-goals, current state, design decisions, and trade-offs. When the system changes significantly, the design doc is updated (not superseded) so that it remains the authoritative single reference for the system. Archived (not deleted) only when the system is entirely replaced. |
+| **Mechanism** | The design doc is the canonical answer to "what is this system and why does it work this way?" New team members read the design doc, not the git log. The document is kept current because the cost of updating it is low (it is co-located in the repo) and the cost of not updating it is high (onboarding failures, wrong decisions). Unlike ADRs, design docs answer the current state question directly rather than requiring the reader to replay a sequence of decisions. |
+| **Where used** | Justifies the rewrite-not-append model for `docs/system.md`. The SA rewrites `docs/system.md` at Step 2 to reflect the system after each feature — same lifecycle as a Google design doc. This entry extends entry 60 (docs-as-code) with the specific design doc pattern. |
+
+---
+
+### 71. RFC / Technical Spec Pattern — Authoritative Living Reference
+
+| | |
+|---|---|
+| **Source** | Winters, T., Manshreck, T., & Wright, H. (2020). *Software Engineering at Google*. O'Reilly. (RFC culture at Google, Stripe, Notion, Airbnb). See also: Skelton, M., & Pais, M. (2019). *Team Topologies*. IT Revolution Press. (Chapter 7: "Team Interaction Modes") |
+| **Date** | 2020 |
+| **Alternative** | RFC 2119 (Bradner, 1997) for the formal RFC model; internal RFC practices at Stripe (public eng blog, 2021) and Notion (public eng blog, 2022) |
+| **Status** | Confirmed — widely adopted industry practice; independently replicated across large engineering organizations |
+| **Core finding** | A technical spec (RFC, design doc, system doc) is the authoritative description of how the system works now. It is a single document that answers: what is this, who uses it, how is it structured, what are the key constraints. It is not a changelog. When the system changes, the spec is updated in place so it always reflects current reality. When a system is retired, the spec is archived (moved, not deleted) so the record is preserved. The spec is kept current because it is the primary onboarding artifact — the first document a new engineer reads. |
+| **Mechanism** | The pattern's authority comes from its singularity: there is exactly one canonical reference. Multiple documents (a design doc here, an ADR log there, a wiki page somewhere else) create the "which one is correct?" problem that degrades onboarding speed. A single rewritten document with git history for audit purposes gives onboarding speed and audit capability simultaneously. |
+| **Where used** | Confirms the single-document model for `docs/system.md`. One file, always current, SA rewrites it at Step 2. Git history provides the full change record without requiring a separate append-only log. Entries 69, 70, and 71 together form the evidence base for `docs/system.md` replacing the ADR-log format of `docs/architecture.md`. |
+
+---
+
+## Bibliography
+
+1. Allspaw, J. (2012). Blameless PostMortems and a Just Culture. *code.etsy.com*. https://www.etsy.com/codeascraft/blameless-postmortems/
+2. Bradner, S. (1997). Key words for use in RFCs to Indicate Requirement Levels. *RFC 2119*. IETF. https://www.rfc-editor.org/rfc/rfc2119
+3. Dagenais, B., & Robillard, M. P. (2010). Creating and evolving developer documentation. *Proc. FSE 2010*, pp. 127–136. ACM.
+4. Dekker, S. (2006). *The Field Guide to Understanding Human Error*. Ashgate.
+5. Ko, A. J., DeLine, R., & Venolia, G. (2007). Information Needs in Collocated Software Development Teams. *Proc. ICSE 2007*, pp. 344–353. https://doi.org/10.1109/ICSE.2007.45
+6. Ousterhout, J. (2018). *A Philosophy of Software Design*. Yaknyam Press.
+7. Procida, D. (2021). Diátaxis — A systematic approach to technical documentation. *diataxis.fr*. https://diataxis.fr
+8. Rozanski, N., & Woods, E. (2011). *Software Systems Architecture: Working with Stakeholders Using Viewpoints and Perspectives* (2nd ed.). Addison-Wesley.
+9. Skelton, M., & Pais, M. (2019). *Team Topologies*. IT Revolution Press.
+10. Starke, G., & Hruschka, P. (2022). arc42 — Pragmatic, practical and proven. https://arc42.org
+11. Winters, T., Manshreck, T., & Wright, H. (2020). *Software Engineering at Google*. O'Reilly. Chapter 10. https://abseil.io/resources/swe-book/html/ch10.html
diff --git a/docs/scientific-research/domain-modeling.md b/docs/research/domain-modeling.md
similarity index 90%
rename from docs/scientific-research/domain-modeling.md
rename to docs/research/domain-modeling.md
index eb9143e..2b550e7 100644
--- a/docs/scientific-research/domain-modeling.md
+++ b/docs/research/domain-modeling.md
@@ -14,7 +14,7 @@ Foundations for bounded context identification, ubiquitous language, and feature
 | **Status** | Confirmed — foundational DDD literature |
 | **Core finding** | A Bounded Context is a boundary within which a particular ubiquitous language is consistent. Features are identified by grouping related user stories that share the same language. The decomposition criterion is "single responsibility per context" + "consistency of language." |
 | **Mechanism** | In DDD: (1) Extract ubiquitous language from requirements → (2) Group by language consistency → (3) Each group is a candidate bounded context → (4) Each bounded context maps to a feature. Context Mapper automates this: User Stories → Subdomains (via noun/verb extraction) → Bounded Contexts of type FEATURE. |
-| **Where used** | Stage 1 Discovery: after session synthesis, verify each feature has consistent language. Noun/verb extraction from discovery answers builds the Domain Model in `docs/discovery.md`. The `Rules (Business):` section in `.feature` files captures the ubiquitous language rules that govern each feature. |
+| **Where used** | Stage 1 Discovery: after session synthesis, verify each feature has consistent language. Noun/verb extraction from discovery answers produces candidate entities, formalized by the SA in `docs/domain-model.md` at Step 2. The `Rules (Business):` section in `.feature` files captures the ubiquitous language rules that govern each feature. |
 
 ---
 
@@ -28,7 +28,7 @@ Foundations for bounded context identification, ubiquitous language, and feature
 | **Status** | Confirmed — freely available CC-BY canonical summary; maintained by Evans personally |
 | **Core finding** | The open-access pattern summary of all DDD patterns from the 2003 book. More precisely citable than the book for specific pattern definitions. Key patterns: Ubiquitous Language ("Use the model as the backbone of a language. Commit the team to exercising that language relentlessly in all communication within the team and in the code."), Bounded Context, Context Map, Domain Events, Aggregates, Repositories. |
 | **Mechanism** | Each pattern is described with: intent, prescription, and "therefore" consequences. The Ubiquitous Language pattern prescribes: use the same terms in diagrams, writing, and especially speech. Refactor the code when the language changes. Resolve confusion over terms in conversation, the way confusion over ordinary words is resolved — by agreement and precision. |
-| **Where used** | Primary reference for `docs/discovery.md` Domain Model structure and the ubiquitous language practice. `living-docs` skill glossary entries derive from this: terms must match code identifiers (Evans' "use the same language in code" prescription). `docs/scientific-research/domain-modeling.md`. |
+| **Where used** | Primary reference for `docs/domain-model.md` structure and the ubiquitous language practice. `update-docs` skill glossary entries derive from this: terms must match code identifiers (Evans' "use the same language in code" prescription). `docs/research/domain-modeling.md`. |
 | **Note** | Supersedes entry #31 as the citable source for specific pattern quotes. Entry #31 remains as the book reference. Use this entry when citing a specific Evans pattern definition. |
 
 ---
@@ -43,7 +43,7 @@ Foundations for bounded context identification, ubiquitous language, and feature
 | **Status** | Confirmed — widely cited secondary source; Fowler wrote the DDD foreword and is considered the authoritative secondary interpreter of Evans |
 | **Core finding** | The ubiquitous language is a practice, not a document. The glossary is a secondary artifact — a snapshot of the current state of the language. The language itself lives in conversation, in the code, and in all written communication. "By using the model-based language pervasively and not being satisfied until it flows, we approach a model that is complete and comprehensible." Domain experts must object to inadequate terms; developers must flag ambiguity. |
 | **Mechanism** | The key test of a ubiquitous language: can a domain expert read the domain layer code and recognize their domain? If the code uses different names than the glossary, the code must be refactored — not the glossary relaxed. The language evolves through experimentation with alternative expressions, followed by code refactoring to match the new model. |
-| **Where used** | `living-docs` skill — grounds the rule "verify each term matches the identifier used in the code's domain layer." `docs/glossary.md` — the glossary is explicitly secondary to the code. `docs/scientific-research/domain-modeling.md`. |
+| **Where used** | `update-docs` skill — grounds the rule "verify each term matches the identifier used in the code's domain layer." `docs/glossary.md` — the glossary is explicitly secondary to the code. `docs/research/domain-modeling.md`. |
 
 ---
 
@@ -57,7 +57,7 @@ Foundations for bounded context identification, ubiquitous language, and feature
 | **Status** | Confirmed — includes a direct Evans quote; the canonical accessible reference for Bounded Context as a design pattern |
 | **Core finding** | "Total unification of the domain model for a large system will not be feasible or cost-effective" (Evans, quoted directly). The same word can mean different things in different Bounded Contexts — this is not a defect but a reflection of domain reality. "You need a different model when the language changes." A Bounded Context is the boundary within which a particular ubiquitous language is internally consistent. Terms must be qualified by their context when a project has more than one bounded context. |
 | **Mechanism** | Fowler's electricity utility example: the word "meter" meant different things in billing, grid management, and customer service. Attempting to unify these into one definition created confusion. Each bounded context maintains its own model and its own language. Context Maps document the relationships and translation rules between bounded contexts. |
-| **Where used** | `living-docs` skill — `**Bounded context:**` field in `docs/glossary.md` entries is mandatory when the project has more than one bounded context (this is the Evans/Fowler requirement). `docs/scientific-research/domain-modeling.md`. |
+| **Where used** | `update-docs` skill — `**Bounded context:**` field in `docs/glossary.md` entries is mandatory when the project has more than one bounded context (this is the Evans/Fowler requirement). `docs/research/domain-modeling.md`. |
 
 ---
 
@@ -71,7 +71,7 @@ Foundations for bounded context identification, ubiquitous language, and feature
 | **Status** | Confirmed — second most cited DDD book; ~5,000 citations |
 | **Core finding** | Three additions to Evans: (1) **Domain Events as first-class vocabulary** — past-tense verb phrases ("OrderPlaced," "VersionDisplayed") are part of the ubiquitous language and belong in the glossary as a distinct type. (2) **Context Maps as the organizing principle** for multi-context glossaries — each bounded context has its own language documentation; the Context Map shows translation rules between contexts. (3) **Documentation co-located with the code** — docs in the same repository decay at the same rate as the code, dramatically reducing divergence. |
 | **Mechanism** | Vernon's IDDD samples (github.com/VaughnVernon/IDDD_Samples) demonstrate all three in practice. The Product Owner / Business Analyst plays the domain-expert-representative role in glossary maintenance — validating semantic correctness — while developers own structural precision. Neither writes the glossary unilaterally. |
-| **Where used** | `living-docs` skill — `Domain Event` added as a distinct Type value in `docs/glossary.md` entries. Grounds the PO-owned glossary with SE input via `docs/architecture.md` Reason: fields. `docs/scientific-research/domain-modeling.md`. |
+| **Where used** | `update-docs` skill — `Domain Event` added as a distinct Type value in `docs/glossary.md` entries. Grounds the PO-owned glossary with SE input via `docs/adr/ADR-YYYY-MM-DD-<slug>.md` Reason: fields. `docs/research/domain-modeling.md`. |
 
 ---
 
@@ -85,7 +85,7 @@ Foundations for bounded context identification, ubiquitous language, and feature
 | **Status** | Confirmed — original URL is 404; widely documented through community discussion and practitioner secondary accounts; thesis is uncontested in the DDD community |
 | **Core finding** | A glossary is not a ubiquitous language. Teams that maintain a glossary but do not reflect its terms in the code have the *appearance* of a ubiquitous language without the substance. The glossary is a secondary artifact derived from the code and domain-expert conversations — not the reverse. The canonical source of truth is the domain layer code, not the glossary document. A glossary that diverges from the code is lying. |
 | **Mechanism** | The test: can a domain expert read the domain layer code and recognize their domain without a translator? If yes, the ubiquitous language exists. If the only evidence of the language is the glossary document, it does not exist. Consequence: every term added to the glossary must be verified against the corresponding code identifier. |
-| **Where used** | `living-docs` skill — grounds the checklist item "Verify each term matches the identifier used in the code's domain layer." Prevents the common failure mode of glossary-as-theatre. `docs/scientific-research/domain-modeling.md`. |
+| **Where used** | `update-docs` skill — grounds the checklist item "Verify each term matches the identifier used in the code's domain layer." Prevents the common failure mode of glossary-as-theatre. `docs/research/domain-modeling.md`. |
 
 ---
 
@@ -99,7 +99,7 @@ Foundations for bounded context identification, ubiquitous language, and feature
 | **Status** | Confirmed — freely available; Evans' own post-2003 process guidance |
 | **Core finding** | Model exploration is a cycle: Scenario Exploring → Harvesting Abstractions → Probing the Model → Challenging the Model → back to Scenario Exploring. New vocabulary crystallizes at the Harvesting Abstractions step — concrete scenarios surface candidate terms, which are then named, defined, and reflected in the code. The glossary grows at each Harvesting Abstractions step. |
 | **Mechanism** | The Whirlpool is not a development process — it fits within most iterative processes. It is a model-exploration subprocess triggered whenever the team encounters a poorly understood domain concept. The output of each cycle is a refined model expressed in clearer language, with updated code identifiers and glossary entries. |
-| **Where used** | `living-docs` skill — grounds the timing of glossary updates: after each completed feature (Step 5) corresponds to the Harvesting Abstractions step in the Whirlpool. Discovery sessions (Stage 1) correspond to Scenario Exploring. `docs/scientific-research/domain-modeling.md`. |
+| **Where used** | `update-docs` skill — grounds the timing of glossary updates: after each completed feature (Step 5) corresponds to the Harvesting Abstractions step in the Whirlpool. Discovery sessions (Stage 1) correspond to Scenario Exploring. `docs/research/domain-modeling.md`. |
 
 ---
 
diff --git a/docs/scientific-research/oop-design.md b/docs/research/oop-design.md
similarity index 64%
rename from docs/scientific-research/oop-design.md
rename to docs/research/oop-design.md
index 4b0637d..2c0ae9c 100644
--- a/docs/scientific-research/oop-design.md
+++ b/docs/research/oop-design.md
@@ -56,9 +56,25 @@ Foundations for object-oriented design principles used in this template.
 
 ---
 
+### 36. refactoring.guru — Code Smells, Refactoring Techniques, and Design Patterns
+
+| | |
+|---|---|
+| **Source** | Shvets, A. (2014–present). *Refactoring.Guru*. https://refactoring.guru |
+| **Date** | 2014–present (continuously updated) |
+| **Status** | Practitioner synthesis — widely used reference |
+| **Core finding** | Three interconnected catalogs: (1) **22 code smells** in 5 categories (Bloaters, OO Abusers, Change Preventers, Dispensables, Couplers); (2) **~70 refactoring techniques** in 6 categories (Composing Methods, Moving Features, Organizing Data, Simplifying Conditionals, Simplifying Method Calls, Dealing with Generalization); (3) **22 GoF design patterns** with visual diagrams and multi-language examples. The unique value is the **interconnected navigation**: each smell links to the techniques that address it, and techniques link to patterns they lead toward. |
+| **Mechanism** | Navigation chain: smell → techniques → patterns. Smell categories group related structural problems (e.g., Bloaters = classes/methods grown too large; Dispensables = code that can safely be removed; Couplers = excessive dependency between classes). Each technique has a before/after structure, prerequisites, and trade-offs. |
+| **Smell categories** | **Bloaters** (Long Method, Large Class, Primitive Obsession, Long Parameter List, Data Clumps); **OO Abusers** (Switch Statements, Temporary Field, Refused Bequest, Alternative Classes with Different Interfaces); **Change Preventers** (Divergent Change, Shotgun Surgery, Parallel Inheritance Hierarchies); **Dispensables** (Comments, Duplicate Code, Lazy Class, Data Class, Dead Code, Speculative Generality); **Couplers** (Feature Envy, Inappropriate Intimacy, Message Chains, Middle Man, Incomplete Library Class) |
+| **Technique categories** | Composing Methods, Moving Features Between Objects, Organizing Data, Simplifying Conditional Expressions, Simplifying Method Calls, Dealing with Generalization |
+| **Where used** | `refactor/SKILL.md`: expanded smell table with all 5 categories. `apply-patterns/SKILL.md`: cross-reference for GoF pattern selection. |
+
+---
+
 ## Bibliography
 
 1. Bay, J. (~2005). "Object Calisthenics." *IEEE Software/DevX*. https://www.bennadel.com/resources/uploads/2012/objectcalisthenics.pdf
 2. Fowler, M. (1999/2018). *Refactoring: Improving the Design of Existing Code* (2nd ed.). Addison-Wesley. https://martinfowler.com/books/refactoring.html
 3. Gamma, E., Helm, R., Johnson, R., & Vlissides, J. (1995). *Design Patterns: Elements of Reusable Object-Oriented Software*. Addison-Wesley.
 4. Martin, R. C. (2000). "Principles of OOD." *ButUncleBob.com*. https://blog.interface-solv.com/wp-content/uploads/2020/07/Principles-Of-OOD.pdf
+5. Shvets, A. (2014–present). *Refactoring.Guru*. https://refactoring.guru
diff --git a/docs/scientific-research/refactoring-empirical.md b/docs/research/refactoring-empirical.md
similarity index 100%
rename from docs/scientific-research/refactoring-empirical.md
rename to docs/research/refactoring-empirical.md
diff --git a/docs/scientific-research/requirements-elicitation.md b/docs/research/requirements-elicitation.md
similarity index 100%
rename from docs/scientific-research/requirements-elicitation.md
rename to docs/research/requirements-elicitation.md
diff --git a/docs/scientific-research/software-economics.md b/docs/research/software-economics.md
similarity index 100%
rename from docs/scientific-research/software-economics.md
rename to docs/research/software-economics.md
diff --git a/docs/scientific-research/testing.md b/docs/research/testing.md
similarity index 97%
rename from docs/scientific-research/testing.md
rename to docs/research/testing.md
index 2c7f7d7..6ebdd87 100644
--- a/docs/scientific-research/testing.md
+++ b/docs/research/testing.md
@@ -26,7 +26,7 @@ Foundations for test design, TDD, BDD, and property-based testing used in this t
 | **Status** | Confirmed |
 | **Core finding** | Test setup may need to change if implementation changes, but the actual test shouldn't need to change if the code's user-facing behavior doesn't change. |
 | **Mechanism** | Tests that are tightly coupled to implementation break on refactoring and become a drag on design improvement. Behavioral tests survive internal rewrites. |
-| **Where used** | Contract test rule in `implementation/SKILL.md`, reviewer verification check in `reviewer.md`. |
+| **Where used** | Contract test rule in `implement/SKILL.md`, system-architect verification check in `verify/SKILL.md`. |
 
 ---
 
@@ -39,7 +39,7 @@ Foundations for test design, TDD, BDD, and property-based testing used in this t
 | **Status** | Confirmed |
 | **Core finding** | Tests should be treated as first-class citizens of the system — not coupled to implementation. Bad tests are worse than no tests because they give false confidence. |
 | **Mechanism** | Tests written as "contract tests" — describing what the caller observes — remain stable through refactoring. Tests that verify implementation details are fragile and create maintenance burden. |
-| **Where used** | Contract test rule in `implementation/SKILL.md`, verification check in `reviewer.md`. |
+| **Where used** | Contract test rule in `implement/SKILL.md`, verification check in `verify/SKILL.md`. |
 
 ---
 
diff --git a/docs/research/version-control.md b/docs/research/version-control.md
new file mode 100644
index 0000000..195e9cc
--- /dev/null
+++ b/docs/research/version-control.md
@@ -0,0 +1,57 @@
+# Version Control & Branching Strategies
+
+## 63. Pro Git — Scott Chacon & Ben Straub
+
+**Source**: Chacon, S., & Straub, B. (2014). *Pro Git* (2nd ed.). Apress. Free online: https://git-scm.com/book
+
+**Key Insight**: Git's distributed model makes branching and merging cheap daily operations, not rare scary events. The book covers the full Git object model (blobs, trees, commits, refs), which explains why operations like `rebase` rewrite history while `revert` appends it — critical for our "no history rewrite" safety protocol.
+
+**Relevance**: Foundation for all Git operations in the project. The object model chapter explains why `git revert` is safe on shared branches while `rebase` is not.
+
+---
+
+## 64. A Successful Git Branching Model — Vincent Driessen
+
+**Source**: Driessen, V. (2010). A successful Git branching model. https://nvie.com/posts/a-successful-git-branching-model/
+
+**Key Insight**: The "git-flow" model defines `master`/`develop` as infinite-lifetime branches, with `feature/*`, `release/*`, and `hotfix/*` as short-lived supporting branches. The `--no-ff` merge is explicitly recommended to preserve feature boundaries in history, making whole-feature reverts possible.
+
+> "The `--no-ff` flag causes the merge to always create a new commit object, even if the merge could be performed with a fast-forward. This avoids losing information about the historical existence of a feature branch."
+
+**Relevance**: Direct basis for our branch model. We use `feat/<stem>` and `fix/<stem>` branches, merge to `main` with `--no-ff`, and delete branches after merge.
+
+---
+
+## 65. Git Cheat Sheet — Git SCM
+
+**Source**: Git SCM. Git Cheat Sheet. https://git-scm.com/cheat-sheet
+
+**Key Insight**: Quick reference for everyday commands. Covers `git merge-tree` for conflict detection without touching working tree, `git log --follow` for renamed files, and `git reflog` for recovery — all relevant to our workflow.
+
+**Relevance**: Operational reference for the SE when executing branch operations.
+
+---
+
+## 66. Common Git Issues & Anti-Patterns
+
+**Source**: Fowler, M. (2013). Patterns for Managing Source Code Branches. https://martinfowler.com/articles/branching-patterns.html
+
+**Key Insight**: Fowler contrasts "feature branching" (short-lived branches, frequent integration) with "release branching" (long-lived stabilization branches). Our model is feature branching: branches live only for the duration of one feature, then merge to `main`.
+
+**Anti-patterns to avoid**:
+- **Long-lived feature branches**: increase merge conflict risk and integration pain
+- **Force push on shared branches**: destroys history that others may have fetched
+- **Squash merge on collaborative branches**: erases individual commit authorship and makes bisect harder
+- **Committing directly to main**: bypasses review and breaks the closed loop
+
+**Relevance**: Validates our WIP=1 approach and our safety protocol against force push and history rewrite.
+
+---
+
+## 67. Merge vs. Rebase — When to Use Each
+
+**Source**: Atlassian Git Tutorial. Merging vs. Rebasing. https://www.atlassian.com/git/tutorials/merging-vs-rebasing
+
+**Key Insight**: Rebase rewrites commit history by replaying commits on top of a new base. This is fine for local, unpushed branches but dangerous for shared branches because it changes commit SHAs that others may reference. Merge preserves history but creates merge commits.
+
+**Our rule**: Never rebase a pushed branch. Use `git merge main` on the feature branch to resolve conflicts, then `--no-ff` merge the feature branch to `main`.
diff --git a/docs/scientific-research/architecture.md b/docs/scientific-research/architecture.md
deleted file mode 100644
index 8cf3a9d..0000000
--- a/docs/scientific-research/architecture.md
+++ /dev/null
@@ -1,86 +0,0 @@
-# Scientific Research — Architecture
-
-Foundations for the architectural decisions and patterns used in this template.
-
----
-
-### 42. Hexagonal Architecture — Ports and Adapters
-
-| | |
-|---|---|
-| **Source** | Cockburn, A. (2005). "Hexagonal Architecture." *alistair.cockburn.us*. https://alistair.cockburn.us/hexagonal-architecture/ |
-| **Date** | 2005 |
-| **Alternative** | Freeman, S., & Pryce, N. (2009). *Growing Object-Oriented Software, Guided by Tests*. Addison-Wesley. (Chapter 7: "Ports and Adapters") |
-| **Status** | Confirmed — foundational; widely adopted as Clean Architecture, Onion Architecture |
-| **Core finding** | The application domain should have no knowledge of external systems (databases, filesystems, network, UI). All contact between the domain and the outside world passes through a **port** (an interface / Protocol) and an **adapter** (a concrete implementation of that port). The domain is independently testable without any infrastructure. The key structural rule: dependency arrows point inward — domain code never imports from adapters; adapters import from domain. |
-| **Mechanism** | Two distinct sides of any application: the "driving side" (actors who initiate action — tests, UI, CLI) and the "driven side" (actors the application drives — databases, filesystems, external services). Each driven-side dependency is hidden behind a port. Tests supply a test adapter; production supplies a real adapter. Substituting adapters requires no domain code changes. This is SOLID-D at the architectural layer. |
-| **Where used** | Step 2 (Architecture): if an external dependency is identified during domain analysis, assign it a Protocol. `ports/` and `adapters/` folders emerge when a concrete dependency is confirmed — do not pre-create them. The dependency-inversion principle (SOLID-D) is the goal; the folder names are convention, not law. |
-
----
-
-### 55. Architecture Decision Records (ADRs)
-
-| | |
-|---|---|
-| **Source** | Nygard, M. T. (2011). "Documenting Architecture Decisions." *cognitect.com*. https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions |
-| **Date** | 2011 |
-| **Alternative** | Keeling, M. (2017). *Design It!: From Programmer to Software Architect*. Pragmatic Bookshelf. (Chapter 6: "Architectural Decisions") |
-| **Status** | Confirmed — widely adopted industry standard; tooled by adr-tools, ADR Manager, Log4Brains |
-| **Core finding** | Architectural decisions should be recorded as short, immutable documents capturing: what was decided, why, and what alternatives were rejected. Without this record, decisions get re-litigated by every new developer (or AI agent) who encounters the codebase, producing rework and re-divergence. |
-| **Mechanism** | An ADR is written at decision time, never edited afterward. If the decision changes, a new ADR is written that supersedes the old one. The append-only record becomes a reliable audit trail. The constraint "one sentence per field" forces clarity — if you can't state the reason in one sentence, the decision is not yet understood. |
-| **Where used** | `docs/architecture/architecture.md` (ADR template). SE appends one block per non-obvious decision after Step 2. The `living-docs` skill reads ADRs as input for C4 diagram annotations. |
-
----
-
-### 56. The 4+1 View Model of Architecture
-
-| | |
-|---|---|
-| **Source** | Kruchten, P. B. (1995). "The 4+1 View Model of Architecture." *IEEE Software*, 12(6), 42–50. https://doi.org/10.1109/52.469759 |
-| **Date** | 1995 |
-| **Alternative** | Bass, L., Clements, P., & Kazman, R. (2021). *Software Architecture in Practice* (4th ed.). Addison-Wesley. |
-| **Status** | Confirmed — 3,000+ citations; foundational IEEE reference for architectural documentation |
-| **Core finding** | A single architectural diagram cannot communicate all relevant aspects of a system. Four distinct views are required: **Logical** (domain objects and relationships), **Process** (runtime behavior and concurrency), **Development** (module organisation and dependencies), **Physical** (deployment topology). A fifth **Scenarios** view (use cases) ties the four together by showing how each scenario exercises each view. |
-| **Mechanism** | Different stakeholders need different views: a developer needs the Development view; an operator needs the Physical view; a domain expert needs the Logical view. Conflating views into one diagram produces a cluttered diagram that satisfies nobody. The 4+1 model assigns each concern to its appropriate view and cross-validates them through scenarios. |
-| **Where used** | Theoretical foundation for the C4 model (entry 57). The `living-docs` skill generates C4 diagrams that map to: Context diagram (Scenarios view), Container diagram (Physical + Development views), Component diagram (Logical + Development views). |
-
----
-
-### 57. The C4 Model for Software Architecture
-
-| | |
-|---|---|
-| **Source** | Brown, S. (2018). *The C4 Model for Software Architecture*. Leanpub. https://c4model.com |
-| **Date** | 2018 (ongoing) |
-| **Alternative** | Brown, S. (2023). "The C4 model for visualising software architecture." *InfoQ*. |
-| **Status** | Confirmed — widely adopted; tooled by Structurizr, PlantUML C4, Mermaid C4 |
-| **Core finding** | Software architecture can be communicated at four zoom levels: **Level 1 — System Context** (who uses the system and what external systems it talks to), **Level 2 — Container** (major runnable/deployable units), **Level 3 — Component** (major structural building blocks within a container), **Level 4 — Code** (classes, interfaces; usually auto-generated). Each level answers a specific question; mixing levels in one diagram creates confusion. |
-| **Mechanism** | C4 operationalises the 4+1 View Model (entry 56) into a lightweight notation that can be expressed in text (PlantUML, Mermaid) and version-controlled alongside code. The notation is deliberately constrained: boxes (people, systems, containers, components) and unidirectional arrows with labels. No UML formalism required. Context + Container diagrams cover >90% of communication needs for most teams. |
-| **Where used** | The `living-docs` skill generates and updates C4 diagrams in `docs/c4/`. Context diagram (L1) always generated; Container (L2) generated when multiple containers are identified; Component (L3) generated on demand. Source files are Mermaid so they render in GitHub and are version-controlled. |
-
----
-
-### 58. Information Hiding — Module Decomposition Criterion
-
-| | |
-|---|---|
-| **Source** | Parnas, D. L. (1972). "On the criteria to be used in decomposing systems into modules." *Communications of the ACM*, 15(12), 1053–1058. https://doi.org/10.1145/361598.361623 |
-| **Date** | 1972 |
-| **Alternative** | Parnas, D. L. (1974). "On a 'buzzword': Hierarchical structure." *Proc. IFIP Congress 74*, 336–339. |
-| **Status** | Confirmed — 4,000+ citations; foundational criterion for all modular decomposition in software engineering |
-| **Core finding** | The correct criterion for decomposing a system into modules is **information hiding**: each module hides a design decision that is likely to change. A module's interface reveals only what callers need; its implementation hides how. Decomposing by execution steps (procedure-based) creates tight coupling to implementation order; decomposing by change-prone decisions (information-hiding) allows each decision to be changed independently. |
-| **Mechanism** | Identify which decisions are most likely to change (data structures, algorithms, I/O formats, external service protocols). Each such decision becomes a module boundary. The module's public interface is defined to be change-stable; the implementation is change-free from the caller's perspective. This is the theoretical basis for SOLID-D (depend on abstractions), Hexagonal Architecture (hide external decisions behind ports), and DDD bounded contexts (hide language decisions behind context boundaries). |
-| **Where used** | Step 2 Architecture: bounded context check ("same word, different meaning across features? → module boundary") and external dep Protocol assignment both apply the information-hiding criterion. The `living-docs` skill uses module boundaries as container/component boundaries in `docs/c4/` diagrams. |
-
----
-
-## Bibliography
-
-1. Bass, L., Clements, P., & Kazman, R. (2021). *Software Architecture in Practice* (4th ed.). Addison-Wesley.
-2. Brown, S. (2018). *The C4 Model for Software Architecture*. Leanpub. https://c4model.com
-3. Cockburn, A. (2005). Hexagonal Architecture. *alistair.cockburn.us*. https://alistair.cockburn.us/hexagonal-architecture/
-4. Freeman, S., & Pryce, N. (2009). *Growing Object-Oriented Software, Guided by Tests*. Addison-Wesley.
-5. Keeling, M. (2017). *Design It!: From Programmer to Software Architect*. Pragmatic Bookshelf.
-6. Kruchten, P. B. (1995). The 4+1 View Model of Architecture. *IEEE Software*, 12(6), 42–50. https://doi.org/10.1109/52.469759
-7. Nygard, M. T. (2011). Documenting Architecture Decisions. *cognitect.com*. https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions
-8. Parnas, D. L. (1972). On the criteria to be used in decomposing systems into modules. *CACM*, 15(12), 1053–1058. https://doi.org/10.1145/361598.361623
diff --git a/docs/scientific-research/documentation.md b/docs/scientific-research/documentation.md
deleted file mode 100644
index 9c77a00..0000000
--- a/docs/scientific-research/documentation.md
+++ /dev/null
@@ -1,69 +0,0 @@
-# Scientific Research — Documentation
-
-Foundations for living documentation, docs-as-code, information architecture, and post-mortem practices used in this template.
-
----
-
-### 59. Information Needs in Collocated Software Development Teams
-
-| | |
-|---|---|
-| **Source** | Ko, A. J., DeLine, R., & Venolia, G. (2007). "Information Needs in Collocated Software Development Teams." *Proc. 29th International Conference on Software Engineering (ICSE 2007)*, pp. 344–353. IEEE. https://doi.org/10.1109/ICSE.2007.45 |
-| **Date** | 2007 |
-| **Alternative** | Dagenais, B., & Robillard, M. P. (2010). "Creating and evolving developer documentation." *Proc. FSE 2010*, pp. 127–136. ACM. |
-| **Status** | Confirmed — empirical study; 600+ citations |
-| **Core finding** | Developers spend 35–50% of their working time not writing code but searching for information — navigating code, reading past decisions, and understanding relationships between components. The most frequently sought information is: who wrote this, why was it written this way, and what does this module depend on. Direct questioning of teammates is the most common fallback when documentation is absent, creating serial bottlenecks. |
-| **Mechanism** | Information seeking is triggered by a task, not by curiosity. A developer encountering an unfamiliar component has a specific decision to make. When documentation is absent, the seek-ask-wait loop (find the right person, ask, wait for a response) dominates time. Persistent documentation (ADRs, architecture diagrams, glossary) short-circuits this loop by making the answer findable without a human intermediary. |
-| **Where used** | Justifies the full `living-docs` skill: C4 diagrams answer "what does this module depend on?"; the ADR record answers "why was it written this way?"; the living glossary answers "what does this term mean in this context?". Collectively these eliminate the three most frequent information needs identified by Ko et al. |
-
----
-
-### 60. Software Engineering at Google — Documentation Chapter
-
-| | |
-|---|---|
-| **Source** | Winters, T., Manshreck, T., & Wright, H. (2020). *Software Engineering at Google: Lessons Learned from Programming Over Time*. O'Reilly. Chapter 10: "Documentation." https://abseil.io/resources/swe-book/html/ch10.html |
-| **Date** | 2020 |
-| **Alternative** | Fitzpatrick, B., & Collins-Sussman, B. (2012). *Team Geek*. O'Reilly. |
-| **Status** | Confirmed — large-scale industry evidence from a codebase with ~2 billion lines of code |
-| **Core finding** | Documentation that lives outside the code repository decays at a rate proportional to how often the code changes — because there is no mechanism that forces the doc to be updated when the code changes. Docs-as-code (documentation in the same repo, reviewed in the same PRs, tested in the same CI pipeline) dramatically reduces divergence because the cost of updating the doc is incurred at the same moment as the cost of the code change. |
-| **Mechanism** | Google's g3doc system co-locates docs with the code they describe. When a PR changes `payments/service.py`, the reviewer also sees `payments/README.md` in the diff and can flag staleness immediately. At scale, Google found that docs with no co-located tests or CI checks become stale within 3–6 months regardless of team discipline. |
-| **Where used** | Justifies co-locating `docs/` within the project repository. Living docs (`docs/architecture/c4/`, `docs/glossary.md`) are updated in the same commits as the code they describe. The `living-docs` skill is the mechanism that enforces this — it runs after Step 5 to regenerate diagrams from the current state of the codebase and discovery docs. |
-
----
-
-### 61. Diátaxis — A Systematic Framework for Technical Documentation
-
-| | |
-|---|---|
-| **Source** | Procida, D. (2021). "Diátaxis — A systematic approach to technical documentation." *diataxis.fr*. https://diataxis.fr |
-| **Date** | 2021 |
-| **Status** | Confirmed — adopted by Django, NumPy, Gatsby, Cloudflare, and the Python Software Foundation |
-| **Core finding** | Technical documentation fails because it conflates four fundamentally different needs into a single undifferentiated text. The four types are: **Tutorials** (learning-oriented; guides a beginner through a complete task), **How-to guides** (task-oriented; solves a specific problem for a practitioner), **Reference** (information-oriented; describes the system accurately and completely), **Explanation** (understanding-oriented; discusses concepts and decisions). Each type has a different audience mental state and requires a different writing mode. Mixing them degrades all four. |
-| **Mechanism** | The two axes of Diátaxis are: **practical ↔ theoretical** (tutorials and how-to guides are practical; reference and explanation are theoretical) and **acquiring ↔ applying** (tutorials and explanation are for acquiring knowledge; how-to guides and reference are for applying it). A document that tries to be both a tutorial and a reference simultaneously will be a poor tutorial (too much information) and a poor reference (not structured for lookup). |
-| **Where used** | Documentation structure in this template maps to Diátaxis: `README.md` = tutorial (getting started), `AGENTS.md` = reference (complete description of roles, skills, commands) and explanation (why the workflow exists), `docs/c4/` = reference (system structure), post-mortems = explanation (why decisions were made). The `living-docs` skill produces reference-type documentation (C4 diagrams, glossary) — not tutorials. |
-
----
-
-### 62. Blameless Post-Mortems and a Just Culture
-
-| | |
-|---|---|
-| **Source** | Allspaw, J. (2012). "Blameless PostMortems and a Just Culture." *code.etsy.com* (archived). https://www.etsy.com/codeascraft/blameless-postmortems/ |
-| **Date** | 2012 |
-| **Alternative** | Dekker, S. (2006). *The Field Guide to Understanding Human Error*. Ashgate. |
-| **Status** | Confirmed — foundational DevOps/SRE practice; referenced in Google SRE Book (2016) |
-| **Core finding** | Post-mortems that assign blame produce less information and lower long-term system reliability than blameless post-mortems. When individuals believe they will be blamed, they withhold information about contributing factors, preventing the systemic causes from being identified and fixed. A blameless post-mortem treats the incident as a system failure, not an individual failure — asking "what conditions allowed this to happen?" not "who caused this?" |
-| **Mechanism** | Allspaw's model separates two questions: (1) what happened? (factual, blameless) and (2) what changes would prevent recurrence? (systemic). The post-mortem document records both. The output is not an individual's performance review but a list of system changes — process improvements, documentation gaps, tooling additions. Etsy's incident rate fell after adopting blameless post-mortems because engineers began reporting near-misses that they previously concealed. |
-| **Where used** | `docs/post-mortem/` directory. Post-mortems in this template follow the blameless model: they report workflow gaps found, not who made the mistake. The output of each post-mortem is a list of improvements to skills, agents, or workflow documentation. The `living-docs` skill is one such improvement — it emerged from the discovery that architecture and glossary documentation were falling behind the codebase. |
-
----
-
-## Bibliography
-
-1. Allspaw, J. (2012). Blameless PostMortems and a Just Culture. *code.etsy.com*. https://www.etsy.com/codeascraft/blameless-postmortems/
-2. Dagenais, B., & Robillard, M. P. (2010). Creating and evolving developer documentation. *Proc. FSE 2010*, pp. 127–136. ACM.
-3. Dekker, S. (2006). *The Field Guide to Understanding Human Error*. Ashgate.
-4. Ko, A. J., DeLine, R., & Venolia, G. (2007). Information Needs in Collocated Software Development Teams. *Proc. ICSE 2007*, pp. 344–353. https://doi.org/10.1109/ICSE.2007.45
-5. Procida, D. (2021). Diátaxis — A systematic approach to technical documentation. *diataxis.fr*. https://diataxis.fr
-6. Winters, T., Manshreck, T., & Wright, H. (2020). *Software Engineering at Google*. O'Reilly. Chapter 10. https://abseil.io/resources/swe-book/html/ch10.html
diff --git a/docs/discovery_journal.md b/docs/scope_journal.md
similarity index 95%
rename from docs/discovery_journal.md
rename to docs/scope_journal.md
index ef538fe..6fe6902 100644
--- a/docs/discovery_journal.md
+++ b/docs/scope_journal.md
@@ -1,4 +1,4 @@
-# Discovery Journal: <project-name>
+# Scope Journal: <project-name>
 
 ---
 
diff --git a/docs/system.md b/docs/system.md
new file mode 100644
index 0000000..efb9650
--- /dev/null
+++ b/docs/system.md
@@ -0,0 +1,40 @@
+# System: <project-name>
+
+> Last updated: YYYY-MM-DD — <feature-stem>
+
+**Purpose:** <one sentence — what problem this system solves>
+
+---
+
+## Actors
+
+| Actor | Needs |
+|-------|-------|
+| <role> | <what they need from the system> |
+
+---
+
+## Structure
+
+| Module | Responsibility |
+|--------|----------------|
+| <actual Python path> | <one sentence> |
+
+---
+
+## Key Decisions
+
+- <current-fact sentence — e.g. "Version is read from pyproject.toml via tomllib; no hardcoded constant.">
+
+---
+
+## External Dependencies
+
+| Dependency | What it provides | Why not replaced |
+|------------|------------------|-----------------|
+
+---
+
+## Active Constraints
+
+- <constraint governing future features>