From 7819adf20a6c904490c92702a8693e117c0c639e Mon Sep 17 00:00:00 2001 From: nullhack Date: Sun, 19 Apr 2026 16:35:18 -0400 Subject: [PATCH 1/2] chore(workflow): integrate pytest-beehave and clean up @id conventions - Add pytest-beehave[html]>=3.0 to dev deps; configure [tool.beehave] - Fix test-build task: --cov=pytest_beehave -> --cov=app - Remove manual deprecated skip hook from conftest.py (beehave owns it) - Naming: feature-stem for .feature paths, feature_slug for test dirs - @id tags now auto-assigned on first pytest run; remove all manual generation instructions - Test stubs auto-generated at Step 2 end via test-fast; remove manual stub section - Add pytest-beehave to README tooling table; add Why section and auto-gen stub example - Update product-owner.md, AGENTS.md, and all affected skills accordingly --- .opencode/agents/product-owner.md | 13 ++-- .opencode/skills/implementation/SKILL.md | 36 +++++------ .opencode/skills/living-docs/SKILL.md | 2 +- .opencode/skills/pr-management/SKILL.md | 4 +- .opencode/skills/refactor/SKILL.md | 6 +- .opencode/skills/scope/SKILL.md | 23 +++---- .opencode/skills/session-workflow/SKILL.md | 26 ++++---- .opencode/skills/verify/SKILL.md | 2 +- AGENTS.md | 26 ++++---- README.md | 71 +++++++++++++++------- docs/architecture.md | 6 +- docs/discovery.md | 2 +- docs/discovery_journal.md | 2 +- pyproject.toml | 6 +- tests/conftest.py | 7 --- uv.lock | 20 ++++++ 16 files changed, 142 insertions(+), 110 deletions(-) diff --git a/.opencode/agents/product-owner.md b/.opencode/agents/product-owner.md index 456c339..403da3c 100644 --- a/.opencode/agents/product-owner.md +++ b/.opencode/agents/product-owner.md @@ -51,19 +51,18 @@ When a gap is reported (by software-engineer or reviewer): | Situation | Action | |---|---| -| Edge case within current user stories | Add a new Example with a new `@id` to the relevant `.feature` file. | +| Edge case within current user stories | Add a new Example to the relevant `.feature` file. | | New behavior beyond current stories | Add to backlog as a new feature. Do not extend the current feature. | -| Behavior contradicts an existing Example | Write a new Example with new `@id`. | -| Post-merge defect | Move the `.feature` file back to `in-progress/`, add new Example with `@id`, resume at Step 3. | +| Behavior contradicts an existing Example | Add `@deprecated` to the old Example; write a new Example. | +| Post-merge defect | Move the `.feature` file back to `in-progress/`, add new Example, resume at Step 3. | ## Bug Handling When a defect is reported against any feature: -1. Add a `@bug @id:` Example to the relevant `Rule:` block in the `.feature` file. -2. Write the Example using the standard `Given/When/Then` format describing the correct behavior. -3. Update TODO.md to note the new `@id` for the SE to implement. -4. SE implements the `@id` test in `tests/features/` **and** a `@given` Hypothesis property test in `tests/unit/`. Both are required. +1. Add a `@bug` Example to the relevant `Rule:` block in the `.feature` file using the standard `Given/When/Then` format describing the correct behavior. +2. Update TODO.md to note the new bug Example for the SE to implement. +3. SE implements the test in `tests/features/` **and** a `@given` Hypothesis property test in `tests/unit/`. Both are required. ## Available Skills diff --git a/.opencode/skills/implementation/SKILL.md b/.opencode/skills/implementation/SKILL.md index 8fbf5fd..826c6b7 100644 --- a/.opencode/skills/implementation/SKILL.md +++ b/.opencode/skills/implementation/SKILL.md @@ -116,12 +116,12 @@ Place stubs where responsibility dictates — do not pre-create `ports/` or `ada Append a new dated block to `docs/architecture.md` for each significant decision: ```markdown -## YYYY-MM-DD — : +## YYYY-MM-DD — : Decision: Reason: Alternatives considered: -Feature: +Feature: ``` Only write a block for non-obvious decisions with meaningful trade-offs. Routine YAGNI choices do not need a record. @@ -141,7 +141,11 @@ Apply to the stub files just written: If any check fails: fix the stub files before committing. -Commit: `feat(): add architecture stubs` +### Generate Test Stubs + +Run `uv run task test-fast` once. It reads the in-progress `.feature` file, assigns `@id` tags to any untagged `Example:` blocks (writing them back to the `.feature` file), and generates `tests/features//_test.py` — one file per `Rule:` block, one skipped function per `@id`. Verify the files were created, then stage all changes (including any `@id` write-backs to the `.feature` file). + +Commit: `feat(): add architecture and test stubs` --- @@ -152,26 +156,14 @@ Commit: `feat(): add architecture stubs` - [ ] Exactly one .feature `in_progress`. If not present, Load `skill feature-selection` - [ ] Architecture stubs present in `/` (committed by Step 2) - [ ] Read `docs/architecture.md` — understand all architectural decisions before writing any test -- [ ] Test stub files exist in `tests/features//_test.py` — one file per `Rule:` block, all `@id` stub functions present with `@pytest.mark.skip`; if missing, write them now before entering RED - -### Write Test Stubs (if not present) - -For each `Rule:` block in the in-progress `.feature` file, create `tests/features//_test.py` if it does not already exist. Write one function per `@id` Example, all skipped: - -```python -@pytest.mark.skip(reason="not yet implemented") -def test__<@id>() -> None: - """ - <@id steps raw text including new lines> - """ -``` +- [ ] Test stub files exist in `tests/features//_test.py` — generated by pytest-beehave at Step 2 end; if missing, re-run `uv run task test-fast` and commit the generated files before entering RED ### Build TODO.md Test List 1. List all `@id` tags from in-progress `.feature` file 2. Order: fewest dependencies first; most impactful within that set 3. Each `@id` = one TODO item, status: `pending` -4. Confirm each `@id` has a corresponding skipped stub in `tests/features//` — if any are missing, add them before proceeding +4. Confirm each `@id` has a corresponding skipped stub in `tests/features//` — if any are missing, add them before proceeding ### Outer Loop — One @id at a time @@ -182,7 +174,7 @@ For each pending `@id`: ``` INNER LOOP ├── RED -│ ├── Confirm stub for this @id exists in tests/features//.feature with @pytest.mark.skip +│ ├── Confirm stub for this @id exists in tests/features//_test.py with @pytest.mark.skip │ ├── Read existing stubs in `/` — base the test on the current data model and signatures │ ├── Write test body (Given/When/Then → Arrange/Act/Assert); remove @pytest.mark.skip │ ├── Update stub signatures as needed — edit the `.py` file directly @@ -265,11 +257,11 @@ Signal completion to the reviewer. Provide: ### Test File Layout ``` -tests/features//_test.py +tests/features//_test.py ``` -- `` = the `.feature` file stem -- `` = the `Rule:` title slugified +- `` = the `.feature` file stem with hyphens replaced by underscores, lowercase +- `` = the `Rule:` title slugified (lowercase, underscores) ### Function Naming @@ -299,7 +291,7 @@ def test__<@id>() -> None: ### Markers - `@pytest.mark.slow` — takes > 50ms (Hypothesis, DB, network, terminal I/O) -- `@pytest.mark.deprecated` — auto-skipped by conftest; used for superseded Examples +- `@pytest.mark.deprecated` — auto-skipped by pytest-beehave; used for superseded Examples ```python @pytest.mark.deprecated diff --git a/.opencode/skills/living-docs/SKILL.md b/.opencode/skills/living-docs/SKILL.md index ba2264b..8472547 100644 --- a/.opencode/skills/living-docs/SKILL.md +++ b/.opencode/skills/living-docs/SKILL.md @@ -188,7 +188,7 @@ If `docs/glossary.md` already exists: **When run standalone** (stakeholder on demand): commit after all diagrams and glossary are updated: ``` -docs(living-docs): update C4 and glossary after +docs(living-docs): update C4 and glossary after ``` If triggered without a specific feature (general refresh): diff --git a/.opencode/skills/pr-management/SKILL.md b/.opencode/skills/pr-management/SKILL.md index f10605c..94af430 100644 --- a/.opencode/skills/pr-management/SKILL.md +++ b/.opencode/skills/pr-management/SKILL.md @@ -14,7 +14,7 @@ Create and manage pull requests after the reviewer approves the feature (Step 5) ## Branch Naming ``` -feature/ # new feature +feature/ # new feature fix/ # bug fix refactor/ # refactoring docs/ # documentation @@ -42,7 +42,7 @@ git commit -m "chore(deps): add python-dotenv dependency" ```bash # Push branch -git push -u origin feature/ +git push -u origin feature/ # Create PR gh pr create \ diff --git a/.opencode/skills/refactor/SKILL.md b/.opencode/skills/refactor/SKILL.md index 6d84a2e..208d12d 100644 --- a/.opencode/skills/refactor/SKILL.md +++ b/.opencode/skills/refactor/SKILL.md @@ -265,9 +265,9 @@ Refactoring commits are always **separate** from feature commits. | Commit type | Message format | When | |---|---|---| -| Preparatory refactoring | `refactor(): ` | Before RED, to make the feature easier | -| REFACTOR phase | `refactor(): ` | After GREEN, cleaning up the green code | -| Feature addition | `feat(): ` | After GREEN (never mixed with refactor) | +| Preparatory refactoring | `refactor(): ` | Before RED, to make the feature easier | +| REFACTOR phase | `refactor(): ` | After GREEN, cleaning up the green code | +| Feature addition | `feat(): ` | After GREEN (never mixed with refactor) | Never mix a structural cleanup with a behavior addition in one commit. This keeps history bisectable and CI green at every commit. diff --git a/.opencode/skills/scope/SKILL.md b/.opencode/skills/scope/SKILL.md index af0de0f..cd02bd5 100644 --- a/.opencode/skills/scope/SKILL.md +++ b/.opencode/skills/scope/SKILL.md @@ -130,7 +130,7 @@ Append all answered Q&A to `docs/discovery_journal.md`, in groups (general, cros Group headers use this format: - General group: `### General` - Cross-cutting group: `### ` -- Feature group: `### Feature: ` +- Feature group: `### Feature: ` **Step B — Update .feature descriptions** @@ -216,7 +216,7 @@ Avoid: "As the system, I want..." (no business value). Break down stories that c - [ ] Rules collectively cover all entities in scope from the feature description - [ ] Every Rule passes the INVEST gate -Commit: `feat(stories): write user stories for ` +Commit: `feat(stories): write user stories for ` ### Step B — Criteria @@ -244,7 +244,6 @@ All Rules must have their pre-mortems completed before any Examples are written. ``` **Rules**: -- `@id` tag on the line before `Example:` - `Example:` keyword (not `Scenario:`) - `Given/When/Then` in plain English - `Then` must be a single, observable, measurable outcome — no "and" @@ -271,7 +270,6 @@ All Rules must have their pre-mortems completed before any Examples are written. **Review checklist:** - [ ] Every `Rule:` block has at least one Example -- [ ] Every `@id` is unique within this feature - [ ] Every Example has `Given/When/Then` - [ ] Every `Then` is a single, observable, measurable outcome - [ ] No Example tests implementation details @@ -291,15 +289,14 @@ Communicate verbally to the next agent. Every `DISAGREE` is a **hard blocker** - No impl details: no Example tests internal state or implementation — AGREE/DISAGREE | file:line - Coverage: every entity in the feature description appears in at least one Rule — AGREE/DISAGREE | missing: - Distinct: no two Examples test the same observable behavior — AGREE/DISAGREE | file:line -- Unique IDs: all @id values are unique within this feature — AGREE/DISAGREE - Pre-mortem: I ran a pre-mortem on each Rule and found no hidden failure modes — AGREE/DISAGREE | Rule: - Scope: no Example introduces behavior outside the feature boundary — AGREE/DISAGREE | file:line -Commit: `feat(criteria): write acceptance criteria for ` +Commit: `feat(criteria): write acceptance criteria for ` **After this commit, `Example:` blocks are frozen.** Any change requires: 1. Add `@deprecated` tag to the old Example -2. Write a new Example with a new `@id` +2. Write a new Example (the `@id` tag will be assigned automatically) --- @@ -310,14 +307,14 @@ When a defect is reported against a completed or in-progress feature: 1. **PO** adds a new Example to the relevant `Rule:` block in the `.feature` file: ```gherkin - @bug @id: + @bug Example: Given When Then ``` -2. **SE** implements the specific test in `tests/features//` (the `@id` test). +2. **SE** implements the specific test in `tests/features//` (the `@id` test). 3. **SE** also writes a `@given` Hypothesis property test in `tests/unit/` covering the whole class of inputs that triggered the bug — not just the single case. 4. Both tests are required — neither is optional. 5. SE follows the normal TDD loop (Step 3) for the new `@id`. @@ -404,7 +401,7 @@ Status: IN-PROGRESS |----|----------|--------| | Q8 | ... | ... | -### Feature: +### Feature: | ID | Question | Answer | |----|----------|--------| @@ -435,7 +432,7 @@ success/failure conditions, and out-of-scope boundaries.> (First session only. Omit this subsection in subsequent sessions.) ### Feature List -- `` — +- `` — (Write "No changes" if no features were added or modified this session.) ### Domain Model @@ -459,12 +456,12 @@ Rules: --- -## YYYY-MM-DD — : +## YYYY-MM-DD — : Decision: Reason: Alternatives considered: -Feature: +Feature: ``` Rules: Append-only. When a decision changes, append a new block that supersedes the old one. Cross-feature decisions use `Cross-feature:` in the header. Only write a block for non-obvious decisions with meaningful trade-offs. diff --git a/.opencode/skills/session-workflow/SKILL.md b/.opencode/skills/session-workflow/SKILL.md index ab86736..0281f2c 100644 --- a/.opencode/skills/session-workflow/SKILL.md +++ b/.opencode/skills/session-workflow/SKILL.md @@ -24,7 +24,7 @@ Every session starts by reading state. Every session ends by writing state. This 2. **If you are the PO** and Step 1 (SCOPE) is active: check `docs/discovery_journal.md` for the most recent session block. - If the most recent block has `Status: IN-PROGRESS` → the previous session was interrupted. Resume it before starting a new session: finish updating `.feature` files and `docs/discovery.md`, then mark the block `Status: COMPLETE`. 3. If a feature is active at Step 2–5, read: - - `docs/features/in-progress/.feature` — feature file (Rules + Examples + @id) + - `docs/features/in-progress/.feature` — feature file (Rules + Examples + @id) - `docs/discovery.md` — project-level synthesis changelog (for context) 4. Run `git status` — understand what is committed vs. what is not 5. Confirm scope: you are working on exactly one step of one feature @@ -43,7 +43,7 @@ Every session starts by reading state. Every session ends by writing state. This 2. Commit any uncommitted work (even WIP): ```bash git add -A - git commit -m "WIP(): " + git commit -m "WIP(): " ``` 3. If a step is fully complete, use the proper commit message instead of WIP. @@ -55,7 +55,7 @@ When a step completes within a session: 2. Commit the TODO.md update: ```bash git add TODO.md - git commit -m "chore: complete step for " + git commit -m "chore: complete step for " ``` 3. Only then begin the next step (in a new session where possible — see Rule 4). @@ -64,9 +64,9 @@ When a step completes within a session: ```markdown # Current Work -Feature: +Feature: Step: <1-5> () -Source: docs/features/in-progress/.feature +Source: docs/features/in-progress/.feature ## Progress - [x] `@id:`: @@ -79,15 +79,15 @@ Run @ **"Next" line format**: Always prefix with `Run @` so the human knows exactly which agent to invoke. Agent names are defined in `AGENTS.md` — use the name exactly as listed there. Examples: - `Run @ — implement @id:a1b2c3d4 (Step 3 RED)` -- `Run @ — load skill implementation and begin Step 2 (Architecture) for ` -- `Run @ — verify feature at Step 4` +- `Run @ — load skill implementation and begin Step 2 (Architecture) for ` +- `Run @ — verify feature at Step 4` - `Run @ — pick next BASELINED feature from backlog` -- `Run @ — accept feature at Step 5` +- `Run @ — accept feature at Step 5` **Source path by step:** -- Step 1: `Source: docs/features/backlog/.feature` -- Steps 2–4: `Source: docs/features/in-progress/.feature` -- Step 5: `Source: docs/features/completed/.feature` +- Step 1: `Source: docs/features/backlog/.feature` +- Steps 2–4: `Source: docs/features/in-progress/.feature` +- Step 5: `Source: docs/features/completed/.feature` Status markers: - `[ ]` — not started @@ -110,9 +110,9 @@ During Step 3 (TDD Loop), TODO.md **must** include a `## Cycle State` block to t ```markdown # Current Work -Feature: +Feature: Step: 3 (TDD Loop) -Source: docs/features/in-progress/.feature +Source: docs/features/in-progress/.feature ## Cycle State Test: `@id:` — diff --git a/.opencode/skills/verify/SKILL.md b/.opencode/skills/verify/SKILL.md index 5e2ee07..f95f9ba 100644 --- a/.opencode/skills/verify/SKILL.md +++ b/.opencode/skills/verify/SKILL.md @@ -162,7 +162,7 @@ Record what input was given and what output was observed. ### 9. Write the Report ```markdown -## Step 4 Verification Report — +## Step 4 Verification Report — ### pyproject.toml Gate | Check | Result | Notes | diff --git a/AGENTS.md b/AGENTS.md index 30c737b..e483107 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -5,9 +5,9 @@ A Python template to quickstart any project with a production-ready workflow, qu ## Workflow Overview Features flow through 5 steps with a WIP limit of 1 feature at a time. The filesystem enforces WIP: -- `docs/features/backlog/.feature` — features waiting to be worked on -- `docs/features/in-progress/.feature` — exactly one feature being built right now -- `docs/features/completed/.feature` — accepted and shipped features +- `docs/features/backlog/.feature` — features waiting to be worked on +- `docs/features/in-progress/.feature` — exactly one feature being built right now +- `docs/features/completed/.feature` — accepted and shipped features ``` STEP 1: SCOPE (product-owner) → discovery + Gherkin stories + criteria @@ -107,8 +107,8 @@ Commit: `feat(criteria): write acceptance criteria for ` ### Bug Handling When a defect is reported: -1. **PO** adds a `@bug @id:` Example to the relevant `Rule:` in the `.feature` file and moves (or keeps) the feature in `backlog/` for normal scheduling. -2. **SE** handles the bug when the feature is selected for development (standard Step 2–3 flow): implements the specific `@bug`-tagged test in `tests/features//` and also writes a `@given` Hypothesis property test in `tests/unit/` covering the whole class of inputs. +1. **PO** adds a `@bug` Example to the relevant `Rule:` in the `.feature` file and moves (or keeps) the feature in `backlog/` for normal scheduling. +2. **SE** handles the bug when the feature is selected for development (standard Step 2–3 flow): implements the specific `@bug`-tagged test in `tests/features//` and also writes a `@given` Hypothesis property test in `tests/unit/` covering the whole class of inputs. 3. Both tests are required. SE follows the normal TDD loop (Step 3). ## Filesystem Structure @@ -123,12 +123,12 @@ docs/ context.md ← C4 Level 1 diagram, PO updates via living-docs skill container.md ← C4 Level 2 diagram, PO updates via living-docs skill features/ - backlog/.feature ← narrative + Rules + Examples - in-progress/.feature - completed/.feature + backlog/.feature ← narrative + Rules + Examples + in-progress/.feature + completed/.feature tests/ - features// + features// _test.py ← one per Rule: block, software-engineer-written unit/ _test.py ← software-engineer-authored extras (no @id traceability) @@ -143,10 +143,12 @@ Tests in `tests/unit/` are software-engineer-authored extras not covered by any ## Test File Layout ``` -tests/features//_test.py +tests/features//_test.py ``` -### Stub Format (mandatory) +### Stub Format + +Stubs are auto-generated by pytest-beehave. The SE triggers generation at Step 2 end by running `uv run task test-fast`. pytest-beehave reads the in-progress `.feature` file and creates one skipped function per `@id`: ```python @pytest.mark.skip(reason="not yet implemented") @@ -158,7 +160,7 @@ def test__<@id>() -> None: ### Markers - `@pytest.mark.slow` — takes > 50ms; applied to Hypothesis tests and any test with I/O, network, or DB -- `@pytest.mark.deprecated` — auto-skipped by conftest; used for superseded Examples +- `@pytest.mark.deprecated` — auto-skipped by pytest-beehave; used for superseded Examples ## Development Commands diff --git a/README.md b/README.md index cc4aed1..a89bb28 100644 --- a/README.md +++ b/README.md @@ -32,23 +32,35 @@ uv run task test && uv run task lint && uv run task static-check --- -## What You Get +## Why this template? -### A structured 5-step development cycle +Most Python templates give you a folder structure and a `Makefile`. This one gives you a **complete delivery system**: + +- **No feature starts without written acceptance criteria** — Gherkin `Example:` blocks traced to tests +- **No feature ships without adversarial review** — the reviewer's default hypothesis is "broken" +- **No guesswork on test stubs** — they are generated automatically from your `.feature` files +- **No manual `@id` tags** — assigned automatically when you run tests +- **AI agents for every role** — PO, SE, and reviewer each have scoped instructions; none can exceed their authority + +--- + +## How it works + +### 5-step delivery cycle ``` SCOPE → ARCH → TDD LOOP → VERIFY → ACCEPT ``` -| Step | Who | What | -|------|-----|------| -| **SCOPE** | Product Owner | Discovery interviews → Gherkin stories → `@id` criteria | -| **ARCH** | Software Engineer | Module design, ADRs, test stubs | -| **TDD LOOP** | Software Engineer | RED → GREEN → REFACTOR, one `@id` at a time | -| **VERIFY** | Reviewer | Adversarial verification — default hypothesis: broken | -| **ACCEPT** | Product Owner | Demo, validate, ship | +| Step | Role | Output | +|------|------|--------| +| **1 · SCOPE** | Product Owner | Discovery interviews + Gherkin stories + acceptance criteria | +| **2 · ARCH** | Software Engineer | Module stubs, ADRs, auto-generated test stubs | +| **3 · TDD LOOP** | Software Engineer | RED → GREEN → REFACTOR, one criterion at a time | +| **4 · VERIFY** | Reviewer | Adversarial check — lint, types, coverage, semantic review | +| **5 · ACCEPT** | Product Owner | Demo, validate, ship | -WIP limit of 1. Features are `.feature` files that move between filesystem folders: +**WIP limit: 1 feature at a time.** Features are `.feature` files that move through folders: ``` docs/features/backlog/ ← waiting @@ -58,12 +70,12 @@ docs/features/completed/ ← shipped ### AI agents included -``` -@product-owner — scope, stories, acceptance -@software-engineer — architecture, TDD, git, releases -@reviewer — adversarial verification -@setup-project — one-time project initialisation -``` +| Agent | Responsibility | +|-------|---------------| +| `@product-owner` | Scope, stories, acceptance criteria, delivery acceptance | +| `@software-engineer` | Architecture, TDD loop, git, releases | +| `@reviewer` | Adversarial verification — default position: broken | +| `@setup-project` | One-time project initialisation | ### Quality tooling, pre-configured @@ -73,6 +85,7 @@ docs/features/completed/ ← shipped | `ruff` | Lint + format (Google docstrings) | | `pyright` | Static type checking — 0 errors | | `pytest` + `hypothesis` | Tests + property-based testing | +| `pytest-beehave` | Auto-generates test stubs from `.feature` files | | `pytest-cov` | Coverage — 100% required | | `pdoc` | API docs → GitHub Pages | | `taskipy` | Task runner | @@ -91,7 +104,7 @@ uv run task run # Run the app --- -## Code Standards +## Code standards | | | |---|---| @@ -104,19 +117,31 @@ uv run task run # Run the app --- -## Test Convention +## Test convention + +Write acceptance criteria in Gherkin: + +```gherkin +@id:a3f2b1c4 +Example: User sees version on startup + Given the application starts + When no arguments are passed + Then the version string is printed to stdout +``` + +Run tests once — a traced, skipped stub appears automatically: ```python @pytest.mark.skip(reason="not yet implemented") -def test_feature_a3f2b1c4() -> None: +def test_display_version_a3f2b1c4() -> None: """ - Given: ... - When: ... - Then: ... + Given the application starts + When no arguments are passed + Then the version string is printed to stdout """ ``` -Each test is traced to exactly one `@id` acceptance criterion. +Each test is traced to exactly one acceptance criterion. No orphan tests. No untested criteria. --- diff --git a/docs/architecture.md b/docs/architecture.md index 5ad7cbd..2edabcd 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -2,12 +2,12 @@ --- -## YYYY-MM-DD — : +## YYYY-MM-DD — : Decision: Reason: Alternatives considered: -Feature: +Feature: --- @@ -16,4 +16,4 @@ Feature: Decision: Reason: Alternatives considered: -Affected features: , +Affected features: , diff --git a/docs/discovery.md b/docs/discovery.md index 1c91da9..9b8a33f 100644 --- a/docs/discovery.md +++ b/docs/discovery.md @@ -10,7 +10,7 @@ success/failure conditions, and explicit out-of-scope boundaries.> (First session only. Omit this subsection in subsequent sessions.) ### Feature List -- `` — +- `` — (Write "No changes" if no features were added or modified this session.) ### Domain Model diff --git a/docs/discovery_journal.md b/docs/discovery_journal.md index 4c4fd41..ef538fe 100644 --- a/docs/discovery_journal.md +++ b/docs/discovery_journal.md @@ -23,7 +23,7 @@ Status: IN-PROGRESS |----|----------|--------| | Q8 | ... | ... | -### Feature: +### Feature: | ID | Question | Answer | |----|----------|--------| diff --git a/pyproject.toml b/pyproject.toml index 2c1bab2..f3c6b83 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -23,6 +23,7 @@ Documentation = "https://github.com/nullhack/python-project-template/tree/main/d dev = [ "pdoc>=14.0", "pytest>=9.0.3", + "pytest-beehave[html]>=3.0", "pytest-cov>=6.1.1", "pytest-html>=4.1.1", "pytest-mock>=3.14.0", @@ -121,7 +122,7 @@ pytest \ --cov-config=pyproject.toml \ --cov-report html:docs/coverage \ --cov-report term:skip-covered \ - --cov=pytest_beehave \ + --cov=app \ --cov-fail-under=100 \ --hypothesis-show-statistics \ --html=docs/tests/report.html \ @@ -153,3 +154,6 @@ dev = [ "gherkin-official>=39.0.0", "safety>=3.7.0", ] + +[tool.beehave] +features_path = "docs/features" diff --git a/tests/conftest.py b/tests/conftest.py index 9a606f7..a5c8f50 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -21,10 +21,3 @@ def pytest_html_results_table_header(cells): def pytest_html_results_table_row(report, cells): docstring = getattr(report, "docstrings", "") or "" cells.insert(2, f"{docstring}") - - -def pytest_collection_modifyitems(items): - """Automatically skip tests marked as deprecated.""" - for item in items: - if item.get_closest_marker("deprecated"): - item.add_marker(pytest.mark.skip(reason="deprecated")) diff --git a/uv.lock b/uv.lock index 0914f45..89d4c79 100644 --- a/uv.lock +++ b/uv.lock @@ -670,6 +670,24 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/d4/24/a372aaf5c9b7208e7112038812994107bc65a84cd00e0354a88c2c77a617/pytest-9.0.3-py3-none-any.whl", hash = "sha256:2c5efc453d45394fdd706ade797c0a81091eccd1d6e4bccfcd476e2b8e0ab5d9", size = 375249, upload-time = "2026-04-07T17:16:16.13Z" }, ] +[[package]] +name = "pytest-beehave" +version = "3.0.20260419" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "fire" }, + { name = "gherkin-official" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/6b/45/a64788db805fc079792d28670846f8320045bd82e67ea2528f842857606b/pytest_beehave-3.0.20260419.tar.gz", hash = "sha256:bc114a0f809e3b437f09f5d42da0a36a105dc8b7b7e311410a7fdcdc915398f0", size = 28685, upload-time = "2026-04-19T19:11:15.811Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/50/24/0bfacd345c1b75497f84d83ee3a4459ec30cc4e54fc4530c376f18346ccc/pytest_beehave-3.0.20260419-py3-none-any.whl", hash = "sha256:be3843af1e8691f6023007de147b4f92a8a4ca505f94439f2df210137e746acd", size = 30323, upload-time = "2026-04-19T19:11:14.168Z" }, +] + +[package.optional-dependencies] +html = [ + { name = "pytest-html" }, +] + [[package]] name = "pytest-cov" version = "6.1.1" @@ -748,6 +766,7 @@ dev = [ { name = "pdoc" }, { name = "pyright" }, { name = "pytest" }, + { name = "pytest-beehave", extra = ["html"] }, { name = "pytest-cov" }, { name = "pytest-html" }, { name = "pytest-mock" }, @@ -769,6 +788,7 @@ requires-dist = [ { name = "pdoc", marker = "extra == 'dev'", specifier = ">=14.0" }, { name = "pyright", marker = "extra == 'dev'", specifier = ">=1.1.407" }, { name = "pytest", marker = "extra == 'dev'", specifier = ">=9.0.3" }, + { name = "pytest-beehave", extras = ["html"], marker = "extra == 'dev'", specifier = ">=3.0" }, { name = "pytest-cov", marker = "extra == 'dev'", specifier = ">=6.1.1" }, { name = "pytest-html", marker = "extra == 'dev'", specifier = ">=4.1.1" }, { name = "pytest-mock", marker = "extra == 'dev'", specifier = ">=3.14.0" }, From 35aec6e8803628c019d0c48ff1a1a1f6fd8fef41 Mon Sep 17 00:00:00 2001 From: nullhack Date: Sun, 19 Apr 2026 16:42:59 -0400 Subject: [PATCH 2/2] chore(workflow): number self-declaration items and add completeness enforcement - implementation/SKILL.md: number all 25 items 1-25; add count reminder comment - verify/SKILL.md: add completeness hard gate (count must be 25, sequence must be gapless); expand report table from 21 to 25 numbered rows matching implementation template exactly --- .opencode/skills/implementation/SKILL.md | 52 ++++++++++++------------ .opencode/skills/verify/SKILL.md | 52 +++++++++++++----------- 2 files changed, 56 insertions(+), 48 deletions(-) diff --git a/.opencode/skills/implementation/SKILL.md b/.opencode/skills/implementation/SKILL.md index 826c6b7..61ada18 100644 --- a/.opencode/skills/implementation/SKILL.md +++ b/.opencode/skills/implementation/SKILL.md @@ -213,33 +213,35 @@ All must pass before Self-Declaration. ### Self-Declaration (once, after all quality gates pass) + + Communicate verbally to the reviewer. Answer honestly for each principle: -- YAGNI: no code without a failing test — AGREE/DISAGREE | file:line -- YAGNI: no speculative abstractions — AGREE/DISAGREE | file:line -- KISS: simplest solution that passes — AGREE/DISAGREE | file:line -- KISS: no premature optimization — AGREE/DISAGREE | file:line -- DRY: no duplication — AGREE/DISAGREE | file:line -- DRY: no redundant comments — AGREE/DISAGREE | file:line -- SOLID-S: one reason to change per class — AGREE/DISAGREE | file:line -- SOLID-O: open for extension, closed for modification — AGREE/DISAGREE | file:line -- SOLID-L: subtypes substitutable — AGREE/DISAGREE | file:line -- SOLID-I: no forced unused deps — AGREE/DISAGREE | file:line -- SOLID-D: depend on abstractions, not concretions — AGREE/DISAGREE | file:line -- OC-1: one level of indentation per method — AGREE/DISAGREE | deepest: file:line -- OC-2: no else after return — AGREE/DISAGREE | file:line -- OC-3: primitive types wrapped — AGREE/DISAGREE | file:line -- OC-4: first-class collections — AGREE/DISAGREE | file:line -- OC-5: one dot per line — AGREE/DISAGREE | file:line -- OC-6: no abbreviations — AGREE/DISAGREE | file:line -- OC-7: ≤20 lines per function, ≤50 per class — AGREE/DISAGREE | longest: file:line -- OC-8: ≤2 instance variables per class (behavioural classes only; dataclasses, Pydantic models, value objects, and TypedDicts are exempt) — AGREE/DISAGREE | file:line -- OC-9: no getters/setters — AGREE/DISAGREE | file:line -- Patterns: I have no good reason to refactor parts of the code using OOP or Design Patterns — AGREE/DISAGREE | file:line -- Patterns: no creational smell — AGREE/DISAGREE | file:line -- Patterns: no structural smell — AGREE/DISAGREE | file:line -- Patterns: no behavioral smell — AGREE/DISAGREE | file:line -- Semantic: tests operate at same abstraction as AC — AGREE/DISAGREE | file:line +1. YAGNI: no code without a failing test — AGREE/DISAGREE | file:line +2. YAGNI: no speculative abstractions — AGREE/DISAGREE | file:line +3. KISS: simplest solution that passes — AGREE/DISAGREE | file:line +4. KISS: no premature optimization — AGREE/DISAGREE | file:line +5. DRY: no duplication — AGREE/DISAGREE | file:line +6. DRY: no redundant comments — AGREE/DISAGREE | file:line +7. SOLID-S: one reason to change per class — AGREE/DISAGREE | file:line +8. SOLID-O: open for extension, closed for modification — AGREE/DISAGREE | file:line +9. SOLID-L: subtypes substitutable — AGREE/DISAGREE | file:line +10. SOLID-I: no forced unused deps — AGREE/DISAGREE | file:line +11. SOLID-D: depend on abstractions, not concretions — AGREE/DISAGREE | file:line +12. OC-1: one level of indentation per method — AGREE/DISAGREE | deepest: file:line +13. OC-2: no else after return — AGREE/DISAGREE | file:line +14. OC-3: primitive types wrapped — AGREE/DISAGREE | file:line +15. OC-4: first-class collections — AGREE/DISAGREE | file:line +16. OC-5: one dot per line — AGREE/DISAGREE | file:line +17. OC-6: no abbreviations — AGREE/DISAGREE | file:line +18. OC-7: ≤20 lines per function, ≤50 per class — AGREE/DISAGREE | longest: file:line +19. OC-8: ≤2 instance variables per class (behavioural classes only; dataclasses, Pydantic models, value objects, and TypedDicts are exempt) — AGREE/DISAGREE | file:line +20. OC-9: no getters/setters — AGREE/DISAGREE | file:line +21. Patterns: no good reason remains to refactor using OOP or Design Patterns — AGREE/DISAGREE | file:line +22. Patterns: no creational smell — AGREE/DISAGREE | file:line +23. Patterns: no structural smell — AGREE/DISAGREE | file:line +24. Patterns: no behavioral smell — AGREE/DISAGREE | file:line +25. Semantic: tests operate at same abstraction as AC — AGREE/DISAGREE | file:line A `DISAGREE` answer is not automatic rejection — state the reason and fix before handing off. diff --git a/.opencode/skills/verify/SKILL.md b/.opencode/skills/verify/SKILL.md index f95f9ba..3d5c449 100644 --- a/.opencode/skills/verify/SKILL.md +++ b/.opencode/skills/verify/SKILL.md @@ -60,6 +60,8 @@ Run before code review. If any row is FAIL, stop immediately with REJECTED. ### 5. Self-Declaration Audit +**Completeness check (hard gate — REJECT if failed)**: Count the numbered items in the SE's Self-Declaration. The template in `implementation/SKILL.md` has exactly 25 items numbered 1–25. If the count is not 25, or any number in the sequence 1–25 is missing, REJECT immediately — do not proceed to item-level audit. + Read the software-engineer's Self-Declaration from the handoff message. For every **AGREE** claim: @@ -183,29 +185,33 @@ Record what input was given and what output was observed. | uv run task test | PASS / FAIL | | ### Self-Declaration Audit -| Claim | Software-Engineer Claims | Reviewer Verdict | Evidence | -|------|-------------------------|------------------|----------| -| YAGNI | AGREE/DISAGREE | PASS/FAIL | | -| KISS | AGREE/DISAGREE | PASS/FAIL | | -| DRY | AGREE/DISAGREE | PASS/FAIL | | -| SOLID-S | AGREE/DISAGREE | PASS/FAIL | | -| SOLID-O | AGREE/DISAGREE | PASS/FAIL | | -| SOLID-L | AGREE/DISAGREE | PASS/FAIL | | -| SOLID-I | AGREE/DISAGREE | PASS/FAIL | | -| SOLID-D | AGREE/DISAGREE | PASS/FAIL | | -| OC-1 | AGREE/DISAGREE | PASS/FAIL | | -| OC-2 | AGREE/DISAGREE | PASS/FAIL | | -| OC-3 | AGREE/DISAGREE | PASS/FAIL | | -| OC-4 | AGREE/DISAGREE | PASS/FAIL | | -| OC-5 | AGREE/DISAGREE | PASS/FAIL | | -| OC-6 | AGREE/DISAGREE | PASS/FAIL | | -| OC-7 | AGREE/DISAGREE | PASS/FAIL | | -| OC-8 | AGREE/DISAGREE | PASS/FAIL | | -| OC-9 | AGREE/DISAGREE | PASS/FAIL | | -| Patterns Creational | AGREE/DISAGREE | PASS/FAIL | | -| Patterns Structural | AGREE/DISAGREE | PASS/FAIL | | -| Patterns Behavioral | AGREE/DISAGREE | PASS/FAIL | | -| Semantic | AGREE/DISAGREE | PASS/FAIL | | +| # | Claim | SE Claims | Reviewer Verdict | Evidence | +|---|-------|-----------|------------------|----------| +| 1 | YAGNI: no code without a failing test | AGREE/DISAGREE | PASS/FAIL | | +| 2 | YAGNI: no speculative abstractions | AGREE/DISAGREE | PASS/FAIL | | +| 3 | KISS: simplest solution that passes | AGREE/DISAGREE | PASS/FAIL | | +| 4 | KISS: no premature optimization | AGREE/DISAGREE | PASS/FAIL | | +| 5 | DRY: no duplication | AGREE/DISAGREE | PASS/FAIL | | +| 6 | DRY: no redundant comments | AGREE/DISAGREE | PASS/FAIL | | +| 7 | SOLID-S: one reason to change per class | AGREE/DISAGREE | PASS/FAIL | | +| 8 | SOLID-O: open for extension, closed for modification | AGREE/DISAGREE | PASS/FAIL | | +| 9 | SOLID-L: subtypes substitutable | AGREE/DISAGREE | PASS/FAIL | | +| 10 | SOLID-I: no forced unused deps | AGREE/DISAGREE | PASS/FAIL | | +| 11 | SOLID-D: depend on abstractions, not concretions | AGREE/DISAGREE | PASS/FAIL | | +| 12 | OC-1: one level of indentation per method | AGREE/DISAGREE | PASS/FAIL | | +| 13 | OC-2: no else after return | AGREE/DISAGREE | PASS/FAIL | | +| 14 | OC-3: primitive types wrapped | AGREE/DISAGREE | PASS/FAIL | | +| 15 | OC-4: first-class collections | AGREE/DISAGREE | PASS/FAIL | | +| 16 | OC-5: one dot per line | AGREE/DISAGREE | PASS/FAIL | | +| 17 | OC-6: no abbreviations | AGREE/DISAGREE | PASS/FAIL | | +| 18 | OC-7: ≤20 lines per function, ≤50 per class | AGREE/DISAGREE | PASS/FAIL | | +| 19 | OC-8: ≤2 instance variables (behavioural classes only) | AGREE/DISAGREE | PASS/FAIL | | +| 20 | OC-9: no getters/setters | AGREE/DISAGREE | PASS/FAIL | | +| 21 | Patterns: no good reason remains to refactor using OOP or Design Patterns | AGREE/DISAGREE | PASS/FAIL | | +| 22 | Patterns: no creational smell | AGREE/DISAGREE | PASS/FAIL | | +| 23 | Patterns: no structural smell | AGREE/DISAGREE | PASS/FAIL | | +| 24 | Patterns: no behavioral smell | AGREE/DISAGREE | PASS/FAIL | | +| 25 | Semantic: tests operate at same abstraction as AC | AGREE/DISAGREE | PASS/FAIL | | ### Reviewer Stance Declaration