The "harness" is the set of mechanical controls that make LLM-driven coding produce production-grade output regardless of which agent or contributor is at the keyboard. This document is the umbrella — every other doc in docs/ is a layer of it.
| Layer | What it enforces | Where it lives |
|---|---|---|
| Lint | Style, simple bugs, security smells | pyproject.toml [tool.ruff] (E W F I N UP B SIM TCH S RUF), .pre-commit-config.yaml |
| Format | Consistent line shape | ruff format, prettier (frontend) |
| Type check | No untyped code | pyproject.toml [tool.mypy] strict = true; tsc --noEmit for the frontend |
| Architecture | One-way layer flow | pyproject.toml [tool.importlinter] + docs/BOUNDARIES.md |
| Tests | Behaviour | pytest tests/, pytest eval/, vitest |
| Coverage | ≥ 75% on src/ |
pyproject.toml [tool.coverage.report] |
| Pre-commit | Local-first defence | .pre-commit-config.yaml (ruff, gitleaks, commitizen, mypy, hygiene) |
| CI | Non-bypassable | .github/workflows/ci.yml + security.yml + pr-title.yml (21 required contexts) plus release and maintenance workflows |
| Branch protection | Declarative, drift-checked | .github/branch-protection/{develop,main}.json + branch-protection.yml apply workflow + check_required_contexts.py meta-gate |
| Commit format | Seven prefixes only | [tool.commitizen] schema + pr-title.yml allowlist + check_commit_types.py meta-gate |
| Secret scan | Three checkpoints | local hook → pre-commit → security.yml gitleaks |
| Container scan | HIGH/CRITICAL CVEs block | security.yml trivy-action |
| Dep scan | Pinned + audited | pip-audit, npm audit |
| Release | Reproducible artefacts | release.yml (image push to GHCR + CycloneDX SBOM) |
| Eval | LLM-output regressions | src/eval/, eval/, eval-nightly.yml (workflow_dispatch by default) |
| Issue execution | GitHub stays canonical; Beads can drive local ready/blocked work | GitHub issue templates + PR template + optional docs/BEADS.md queue guidance |
| Agent hooks | LLM coder side enforcement | .claude/hooks/{pretooluse_bash, posttooluse_writeedit, sessionstart}.py + settings.local.json.example |
| Skills | Auto-activated agent guidance | .claude/skills/{architect, code-reviewer, devops, frontend, qa-engineer, technical-writer} |
Each layer above catches something specific. None catches everything. Stacked, they form a defence-in-depth where the cost of a mistake is bounded by the highest-fidelity layer it slips past.
The harness is independent of the project's domain. The included src/api/echo and eval/golden_qa.json are scaffolding so every gate has something to operate on. Once you replace them with your real domain, the same harness keeps the new code under the same posture.
For someone who hasn't worked in a harness like this before, start with docs/HARNESS_PRIMER.md — it's the same surface as this doc but written for non-engineers.
For an engineer setting up the template:
docs/INVARIANTS.md— the load-bearing rules. Every PR is checked against them.docs/BOUNDARIES.md— module layering and the import-linter contracts.docs/DEVELOPMENT.md— local setup, thejustfile, the CI pipeline.docs/EVAL_HARNESS.md— the eval flywheel; how to add a case, how to opt the nightly into running.docs/BEADS.md— optional local execution queue layered under GitHub Issues.docs/SECURITY.md— threat model + the defence-in-depth map.docs/ARCHITECTURE.md— scaffold-level diagram; expand as your domain lands.