The harness

The "harness" is the set of mechanical controls that make LLM-driven coding produce production-grade output regardless of which agent or contributor is at the keyboard. This document is the umbrella — every other doc in docs/ is a layer of it.

What's in the harness

Layer	What it enforces	Where it lives
Lint	Style, simple bugs, security smells	`pyproject.toml` `[tool.ruff]` (E W F I N UP B SIM TCH S RUF), `.pre-commit-config.yaml`
Format	Consistent line shape	`ruff format`, `prettier` (frontend)
Type check	No untyped code	`pyproject.toml` `[tool.mypy]` `strict = true`; `tsc --noEmit` for the frontend
Architecture	One-way layer flow	`pyproject.toml` `[tool.importlinter]` + `docs/BOUNDARIES.md`
Tests	Behaviour	`pytest tests/`, `pytest eval/`, `vitest`
Coverage	≥ 75% on `src/`	`pyproject.toml` `[tool.coverage.report]`
Pre-commit	Local-first defence	`.pre-commit-config.yaml` (ruff, gitleaks, commitizen, mypy, hygiene)
CI	Non-bypassable	`.github/workflows/ci.yml` + `security.yml` + `pr-title.yml` (21 required contexts) plus release and maintenance workflows
Branch protection	Declarative, drift-checked	`.github/branch-protection/{develop,main}.json` + `branch-protection.yml` apply workflow + `check_required_contexts.py` meta-gate
Commit format	Seven prefixes only	`[tool.commitizen]` schema + `pr-title.yml` allowlist + `check_commit_types.py` meta-gate
Secret scan	Three checkpoints	local hook → pre-commit → `security.yml` gitleaks
Container scan	HIGH/CRITICAL CVEs block	`security.yml` trivy-action
Dep scan	Pinned + audited	pip-audit, npm audit
Release	Reproducible artefacts	`release.yml` (image push to GHCR + CycloneDX SBOM)
Eval	LLM-output regressions	`src/eval/`, `eval/`, `eval-nightly.yml` (workflow_dispatch by default)
Issue execution	GitHub stays canonical; Beads can drive local ready/blocked work	GitHub issue templates + PR template + optional `docs/BEADS.md` queue guidance
Agent hooks	LLM coder side enforcement	`.claude/hooks/{pretooluse_bash, posttooluse_writeedit, sessionstart}.py` + `settings.local.json.example`
Skills	Auto-activated agent guidance	`.claude/skills/{architect, code-reviewer, devops, frontend, qa-engineer, technical-writer}`

Why "harness"

Each layer above catches something specific. None catches everything. Stacked, they form a defence-in-depth where the cost of a mistake is bounded by the highest-fidelity layer it slips past.

The harness is independent of the project's domain. The included src/api/echo and eval/golden_qa.json are scaffolding so every gate has something to operate on. Once you replace them with your real domain, the same harness keeps the new code under the same posture.

Reading order

For someone who hasn't worked in a harness like this before, start with docs/HARNESS_PRIMER.md — it's the same surface as this doc but written for non-engineers.

For an engineer setting up the template:

docs/INVARIANTS.md — the load-bearing rules. Every PR is checked against them.
docs/BOUNDARIES.md — module layering and the import-linter contracts.
docs/DEVELOPMENT.md — local setup, the justfile, the CI pipeline.
docs/EVAL_HARNESS.md — the eval flywheel; how to add a case, how to opt the nightly into running.
docs/BEADS.md — optional local execution queue layered under GitHub Issues.
docs/SECURITY.md — threat model + the defence-in-depth map.
docs/ARCHITECTURE.md — scaffold-level diagram; expand as your domain lands.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The harness

What's in the harness

Why "harness"

Reading order

FilesExpand file tree

HARNESS.md

Latest commit

History

HARNESS.md

File metadata and controls

The harness

What's in the harness

Why "harness"

Reading order