Skip to content

Latest commit

 

History

History
46 lines (36 loc) · 3.69 KB

File metadata and controls

46 lines (36 loc) · 3.69 KB

The harness

The "harness" is the set of mechanical controls that make LLM-driven coding produce production-grade output regardless of which agent or contributor is at the keyboard. This document is the umbrella — every other doc in docs/ is a layer of it.

What's in the harness

Layer What it enforces Where it lives
Lint Style, simple bugs, security smells pyproject.toml [tool.ruff] (E W F I N UP B SIM TCH S RUF), .pre-commit-config.yaml
Format Consistent line shape ruff format, prettier (frontend)
Type check No untyped code pyproject.toml [tool.mypy] strict = true; tsc --noEmit for the frontend
Architecture One-way layer flow pyproject.toml [tool.importlinter] + docs/BOUNDARIES.md
Tests Behaviour pytest tests/, pytest eval/, vitest
Coverage ≥ 75% on src/ pyproject.toml [tool.coverage.report]
Pre-commit Local-first defence .pre-commit-config.yaml (ruff, gitleaks, commitizen, mypy, hygiene)
CI Non-bypassable .github/workflows/ci.yml + security.yml + pr-title.yml (21 required contexts) plus release and maintenance workflows
Branch protection Declarative, drift-checked .github/branch-protection/{develop,main}.json + branch-protection.yml apply workflow + check_required_contexts.py meta-gate
Commit format Seven prefixes only [tool.commitizen] schema + pr-title.yml allowlist + check_commit_types.py meta-gate
Secret scan Three checkpoints local hook → pre-commit → security.yml gitleaks
Container scan HIGH/CRITICAL CVEs block security.yml trivy-action
Dep scan Pinned + audited pip-audit, npm audit
Release Reproducible artefacts release.yml (image push to GHCR + CycloneDX SBOM)
Eval LLM-output regressions src/eval/, eval/, eval-nightly.yml (workflow_dispatch by default)
Issue execution GitHub stays canonical; Beads can drive local ready/blocked work GitHub issue templates + PR template + optional docs/BEADS.md queue guidance
Agent hooks LLM coder side enforcement .claude/hooks/{pretooluse_bash, posttooluse_writeedit, sessionstart}.py + settings.local.json.example
Skills Auto-activated agent guidance .claude/skills/{architect, code-reviewer, devops, frontend, qa-engineer, technical-writer}

Why "harness"

Each layer above catches something specific. None catches everything. Stacked, they form a defence-in-depth where the cost of a mistake is bounded by the highest-fidelity layer it slips past.

The harness is independent of the project's domain. The included src/api/echo and eval/golden_qa.json are scaffolding so every gate has something to operate on. Once you replace them with your real domain, the same harness keeps the new code under the same posture.

Reading order

For someone who hasn't worked in a harness like this before, start with docs/HARNESS_PRIMER.md — it's the same surface as this doc but written for non-engineers.

For an engineer setting up the template:

  1. docs/INVARIANTS.md — the load-bearing rules. Every PR is checked against them.
  2. docs/BOUNDARIES.md — module layering and the import-linter contracts.
  3. docs/DEVELOPMENT.md — local setup, the justfile, the CI pipeline.
  4. docs/EVAL_HARNESS.md — the eval flywheel; how to add a case, how to opt the nightly into running.
  5. docs/BEADS.md — optional local execution queue layered under GitHub Issues.
  6. docs/SECURITY.md — threat model + the defence-in-depth map.
  7. docs/ARCHITECTURE.md — scaffold-level diagram; expand as your domain lands.