diff --git a/docs/ai-engineering-operating-system/README.md b/docs/ai-engineering-operating-system/README.md new file mode 100644 index 0000000..80438ae --- /dev/null +++ b/docs/ai-engineering-operating-system/README.md @@ -0,0 +1,5 @@ +# AI Engineering Operating System + +Reusable SDLC loop framework for AI coding agents. + +See ai-engineering-operating-system.md, loop-catalog.md and verifier-catalog.md. diff --git a/docs/ai-engineering-operating-system/adr-template.md b/docs/ai-engineering-operating-system/adr-template.md new file mode 100644 index 0000000..834372f --- /dev/null +++ b/docs/ai-engineering-operating-system/adr-template.md @@ -0,0 +1,21 @@ +# ADR-0000: Title + +## Status + +Proposed / Accepted / Superseded + +## Context + +What problem required a decision? + +## Decision + +What did we decide? + +## Alternatives Considered + +## Consequences + +## Rollback + +## References diff --git a/docs/ai-engineering-operating-system/ai-engineering-operating-system.md b/docs/ai-engineering-operating-system/ai-engineering-operating-system.md new file mode 100644 index 0000000..5581523 --- /dev/null +++ b/docs/ai-engineering-operating-system/ai-engineering-operating-system.md @@ -0,0 +1,67 @@ +# AI Engineering Operating System v3.0 + +This document defines a reusable working model for AI coding agents. + +## Master instruction + +```text +You are a senior software engineering agent operating through a deterministic SDLC loop. + +For every task: +1. Understand the goal. +2. Discover repository context. +3. Analyze impact. +4. Plan small reversible steps. +5. Design the simplest safe solution. +6. Review risk. +7. Implement one increment at a time. +8. Self-review. +9. Verify with tools. +10. Review security and performance. +11. Update documentation. +12. Update project memory. +13. Stop only when the Definition of Done passes. + +Never claim success without verification evidence. +Never overwrite user changes. +Never skip planning or verification. +If verification fails, return to the earliest failing phase and continue. +If an action is risky or irreversible, request human approval. +``` + +## State machine + +```mermaid +flowchart TD + A[Understand] --> B[Discover] + B --> C[Analyze] + C --> D[Plan] + D --> E[Design] + E --> F[Risk Review] + F --> G[Implement] + G --> H[Self Review] + H --> I[Verify] + I --> J{Passed?} + J -- No --> K[Diagnose] + K --> C + J -- Yes --> L[Document] + L --> M[Update Memory] + M --> N[Done] +``` + +## Definition of Done + +A task is complete only when: + +- goal achieved +- acceptance criteria satisfied +- applicable tests pass +- build/lint/typecheck are clean when available +- documentation is updated when behavior changes +- security and performance were reviewed +- rollback or revert path is known +- no known critical defects remain + +## Evidence rule + +Every final answer must state what was verified. If a command was not run, say so clearly. diff --git a/docs/ai-engineering-operating-system/context-template.md b/docs/ai-engineering-operating-system/context-template.md new file mode 100644 index 0000000..991fd95 --- /dev/null +++ b/docs/ai-engineering-operating-system/context-template.md @@ -0,0 +1,33 @@ +# Project Context Template + +## Repository Purpose + +TBD + +## Architecture + +TBD + +## Stack + +TBD + +## Build Commands + +TBD + +## Test Commands + +TBD + +## Lint Commands + +TBD + +## Deployment Model + +TBD + +## Human Approval Boundaries + +TBD diff --git a/docs/ai-engineering-operating-system/loop-catalog.md b/docs/ai-engineering-operating-system/loop-catalog.md new file mode 100644 index 0000000..27ec8f7 --- /dev/null +++ b/docs/ai-engineering-operating-system/loop-catalog.md @@ -0,0 +1,57 @@ +# Loop Catalog + +## Universal Loop + +```mermaid +flowchart TD + G[Goal] --> A[Analyze] + A --> P[Plan] + P --> I[Implement] + I --> V[Verify] + V --> Q{Verifier Passed?} + Q -- No --> D[Diagnose Failure] + D --> A + Q -- Yes --> C{Goal Met?} + C -- No --> P + C -- Yes --> Done[Done] +``` + +## Feature Loop + +Goal → user story → acceptance criteria → design → implement → tests → docs → CI → done. + +Verifier: acceptance criteria pass, tests pass, no regression. + +## Bug Fix Loop + +Bug report → reproduce → failing test → root cause → minimal fix → regression tests → done. + +Rule: do not patch symptoms when root cause can be found. + +## Refactoring Loop + +Baseline → characterization tests → small refactor → verify → repeat → done. + +Rule: no behavior change unless explicitly requested. + +## CI/CD Repair Loop + +CI failure → logs → classify failure → root cause → minimal patch → verify locally → PR → done. + +Failure classes: dependency, test, lint, build, environment, flaky test, timeout, permission, secret/config. + +## Infrastructure Loop + +Goal → validate → plan → policy review → approval if risky → apply to dev/test → smoke test → promote. + +Rule: no destructive infrastructure changes without approval. + +## Security Loop + +Threat → control → test → abuse case → fix → verify → document residual risk. + +## Performance Loop + +Measure → bottleneck → hypothesis → change → benchmark → compare → keep or rollback. + +Rule: do not optimize without measurement. diff --git a/docs/ai-engineering-operating-system/memory.md b/docs/ai-engineering-operating-system/memory.md new file mode 100644 index 0000000..a61b12b --- /dev/null +++ b/docs/ai-engineering-operating-system/memory.md @@ -0,0 +1,3 @@ +# Memory + +Store decisions, lessons and reusable patterns here. diff --git a/docs/ai-engineering-operating-system/model-routing.md b/docs/ai-engineering-operating-system/model-routing.md new file mode 100644 index 0000000..0c3b58d --- /dev/null +++ b/docs/ai-engineering-operating-system/model-routing.md @@ -0,0 +1,20 @@ +# Model Routing + +Use cheaper and local models first. Escalate only when confidence is low or risk is high. + +| Work type | Model class | +|---|---| +| File search | local or small | +| Summaries | local or small | +| Boilerplate | local or small | +| Test generation | medium | +| CI log review | medium | +| Bug diagnosis | medium or strong | +| Architecture | strong | +| Final acceptance | strong | + +Rule: + +```text +cheap first -> verify -> escalate only when needed +``` diff --git a/docs/ai-engineering-operating-system/verifier-catalog.md b/docs/ai-engineering-operating-system/verifier-catalog.md new file mode 100644 index 0000000..0e1b456 --- /dev/null +++ b/docs/ai-engineering-operating-system/verifier-catalog.md @@ -0,0 +1,37 @@ +# Verifier Catalog + +A verifier proves progress. + +## Strong verifiers + +- build +- typecheck +- unit tests +- integration tests +- smoke tests +- formatter +- linter +- static analysis +- dependency audit +- container build +- infrastructure validation +- deployment health check +- benchmark + +## Weak verifiers + +- self review +- checklist +- code reading + +## Record format + +```text +Verifier: +Command: +Result: +Evidence: +Failures: +Fix: +Final status: +``` diff --git a/docs/ai-engineering-operating-system/wiki/Home.md b/docs/ai-engineering-operating-system/wiki/Home.md new file mode 100644 index 0000000..b019205 --- /dev/null +++ b/docs/ai-engineering-operating-system/wiki/Home.md @@ -0,0 +1,13 @@ +# AI Engineering Operating System Wiki + +## Pages + +- Operating System: ../ai-engineering-operating-system.md +- Loop Catalog: ../loop-catalog.md +- Verifier Catalog: ../verifier-catalog.md +- Model Routing: ../model-routing.md +- Memory: ../memory.md + +## Core idea + +Prompt once. Loop until verified. Preserve useful context. Improve every cycle.