Skip to content

twjohnwu/releaseGuard

Repository files navigation

ReleaseGuard

English · 繁體中文

MR-level release gating: four specialised agents inspect the diff in parallel; an arbitration layer collapses their signals into a single HOLD / REVIEW / PROCEED recommendation that lands at the top of the MR comment.

Why this exists

Traditional CI runs the full test suite plus rule checks and reports a single red/green bit. Real release risk isn't that flat — a single diff simultaneously touches "which tests need to run", "how risky is the rollout", "who knows this code", and "any code-level anti-patterns?". Squashing those four orthogonal views into one pass/fail throws away the signal that reviewers actually need.

ReleaseGuard splits these views into four specialised agents that run in parallel and emit structured signals, then a Decision Arbitration Layer collapses them into one actionable recommendation. The reviewer opens the MR and sees the verdict (HOLD / REVIEW / PROCEED) plus the specific triggered_signals that drove it.

Visual proof

Three MR comments produced by docker compose up against the local harness — the same analyzer code path, three different diffs, three different recommendations:

PROCEED screenshot REVIEW screenshot HOLD screenshot
✅ PROCEED
docs-only change
🟡 REVIEW
handler + config drift (4 medium zones)
🔴 HOLD
critical drift in secrets config

Side-by-side breakdown plus the Topology 0 integration validation (real LCOV + reverse BFS over 17K symbols) live in docs/case_study.md.

How it works

flowchart LR
    A[GitLab MR] --> B[Analyzer]
    B --> C{4 agents in parallel}
    C --> D1[Selective Test]
    C --> D2[Rollout Risk]
    C --> D3[Ownership]
    C --> D4[AI Reviewer]
    D1 --> E[Arbitration]
    D2 --> E
    D3 --> E
    D4 --> E
    E --> F[HOLD / REVIEW / PROCEED]
    F --> G[MR comment]
Loading

Full pipeline, T0/T1 topology comparison, and the arbitration decision matrix (three mermaid diagrams) are in docs/architecture.md.

Design highlights & trade-offs

  • Many insights → one decision. The arbitration layer reconciles "four views speak at once" with "the reviewer wants one clear call". The triggered_signals array keeps every contributing signal traceable.
  • Ownership stays neutral. No reviewer scores, no auto-written approval rules, and the suggestion order is shuffled (a lint test enforces score / kind are not present in the JSON schema). The system surfaces "who has touched this area" without making personnel judgements.
  • L1 / L2 / L3 confidence ladder. Selective Test degrades gracefully: path heuristics (confidence 0.6) → coverage_map intersect (0.75) → callgraph reverse BFS with a dynamic-ratio confidence penalty (0.9). Real coverage is generated by cmd/cover2lcov, which runs go test -coverprofile per Test* across the whole repo and converts the output to LCOV before feeding the indexer — not a hand-crafted seed.
  • Topology 1 invites the zero-infra adopter. The full topology needs Postgres + a nightly indexer; Topology 1 runs the analyzer container alone (degraded but usable), so an infrastructure prerequisite never blocks the first taste.

Quick start

make build
./bin/analyzer

Local end-to-end demo

cd deploy/compose
mkdir -p artifacts
docker compose up --build --abort-on-container-exit
for f in artifacts/note-proj*.md; do echo "=== $f ==="; cat "$f"; echo; done

Three analyzer instances run in parallel against a mock GitLab and post the three arbitration outcomes (PROCEED / REVIEW / HOLD) as separate MR comments. See deploy/compose/README.md for details.

Caller usage

include:
  - project: 'platform/releaseguard'
    ref: main
    file: 'deploy/ci/releaseguard.yaml'

releaseguard-review:
  extends: .releaseguard-full
  variables:
    TARGET_SERVICE_NAME: my-service
    TARGET_SERVICE_TYPE: backend

Documentation

About

MR-level release gating: 4 agents + arbitration → HOLD/REVIEW/PROCEED

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages