-
Notifications
You must be signed in to change notification settings - Fork 0
Refactor review action to multi-phase orchestrator with model routing #21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
d3ae6d5
c4e9c30
c014c45
37a1142
3737488
1ac0c46
80f1dcf
df71d9f
df89a93
66fdece
9aaa3ed
e55a3e8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| --- | ||
| name: compliance | ||
| model: claude-sonnet-4-5 | ||
| description: CLAUDE.md compliance specialist | ||
| --- | ||
|
|
||
| Audit changed files against relevant CLAUDE.md guidance. | ||
| Return only JSON findings with concrete rule references. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| --- | ||
| name: quality | ||
| model: claude-opus-4-6 | ||
| description: Code quality specialist for correctness and reliability | ||
| --- | ||
|
|
||
| Find high-signal correctness, reliability, and performance issues. | ||
| Return only JSON findings. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| --- | ||
| name: security | ||
| model: claude-opus-4-6 | ||
| description: Security specialist for exploitable vulnerabilities | ||
| --- | ||
|
|
||
| Find exploitable vulnerabilities in changed code with concrete attack paths. | ||
| Return only JSON findings including exploit preconditions and trust boundary. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| --- | ||
| name: triage | ||
| model: claude-haiku-4-5 | ||
| description: Fast PR triage for skip/continue decisions | ||
| --- | ||
|
|
||
| Determine whether review can be skipped safely. | ||
| Return only JSON with `skip_review`, `reason`, and `risk_level`. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| --- | ||
| name: validator | ||
| model: claude-sonnet-4-5 | ||
| description: Finding validation and deduplication specialist | ||
| --- | ||
|
|
||
| Validate candidate findings with strict confidence and impact criteria. | ||
| Return only JSON decisions for keep/drop. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,61 @@ | ||
| """Utilities for merging and deduplicating findings from multiple phases.""" | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🤖 Code Review Finding: Missing tests for new functionality Severity: MEDIUM Impact: Violates CLAUDE.md requirement that 'Tests required for new functionality'. The merge_findings function is critical for deduplication and could fail silently if edge cases aren't tested (e.g., handling None values, empty lists, duplicate keys with same severity/confidence). Recommendation: Create claudecode/test_findings_merge.py with comprehensive tests covering: empty input, single finding, duplicate findings with different severities/confidences, edge cases for line number parsing, and invalid input handling. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🤖 Code Review Finding: Missing tests for new functionality violates CLAUDE.md requirement Severity: HIGH Impact: Untested deduplication logic can lead to findings being incorrectly merged or dropped, causing false negatives (missed issues) or false positives (duplicate issues) in code reviews. Recommendation: Add test_findings_merge.py with comprehensive tests covering: _finding_key() normalization and key generation, _severity_rank() and _confidence_value() parsing, merge_findings() deduplication logic with various combinations of duplicate and unique findings, edge cases for malformed input. |
||
|
|
||
| from typing import Any, Dict, List, Tuple | ||
|
|
||
|
|
||
| def _normalize_text(value: Any) -> str: | ||
| return str(value or "").strip().lower() | ||
|
|
||
|
|
||
| def _finding_key(finding: Dict[str, Any]) -> Tuple[str, int, str, str]: | ||
| file_path = _normalize_text(finding.get("file")) | ||
| line = finding.get("line") | ||
| try: | ||
| line_no = int(line) | ||
| except (TypeError, ValueError): | ||
| line_no = 1 | ||
| category = _normalize_text(finding.get("category")) | ||
| title = _normalize_text(finding.get("title")) | ||
| return file_path, line_no, category, title | ||
|
|
||
|
|
||
| def _severity_rank(value: Any) -> int: | ||
| sev = _normalize_text(value).upper() | ||
| if sev == "HIGH": | ||
| return 3 | ||
| if sev == "MEDIUM": | ||
| return 2 | ||
| if sev == "LOW": | ||
| return 1 | ||
| return 0 | ||
|
|
||
|
|
||
| def _confidence_value(value: Any) -> float: | ||
| try: | ||
| return float(value) | ||
| except (TypeError, ValueError): | ||
| return 0.0 | ||
|
|
||
|
|
||
| def merge_findings(findings: List[Dict[str, Any]]) -> List[Dict[str, Any]]: | ||
| """Merge duplicate findings and keep the strongest candidate.""" | ||
| merged: Dict[Tuple[str, int, str, str], Dict[str, Any]] = {} | ||
|
|
||
| for finding in findings: | ||
| if not isinstance(finding, dict): | ||
| continue | ||
|
|
||
| key = _finding_key(finding) | ||
| existing = merged.get(key) | ||
|
|
||
| if existing is None: | ||
| merged[key] = finding | ||
| continue | ||
|
|
||
| incoming_score = (_severity_rank(finding.get("severity")), _confidence_value(finding.get("confidence"))) | ||
| existing_score = (_severity_rank(existing.get("severity")), _confidence_value(existing.get("confidence"))) | ||
|
|
||
| if incoming_score > existing_score: | ||
| merged[key] = finding | ||
|
|
||
| return list(merged.values()) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤖 Code Review Finding: Missing tests for new functionality
Severity: HIGH
Category: compliance
Impact: Untested code reduces reliability and makes future changes risky, potentially introducing bugs in the findings deduplication logic.
Recommendation: Create test_findings_merge.py with comprehensive tests for the merge_findings function, including edge cases for duplicate detection and priority ranking.