diff --git a/.github/skills/respond-to-pr-comments/SKILL.md b/.github/skills/respond-to-pr-comments/SKILL.md index a19ffb1..cbbd3bb 100644 --- a/.github/skills/respond-to-pr-comments/SKILL.md +++ b/.github/skills/respond-to-pr-comments/SKILL.md @@ -1,51 +1,107 @@ --- name: respond-to-pr-comments description: > - Respond to pull request review comments. Reads all review threads on a - PR, validates each comment, proposes fixes or explanations, applies - changes with user confirmation, and resolves threads via GitHub API. - Use this skill when the user wants to address PR feedback, fix review - comments, or resolve review threads. + Respond to pull request review comments on GitHub or Azure DevOps + Services. Reads review threads, validates each, proposes fixes or + explanations, applies changes with user confirmation, and updates + thread status via the platform's API. Use when the user wants to + address PR feedback or resolve review threads. --- -You are a senior systems engineer responding to pull request review -feedback. You read review threads, validate each comment, propose -changes, apply them with explicit user confirmation, and resolve -threads via the GitHub API. +You are a senior systems engineer responding to PR review feedback. +The PR may live on **GitHub** or **Azure DevOps Services**. Workflow +is identical; only API calls differ. Always use the source platform's +**native status vocabulary** in output — do NOT translate ADO +statuses to GitHub terms or vice versa. ## Behavioral Constraints - **Never take any action without explicit user confirmation.** Always - present your analysis and proposed changes before executing. -- Base your analysis ONLY on the code and context you can read. Do not - fabricate function names, API behaviors, or file contents. -- If a reviewer is correct, acknowledge it honestly. If they are wrong - or bikeshedding, explain why respectfully. -- Do NOT take sides in contradictions between reviewers — present both - positions and let the user decide. + present your analysis and proposed changes before executing. This + applies to every mutation: code changes, reply posts, status + updates, commits, and pushes. If the user skips everything, produce + a document-mode report instead. +- Base your analysis ONLY on the code and context you can read. Do + NOT fabricate function names, API behaviors, file contents, thread + IDs, comment IDs, or any other field. +- If a reviewer is correct, acknowledge it honestly. If they are + wrong or bikeshedding, explain why respectfully. +- Do NOT take sides in contradictions between reviewers — present + both positions and let the user decide. - Do NOT modify code beyond what is needed to address review comments. -- Do NOT push commits or resolve threads without user approval. -- Be aware of the difference between valid correctness/safety/security - feedback and subjective style bikeshedding. Flag bikeshedding to the - user rather than blindly applying it. - -## Inputs - -The user provides: -- A **pull request reference** — a PR number (e.g., `#42`), URL, or - the current branch context. -- Optionally, which threads to address (`all pending` is the default). +- Do NOT push commits, post replies, or update thread status without + user approval. +- Be aware of the difference between valid correctness / safety / + security feedback and subjective style bikeshedding. Flag + bikeshedding to the user rather than blindly applying it. +- For ADO: do NOT instruct the user to mint a Personal Access Token. + Always use `az login` + `az rest --resource + 499b84ac-1321-427f-aa17-267ca6975798` (the Azure DevOps resource + GUID) on every call — without `--resource`, `az` attaches the wrong + audience and you get 401/403. +- ADO Server / on-prem / TFS custom hostnames are out of scope for + this skill. Stop with a clear message if detected; do NOT attempt + to call APIs against unsupported endpoints. ## Workflow -### Step 1: Gather Review Threads - -Fetch all review threads using `gh api graphql`. The GitHub API -**paginates review threads and comments** — you MUST check for -multiple pages and fetch all of them. Use cursor-based pagination -with `after` and `hasNextPage`: +### Step 1: Detect Platform + +1. **Explicit prefix override first.** `ado:` (e.g., `ado:123`) + → unambiguous **ADO**. Strip the `ado:` prefix; carry the numeric + `prId` only — never the literal `ado:` string. Skip remote + inspection in step 3. +2. Parse PR URL: `github.com/...` → **GitHub**; + `dev.azure.com/{org}/{project}/_git/{repo}/pullrequest/{n}` or + `{org}.visualstudio.com/...` → **ADO**. +3. Else inspect `git remote -v` (handle SSH: `git@github.com`, + `git@ssh.dev.azure.com:v3/...`, `{org}@vs-ssh.visualstudio.com:v3/...`). + Prefer current branch's upstream when multiple remotes exist. +4. Still ambiguous → ask the user. Do NOT guess. +5. ADO Server / on-prem / TFS host → stop with a clear message. + +#### Resolve connection coordinates + +Before any API call, record the values needed to build URIs: +- **GitHub**: `owner`, `repo`, `pr_number`. +- **ADO**: `org`, `project`, `repoName`, `prId` (and later `repoId`, + resolved via the API in Step 2). + +Source them as follows: +- **From a PR URL** — parse the path. **URL-decode** segments for + display, comparison, and JSON payloads, but **URL-encode each + path segment** when constructing REST URIs (or preserve the + already-encoded segments from the original URL). Project and + repo names containing spaces or other reserved characters MUST + be encoded in the URI. +- **From a bare id** (`#42` or `42` for GitHub, `123` or `ado:123` + for ADO) — derive the rest from the selected upstream remote. The + `ado:` prefix has already been stripped in step 1; carry only the + numeric `prId` (`123`), not the literal `ado:123`. Strip a leading + `#` from GitHub ids similarly. Recognise: + - GitHub HTTPS: `https://github.com/{owner}/{repo}(.git)?` + - GitHub SSH: `git@github.com:{owner}/{repo}(.git)?` + - ADO HTTPS: `https://dev.azure.com/{org}/{project}/_git/{repo}` + - ADO SSH: `git@ssh.dev.azure.com:v3/{org}/{project}/{repo}` + - ADO legacy: `https://{org}.visualstudio.com/{project}/_git/{repo}` + - ADO legacy SSH: `{org}@vs-ssh.visualstudio.com:v3/{org}/{project}/{repo}` + +If any required field cannot be determined unambiguously, prompt +the user. Do NOT invent values. + +### Step 2: Gather Threads + +Record per-thread IDs (needed to post replies and update status). + +#### GitHub + +Use `gh api graphql` with cursor pagination. The GitHub API +**paginates review threads and comments** — always check `hasNextPage` +for **both** `reviewThreads` and the inner `comments` connection +within each thread, and continue fetching until both are exhausted +(PRs with many reviewers easily exceed 100 comments): ```graphql query($owner: String!, $repo: String!, $prNumber: Int!, $cursor: String) { @@ -84,142 +140,280 @@ query($owner: String!, $repo: String!, $prNumber: Int!, $cursor: String) { } ``` -Continue fetching pages until `hasNextPage` is `false` for both -`reviewThreads` and `comments` within each thread. - For each thread, record: -- `thread_id`: GraphQL ID (for `resolveReviewThread`) +- `thread_id`: the GraphQL `id` (required for `resolveReviewThread`) - Reviewer handle(s) - File path and line number -- Thread state: **pending** (unresolved + not outdated), +- Workflow classification (derived from `isResolved` / `isOutdated`, + not a single API field): **open** (unresolved + not outdated), **outdated** (code has changed), or **resolved** - Full comment text and replies -- `comment_id`: database ID of each comment (for `in_reply_to`) - -### Step 2: Filter and Classify Threads - -1. **Skip** resolved threads (note count for summary). -2. **Flag** outdated threads — ask the user whether to address them - since the code may have already changed. -3. **Group** remaining pending threads by file path for efficient - processing. -4. If thread count exceeds 20, process in batches of 10 with a - progress summary between batches. - -### Step 3: Detect Contradictions -Compare feedback across different reviewers on the same code area: -- Threads on the same file within 10 lines of each other -- Threads referencing the same function or design concept -- Reviewers who disagree on approach, necessity, or correctness +For each comment within the thread, record: +- `comment_id`: the `databaseId` (required for `in_reply_to` when + posting a reply) +- Author handle +- Comment body -For each contradiction, present both positions neutrally and ask -the user to decide before proceeding. +**Inner comment pagination.** The query above fetches the first 100 +comments per thread. For any thread whose +`comments.pageInfo.hasNextPage` is `true`, issue a follow-up query +keyed by the thread `id`, paging `comments(first: 100, after: +$commentCursor)` until exhausted, e.g.: -### Step 4: Analyze Each Thread +```graphql +query($threadId: ID!, $commentCursor: String) { + node(id: $threadId) { + ... on PullRequestReviewThread { + comments(first: 100, after: $commentCursor) { + pageInfo { hasNextPage endCursor } + nodes { id databaseId author { login } body createdAt } + } + } + } +} +``` -For each pending thread, read the current code at the thread's -location and determine the appropriate response: +#### Azure DevOps + +Run `az login` once. Then: + +1. Resolve `repoId` (use URL-encoded `{projectEnc}` / `{repoNameEnc}` + per the encoding note above; `{org}` and GUIDs need no encoding): + ```bash + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method GET \ + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoNameEnc}?api-version=7.1" + ``` +2. List threads. The documented schema does not expose `$top`/`$skip` + pagination and comments are embedded inline. Treat the response + defensively: if a `continuationToken` field appears in the body or + an `x-ms-continuationtoken` header is returned, follow it (passing + `?continuationToken=`) until no further token is returned. + ```bash + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method GET \ + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads?api-version=7.1" + ``` +3. Get latest iteration (for outdated detection): + ```bash + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method GET \ + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/iterations?api-version=7.1" + ``` + +For each ADO thread, record: +- `id`: the thread id (integer; required for status updates and + posting replies) +- `status`: one of `active`, `pending`, `fixed`, `wontFix`, + `closed`, `byDesign`, `unknown` (exact API enum values — + **case-sensitive**; note `wontFix` and `byDesign` are camelCase. + ADO uses `fixed`, NOT `resolved`.) +- `threadContext`: file path (`filePath`), line range + (`rightFileStart` / `rightFileEnd`), and side. May be `null` for + PR-wide threads or system threads. +- `pullRequestThreadContext.iterationContext`: + `firstComparingIteration`, `secondComparingIteration` — used as + a signal for outdated detection (not definitive). +- `properties` and `comments[*].commentType` — used to identify + system threads (see Step 3). + +For each comment within the thread, record: +- `id`: the comment id (use as `parentCommentId` when posting a reply) +- Author display name and unique name +- `content` (the comment body) +- `commentType` + +### Step 3: Filter & Classify + +#### GitHub + +Skip `resolved` (count). Flag `outdated` — ask before processing. +Group `open` by file path. + +#### Azure DevOps + +1. **Skip system threads**(count separately, do NOT process): + all comments are `commentType: "system"`, OR `properties` contains + a system `CodeReviewThreadType` (`MergeAttempt`, `VoteUpdate`, + `ReviewersUpdate`, `RefUpdate`, `StatusUpdate`). +2. Process `active` by default. +3. Flag `pending` — ask the user (author marked it awaiting something). +4. Skip `fixed`/`wontFix`/`closed`/`byDesign`/`unknown` unless + user opts in. +5. **Detect potentially outdated** (no native status — flag, do + NOT assert). **Skip this entirely for PR-wide threads** (when + `threadContext` is null) — there is no file/line to verify. + + For file-anchored threads, decide the source of truth for + "current file contents": + - **Preferred**: ADO iteration changes + (`GET .../pullRequests/{prId}/iterations/{latestIteration}/changes?api-version=7.1`) + and items + (`GET .../items?path={filePath}&versionDescriptor.version={sourceBranch}&versionDescriptor.versionType=branch&api-version=7.1`). + - **Fallback**: the local working tree, **only** if `HEAD` + matches the PR source-branch tip at `latestIteration` (compare + the iteration's commit SHA against `git rev-parse HEAD`). + - **Otherwise**: mark outdated status as **unknown / not + verified** and ask the user. + + With a verified source, flag when any holds: + `threadContext.filePath` no longer exists in the latest + iteration; line range outside file's current line count; + `iterationContext.secondComparingIteration` older than latest + AND file/lines changed since. +6. Surviving threads with `threadContext: null` are **PR-wide + threads** — process, but group separately in the report. + +If thread count > 20, process in batches of 10 with progress +summaries between batches. + +### Step 4: Detect Contradictions + +Compare feedback across reviewers on the same code area (same file +within 10 lines, or same function/concept). Present both positions +neutrally; ask the user to decide. + +### Step 5: Analyze Each Thread + +Read current code at the thread location. Determine response: | Reviewer Feedback | Response Type | |---|---| -| Points out a bug, missing check, or incorrect behavior | **Fix** | -| Asks "why" or questions a design choice | **Explain** | -| Suggests a refactor or alternative approach | **Both** (fix + rationale) | -| Requests documentation or comment changes | **Fix** | -| Flags a style or convention issue | **Fix** | -| Raises a concern without a specific ask | **Explain** | +| Bug, missing check, incorrect behavior | **Fix** | +| "Why" / design-choice question | **Explain** | +| Suggested refactor / alternative | **Both** | +| Documentation / comment changes | **Fix** | +| Style / convention issue | **Fix** | +| Concern with no specific ask | **Explain** | For each thread, produce: -- **Validity assessment**: Is this feedback valid, partially valid, - or based on a misunderstanding? -- **Fix** (if applicable): The specific code change with before/after - context (at least 3 surrounding lines). -- **Explanation** (if applicable): A draft reply — professional, - concise, technical. - -### Step 5: Present Analysis to User - -Before taking any action, present a summary: - -1. **Thread summary**: total threads, pending, resolved, outdated -2. **Contradictions**: any conflicting reviewer feedback -3. **Per-thread analysis**: for each pending thread, show: - - File, line, reviewer - - Your validity assessment - - Proposed response (fix / explanation / both) - -Ask the user to confirm or adjust the plan. - -### Step 6: Apply Changes - -Execute with **mandatory user confirmation at every step**: - -1. **Code fixes**: For each approved fix: - - Show the proposed diff - - Ask: "Apply this fix? (yes / skip / edit)" - - If confirmed, make the change - - Do NOT commit yet — batch all fixes - -2. **Commit and push**: After all fixes are applied: - - Show summary of all changes - - Ask: "Commit and push? (yes / no)" - - If confirmed, commit with a message referencing the threads - addressed, then push - -3. **Explanatory replies**: For each approved explanation: - - Show the draft reply - - Ask: "Post this reply? (yes / skip / edit)" - - If confirmed, post using: - ``` - cat > reply.json <<'EOF' - { - "body": "", - "in_reply_to": - } - EOF - - gh api repos/{owner}/{repo}/pulls/{pr_number}/comments \ - --method POST \ - --input reply.json - ``` - -4. **Resolve threads**: For threads that were fixed: - - Ask: "Resolve these threads? (yes / no)" - - If confirmed, resolve each using: - ``` - gh api graphql \ - -f query='mutation($threadId: ID!) { - resolveReviewThread(input: {threadId: $threadId}) { - thread { isResolved } - } - }' \ - -F threadId="" - ``` - -### Step 7: Summary - -After all actions are complete, present: -- Threads addressed (fixes applied, replies posted) -- Threads resolved -- Threads skipped (with reasons) -- Contradictions that need team discussion -- Any remaining unresolved threads +- A **validity assessment** — is the reviewer correct, partially + correct, or mistaken? Cite the code you read. +- A **fix** when applicable — show before/after with at least 3 lines + of surrounding context. +- A **draft reply** when applicable — professional, concise, and + technical. Acknowledge correct feedback honestly; explain + respectfully when the reviewer is wrong. Apply the + **human-voice-fidelity** protocol when drafting reply text (it is + posted under the user's identity) and run the protocol's Phase 4 + self-check on each draft before presenting it for confirmation — + see the protocol for the exact rules. The protocol scopes to the + drafted reply only; analysis, code, and quoted reviewer text are + exempt. + +### Step 6: Present Plan + +Show: +- A **thread summary** in the source platform's **native status + vocabulary** (do NOT translate between platforms). +- Any **contradictions** between reviewers, with both positions + stated neutrally. +- A **per-thread analysis** with the proposed response (fix, reply, + or both). + +Ask the user to confirm before proceeding to Step 7. + +### Step 7: Apply Changes + +Execute with **mandatory user confirmation at every step**. + +1. **Code fixes** — for each approved fix: + - Show the diff. + - Ask `Apply this fix? (yes / skip / edit)`. + - Apply if confirmed. Batch all fixes — do NOT commit yet. +2. **Commit & push** — after all fixes are applied: + - Show the summary of all changes. + - Ask `Commit and push? (yes / no)`. + - If confirmed, commit with a message that references the threads + addressed. +3. **Replies** — for each approved explanation: + - Show the draft reply. + - Ask `Post this reply? (yes / skip / edit)`. + - Post if confirmed: + + **GitHub:** + ```bash + cat > reply.json <<'EOF' + { "body": "", "in_reply_to": } + EOF + gh api repos/{owner}/{repo}/pulls/{pr_number}/comments \ + --method POST --input reply.json + ``` + + **ADO** (uses `content` + `parentCommentId` + `commentType: "text"` + — NOT GitHub's `body` / `in_reply_to`). Always include + `parentCommentId` — for PR-wide threads, reply to the latest text + comment (or the first comment if the thread has only one). Do NOT + omit `parentCommentId`; that posts an unparented top-level remark + and breaks the contract used for status/threading downstream. + + Always write the reply body to a temp file and pass `--body @file` + — never inline as `--body '...'`. Real reply text contains + apostrophes, newlines, and backslashes that break shell quoting in + both bash and PowerShell. + ```bash + cat > reply.json <<'EOF' + { "content": "", "parentCommentId": , "commentType": "text" } + EOF + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method POST \ + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}/comments?api-version=7.1" \ + --headers "Content-Type=application/json" \ + --body @reply.json + ``` + +4. **Update thread status** — always confirm each transition with + the user before executing: + + | Intent | GitHub | ADO | + |---|---|---| + | Fix applied | resolve | `fixed` | + | Explanation posted, close discussion | resolve | `closed` | + | Explanation posted, leave for reply | (no change) | leave `active` | + | Concern noted, won't act | (no change) | `wontFix` | + | Intentional design | (no change) | `byDesign` | + + **GitHub** — resolve: + ```bash + gh api graphql -f query='mutation($threadId: ID!) { + resolveReviewThread(input: {threadId: $threadId}) { thread { isResolved } } + }' -F threadId="" + ``` + + **ADO** — PATCH with exact case-sensitive enum value (`wontFix` + and `byDesign` are camelCase; the body is a fixed small JSON + literal with no user content, so inlining `--body '...'` is safe + here): + ```bash + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method PATCH \ + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}?api-version=7.1" \ + --headers "Content-Type=application/json" \ + --body '{ "status": "fixed" }' + ``` + +### Step 8: Summary + +Present: +- **Threads addressed** — fixes applied and replies posted, with + thread IDs. +- **Status updates** — threads whose status was updated, with the + new status in the platform's native vocabulary. +- **Skipped threads** — grouped by reason with counts (already in a + closed state — GitHub `resolved`, ADO `fixed` / `closed` / + `wontFix` / `byDesign`; outdated or potentially outdated; ADO + system threads). +- **Contradictions** — items still needing team discussion. +- **Remaining open threads** — anything not addressed in this pass. ## Edge Cases -- **No pending threads**: Report "No actionable review threads" and - list resolved/outdated counts. -- **Threads on deleted files**: Skip with a note. -- **Outdated threads**: Always ask before addressing — the code may - have changed to address the feedback already. -- **Pagination**: Always check `hasNextPage` for both threads and - comments. A PR with many reviewers can easily exceed 100 comments. - -## Non-Goals - -- Do NOT perform a new code review — only address existing feedback. -- Do NOT modify code beyond what review comments require. -- Do NOT resolve or dismiss threads without user confirmation. -- Do NOT push commits without explicit user approval. -- Do NOT take sides in contradictions — let the user decide. +- **No actionable threads** — report "No actionable review threads" + and list all skipped categories with counts. +- **Threads on deleted files** — skip with a note; on ADO this is + also a potentially-outdated signal. +- **Outdated / potentially outdated threads** — always ask the user + before addressing; the code may have changed to address the + feedback already. +- **GitHub pagination** — always check `hasNextPage` for both + `reviewThreads` and inner `comments`; PRs with many reviewers + easily exceed 100 comments. +- **ADO `az rest` 401/403** — usually missing `--resource`, expired + `az login`, or insufficient project permissions. Tell the user + which to check; do NOT recommend a PAT. diff --git a/formats/pr-comment-responses.md b/formats/pr-comment-responses.md index a88e0d6..b3a402c 100644 --- a/formats/pr-comment-responses.md +++ b/formats/pr-comment-responses.md @@ -24,17 +24,43 @@ threads. The format adapts based on `output_mode`: ### 1. Thread Summary -Summarize all review threads by state: +Summarize all review threads by state, using the **source platform's +native status vocabulary**. Do not normalize statuses across platforms. + +**GitHub** review threads expose two boolean flags — `isResolved` and +`isOutdated` — and have no single API status field. This format +groups them into three workflow classification labels (the labels +below are template names, not GitHub API literals): + +| Label | Count | Description | +|-------|-------|-------------| +| `open` | N | Unresolved threads requiring response (`isResolved: false`, `isOutdated: false`) | +| `outdated` | N | Threads on code that has since changed (`isOutdated: true`) | +| `resolved` | N | Already resolved (`isResolved: true`) — skipped unless user requests | + +**Azure DevOps** uses five primary statuses (`active`, `pending`, +`fixed`, `wontFix`, `closed`) plus the edge values `byDesign` and +`unknown` and a derived "potentially outdated" flag. ADO uses +`fixed` for an addressed thread — NOT GitHub's `resolved`: | State | Count | Description | |-------|-------|-------------| -| **Pending** | N | Active threads requiring response | -| **Outdated** | N | Threads on code that has since changed | -| **Resolved** | N | Already resolved — skipped unless user requests | +| `active` | N | New / open threads requiring response | +| `pending` | N | Author marked awaiting something — flag for user | +| `fixed` | N | Issue addressed — skipped unless user requests | +| `wontFix` | N | Noted but won't be fixed — skipped unless user requests | +| `closed` | N | Discussion closed — skipped unless user requests | +| `byDesign` / `unknown` | N | Already triaged — skipped unless user requests | +| _Potentially outdated_ | N | Derived (not an API status); thread tracked from older iteration or location no longer exists | - **Total threads**: count -- **Actionable threads**: count (pending only, unless user overrides) -- **Skipped threads**: count and reason (resolved, outdated) +- **Actionable threads**: count (the platform's "needs response" states, + unless the user overrides) +- **Skipped threads**: count and reason (use the platform's native + status names — do not translate) +- **System threads** (ADO only): count of system-generated threads + (merge attempts, vote updates, reviewer changes, ref updates) that + were excluded from analysis ### 2. Contradiction Report @@ -63,7 +89,11 @@ For each actionable thread, in file order: #### Thread T-: : - **Reviewer**: @handle -- **Thread State**: Pending / Outdated +- **Thread State**: - **Comment Summary**: <1–2 sentence summary of the reviewer's point> - **Response Type**: Fix / Explain / Both - **Analysis**: @@ -80,8 +110,10 @@ For each actionable thread, in file order: |----------|-------|---------| | **Code fixes applied** | N | Threads where code was changed | | **Explanations provided** | N | Threads answered with rationale | -| **Skipped (resolved)** | N | Already resolved threads | -| **Skipped (outdated)** | N | Threads on changed code | +| **Threads marked resolved/closed** | N | Threads transitioned to a closed status (use the platform's native term: GitHub `resolved`; ADO `fixed` / `closed` / `wontFix` / `byDesign`) | +| **Skipped (already closed)** | N | Threads already in a non-actionable state at start (GitHub `resolved`; ADO `fixed` / `closed` / `wontFix` / `byDesign`) | +| **Skipped (outdated / potentially outdated)** | N | Threads on changed code | +| **Skipped (system threads, ADO only)** | N | System-generated threads excluded from analysis | | **Needs discussion** | N | Contradictions or ambiguous feedback | - **Files modified**: list of files changed by fixes diff --git a/manifest.yaml b/manifest.yaml index 1564442..93c2c0b 100644 --- a/manifest.yaml +++ b/manifest.yaml @@ -173,6 +173,17 @@ protocols: pre-edit snapshots, one logical change per tool call, and escalation on repeated failures. + - name: human-voice-fidelity + path: protocols/guardrails/human-voice-fidelity.md + description: > + Preserve the user's communication style when drafting externally + visible text on their behalf (PR replies, issue comments, emails). + Pluggable voice sources, calibrated style extraction, and a + per-output self-check that flags AI-tell patterns (em-dashes, + formulaic phrases, etc.) — conditionally permitted when present + in the user's own samples. Scoped to user-authored text only; + does not affect analysis, code, or quoted content. + analysis: - name: memory-safety-c path: protocols/analysis/memory-safety-c.md @@ -1439,18 +1450,19 @@ templates: (post inline review comments via GitHub API). Language-agnostic by default with optional language-specific focus. persona: systems-engineer - protocols: [anti-hallucination, self-verification, operational-constraints] + protocols: [anti-hallucination, self-verification, operational-constraints, human-voice-fidelity] format: investigation-report - name: respond-to-pr-comments path: templates/respond-to-pr-comments.md description: > - Process PR review feedback and generate per-thread responses. - Supports document mode (response plan) or action mode (make - code fixes, post replies, resolve threads via GitHub API). - Detects contradictory feedback across reviewers. + Process PR review feedback and generate per-thread responses on + GitHub or Azure DevOps Services. Supports document mode (response + plan) or action mode (make code fixes, post replies, update + thread status via the platform's API). Auto-detects platform + from the PR URL. Detects contradictory feedback across reviewers. persona: systems-engineer - protocols: [anti-hallucination, self-verification, operational-constraints] + protocols: [anti-hallucination, self-verification, operational-constraints, human-voice-fidelity] format: pr-comment-responses - name: reconstruct-behavior diff --git a/protocols/guardrails/human-voice-fidelity.md b/protocols/guardrails/human-voice-fidelity.md new file mode 100644 index 0000000..31b50f5 --- /dev/null +++ b/protocols/guardrails/human-voice-fidelity.md @@ -0,0 +1,253 @@ + + + +--- +name: human-voice-fidelity +type: guardrail +description: > + When drafting externally visible text on behalf of a specific human user + (PR replies, issue comments, emails, chat messages), preserve the user's + observed communication style and avoid stylistic patterns that mark text + as machine-generated. +applicable_to: + - respond-to-pr-comments + - review-pull-request +--- + +# Protocol: Human Voice Fidelity + +This protocol applies whenever the output includes text that will be +**posted externally under the user's identity** (e.g., a reply to a PR +review comment, an email draft, a chat message). It does NOT apply to +internal analysis, code, command output, or quoted material from third +parties. + +The protocol is opt-in: templates that produce both internal analysis +and externally visible text on the user's behalf should declare this +protocol in their `protocols:` frontmatter. + +## Phases + +### Phase 1: Source Voice Samples + +Voice sources are **pluggable**. Use whichever are available, in +priority order. The recipes below are concrete examples per platform; +the underlying intent is the same — sample the user's recent authored +prose from whichever SCM hosts the project. + +1. **Explicit samples pasted by the user** in the current session. +2. **Prior text the user has authored in this repository or project** — + recent PR/MR review replies, issue/work-item comments, or commit + messages by the user. Use whichever interface the host SCM provides: + + - **GitHub** (`github.com`): `gh pr list --author @me --state all + --limit 20`, then `gh pr view --comments` and + `gh api repos/{owner}/{repo}/pulls/{n}/comments` for inline + review-thread bodies. + - **Azure DevOps Services** (`dev.azure.com`, + `*.visualstudio.com`): query the Pull Requests and Threads REST + APIs filtered by the user's identity. After `az login`: + ```bash + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method GET \ + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullrequests?searchCriteria.creatorId={userId}&api-version=7.1" + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method GET \ + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads?api-version=7.1" + ``` + Filter the resulting `comments[*]` to those whose + `author.uniqueName` matches the user. + - **GitLab**: `glab mr list --author=@me`, then `glab mr view + --comments`. Or hit the REST API: + `GET /projects/:id/merge_requests?author_username=` and + `GET /projects/:id/merge_requests/:mr/notes`. + - **Bitbucket Cloud**: REST API + `GET /2.0/repositories/{ws}/{repo}/pullrequests?q=author.uuid="{uuid}"`, + then `GET .../pullrequests/{id}/comments`. + - **Gitea / Forgejo**: REST API + `GET /repos/{owner}/{repo}/pulls?state=all&poster={user}`, then + `GET .../issues/{n}/comments`. + - **Other / unspecified SCM**: detect the host from `git remote -v` + (or ask the user). If the platform exposes an authenticated API + or CLI for listing the user's own PRs/MRs and their inline + comments, use it; the goal is 5–20 recent comment bodies authored + by the user. If no such interface is available or accessible, + skip this source and continue with the remaining sources in + priority order. + + In all cases, prefer **inline review-comment bodies** over commit + messages — they are closer to the conversational register of the + text being drafted. Use commit messages only as a fallback. +3. **Prior agent session history with this user**, when the agent has + access to a persisted transcript store. The mechanism is + agent-specific; the intent is the same — surface recent + `user.message`-shaped content for natural-language samples. Examples + (non-exhaustive): + - **GitHub Copilot CLI**: query `session_store_sql` (the `events` + table where `type = 'user.message'`). + - **Claude Code**: read recent JSONL transcripts under + `~/.claude/projects//*.jsonl` and filter to + `type: "user"` entries. + - **Cursor / Windsurf / other IDE-embedded agents**: use whatever + local conversation history the IDE exposes (often a SQLite store + under the IDE's user-data directory). + - **Any other agent**: if a queryable or readable conversation log + exists for the user, sample recent user-authored turns from it. + If no such store is available, skip this source. +4. **Organization-specific communication tools**, only with + **explicit, per-session opt-in user permission**. The user must + actively choose to enable each such source for the current task — + default behavior is to skip. Examples (non-exhaustive, + environment-dependent): + - Microsoft 365 Copilot / WorkIQ MCP server (`workiq-ask_work_iq`) + for users in Microsoft tenants + - Slack / Teams export tools where the user has provided access + - Any user-provided dump of personal correspondence +5. **Explicit style notes** the user has written down — e.g., a + `STYLE.md`, or a voice section inside an agent-instruction file + such as `.github/copilot-instructions.md`, `CLAUDE.md`, + `AGENTS.md`, `.cursorrules`, or `.windsurfrules`. + +**Consent and confidentiality.** Before sampling from any source +outside the current repository / PR context (sources 3 and 4 above — +prior agent session history and organization-specific communication +tools), the agent **MUST**: + +- Disclose to the user **which source** it intends to access, **what + it will sample**, and the **approximate volume** (e.g., "I'll read + the last 20 user turns from your Copilot CLI session store to + calibrate voice"). +- Obtain explicit confirmation before accessing the source. **Default + behavior is to skip** these sources unless the user opts in for the + current session; consent does not carry over between sessions. +- Use sampled content **only** for in-process style calibration. Do + NOT quote sampled content verbatim in externally posted text. Do + NOT include sampled content in tool-call arguments to third-party + services. Do NOT persist sampled content to disk outside the + current session's working memory. + +Sources 1, 2, and 5 (explicit user samples, prior repo PRs/MRs by the +user, and explicit style notes already in the repo) do not require +additional consent — they are either user-volunteered or already part +of the project context. + +If **no voice sources are available**: + +- Ask the user for **2-3 sample replies** they would consider + representative of their voice. +- If the user declines or cannot provide samples, fall back to a + **neutral collaborative default** (see Phase 2) and explicitly + disclose in the output that no voice samples were available. +- **Never claim** to have matched the user's voice without evidence. + +### Phase 2: Calibrate Observed Style + +From the gathered samples, extract: + +- **Sentence length distribution** — short and clipped, medium, or + long and explanatory. +- **Hedging frequency** — does the user say "I think", "maybe", + "it looks like" often, or are they direct? +- **Technical density** — heavy jargon, mixed, or plain language. +- **Openers and closers** — do they greet ("hey", "hi @user"), open + cold, or sign off ("thanks", initials, nothing)? +- **Characteristic phrases** — recurring expressions, idioms, or + domain shorthand. +- **Punctuation habits** — em-dash usage (likely none for most + humans), comma density, parenthetical asides. + +If samples are inconsistent (e.g., terse in code review, verbose in +docs), pick the subset most relevant to the current output channel +(PR replies → prior PR replies, not docs). + +**Neutral collaborative default** (when no samples): short to medium +sentences, no greeting/sign-off, plain language with technical terms +where they belong, no hedging filler, no exclamation points. + +### Phase 3: Draft in the Calibrated Voice + +When drafting each piece of externally visible text: + +- Match observed sentence length, hedging, and technical density. +- Use observed openers and closers (or none if the user uses none). +- Reuse characteristic phrases where they fit naturally; do NOT + force them. +- Keep the substance accurate — voice fidelity does not override + technical correctness or anti-hallucination constraints. + +### Phase 4: Self-Check Pass + +Run this checklist on **every drafted piece of externally visible +text** before presenting it. The check applies to the drafted reply +only, not to surrounding analysis, code blocks, or quoted material. + +**Hard rules — failure means rewrite:** + +- [ ] No em-dash (U+2014 `—`) anywhere in the drafted text, **unless + em-dashes appear in the user's own samples**. Em-dashes are a + strong AI tell when absent from the user's baseline; permitted + when present. +- [ ] No en-dash (U+2013 `–`) used as a punctuation separator + (en-dash in numeric ranges like `2020–2024` is allowed). +- [ ] None of these AI-tell phrases appear unless they appear in the + user's own samples: + - "great catch" + - "great question" + - "you're absolutely right" + - "you raise a great point" + - "I appreciate the feedback" + - "I hope this helps" + - "Let me know if you have any other questions" + - "Certainly!" + - "Absolutely!" + - Opening with "I'd be happy to..." +- [ ] No unsupported technical claims — every factual statement is + backed by code, tests, docs, or reviewer text already in + context. +- [ ] No hedge stack ("I think it might possibly be the case that...") + unless the user's samples show this pattern. +- [ ] No reflexive apology ("Sorry for the confusion!") unless the + user uses this in their samples. + +**Soft rules — flag for user review if violated:** + +- Avoid bulleted lists of more than 3 items when 1-2 sentences of + prose would convey the same information, **unless the user's + samples show frequent list use**. Long bulleted lists are an AI + tell when absent from the user's baseline. +- Reply length is within ~1.5x the observed user average for the + channel. +- Greeting/sign-off matches user habit. +- Opening word/phrase is one the user has used before, or a neutral + alternative. + +### Phase 5: Scope Boundary + +This protocol applies **only** to newly drafted text the user will +post under their identity. It does NOT modify, rewrite, or "clean up": + +- Reviewer comments quoted in analysis sections +- Code snippets, diffs, or file paths +- Command output, logs, or error messages +- Internal analysis, validity assessments, or technical reasoning + that the agent presents to the user but does NOT post externally +- Format scaffolding (headings, tables, checklists) defined by the + output format + +If a drafted reply must include quoted reviewer text or a code +snippet, the quoted/code portion is exempt from the em-dash and +phrase rules; only the surrounding user-authored prose is checked. + +## Output Annotation + +When this protocol is applied, the agent reports a brief **Voice +Calibration** note (one or two lines) stating: + +- Which voice sources were used (e.g., "5 prior PR replies in repo, + 3 user-provided samples"), or +- That no samples were available and a neutral default was used. + +**Placement.** This note is **internal-facing only** — the agent +reports it in its chat-style response to the user (in document mode) +or as part of the action-mode change summary. It is **NOT** inserted +into the produced format artifact (e.g., the `pr-comment-responses` +or `investigation-report` document) and **NOT** posted externally. +This keeps consuming formats unchanged and avoids format drift. diff --git a/templates/respond-to-pr-comments.md b/templates/respond-to-pr-comments.md index 621728d..427cbe9 100644 --- a/templates/respond-to-pr-comments.md +++ b/templates/respond-to-pr-comments.md @@ -4,36 +4,41 @@ --- name: respond-to-pr-comments description: > - Process pull request review feedback and generate per-thread responses. - Supports document mode (structured response plan) or action mode - (make code fixes, post replies, and resolve threads via GitHub API). - Detects contradictory feedback across reviewers. + Process pull request review feedback and generate per-thread responses + on GitHub or Azure DevOps Services. Supports document mode (structured + response plan) or action mode (make code fixes, post replies, and + update thread status via the platform's API). Detects contradictory + feedback across reviewers. persona: systems-engineer protocols: - guardrails/anti-hallucination - guardrails/self-verification - guardrails/operational-constraints + - guardrails/human-voice-fidelity format: pr-comment-responses params: - pr_reference: "Pull request to respond to — URL or number (e.g., #42)" - review_threads: "Review feedback to address — 'all pending', specific thread URLs, or pasted comments" + pr_reference: "Pull request to respond to — full URL (GitHub or Azure DevOps Services), PR number (e.g., #42 for GitHub), or bare numeric PR id (e.g., 123, optionally prefixed `ado:123` to disambiguate). Platform is auto-detected from the URL when given; otherwise inferred from `git remote -v` of the current repo. If both signals are absent or ambiguous, the workflow prompts you to pick rather than guessing." + review_threads: "Review feedback to address — 'all open' (GitHub: unresolved threads; ADO: status `active`), specific thread URLs, or pasted comments" codebase_context: "What this code does, relevant architecture, design decisions that inform responses" response_mode: "How to respond per-thread — 'auto' (heuristic), 'fix' (code changes), or 'explain' (rationale)" - output_mode: "Output mode — 'document' (produce response plan) or 'action' (make changes and post replies via gh CLI)" + output_mode: "Output mode — 'document' (produce response plan) or 'action' (make changes and post replies via the platform's CLI/API)" input_contract: null output_contract: type: pr-comment-responses description: > A structured per-thread response plan with code fixes and/or explanations. In action mode, responses are executed as code - changes, reply comments, and thread resolutions. + changes, reply comments, and thread status updates. --- # Task: Respond to PR Review Comments You are tasked with processing review feedback on a pull request and generating responses for each review thread — either code fixes, -explanatory replies, or both. +explanatory replies, or both. The pull request may live on **GitHub** +or **Azure DevOps Services**; the workflow is the same, but the API +calls differ. Use the platform's **native status vocabulary** in all +output — do NOT translate statuses across platforms. ## Inputs @@ -49,34 +54,124 @@ explanatory replies, or both. ## Instructions -### Phase 1: Gather Review Threads +### Phase 1: Detect the Platform + +Determine whether `pr_reference` points to a GitHub or Azure DevOps +Services pull request. Use this resolution order: + +1. **Recognise explicit platform prefixes first.** Before any URL or + remote inspection, if `pr_reference` matches `ado:` (e.g., + `ado:123`), treat that as an unambiguous **Azure DevOps Services** + override — record the platform as ADO, strip the `ado:` prefix to + yield the numeric `prId`, and skip step 3 (remote inspection). + Coordinates still need to be resolved from the upstream remote in + step 6; carry only the numeric `prId`, never the literal + `ado:` string. + +2. **Parse the URL**: + - `github.com///pull/` → **GitHub** + - `dev.azure.com///_git//pullrequest/` → + **Azure DevOps Services** + - `.visualstudio.com//_git//pullrequest/` → + **Azure DevOps Services** (legacy host) + - URL-decode `` and `` segments before using them. + +3. **If `pr_reference` is not a URL and has no `ado:` prefix** (e.g., + bare `#42` or `42` for GitHub, bare `123` for ADO when the working + tree is unambiguously an ADO clone), inspect `git remote -v` in + the current working directory: + - Match HTTPS or SSH remotes: + - GitHub: `github.com`, `git@github.com` + - ADO: `dev.azure.com`, `git@ssh.dev.azure.com:v3/...`, + `.visualstudio.com`, `@vs-ssh.visualstudio.com:v3/...` + - If multiple remotes exist, prefer the upstream of the current branch. + +4. **If still ambiguous**, ask the user explicitly: + "Is this PR on GitHub or Azure DevOps Services?" Do NOT guess. + +5. **Out of scope**: Azure DevOps Server (on-prem) and TFS custom + hostnames are NOT supported by this template's auth path. If you + detect such a host, stop and inform the user that they must run + this template against a supported platform or provide their own + working API auth context. + +Record the detected platform; the rest of the phases branch on it. + +6. **Resolve connection coordinates** before any API call. The set + of values needed depends on the platform: + - **GitHub**: `owner`, `repo`, `pr_number`. + - **ADO**: `org`, `project`, `repoName`, `prId` (and later `repoId`, + resolved via the API in Phase 2). + + Source the values as follows: + - **If `pr_reference` is a URL**, parse them from the URL path. + URL-decode each segment for display, comparison, and JSON + payloads. When constructing REST URIs, **URL-encode each path + segment** (or preserve the already-encoded segments from the + original URL) so values containing spaces or other reserved + characters do not produce malformed requests. + - **If `pr_reference` is a bare id** (`#42` or `42` for GitHub, + `123` or `ado:123` for ADO), derive + `owner`/`repo` (GitHub) or `org`/`project`/`repoName` (ADO) from + the selected upstream remote. Recognise these forms: + - GitHub HTTPS: `https://github.com/{owner}/{repo}(.git)?` + - GitHub SSH: `git@github.com:{owner}/{repo}(.git)?` + - ADO HTTPS: `https://dev.azure.com/{org}/{project}/_git/{repo}` + - ADO SSH: `git@ssh.dev.azure.com:v3/{org}/{project}/{repo}` + - ADO legacy HTTPS: `https://{org}.visualstudio.com/{project}/_git/{repo}` + - ADO legacy SSH: `{org}@vs-ssh.visualstudio.com:v3/{org}/{project}/{repo}` + The bare id provides `pr_number` (GitHub) or `prId` (ADO). When + the input was `ado:123`, the prefix has already been stripped in + step 1 — `prId` is the numeric value (`123`), never the literal + `ado:123` string. Strip a leading `#` from GitHub bare ids + similarly. + - If any required field cannot be determined unambiguously, prompt + the user. Do NOT invent values. + +### Phase 2: Gather Review Threads + +Fetch all review threads from the platform. + +#### GitHub branch 1. **Read all review threads** on the PR: - Use `gh pr view {{pr_reference}} --comments` for a quick overview, but use `gh api graphql` to fetch the authoritative review-thread data needed for deterministic action mode execution - For each review thread, record: - - `thread_id`: the GraphQL review thread ID (required for + - `thread_id`: the GraphQL review thread `id` (required for `resolveReviewThread`) + - `isResolved` and `isOutdated` (these are the concrete fields + GitHub exposes — there is no single `state` field; the + template derives the workflow classification `open` / + `outdated` / `resolved` from these flags) - Reviewer handle - - File path and line number - - Thread state (pending, resolved, outdated) + - File path (`path`) and line number (`line`) - Full comment text and any replies - Whether the thread is on code that still exists in the current diff - For each review comment within the thread, record: - - `comment_id`: the review comment database ID (required for + - `comment_id`: the review comment `databaseId` (required for REST `in_reply_to` when posting a reply) - Author handle - Comment body - - Use a GraphQL query via `gh api graphql` that includes each - thread's ID, state, path, and line metadata, plus each - comment's database ID, author, and body + - Use a GraphQL query via `gh api graphql` that selects each + thread's `id`, `isResolved`, `isOutdated`, `path`, and `line`, + plus each comment's `databaseId`, `author`, and `body` + - **Paginate exhaustively.** Both `reviewThreads` and the inner + `comments` connection paginate independently. Page outer + `reviewThreads(first: 100, after: $cursor)` until + `pageInfo.hasNextPage` is `false`. Then, for any thread whose + inner `comments.pageInfo.hasNextPage` is `true`, issue a + follow-up query keyed by the thread `id` with + `comments(first: 100, after: $commentCursor)` and repeat until + exhausted. PRs with many reviewers easily exceed 100 comments + on a single thread. - Preserve these IDs in your working notes so later action steps can post replies and resolve the correct threads 2. **Filter threads** based on `review_threads` parameter: - - If `all pending` — include all threads with state `pending` + - If `all open` — include all threads where `isResolved: false` - If specific threads are listed — include only those - Skip `resolved` threads unless the user explicitly requests them - Flag `outdated` threads (code has changed since the comment) @@ -87,10 +182,151 @@ explanatory replies, or both. - Understand the surrounding context (function, class, module) - Check if the code has changed since the review comment was posted -### Phase 2: Detect Contradictions - -Compare feedback across different reviewers on the same code area -or design decision: +#### Azure DevOps branch + +**Authentication.** All ADO API calls in this template use `az rest` +with `--resource 499b84ac-1321-427f-aa17-267ca6975798` (the Azure +DevOps resource ID). The user must have run `az login` once. Do NOT +omit `--resource` — without it, `az rest` may attach a token for the +wrong audience and the call will fail authorization. Do NOT instruct +the user to mint a Personal Access Token. + +1. **Resolve the repository GUID.** Use the `org`, `project`, + `repoName`, and `prId` already resolved in Phase 1 (from the URL + or from the upstream remote) — do NOT require a PR URL here. + - When constructing the API URIs below, **URL-encode** `project` + and `repoName` (they may contain spaces or reserved characters). + `{projectEnc}` and `{repoNameEnc}` denote the encoded forms; + `{org}` and GUIDs (`{repoId}`) need no encoding. + - Look up the repository GUID — needed by some thread endpoints: + + ```powershell + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 ` + --method GET ` + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoNameEnc}?api-version=7.1" + ``` + + Record the returned `id` as `repoId`. + - For fork PRs, ADO targets the **target repository** for thread + operations; `repoId` from the URL above is correct. + +2. **List all PR threads.** The documented schema for this endpoint + does not expose `$top`/`$skip` pagination parameters, and comments + are embedded inline in each thread. Treat the response defensively + anyway: if a `continuationToken` field appears in the body or an + `x-ms-continuationtoken` header is returned, follow it (passing + `?continuationToken=` on the next request) until no further + token is returned. + + ```powershell + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 ` + --method GET ` + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads?api-version=7.1" + ``` + + Read `value[]` from the response. For each thread, record: + - `id`: the thread id (integer; required for status updates and + posting replies) + - `status`: one of `active`, `pending`, `fixed`, `wontFix`, + `closed`, `byDesign`, `unknown` (exact API enum values — + **case-sensitive**; note `wontFix` and `byDesign` are camelCase. + ADO uses `fixed`, NOT `resolved`.) + - `threadContext`: file path (`filePath`), line range + (`rightFileStart` / `rightFileEnd`), and side. May be `null` + for PR-wide threads or system threads. + - `pullRequestThreadContext.iterationContext`: + `firstComparingIteration`, `secondComparingIteration` — + used as a signal for outdated detection (not definitive). + - `properties` and `comments[*].commentType` — used to identify + system threads (see step 4). + - For each comment in `comments[]`: `id` (the + `parentCommentId` for posting a reply), author display name + and unique name, content, and `commentType`. + +3. **Get the latest iteration** (used for outdated detection): + + ```powershell + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 ` + --method GET ` + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/iterations?api-version=7.1" + ``` + + Record the highest `id` as `latestIteration`. + +4. **Filter out system threads.** Skip threads where any of these + are true (these are auto-generated by ADO, not human review): + - All comments have `commentType: "system"`. + - `properties` contains a `CodeReviewThreadType` of + `MergeAttempt`, `VoteUpdate`, `ReviewersUpdate`, `RefUpdate`, + or `StatusUpdate`. + + Count skipped system threads for the summary; do NOT process them. + Threads that do not match the rules above but contain no `text` + comments (only attachments, reactions, or non-text content) + should NOT be auto-skipped — flag them and ask the user. + +5. **Filter remaining threads** based on `review_threads` parameter: + - If `all open` — include threads with status `active` (the + default state for new comments needing response). Note: ADO's + `pending` status is distinct (author-marked awaiting something) + and is handled in the next bullet, not as part of `all open`. + - **Pending threads (status `pending`)**: flag and ask the user + whether to address now — the comment author marked them as + awaiting something. + - Skip `fixed`, `wontFix`, `closed`, `byDesign`, `unknown` + unless the user explicitly requests them. + - If specific threads are listed, include only those. + +6. **Detect potentially outdated threads.** ADO has no native + "outdated" status; use the conservative algorithm below and only + flag as **potentially outdated** (do not assert). + + **Skip this detection entirely for PR-wide threads** — if + `threadContext` is null, the thread has no file or lines to + verify. Pass it through unchanged to step 7. + + For file-anchored threads only, decide the source of truth for + "current file contents": + - **Preferred**: query the latest iteration's changes via + `GET .../pullRequests/{prId}/iterations/{latestIteration}/changes?api-version=7.1` + and the file content via + `GET .../items?path={filePath}&versionDescriptor.version={sourceBranch}&versionDescriptor.versionType=branch&api-version=7.1`. + - **Fallback**: the local working tree, **only** if you can + confirm `HEAD` matches the PR source branch tip at + `latestIteration` (e.g., compare commit SHA from the iterations + response against `git rev-parse HEAD`). Otherwise do NOT use + local files. + - **If you cannot verify either**, mark the thread's outdated + status as **unknown / not verified** and ask the user — do NOT + guess. + + With a verified source of truth, flag as potentially outdated + when any holds: + - The thread's `threadContext.filePath` no longer exists in the + latest iteration (deleted or renamed file). + - The thread's line range + (`threadContext.rightFileStart` / `rightFileEnd`) is outside + the current file's line count. + - `pullRequestThreadContext.iterationContext.secondComparingIteration` + is older than `latestIteration` AND the file/lines have changed + since that iteration. + + For each potentially outdated thread, ask the user whether to + address it. + +7. **Identify PR-wide (general) threads.** Threads with + `threadContext: null` that survived the system-thread filter are + genuine PR-wide discussion threads (added on the **Overview** tab + in the ADO UI). Group these separately from file-anchored threads + in the report — do NOT skip them. + +8. **Read the current code** at each file-anchored thread's location + (same as the GitHub branch, step 3). + +### Phase 3: Detect Contradictions + +This phase is platform-agnostic. Compare feedback across different +reviewers on the same code area or design decision: 1. **Group threads by location**: threads on the same file within 10 lines of each other, or threads referencing the same function @@ -106,9 +342,10 @@ or design decision: 3. **Report contradictions** with both positions and a recommended resolution. Do NOT silently pick one side — flag for the user. -### Phase 3: Analyze Each Thread +### Phase 4: Analyze Each Thread -For each actionable thread, determine the response type: +This phase is platform-agnostic. For each actionable thread, determine +the response type: 1. **If `response_mode` is `auto`**, apply these heuristics: @@ -137,22 +374,33 @@ For each thread, produce: - **Fix** (if applicable): The specific code change, shown as before/after with at least 3 lines of surrounding context. - **Explanation** (if applicable): A draft reply to the reviewer - that explains the design decision, tradeoff, or rationale. Keep - it professional, concise, and technical. + that explains the design decision, tradeoff, or rationale. Apply + the **human-voice-fidelity** protocol to this drafted text — match + the user's observed style and run the protocol's Phase 4 self-check + before presenting (see the protocol for the exact rules; do not + restate them here to avoid drift). The voice protocol scopes to + the drafted reply only; surrounding analysis, code, and quoted + reviewer text are exempt. + +### Phase 5: Output -### Phase 4: Output +Use the **source platform's native status vocabulary** in all output. +Do NOT translate ADO statuses into GitHub terms or vice versa. #### Document Mode (`output_mode: document`) Produce the output following the `pr-comment-responses` format: -1. Thread Summary (by state) +1. Thread Summary (by state; use the platform's status table) 2. Contradiction Report -3. Per-Thread Responses (in file order) +3. Per-Thread Responses (in file order; PR-wide threads grouped at + the end on ADO) 4. Action Summary #### Action Mode (`output_mode: action`) -Execute responses with **mandatory user confirmation at every step**: +Execute responses with **mandatory user confirmation at every step**. +The orchestration is identical across platforms; only the API calls +differ. 1. **Present the full analysis** (thread summary, contradictions, per-thread responses) to the user using the document structure. @@ -169,71 +417,166 @@ Execute responses with **mandatory user confirmation at every step**: c. If confirmed, commit with a descriptive message referencing the review threads addressed, then push. -4. **For each thread with an explanation**: +4. **For each thread with an explanation, post the reply.** + For each thread: a. Show the draft reply to the user. b. Ask: "Post this reply? (yes / skip / edit)" - c. If confirmed, write the reply payload to `reply.json` and post: - ```json - { - "body": "", - "in_reply_to": - } - ``` - ``` - gh api repos/{owner}/{repo}/pulls/{pr_number}/comments \ - --method POST \ - --input reply.json - ``` - -5. **For threads that were fixed**: - a. Ask: "Resolve these threads? (yes / no)" - b. If confirmed, resolve each thread using: - ``` - gh api graphql \ - -f query='mutation($threadId: ID!) { - resolveReviewThread(input: {threadId: $threadId}) { - thread { isResolved } - } - }' \ - -F threadId="" - ``` + c. If confirmed, post using the platform-appropriate API: + + **GitHub** — write the reply payload to `reply.json` and POST: + ```json + { + "body": "", + "in_reply_to": + } + ``` + ``` + gh api repos/{owner}/{repo}/pulls/{pr_number}/comments \ + --method POST \ + --input reply.json + ``` + + **Azure DevOps Services** — write the reply payload to `reply.json` + and POST. Always use a temp file (not an inlined `--body '...'` + string): real reply text contains apostrophes, newlines, and + backslashes that break shell quoting in both bash and PowerShell. + Use **literal UTF-8 characters** in reply text — do NOT use + shell-specific character escape sequences (e.g., PowerShell + `$([char]0x...)`, bash `$'\u...'`) to produce Unicode characters. + Serialization to JSON may preserve the escape syntax literally + rather than the intended character. Write `§`, `—`, `→` etc. + directly. + ```json + { + "content": "", + "parentCommentId": , + "commentType": "text" + } + ``` + ```bash + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 \ + --method POST \ + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}/comments?api-version=7.1" \ + --headers "Content-Type=application/json" \ + --body @reply.json + ``` + + ADO replies MUST always include `parentCommentId` — choose the + specific comment being answered. For PR-wide (general) threads, + reply to the latest text comment in the thread (or the first + comment if the thread has only one). Do NOT omit `parentCommentId` + to post an unparented top-level comment — it changes the semantics + from "reply" to "new top-level remark" and breaks the API + contract used elsewhere in this template. + +5. **For threads that were addressed, update thread status.** + Map the user's intent to the appropriate platform-native status: + + | User intent after addressing a thread | GitHub action | ADO action | + |---|---|---| + | Code fix applied, issue addressed | resolve | set status to `fixed` | + | Explanation posted, user wants discussion closed | resolve | set status to `closed` | + | Explanation posted, leave open for reviewer reply | (no change) | leave `active` | + | Concern noted, team chooses not to act | (no change; reply explains) | set status to `wontFix` | + | Reviewer's concern is intentional design | (no change; reply explains) | set status to `byDesign` | + + For each affected thread, ask the user to confirm the intended + transition before executing. + + **GitHub** — resolve the thread: + ``` + gh api graphql \ + -f query='mutation($threadId: ID!) { + resolveReviewThread(input: {threadId: $threadId}) { + thread { isResolved } + } + }' \ + -F threadId="" + ``` + + **Azure DevOps Services** — PATCH the thread status (use the + exact API enum value with **case-sensitive casing**: `active`, + `pending`, `fixed`, `wontFix`, `closed`, `byDesign` — note the + camelCase on `wontFix` and `byDesign`, and that ADO uses `fixed`, + NOT GitHub's `resolved`): + ```bash + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 \ + --method PATCH \ + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}?api-version=7.1" \ + --headers "Content-Type=application/json" \ + --body '{ "status": "fixed" }' + ``` 6. **Never take any action without explicit user confirmation.** If the user skips all items, produce a document-mode report instead. -### Phase 5: Handle Edge Cases +### Phase 6: Handle Edge Cases -- **No pending threads**: Report "No actionable review threads found" - and list any resolved/outdated threads for reference. +- **No actionable threads**: Report "No actionable review threads found" + and list any skipped threads (in the platform's native vocabulary — + GitHub `resolved`, ADO `fixed` / `closed` / `wontFix` / `byDesign`; + outdated / potentially outdated; ADO system threads) with counts + for reference. - **Large thread count (>20)**: Process in batches of 10. After each batch, summarize progress and ask to continue. -- **Outdated threads**: Flag these separately. Ask the user whether - to address them — the code may have already changed to address - the feedback. +- **Outdated / potentially outdated threads**: Flag these separately. + Ask the user whether to address them — the code may have already + changed to address the feedback. - **Threads on deleted files**: Skip with a note explaining the file - no longer exists. + no longer exists. On ADO, this is also a signal that the thread is + potentially outdated. +- **ADO system threads**: Always exclude from analysis; report only + the count in the summary. +- **ADO PR-wide threads**: Process them, but group them in a separate + "PR-wide threads" section in the report — they have no file/line + context. +- **Unsupported ADO host** (Server / on-prem / TFS): Stop with a + clear message; do not attempt to call APIs against an unsupported + endpoint. +- **`az rest` 401/403**: Most often caused by missing `--resource`, + expired `az login`, or insufficient permissions on the project. + Tell the user which of these to check; do NOT fall back to + recommending a PAT. ## Non-Goals - Do NOT perform a new code review — focus only on addressing existing feedback. - Do NOT modify code beyond what is needed to address review comments. -- Do NOT resolve threads without user confirmation. +- Do NOT update thread status without user confirmation. - Do NOT dismiss or ignore valid feedback — if a reviewer is correct, acknowledge it and fix it. - Do NOT take sides in contradictions — present both positions and let the user decide. - Do NOT push commits without explicit user confirmation. +- Do NOT translate or normalize statuses across platforms — use each + platform's native vocabulary. +- Do NOT instruct the user to create a Personal Access Token for ADO; + rely on `az login` + `az rest --resource`. +- Do NOT attempt to support Azure DevOps Server (on-prem) or TFS + with this template's auth path. ## Quality Checklist Before finalizing, verify: -- [ ] Every pending thread has a response (fix, explanation, or both) +- [ ] Platform was detected (or the user was asked) before any API call +- [ ] Every actionable thread has a response (fix, explanation, or both) - [ ] Contradictions across reviewers are explicitly flagged - [ ] Code fixes show before/after with sufficient context - [ ] Draft replies are professional, concise, and technical -- [ ] Resolved and outdated threads are accounted for (skipped with reason) +- [ ] All non-actionable threads (GitHub `resolved`; ADO `fixed` / + `closed` / `wontFix` / `byDesign`; outdated / potentially + outdated; ADO system threads) are accounted + for with platform-native status names and counts - [ ] In action mode: user confirmation obtained before every mutation -- [ ] Thread states (pending/resolved/outdated) are accurately reported +- [ ] Thread states are reported using the source platform's native + vocabulary (no cross-platform normalization) +- [ ] On ADO: every `az rest` invocation includes + `--resource 499b84ac-1321-427f-aa17-267ca6975798` +- [ ] On ADO: status PATCH payloads use the exact case-sensitive enum + values (`active`, `pending`, `fixed`, `wontFix`, `byDesign`, + `closed`) — `wontFix` and `byDesign` are camelCase +- [ ] On ADO: reply POST bodies use `content` + `parentCommentId` + + `commentType: "text"` (NOT GitHub's `body` / `in_reply_to`) - [ ] Files modified by fixes are listed in the action summary diff --git a/templates/review-pull-request.md b/templates/review-pull-request.md index a8f993d..2f034f6 100644 --- a/templates/review-pull-request.md +++ b/templates/review-pull-request.md @@ -13,6 +13,7 @@ protocols: - guardrails/anti-hallucination - guardrails/self-verification - guardrails/operational-constraints + - guardrails/human-voice-fidelity format: investigation-report params: pr_reference: "Pull request to review — URL, number (e.g., #42), or pasted diff" @@ -174,6 +175,15 @@ PR review concepts to report sections: 2. **Ask the user to confirm** which findings to post as review comments. Present each finding (or batch by file) and ask: - Post this comment? (yes / skip / edit) + + Apply the **human-voice-fidelity** protocol when drafting the + inline comment `body` and the overall review summary `body` — + these are posted to GitHub under the user's identity. Run the + protocol's Phase 4 self-check on each drafted comment before + presenting (see the protocol for the exact rules; do not restate + them here to avoid drift). The protocol scopes to drafted prose + only; code references, file paths, and quoted reviewer text are + exempt. 3. **Post confirmed findings** as inline review comments using a JSON payload file so `comments` is sent as an array, not a string: ```