From a0c4e973f0e57374ef3f16b5dc9ab93fed15ebb8 Mon Sep 17 00:00:00 2001 From: Andrew Beltrano Date: Mon, 4 May 2026 12:12:41 -0600 Subject: [PATCH 01/12] feat(respond-to-pr-comments): add Azure DevOps Services support MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Extends the respond-to-pr-comments template, skill, and format to handle Azure DevOps Services PRs alongside GitHub. Platform is auto-detected from the PR URL (with git-remote fallback); the workflow shape is shared and only API recipes branch per platform. - Auto-detect platform from PR URL or git remote (handles SSH and legacy visualstudio.com hosts); prompt on ambiguity, do not guess. - ADO auth uses 'az login' + 'az rest --resource ' on every call; no Personal Access Token path. - Preserve each platform's native status vocabulary in output (no cross-platform normalization). ADO uses 'fixed' (not 'resolved') per the CommentThreadStatus REST enum. - ADO reply payload uses content + parentCommentId + commentType ('text'); always set parentCommentId, including for PR-wide threads. - Filter ADO system threads (commentType 'system' or system CodeReviewThreadType properties); flag, do not auto-skip, threads with no text comments. - Conservative outdated detection: prefer ADO iteration/items API, fall back to local working tree only when HEAD matches the iteration's source-branch tip; otherwise mark unverified. - GitHub recipe paginates both reviewThreads and inner comments via follow-up cursored queries. - ADO Server / on-prem / TFS / custom hostnames are out of scope — stop with a clear message. - Update format file with per-platform status tables and add byDesign to the closed-state action summary. - Update manifest description to mention ADO Services. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../skills/respond-to-pr-comments/SKILL.md | 472 ++++++++++++------ formats/pr-comment-responses.md | 41 +- manifest.yaml | 9 +- templates/respond-to-pr-comments.md | 403 ++++++++++++--- 4 files changed, 704 insertions(+), 221 deletions(-) diff --git a/.github/skills/respond-to-pr-comments/SKILL.md b/.github/skills/respond-to-pr-comments/SKILL.md index a19ffb1..bf4d531 100644 --- a/.github/skills/respond-to-pr-comments/SKILL.md +++ b/.github/skills/respond-to-pr-comments/SKILL.md @@ -1,51 +1,100 @@ --- name: respond-to-pr-comments description: > - Respond to pull request review comments. Reads all review threads on a - PR, validates each comment, proposes fixes or explanations, applies - changes with user confirmation, and resolves threads via GitHub API. - Use this skill when the user wants to address PR feedback, fix review - comments, or resolve review threads. + Respond to pull request review comments on GitHub or Azure DevOps + Services. Reads review threads, validates each, proposes fixes or + explanations, applies changes with user confirmation, and updates + thread status via the platform's API. Use when the user wants to + address PR feedback or resolve review threads. --- -You are a senior systems engineer responding to pull request review -feedback. You read review threads, validate each comment, propose -changes, apply them with explicit user confirmation, and resolve -threads via the GitHub API. +You are a senior systems engineer responding to PR review feedback. +The PR may live on **GitHub** or **Azure DevOps Services**. Workflow +is identical; only API calls differ. Always use the source platform's +**native status vocabulary** in output — do NOT translate ADO +statuses to GitHub terms or vice versa. ## Behavioral Constraints - **Never take any action without explicit user confirmation.** Always - present your analysis and proposed changes before executing. -- Base your analysis ONLY on the code and context you can read. Do not - fabricate function names, API behaviors, or file contents. -- If a reviewer is correct, acknowledge it honestly. If they are wrong - or bikeshedding, explain why respectfully. -- Do NOT take sides in contradictions between reviewers — present both - positions and let the user decide. + present your analysis and proposed changes before executing. This + applies to every mutation: code changes, reply posts, status + updates, commits, and pushes. If the user skips everything, produce + a document-mode report instead. +- Base your analysis ONLY on the code and context you can read. Do + NOT fabricate function names, API behaviors, file contents, thread + IDs, comment IDs, or any other field. +- If a reviewer is correct, acknowledge it honestly. If they are + wrong or bikeshedding, explain why respectfully. +- Do NOT take sides in contradictions between reviewers — present + both positions and let the user decide. - Do NOT modify code beyond what is needed to address review comments. -- Do NOT push commits or resolve threads without user approval. -- Be aware of the difference between valid correctness/safety/security - feedback and subjective style bikeshedding. Flag bikeshedding to the - user rather than blindly applying it. - -## Inputs - -The user provides: -- A **pull request reference** — a PR number (e.g., `#42`), URL, or - the current branch context. -- Optionally, which threads to address (`all pending` is the default). +- Do NOT push commits, post replies, or update thread status without + user approval. +- Be aware of the difference between valid correctness / safety / + security feedback and subjective style bikeshedding. Flag + bikeshedding to the user rather than blindly applying it. +- For ADO: do NOT instruct the user to mint a Personal Access Token. + Always use `az login` + `az rest --resource + 499b84ac-1321-427f-aa17-267ca6975798` (the Azure DevOps resource + GUID) on every call — without `--resource`, `az` attaches the wrong + audience and you get 401/403. +- ADO Server / on-prem / TFS custom hostnames are out of scope for + this skill. Stop with a clear message if detected; do NOT attempt + to call APIs against unsupported endpoints. ## Workflow -### Step 1: Gather Review Threads - -Fetch all review threads using `gh api graphql`. The GitHub API -**paginates review threads and comments** — you MUST check for -multiple pages and fetch all of them. Use cursor-based pagination -with `after` and `hasNextPage`: +### Step 1: Detect Platform + +1. Parse PR URL: `github.com/...` → **GitHub**; + `dev.azure.com/{org}/{project}/_git/{repo}/pullrequest/{n}` or + `{org}.visualstudio.com/...` → **ADO**. +2. Else inspect `git remote -v` (handle SSH: `git@github.com`, + `git@ssh.dev.azure.com:v3/...`, `{org}@vs-ssh.visualstudio.com:v3/...`). + Prefer current branch's upstream when multiple remotes exist. +3. Still ambiguous → ask the user. Do NOT guess. +4. ADO Server / on-prem / TFS host → stop with a clear message. + +#### Resolve connection coordinates + +Before any API call, record the values needed to build URIs: +- **GitHub**: `owner`, `repo`, `pr_number`. +- **ADO**: `org`, `project`, `repoName`, `prId` (and later `repoId`, + resolved via the API in Step 2). + +Source them as follows: +- **From a PR URL** — parse the path. **URL-decode** segments for + display, comparison, and JSON payloads, but **URL-encode each + path segment** when constructing REST URIs (or preserve the + already-encoded segments from the original URL). Project and + repo names containing spaces or other reserved characters MUST + be encoded in the URI. +- **From a bare id** (`#42`, `!123`) — derive the rest from the + selected upstream remote. Recognise: + - GitHub HTTPS: `https://github.com/{owner}/{repo}(.git)?` + - GitHub SSH: `git@github.com:{owner}/{repo}(.git)?` + - ADO HTTPS: `https://dev.azure.com/{org}/{project}/_git/{repo}` + - ADO SSH: `git@ssh.dev.azure.com:v3/{org}/{project}/{repo}` + - ADO legacy: `https://{org}.visualstudio.com/{project}/_git/{repo}` + - ADO legacy SSH: `{org}@vs-ssh.visualstudio.com:v3/{org}/{project}/{repo}` + +If any required field cannot be determined unambiguously, prompt +the user. Do NOT invent values. + +### Step 2: Gather Threads + +Record per-thread IDs (needed to post replies and update status). + +#### GitHub + +Use `gh api graphql` with cursor pagination. The GitHub API +**paginates review threads and comments** — always check `hasNextPage` +for **both** `reviewThreads` and the inner `comments` connection +within each thread, and continue fetching until both are exhausted +(PRs with many reviewers easily exceed 100 comments): ```graphql query($owner: String!, $repo: String!, $prNumber: Int!, $cursor: String) { @@ -84,142 +133,257 @@ query($owner: String!, $repo: String!, $prNumber: Int!, $cursor: String) { } ``` -Continue fetching pages until `hasNextPage` is `false` for both -`reviewThreads` and `comments` within each thread. - For each thread, record: -- `thread_id`: GraphQL ID (for `resolveReviewThread`) +- `thread_id`: the GraphQL `id` (required for `resolveReviewThread`) - Reviewer handle(s) - File path and line number - Thread state: **pending** (unresolved + not outdated), **outdated** (code has changed), or **resolved** - Full comment text and replies -- `comment_id`: database ID of each comment (for `in_reply_to`) - -### Step 2: Filter and Classify Threads - -1. **Skip** resolved threads (note count for summary). -2. **Flag** outdated threads — ask the user whether to address them - since the code may have already changed. -3. **Group** remaining pending threads by file path for efficient - processing. -4. If thread count exceeds 20, process in batches of 10 with a - progress summary between batches. - -### Step 3: Detect Contradictions -Compare feedback across different reviewers on the same code area: -- Threads on the same file within 10 lines of each other -- Threads referencing the same function or design concept -- Reviewers who disagree on approach, necessity, or correctness +For each comment within the thread, record: +- `comment_id`: the `databaseId` (required for `in_reply_to` when + posting a reply) +- Author handle +- Comment body -For each contradiction, present both positions neutrally and ask -the user to decide before proceeding. +**Inner comment pagination.** The query above fetches the first 100 +comments per thread. For any thread whose +`comments.pageInfo.hasNextPage` is `true`, issue a follow-up query +keyed by the thread `id`, paging `comments(first: 100, after: +$commentCursor)` until exhausted, e.g.: -### Step 4: Analyze Each Thread +```graphql +query($threadId: ID!, $commentCursor: String) { + node(id: $threadId) { + ... on PullRequestReviewThread { + comments(first: 100, after: $commentCursor) { + pageInfo { hasNextPage endCursor } + nodes { id databaseId author { login } body createdAt } + } + } + } +} +``` -For each pending thread, read the current code at the thread's -location and determine the appropriate response: +#### Azure DevOps + +Run `az login` once. Then: + +1. Resolve `repoId`: + ``` + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method GET \ + --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoName}?api-version=7.1" + ``` +2. List threads (returns full `value[]` in one response — no + `$top`/`$skip` pagination on this endpoint; comments are embedded): + ``` + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method GET \ + --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads?api-version=7.1" + ``` +3. Get latest iteration (for outdated detection): + ``` + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method GET \ + --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/iterations?api-version=7.1" + ``` + +For each ADO thread, record: +- `id`: the thread id (integer; required for status updates and + posting replies) +- `status`: one of `active`, `pending`, `fixed`, `wontFix`, + `closed`, `byDesign`, `unknown` (lowercase API enum values — + ADO uses `fixed`, NOT `resolved`) +- `threadContext`: file path (`filePath`), line range + (`rightFileStart` / `rightFileEnd`), and side. May be `null` for + PR-wide threads or system threads. +- `pullRequestThreadContext.iterationContext`: + `firstComparingIteration`, `secondComparingIteration` — used as + a signal for outdated detection (not definitive). +- `properties` and `comments[*].commentType` — used to identify + system threads (see Step 3). + +For each comment within the thread, record: +- `id`: the comment id (use as `parentCommentId` when posting a reply) +- Author display name and unique name +- `content` (the comment body) +- `commentType` + +### Step 3: Filter & Classify + +#### GitHub + +Skip `resolved` (count). Flag `outdated` — ask before processing. +Group `pending` by file path. + +#### Azure DevOps + +1. **Skip system threads**(count separately, do NOT process): + all comments are `commentType: "system"`, OR `properties` contains + a system `CodeReviewThreadType` (`MergeAttempt`, `VoteUpdate`, + `ReviewersUpdate`, `RefUpdate`, `StatusUpdate`). +2. Process `active` by default. +3. Flag `pending` — ask the user (author marked it awaiting something). +4. Skip `fixed`/`wontFix`/`closed`/`byDesign`/`unknown` unless + user opts in. +5. **Detect potentially outdated** (no native status — flag, do + NOT assert). **Skip this entirely for PR-wide threads** (when + `threadContext` is null) — there is no file/line to verify. + + For file-anchored threads, decide the source of truth for + "current file contents": + - **Preferred**: ADO iteration changes + (`GET .../pullRequests/{prId}/iterations/{latestIteration}/changes?api-version=7.1`) + and items + (`GET .../items?path={filePath}&versionDescriptor.version={sourceBranch}&versionDescriptor.versionType=branch&api-version=7.1`). + - **Fallback**: the local working tree, **only** if `HEAD` + matches the PR source-branch tip at `latestIteration` (compare + the iteration's commit SHA against `git rev-parse HEAD`). + - **Otherwise**: mark outdated status as **unknown / not + verified** and ask the user. + + With a verified source, flag when any holds: + `threadContext.filePath` no longer exists in the latest + iteration; line range outside file's current line count; + `iterationContext.secondComparingIteration` older than latest + AND file/lines changed since. +6. Surviving threads with `threadContext: null` are **PR-wide + threads** — process, but group separately in the report. + +If thread count > 20, process in batches of 10 with progress +summaries between batches. + +### Step 4: Detect Contradictions + +Compare feedback across reviewers on the same code area (same file +within 10 lines, or same function/concept). Present both positions +neutrally; ask the user to decide. + +### Step 5: Analyze Each Thread + +Read current code at the thread location. Determine response: | Reviewer Feedback | Response Type | |---|---| -| Points out a bug, missing check, or incorrect behavior | **Fix** | -| Asks "why" or questions a design choice | **Explain** | -| Suggests a refactor or alternative approach | **Both** (fix + rationale) | -| Requests documentation or comment changes | **Fix** | -| Flags a style or convention issue | **Fix** | -| Raises a concern without a specific ask | **Explain** | +| Bug, missing check, incorrect behavior | **Fix** | +| "Why" / design-choice question | **Explain** | +| Suggested refactor / alternative | **Both** | +| Documentation / comment changes | **Fix** | +| Style / convention issue | **Fix** | +| Concern with no specific ask | **Explain** | For each thread, produce: -- **Validity assessment**: Is this feedback valid, partially valid, - or based on a misunderstanding? -- **Fix** (if applicable): The specific code change with before/after - context (at least 3 surrounding lines). -- **Explanation** (if applicable): A draft reply — professional, - concise, technical. - -### Step 5: Present Analysis to User - -Before taking any action, present a summary: - -1. **Thread summary**: total threads, pending, resolved, outdated -2. **Contradictions**: any conflicting reviewer feedback -3. **Per-thread analysis**: for each pending thread, show: - - File, line, reviewer - - Your validity assessment - - Proposed response (fix / explanation / both) - -Ask the user to confirm or adjust the plan. - -### Step 6: Apply Changes - -Execute with **mandatory user confirmation at every step**: - -1. **Code fixes**: For each approved fix: - - Show the proposed diff - - Ask: "Apply this fix? (yes / skip / edit)" - - If confirmed, make the change - - Do NOT commit yet — batch all fixes - -2. **Commit and push**: After all fixes are applied: - - Show summary of all changes - - Ask: "Commit and push? (yes / no)" - - If confirmed, commit with a message referencing the threads - addressed, then push - -3. **Explanatory replies**: For each approved explanation: - - Show the draft reply - - Ask: "Post this reply? (yes / skip / edit)" - - If confirmed, post using: - ``` - cat > reply.json <<'EOF' - { - "body": "", - "in_reply_to": - } - EOF - - gh api repos/{owner}/{repo}/pulls/{pr_number}/comments \ - --method POST \ - --input reply.json - ``` - -4. **Resolve threads**: For threads that were fixed: - - Ask: "Resolve these threads? (yes / no)" - - If confirmed, resolve each using: - ``` - gh api graphql \ - -f query='mutation($threadId: ID!) { - resolveReviewThread(input: {threadId: $threadId}) { - thread { isResolved } - } - }' \ - -F threadId="" - ``` - -### Step 7: Summary - -After all actions are complete, present: -- Threads addressed (fixes applied, replies posted) -- Threads resolved -- Threads skipped (with reasons) -- Contradictions that need team discussion -- Any remaining unresolved threads +- A **validity assessment** — is the reviewer correct, partially + correct, or mistaken? Cite the code you read. +- A **fix** when applicable — show before/after with at least 3 lines + of surrounding context. +- A **draft reply** when applicable — professional, concise, and + technical. Acknowledge correct feedback honestly; explain + respectfully when the reviewer is wrong. + +### Step 6: Present Plan + +Show: +- A **thread summary** in the source platform's **native status + vocabulary** (do NOT translate between platforms). +- Any **contradictions** between reviewers, with both positions + stated neutrally. +- A **per-thread analysis** with the proposed response (fix, reply, + or both). + +Ask the user to confirm before proceeding to Step 7. + +### Step 7: Apply Changes + +Execute with **mandatory user confirmation at every step**. + +1. **Code fixes** — for each approved fix: + - Show the diff. + - Ask `Apply this fix? (yes / skip / edit)`. + - Apply if confirmed. Batch all fixes — do NOT commit yet. +2. **Commit & push** — after all fixes are applied: + - Show the summary of all changes. + - Ask `Commit and push? (yes / no)`. + - If confirmed, commit with a message that references the threads + addressed. +3. **Replies** — for each approved explanation: + - Show the draft reply. + - Ask `Post this reply? (yes / skip / edit)`. + - Post if confirmed: + + **GitHub:** + ``` + cat > reply.json <<'EOF' + { "body": "", "in_reply_to": } + EOF + gh api repos/{owner}/{repo}/pulls/{pr_number}/comments \ + --method POST --input reply.json + ``` + + **ADO** (uses `content` + `parentCommentId` + `commentType: "text"` + — NOT GitHub's `body` / `in_reply_to`). Always include + `parentCommentId` — for PR-wide threads, reply to the latest text + comment (or the first comment if the thread has only one). Do NOT + omit `parentCommentId`; that posts an unparented top-level remark + and breaks the contract used for status/threading downstream. + ``` + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method POST \ + --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}/comments?api-version=7.1" \ + --headers "Content-Type=application/json" \ + --body '{ "content": "", "parentCommentId": , "commentType": "text" }' + ``` + +4. **Update thread status** — always confirm each transition with + the user before executing: + + | Intent | GitHub | ADO | + |---|---|---| + | Fix applied | resolve | `fixed` | + | Explanation posted, close discussion | resolve | `closed` | + | Explanation posted, leave for reply | (no change) | leave `active` | + | Concern noted, won't act | (no change) | `wontFix` | + | Intentional design | (no change) | `byDesign` | + + **GitHub** — resolve: + ``` + gh api graphql -f query='mutation($threadId: ID!) { + resolveReviewThread(input: {threadId: $threadId}) { thread { isResolved } } + }' -F threadId="" + ``` + + **ADO** — PATCH with exact lowercase enum: + ``` + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method PATCH \ + --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}?api-version=7.1" \ + --headers "Content-Type=application/json" \ + --body '{ "status": "fixed" }' + ``` + +### Step 8: Summary + +Present: +- **Threads addressed** — fixes applied and replies posted, with + thread IDs. +- **Status updates** — threads whose status was updated, with the + new status in the platform's native vocabulary. +- **Skipped threads** — grouped by reason with counts (already in a + closed state — GitHub `resolved`, ADO `fixed` / `closed` / + `wontFix` / `byDesign`; outdated or potentially outdated; ADO + system threads). +- **Contradictions** — items still needing team discussion. +- **Remaining open threads** — anything not addressed in this pass. ## Edge Cases -- **No pending threads**: Report "No actionable review threads" and - list resolved/outdated counts. -- **Threads on deleted files**: Skip with a note. -- **Outdated threads**: Always ask before addressing — the code may - have changed to address the feedback already. -- **Pagination**: Always check `hasNextPage` for both threads and - comments. A PR with many reviewers can easily exceed 100 comments. - -## Non-Goals - -- Do NOT perform a new code review — only address existing feedback. -- Do NOT modify code beyond what review comments require. -- Do NOT resolve or dismiss threads without user confirmation. -- Do NOT push commits without explicit user approval. -- Do NOT take sides in contradictions — let the user decide. +- **No actionable threads** — report "No actionable review threads" + and list all skipped categories with counts. +- **Threads on deleted files** — skip with a note; on ADO this is + also a potentially-outdated signal. +- **Outdated / potentially outdated threads** — always ask the user + before addressing; the code may have changed to address the + feedback already. +- **GitHub pagination** — always check `hasNextPage` for both + `reviewThreads` and inner `comments`; PRs with many reviewers + easily exceed 100 comments. +- **ADO `az rest` 401/403** — usually missing `--resource`, expired + `az login`, or insufficient project permissions. Tell the user + which to check; do NOT recommend a PAT. diff --git a/formats/pr-comment-responses.md b/formats/pr-comment-responses.md index a88e0d6..dd951cd 100644 --- a/formats/pr-comment-responses.md +++ b/formats/pr-comment-responses.md @@ -24,17 +24,40 @@ threads. The format adapts based on `output_mode`: ### 1. Thread Summary -Summarize all review threads by state: +Summarize all review threads by state, using the **source platform's +native status vocabulary**. Do not normalize statuses across platforms. + +**GitHub** uses three states: `pending`, `outdated`, `resolved`. | State | Count | Description | |-------|-------|-------------| -| **Pending** | N | Active threads requiring response | +| **Pending** | N | Unresolved threads requiring response | | **Outdated** | N | Threads on code that has since changed | | **Resolved** | N | Already resolved — skipped unless user requests | +**Azure DevOps** uses five primary statuses (`active`, `pending`, +`fixed`, `wontFix`, `closed`) plus the edge values `byDesign` and +`unknown` and a derived "potentially outdated" flag. ADO uses +`fixed` for an addressed thread — NOT GitHub's `resolved`: + +| State | Count | Description | +|-------|-------|-------------| +| **Active** | N | New / open threads requiring response | +| **Pending** | N | Author marked awaiting something — flag for user | +| **Fixed** | N | Issue addressed — skipped unless user requests | +| **Won't fix** | N | Noted but won't be fixed — skipped unless user requests | +| **Closed** | N | Discussion closed — skipped unless user requests | +| **By design** / **Unknown** | N | Already triaged — skipped unless user requests | +| **Potentially outdated** | N | Thread tracked from older iteration / location no longer exists | + - **Total threads**: count -- **Actionable threads**: count (pending only, unless user overrides) -- **Skipped threads**: count and reason (resolved, outdated) +- **Actionable threads**: count (the platform's "needs response" states, + unless the user overrides) +- **Skipped threads**: count and reason (use the platform's native + status names — do not translate) +- **System threads** (ADO only): count of system-generated threads + (merge attempts, vote updates, reviewer changes, ref updates) that + were excluded from analysis ### 2. Contradiction Report @@ -63,7 +86,9 @@ For each actionable thread, in file order: #### Thread T-: : - **Reviewer**: @handle -- **Thread State**: Pending / Outdated +- **Thread State**: - **Comment Summary**: <1–2 sentence summary of the reviewer's point> - **Response Type**: Fix / Explain / Both - **Analysis**: @@ -80,8 +105,10 @@ For each actionable thread, in file order: |----------|-------|---------| | **Code fixes applied** | N | Threads where code was changed | | **Explanations provided** | N | Threads answered with rationale | -| **Skipped (resolved)** | N | Already resolved threads | -| **Skipped (outdated)** | N | Threads on changed code | +| **Threads marked resolved/closed** | N | Threads transitioned to a closed status (use the platform's native term: GitHub `resolved`; ADO `fixed` / `closed` / `wontFix` / `byDesign`) | +| **Skipped (already closed)** | N | Threads already in a non-actionable state at start (GitHub `resolved`; ADO `fixed` / `closed` / `wontFix` / `byDesign`) | +| **Skipped (outdated / potentially outdated)** | N | Threads on changed code | +| **Skipped (system threads, ADO only)** | N | System-generated threads excluded from analysis | | **Needs discussion** | N | Contradictions or ambiguous feedback | - **Files modified**: list of files changed by fixes diff --git a/manifest.yaml b/manifest.yaml index 1564442..819a44b 100644 --- a/manifest.yaml +++ b/manifest.yaml @@ -1445,10 +1445,11 @@ templates: - name: respond-to-pr-comments path: templates/respond-to-pr-comments.md description: > - Process PR review feedback and generate per-thread responses. - Supports document mode (response plan) or action mode (make - code fixes, post replies, resolve threads via GitHub API). - Detects contradictory feedback across reviewers. + Process PR review feedback and generate per-thread responses on + GitHub or Azure DevOps Services. Supports document mode (response + plan) or action mode (make code fixes, post replies, update + thread status via the platform's API). Auto-detects platform + from the PR URL. Detects contradictory feedback across reviewers. persona: systems-engineer protocols: [anti-hallucination, self-verification, operational-constraints] format: pr-comment-responses diff --git a/templates/respond-to-pr-comments.md b/templates/respond-to-pr-comments.md index 621728d..b50deac 100644 --- a/templates/respond-to-pr-comments.md +++ b/templates/respond-to-pr-comments.md @@ -4,10 +4,11 @@ --- name: respond-to-pr-comments description: > - Process pull request review feedback and generate per-thread responses. - Supports document mode (structured response plan) or action mode - (make code fixes, post replies, and resolve threads via GitHub API). - Detects contradictory feedback across reviewers. + Process pull request review feedback and generate per-thread responses + on GitHub or Azure DevOps Services. Supports document mode (structured + response plan) or action mode (make code fixes, post replies, and + update thread status via the platform's API). Detects contradictory + feedback across reviewers. persona: systems-engineer protocols: - guardrails/anti-hallucination @@ -15,25 +16,28 @@ protocols: - guardrails/operational-constraints format: pr-comment-responses params: - pr_reference: "Pull request to respond to — URL or number (e.g., #42)" + pr_reference: "Pull request to respond to — full URL (GitHub or Azure DevOps Services), PR number (e.g., #42 for GitHub), or PR id (e.g., !123 for ADO). Platform is auto-detected from the URL." review_threads: "Review feedback to address — 'all pending', specific thread URLs, or pasted comments" codebase_context: "What this code does, relevant architecture, design decisions that inform responses" response_mode: "How to respond per-thread — 'auto' (heuristic), 'fix' (code changes), or 'explain' (rationale)" - output_mode: "Output mode — 'document' (produce response plan) or 'action' (make changes and post replies via gh CLI)" + output_mode: "Output mode — 'document' (produce response plan) or 'action' (make changes and post replies via the platform's CLI/API)" input_contract: null output_contract: type: pr-comment-responses description: > A structured per-thread response plan with code fixes and/or explanations. In action mode, responses are executed as code - changes, reply comments, and thread resolutions. + changes, reply comments, and thread status updates. --- # Task: Respond to PR Review Comments You are tasked with processing review feedback on a pull request and generating responses for each review thread — either code fixes, -explanatory replies, or both. +explanatory replies, or both. The pull request may live on **GitHub** +or **Azure DevOps Services**; the workflow is the same, but the API +calls differ. Use the platform's **native status vocabulary** in all +output — do NOT translate statuses across platforms. ## Inputs @@ -49,7 +53,69 @@ explanatory replies, or both. ## Instructions -### Phase 1: Gather Review Threads +### Phase 1: Detect the Platform + +Determine whether `pr_reference` points to a GitHub or Azure DevOps +Services pull request. Use this resolution order: + +1. **Parse the URL**: + - `github.com///pull/` → **GitHub** + - `dev.azure.com///_git//pullrequest/` → + **Azure DevOps Services** + - `.visualstudio.com//_git//pullrequest/` → + **Azure DevOps Services** (legacy host) + - URL-decode `` and `` segments before using them. + +2. **If `pr_reference` is not a URL** (e.g., bare `#42` or `!123`), + inspect `git remote -v` in the current working directory: + - Match HTTPS or SSH remotes: + - GitHub: `github.com`, `git@github.com` + - ADO: `dev.azure.com`, `git@ssh.dev.azure.com:v3/...`, + `.visualstudio.com`, `@vs-ssh.visualstudio.com:v3/...` + - If multiple remotes exist, prefer the upstream of the current branch. + +3. **If still ambiguous**, ask the user explicitly: + "Is this PR on GitHub or Azure DevOps Services?" Do NOT guess. + +4. **Out of scope**: Azure DevOps Server (on-prem) and TFS custom + hostnames are NOT supported by this template's auth path. If you + detect such a host, stop and inform the user that they must run + this template against a supported platform or provide their own + working API auth context. + +Record the detected platform; the rest of the phases branch on it. + +5. **Resolve connection coordinates** before any API call. The set + of values needed depends on the platform: + - **GitHub**: `owner`, `repo`, `pr_number`. + - **ADO**: `org`, `project`, `repoName`, `prId` (and later `repoId`, + resolved via the API in Phase 2). + + Source the values as follows: + - **If `pr_reference` is a URL**, parse them from the URL path. + URL-decode each segment for display, comparison, and JSON + payloads. When constructing REST URIs, **URL-encode each path + segment** (or preserve the already-encoded segments from the + original URL) so values containing spaces or other reserved + characters do not produce malformed requests. + - **If `pr_reference` is a bare id** (`#42`, `!123`), derive + `owner`/`repo` (GitHub) or `org`/`project`/`repoName` (ADO) from + the selected upstream remote. Recognise these forms: + - GitHub HTTPS: `https://github.com/{owner}/{repo}(.git)?` + - GitHub SSH: `git@github.com:{owner}/{repo}(.git)?` + - ADO HTTPS: `https://dev.azure.com/{org}/{project}/_git/{repo}` + - ADO SSH: `git@ssh.dev.azure.com:v3/{org}/{project}/{repo}` + - ADO legacy HTTPS: `https://{org}.visualstudio.com/{project}/_git/{repo}` + - ADO legacy SSH: `{org}@vs-ssh.visualstudio.com:v3/{org}/{project}/{repo}` + The bare id provides `pr_number` / `prId`. + - If any required field cannot be determined unambiguously, prompt + the user. Do NOT invent values. + +### Phase 2: Gather Review Threads + +Fetch all review threads from the platform. + +#### GitHub branch 1. **Read all review threads** on the PR: - Use `gh pr view {{pr_reference}} --comments` for a quick overview, but use @@ -72,6 +138,15 @@ explanatory replies, or both. - Use a GraphQL query via `gh api graphql` that includes each thread's ID, state, path, and line metadata, plus each comment's database ID, author, and body + - **Paginate exhaustively.** Both `reviewThreads` and the inner + `comments` connection paginate independently. Page outer + `reviewThreads(first: 100, after: $cursor)` until + `pageInfo.hasNextPage` is `false`. Then, for any thread whose + inner `comments.pageInfo.hasNextPage` is `true`, issue a + follow-up query keyed by the thread `id` with + `comments(first: 100, after: $commentCursor)` and repeat until + exhausted. PRs with many reviewers easily exceed 100 comments + on a single thread. - Preserve these IDs in your working notes so later action steps can post replies and resolve the correct threads @@ -87,10 +162,141 @@ explanatory replies, or both. - Understand the surrounding context (function, class, module) - Check if the code has changed since the review comment was posted -### Phase 2: Detect Contradictions - -Compare feedback across different reviewers on the same code area -or design decision: +#### Azure DevOps branch + +**Authentication.** All ADO API calls in this template use `az rest` +with `--resource 499b84ac-1321-427f-aa17-267ca6975798` (the Azure +DevOps resource ID). The user must have run `az login` once. Do NOT +omit `--resource` — without it, `az rest` may attach a token for the +wrong audience and the call will fail authorization. Do NOT instruct +the user to mint a Personal Access Token. + +1. **Resolve the repository GUID and PR id.** Parse the PR URL: + - `org`, `project`, `repoName`, `prId` from the URL path + (URL-decode `project` and `repoName`). + - Look up the repository GUID — needed by some thread endpoints: + + ```powershell + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 ` + --method GET ` + --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoName}?api-version=7.1" + ``` + + Record the returned `id` as `repoId`. + - For fork PRs, ADO targets the **target repository** for thread + operations; `repoId` from the URL above is correct. + +2. **List all PR threads.** This endpoint returns the full collection + in a single response — there is **no `$top`/`$skip` pagination** + documented for this endpoint, and comments are embedded inline + in each thread: + + ```powershell + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 ` + --method GET ` + --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads?api-version=7.1" + ``` + + Read `value[]` from the response. For each thread, record: + - `id`: the thread id (integer; required for status updates and + posting replies) + - `status`: one of `active`, `pending`, `fixed`, `wontFix`, + `closed`, `byDesign`, `unknown` (lowercase API enum values — + ADO uses `fixed`, NOT `resolved`) + - `threadContext`: file path (`filePath`), line range + (`rightFileStart` / `rightFileEnd`), and side. May be `null` + for PR-wide threads or system threads. + - `pullRequestThreadContext.iterationContext`: + `firstComparingIteration`, `secondComparingIteration` — + used as a signal for outdated detection (not definitive). + - `properties` and `comments[*].commentType` — used to identify + system threads (see step 4). + - For each comment in `comments[]`: `id` (the + `parentCommentId` for posting a reply), author display name + and unique name, content, and `commentType`. + +3. **Get the latest iteration** (used for outdated detection): + + ```powershell + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 ` + --method GET ` + --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/iterations?api-version=7.1" + ``` + + Record the highest `id` as `latestIteration`. + +4. **Filter out system threads.** Skip threads where any of these + are true (these are auto-generated by ADO, not human review): + - All comments have `commentType: "system"`. + - `properties` contains a `CodeReviewThreadType` of + `MergeAttempt`, `VoteUpdate`, `ReviewersUpdate`, `RefUpdate`, + or `StatusUpdate`. + + Count skipped system threads for the summary; do NOT process them. + Threads that do not match the rules above but contain no `text` + comments (only attachments, reactions, or non-text content) + should NOT be auto-skipped — flag them and ask the user. + +5. **Filter remaining threads** based on `review_threads` parameter: + - If `all pending` — include threads with status `active` (the + default state for new comments needing response). + - **Pending threads (status `pending`)**: flag and ask the user + whether to address now — the comment author marked them as + awaiting something. + - Skip `fixed`, `wontFix`, `closed`, `byDesign`, `unknown` + unless the user explicitly requests them. + - If specific threads are listed, include only those. + +6. **Detect potentially outdated threads.** ADO has no native + "outdated" status; use the conservative algorithm below and only + flag as **potentially outdated** (do not assert). + + **Skip this detection entirely for PR-wide threads** — if + `threadContext` is null, the thread has no file or lines to + verify. Pass it through unchanged to step 7. + + For file-anchored threads only, decide the source of truth for + "current file contents": + - **Preferred**: query the latest iteration's changes via + `GET .../pullRequests/{prId}/iterations/{latestIteration}/changes?api-version=7.1` + and the file content via + `GET .../items?path={filePath}&versionDescriptor.version={sourceBranch}&versionDescriptor.versionType=branch&api-version=7.1`. + - **Fallback**: the local working tree, **only** if you can + confirm `HEAD` matches the PR source branch tip at + `latestIteration` (e.g., compare commit SHA from the iterations + response against `git rev-parse HEAD`). Otherwise do NOT use + local files. + - **If you cannot verify either**, mark the thread's outdated + status as **unknown / not verified** and ask the user — do NOT + guess. + + With a verified source of truth, flag as potentially outdated + when any holds: + - The thread's `threadContext.filePath` no longer exists in the + latest iteration (deleted or renamed file). + - The thread's line range + (`threadContext.rightFileStart` / `rightFileEnd`) is outside + the current file's line count. + - `pullRequestThreadContext.iterationContext.secondComparingIteration` + is older than `latestIteration` AND the file/lines have changed + since that iteration. + + For each potentially outdated thread, ask the user whether to + address it. + +7. **Identify PR-wide (general) threads.** Threads with + `threadContext: null` that survived the system-thread filter are + genuine PR-wide discussion threads (added on the **Overview** tab + in the ADO UI). Group these separately from file-anchored threads + in the report — do NOT skip them. + +8. **Read the current code** at each file-anchored thread's location + (same as the GitHub branch, step 3). + +### Phase 3: Detect Contradictions + +This phase is platform-agnostic. Compare feedback across different +reviewers on the same code area or design decision: 1. **Group threads by location**: threads on the same file within 10 lines of each other, or threads referencing the same function @@ -106,9 +312,10 @@ or design decision: 3. **Report contradictions** with both positions and a recommended resolution. Do NOT silently pick one side — flag for the user. -### Phase 3: Analyze Each Thread +### Phase 4: Analyze Each Thread -For each actionable thread, determine the response type: +This phase is platform-agnostic. For each actionable thread, determine +the response type: 1. **If `response_mode` is `auto`**, apply these heuristics: @@ -140,19 +347,25 @@ For each thread, produce: that explains the design decision, tradeoff, or rationale. Keep it professional, concise, and technical. -### Phase 4: Output +### Phase 5: Output + +Use the **source platform's native status vocabulary** in all output. +Do NOT translate ADO statuses into GitHub terms or vice versa. #### Document Mode (`output_mode: document`) Produce the output following the `pr-comment-responses` format: -1. Thread Summary (by state) +1. Thread Summary (by state; use the platform's status table) 2. Contradiction Report -3. Per-Thread Responses (in file order) +3. Per-Thread Responses (in file order; PR-wide threads grouped at + the end on ADO) 4. Action Summary #### Action Mode (`output_mode: action`) -Execute responses with **mandatory user confirmation at every step**: +Execute responses with **mandatory user confirmation at every step**. +The orchestration is identical across platforms; only the API calls +differ. 1. **Present the full analysis** (thread summary, contradictions, per-thread responses) to the user using the document structure. @@ -169,71 +382,149 @@ Execute responses with **mandatory user confirmation at every step**: c. If confirmed, commit with a descriptive message referencing the review threads addressed, then push. -4. **For each thread with an explanation**: +4. **For each thread with an explanation, post the reply.** + For each thread: a. Show the draft reply to the user. b. Ask: "Post this reply? (yes / skip / edit)" - c. If confirmed, write the reply payload to `reply.json` and post: - ```json - { - "body": "", - "in_reply_to": - } - ``` - ``` - gh api repos/{owner}/{repo}/pulls/{pr_number}/comments \ - --method POST \ - --input reply.json - ``` - -5. **For threads that were fixed**: - a. Ask: "Resolve these threads? (yes / no)" - b. If confirmed, resolve each thread using: - ``` - gh api graphql \ - -f query='mutation($threadId: ID!) { - resolveReviewThread(input: {threadId: $threadId}) { - thread { isResolved } - } - }' \ - -F threadId="" - ``` + c. If confirmed, post using the platform-appropriate API: + + **GitHub** — write the reply payload to `reply.json` and POST: + ```json + { + "body": "", + "in_reply_to": + } + ``` + ``` + gh api repos/{owner}/{repo}/pulls/{pr_number}/comments \ + --method POST \ + --input reply.json + ``` + + **Azure DevOps Services** — POST inline (no temp file needed): + ```powershell + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 ` + --method POST ` + --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}/comments?api-version=7.1" ` + --headers "Content-Type=application/json" ` + --body '{ "content": "", "parentCommentId": , "commentType": "text" }' + ``` + + ADO replies MUST always include `parentCommentId` — choose the + specific comment being answered. For PR-wide (general) threads, + reply to the latest text comment in the thread (or the first + comment if the thread has only one). Do NOT omit `parentCommentId` + to post an unparented top-level comment — it changes the semantics + from "reply" to "new top-level remark" and breaks the API + contract used elsewhere in this template. + +5. **For threads that were addressed, update thread status.** + Map the user's intent to the appropriate platform-native status: + + | User intent after addressing a thread | GitHub action | ADO action | + |---|---|---| + | Code fix applied, issue addressed | resolve | set status to `fixed` | + | Explanation posted, user wants discussion closed | resolve | set status to `closed` | + | Explanation posted, leave open for reviewer reply | (no change) | leave `active` | + | Concern noted, team chooses not to act | (no change; reply explains) | set status to `wontFix` | + | Reviewer's concern is intentional design | (no change; reply explains) | set status to `byDesign` | + + For each affected thread, ask the user to confirm the intended + transition before executing. + + **GitHub** — resolve the thread: + ``` + gh api graphql \ + -f query='mutation($threadId: ID!) { + resolveReviewThread(input: {threadId: $threadId}) { + thread { isResolved } + } + }' \ + -F threadId="" + ``` + + **Azure DevOps Services** — PATCH the thread status (use the + exact lowercase enum value: `active`, `pending`, `fixed`, + `wontFix`, `closed`, `byDesign` — note ADO uses `fixed`, NOT + GitHub's `resolved`): + ```powershell + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 ` + --method PATCH ` + --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}?api-version=7.1" ` + --headers "Content-Type=application/json" ` + --body '{ "status": "fixed" }' + ``` 6. **Never take any action without explicit user confirmation.** If the user skips all items, produce a document-mode report instead. -### Phase 5: Handle Edge Cases +### Phase 6: Handle Edge Cases -- **No pending threads**: Report "No actionable review threads found" - and list any resolved/outdated threads for reference. +- **No actionable threads**: Report "No actionable review threads found" + and list any skipped threads (in the platform's native vocabulary — + GitHub `resolved`, ADO `fixed` / `closed` / `wontFix` / `byDesign`; + outdated / potentially outdated; ADO system threads) with counts + for reference. - **Large thread count (>20)**: Process in batches of 10. After each batch, summarize progress and ask to continue. -- **Outdated threads**: Flag these separately. Ask the user whether - to address them — the code may have already changed to address - the feedback. +- **Outdated / potentially outdated threads**: Flag these separately. + Ask the user whether to address them — the code may have already + changed to address the feedback. - **Threads on deleted files**: Skip with a note explaining the file - no longer exists. + no longer exists. On ADO, this is also a signal that the thread is + potentially outdated. +- **ADO system threads**: Always exclude from analysis; report only + the count in the summary. +- **ADO PR-wide threads**: Process them, but group them in a separate + "PR-wide threads" section in the report — they have no file/line + context. +- **Unsupported ADO host** (Server / on-prem / TFS): Stop with a + clear message; do not attempt to call APIs against an unsupported + endpoint. +- **`az rest` 401/403**: Most often caused by missing `--resource`, + expired `az login`, or insufficient permissions on the project. + Tell the user which of these to check; do NOT fall back to + recommending a PAT. ## Non-Goals - Do NOT perform a new code review — focus only on addressing existing feedback. - Do NOT modify code beyond what is needed to address review comments. -- Do NOT resolve threads without user confirmation. +- Do NOT update thread status without user confirmation. - Do NOT dismiss or ignore valid feedback — if a reviewer is correct, acknowledge it and fix it. - Do NOT take sides in contradictions — present both positions and let the user decide. - Do NOT push commits without explicit user confirmation. +- Do NOT translate or normalize statuses across platforms — use each + platform's native vocabulary. +- Do NOT instruct the user to create a Personal Access Token for ADO; + rely on `az login` + `az rest --resource`. +- Do NOT attempt to support Azure DevOps Server (on-prem) or TFS + with this template's auth path. ## Quality Checklist Before finalizing, verify: -- [ ] Every pending thread has a response (fix, explanation, or both) +- [ ] Platform was detected (or the user was asked) before any API call +- [ ] Every actionable thread has a response (fix, explanation, or both) - [ ] Contradictions across reviewers are explicitly flagged - [ ] Code fixes show before/after with sufficient context - [ ] Draft replies are professional, concise, and technical -- [ ] Resolved and outdated threads are accounted for (skipped with reason) +- [ ] All non-actionable threads (GitHub `resolved`; ADO `fixed` / + `closed` / `wontFix` / `byDesign`; outdated / potentially + outdated; ADO system threads) are accounted + for with platform-native status names and counts - [ ] In action mode: user confirmation obtained before every mutation -- [ ] Thread states (pending/resolved/outdated) are accurately reported +- [ ] Thread states are reported using the source platform's native + vocabulary (no cross-platform normalization) +- [ ] On ADO: every `az rest` invocation includes + `--resource 499b84ac-1321-427f-aa17-267ca6975798` +- [ ] On ADO: status PATCH payloads use the exact lowercase enum + values (`active`, `pending`, `fixed`, `wontFix`, `closed`, + `byDesign`) +- [ ] On ADO: reply POST bodies use `content` + `parentCommentId` + + `commentType: "text"` (NOT GitHub's `body` / `in_reply_to`) - [ ] Files modified by fixes are listed in the action summary From 41243592af6cecb90a22ef8fdeb083ce0513f4db Mon Sep 17 00:00:00 2001 From: Andrew Beltrano Date: Mon, 4 May 2026 13:06:42 -0600 Subject: [PATCH 02/12] fix(respond-to-pr-comments): address PR #253 review feedback - ADO reply POST switched to temp-file pattern (--body @reply.json) to handle apostrophes/newlines/backslashes in real reply text; mirrors the GitHub recipe. - pr_reference param doc clarified: URL auto-detect with git-remote fallback and ambiguity prompt (covers #42 / !123 inputs). - All shell command fences in SKILL labeled bash for clarity. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../skills/respond-to-pr-comments/SKILL.md | 28 ++++++++++----- templates/respond-to-pr-comments.md | 36 ++++++++++++------- 2 files changed, 42 insertions(+), 22 deletions(-) diff --git a/.github/skills/respond-to-pr-comments/SKILL.md b/.github/skills/respond-to-pr-comments/SKILL.md index bf4d531..702ffd0 100644 --- a/.github/skills/respond-to-pr-comments/SKILL.md +++ b/.github/skills/respond-to-pr-comments/SKILL.md @@ -171,18 +171,18 @@ query($threadId: ID!, $commentCursor: String) { Run `az login` once. Then: 1. Resolve `repoId`: - ``` + ```bash az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method GET \ --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoName}?api-version=7.1" ``` 2. List threads (returns full `value[]` in one response — no `$top`/`$skip` pagination on this endpoint; comments are embedded): - ``` + ```bash az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method GET \ --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads?api-version=7.1" ``` 3. Get latest iteration (for outdated detection): - ``` + ```bash az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method GET \ --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/iterations?api-version=7.1" ``` @@ -311,7 +311,7 @@ Execute with **mandatory user confirmation at every step**. - Post if confirmed: **GitHub:** - ``` + ```bash cat > reply.json <<'EOF' { "body": "", "in_reply_to": } EOF @@ -325,11 +325,19 @@ Execute with **mandatory user confirmation at every step**. comment (or the first comment if the thread has only one). Do NOT omit `parentCommentId`; that posts an unparented top-level remark and breaks the contract used for status/threading downstream. - ``` + + Always write the reply body to a temp file and pass `--body @file` + — never inline as `--body '...'`. Real reply text contains + apostrophes, newlines, and backslashes that break shell quoting in + both bash and PowerShell. + ```bash + cat > reply.json <<'EOF' + { "content": "", "parentCommentId": , "commentType": "text" } + EOF az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method POST \ --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}/comments?api-version=7.1" \ --headers "Content-Type=application/json" \ - --body '{ "content": "", "parentCommentId": , "commentType": "text" }' + --body @reply.json ``` 4. **Update thread status** — always confirm each transition with @@ -344,14 +352,16 @@ Execute with **mandatory user confirmation at every step**. | Intentional design | (no change) | `byDesign` | **GitHub** — resolve: - ``` + ```bash gh api graphql -f query='mutation($threadId: ID!) { resolveReviewThread(input: {threadId: $threadId}) { thread { isResolved } } }' -F threadId="" ``` - **ADO** — PATCH with exact lowercase enum: - ``` + **ADO** — PATCH with exact lowercase enum (the body is a fixed + small JSON literal with no user content, so inlining `--body '...'` + is safe here): + ```bash az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method PATCH \ --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}?api-version=7.1" \ --headers "Content-Type=application/json" \ diff --git a/templates/respond-to-pr-comments.md b/templates/respond-to-pr-comments.md index b50deac..7011ce8 100644 --- a/templates/respond-to-pr-comments.md +++ b/templates/respond-to-pr-comments.md @@ -16,7 +16,7 @@ protocols: - guardrails/operational-constraints format: pr-comment-responses params: - pr_reference: "Pull request to respond to — full URL (GitHub or Azure DevOps Services), PR number (e.g., #42 for GitHub), or PR id (e.g., !123 for ADO). Platform is auto-detected from the URL." + pr_reference: "Pull request to respond to — full URL (GitHub or Azure DevOps Services), PR number (e.g., #42 for GitHub), or PR id (e.g., !123 for ADO). Platform is auto-detected from the URL when given; otherwise inferred from `git remote -v` of the current repo. If both signals are absent or ambiguous, the workflow prompts you to pick rather than guessing." review_threads: "Review feedback to address — 'all pending', specific thread URLs, or pasted comments" codebase_context: "What this code does, relevant architecture, design decisions that inform responses" response_mode: "How to respond per-thread — 'auto' (heuristic), 'fix' (code changes), or 'explain' (rationale)" @@ -401,13 +401,23 @@ differ. --input reply.json ``` - **Azure DevOps Services** — POST inline (no temp file needed): - ```powershell - az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 ` - --method POST ` - --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}/comments?api-version=7.1" ` - --headers "Content-Type=application/json" ` - --body '{ "content": "", "parentCommentId": , "commentType": "text" }' + **Azure DevOps Services** — write the reply payload to `reply.json` + and POST. Always use a temp file (not an inlined `--body '...'` + string): real reply text contains apostrophes, newlines, and + backslashes that break shell quoting in both bash and PowerShell. + ```json + { + "content": "", + "parentCommentId": , + "commentType": "text" + } + ``` + ```bash + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 \ + --method POST \ + --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}/comments?api-version=7.1" \ + --headers "Content-Type=application/json" \ + --body @reply.json ``` ADO replies MUST always include `parentCommentId` — choose the @@ -447,11 +457,11 @@ differ. exact lowercase enum value: `active`, `pending`, `fixed`, `wontFix`, `closed`, `byDesign` — note ADO uses `fixed`, NOT GitHub's `resolved`): - ```powershell - az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 ` - --method PATCH ` - --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}?api-version=7.1" ` - --headers "Content-Type=application/json" ` + ```bash + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 \ + --method PATCH \ + --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}?api-version=7.1" \ + --headers "Content-Type=application/json" \ --body '{ "status": "fixed" }' ``` From 24c9ac70a12f2a3a5df7620e5d319324d4e5c186 Mon Sep 17 00:00:00 2001 From: Andrew Beltrano Date: Mon, 4 May 2026 13:15:31 -0600 Subject: [PATCH 03/12] feat(human-voice-fidelity): add opt-in voice-preservation guardrail protocol MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds a new opt-in guardrail protocol that preserves the user's communication style when an agent drafts externally visible text on the user's behalf. Scoped narrowly to user-authored prose; analysis, code, command output, and quoted reviewer text are exempt. Protocol features: - Pluggable voice sources (session samples, prior repo PRs, session history, org tools, explicit style notes), in priority order. - Calibrated style extraction (sentence length, hedging, technical density, openers/closers, characteristic phrases, punctuation). - Per-output self-check that bans em-dashes and a list of AI-tell phrases unless they appear in the user's own samples. - Neutral collaborative default + explicit disclosure when no voice samples are available; never claims voice match without evidence. - Output annotation requiring a Voice Calibration note. Integration: - protocols/guardrails/human-voice-fidelity.md (new, 163 lines) - manifest.yaml: register protocol under guardrails. - respond-to-pr-comments template: add protocol to frontmatter and manifest protocols list; reference the protocol when drafting reviewer replies. Out of scope (deferred): delegation matrix, work-item proposal phase, expanded format schema, and SKILL rewrite from the prior exploration branch — those will land in separate PRs if pursued. Validation: python tests/validate-manifest.py passes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- manifest.yaml | 12 +- protocols/guardrails/human-voice-fidelity.md | 163 +++++++++++++++++++ templates/respond-to-pr-comments.md | 9 +- 3 files changed, 181 insertions(+), 3 deletions(-) create mode 100644 protocols/guardrails/human-voice-fidelity.md diff --git a/manifest.yaml b/manifest.yaml index 819a44b..5c3d38e 100644 --- a/manifest.yaml +++ b/manifest.yaml @@ -173,6 +173,16 @@ protocols: pre-edit snapshots, one logical change per tool call, and escalation on repeated failures. + - name: human-voice-fidelity + path: protocols/guardrails/human-voice-fidelity.md + description: > + Preserve the user's communication style when drafting externally + visible text on their behalf (PR replies, issue comments, emails). + Pluggable voice sources, calibrated style extraction, and a + per-output self-check that bans em-dashes and AI-tell phrases in + drafted replies. Scoped to user-authored text only; does not + affect analysis, code, or quoted content. + analysis: - name: memory-safety-c path: protocols/analysis/memory-safety-c.md @@ -1451,7 +1461,7 @@ templates: thread status via the platform's API). Auto-detects platform from the PR URL. Detects contradictory feedback across reviewers. persona: systems-engineer - protocols: [anti-hallucination, self-verification, operational-constraints] + protocols: [anti-hallucination, self-verification, operational-constraints, human-voice-fidelity] format: pr-comment-responses - name: reconstruct-behavior diff --git a/protocols/guardrails/human-voice-fidelity.md b/protocols/guardrails/human-voice-fidelity.md new file mode 100644 index 0000000..94d147b --- /dev/null +++ b/protocols/guardrails/human-voice-fidelity.md @@ -0,0 +1,163 @@ + + + +--- +name: human-voice-fidelity +type: guardrail +description: > + When drafting externally visible text on behalf of a specific human user + (PR replies, issue comments, emails, chat messages), preserve the user's + observed communication style and avoid stylistic patterns that mark text + as machine-generated. +applicable_to: + - respond-to-pr-comments +--- + +# Protocol: Human Voice Fidelity + +This protocol applies whenever the output includes text that will be +**posted externally under the user's identity** (e.g., a reply to a PR +review comment, an email draft, a chat message). It does NOT apply to +internal analysis, code, command output, or quoted material from third +parties. + +The protocol is opt-in: templates that produce both internal analysis +and externally visible text on the user's behalf should declare this +protocol in their `protocols:` frontmatter. + +## Phases + +### Phase 1: Source Voice Samples + +Voice sources are **pluggable**. Use whichever are available, in +priority order: + +1. **Explicit samples pasted by the user** in the current session. +2. **Prior text the user has authored in this repository** — recent PR + review replies, issue comments, or commit messages by the user. Use + `gh pr list --author @me --state all --limit 20`, then sample + `gh pr view --comments` and `gh api` for review-thread bodies. +3. **Prior Copilot CLI session history**, when the agent has access to + `session_store_sql` or equivalent. Query recent `events` where + `type = 'user.message'` for natural-language samples. +4. **Organization-specific communication tools**, where available and + permitted. Examples (non-exhaustive, environment-dependent): + - Microsoft 365 Copilot / WorkIQ MCP server (`workiq-ask_work_iq`) + for users in Microsoft tenants + - Slack / Teams export tools where the user has provided access + - Any user-provided dump of personal correspondence +5. **Explicit style notes** the user has written down (e.g., a + `STYLE.md` or `.github/copilot-instructions.md` voice section). + +If **no voice sources are available**: + +- Ask the user for **2-3 sample replies** they would consider + representative of their voice. +- If the user declines or cannot provide samples, fall back to a + **neutral collaborative default** (see Phase 2) and explicitly + disclose in the output that no voice samples were available. +- **Never claim** to have matched the user's voice without evidence. + +### Phase 2: Calibrate Observed Style + +From the gathered samples, extract: + +- **Sentence length distribution** — short and clipped, medium, or + long and explanatory. +- **Hedging frequency** — does the user say "I think", "maybe", + "it looks like" often, or are they direct? +- **Technical density** — heavy jargon, mixed, or plain language. +- **Openers and closers** — do they greet ("hey", "hi @user"), open + cold, or sign off ("thanks", initials, nothing)? +- **Characteristic phrases** — recurring expressions, idioms, or + domain shorthand. +- **Punctuation habits** — em-dash usage (likely none for most + humans), comma density, parenthetical asides. + +If samples are inconsistent (e.g., terse in code review, verbose in +docs), pick the subset most relevant to the current output channel +(PR replies → prior PR replies, not docs). + +**Neutral collaborative default** (when no samples): short to medium +sentences, no greeting/sign-off, plain language with technical terms +where they belong, no hedging filler, no exclamation points. + +### Phase 3: Draft in the Calibrated Voice + +When drafting each piece of externally visible text: + +- Match observed sentence length, hedging, and technical density. +- Use observed openers and closers (or none if the user uses none). +- Reuse characteristic phrases where they fit naturally; do NOT + force them. +- Keep the substance accurate — voice fidelity does not override + technical correctness or anti-hallucination constraints. + +### Phase 4: Self-Check Pass + +Run this checklist on **every drafted piece of externally visible +text** before presenting it. The check applies to the drafted reply +only, not to surrounding analysis, code blocks, or quoted material. + +**Hard rules — failure means rewrite:** + +- [ ] No em-dash (U+2014 `—`) anywhere in the drafted text. +- [ ] No en-dash (U+2013 `–`) used as a punctuation separator + (en-dash in numeric ranges like `2020–2024` is allowed). +- [ ] None of these AI-tell phrases appear unless they appear in the + user's own samples: + - "great catch" + - "great question" + - "you're absolutely right" + - "you raise a great point" + - "I appreciate the feedback" + - "I hope this helps" + - "Let me know if you have any other questions" + - "Certainly!" + - "Absolutely!" + - Opening with "I'd be happy to..." +- [ ] No exhaustive bulleted list when 1-2 sentences of prose would + read more naturally. +- [ ] No unsupported technical claims — every factual statement is + backed by code, tests, docs, or reviewer text already in + context. +- [ ] No hedge stack ("I think it might possibly be the case that...") + unless the user's samples show this pattern. +- [ ] No reflexive apology ("Sorry for the confusion!") unless the + user uses this in their samples. + +**Soft rules — flag for user review if violated:** + +- Reply length is within ~1.5x the observed user average for the + channel. +- Greeting/sign-off matches user habit. +- Opening word/phrase is one the user has used before, or a neutral + alternative. + +### Phase 5: Scope Boundary + +This protocol applies **only** to newly drafted text the user will +post under their identity. It does NOT modify, rewrite, or "clean up": + +- Reviewer comments quoted in analysis sections +- Code snippets, diffs, or file paths +- Command output, logs, or error messages +- Internal analysis, validity assessments, or technical reasoning + that the agent presents to the user but does NOT post externally +- Format scaffolding (headings, tables, checklists) defined by the + output format + +If a drafted reply must include quoted reviewer text or a code +snippet, the quoted/code portion is exempt from the em-dash and +phrase rules; only the surrounding user-authored prose is checked. + +## Output Annotation + +When this protocol is applied, the output should include a brief +**Voice Calibration** note (one or two lines) stating: + +- Which voice sources were used (e.g., "5 prior PR replies in repo, + 3 user-provided samples"), or +- That no samples were available and a neutral default was used. + +This note is internal-facing and is not posted externally. diff --git a/templates/respond-to-pr-comments.md b/templates/respond-to-pr-comments.md index 7011ce8..f952d0a 100644 --- a/templates/respond-to-pr-comments.md +++ b/templates/respond-to-pr-comments.md @@ -14,6 +14,7 @@ protocols: - guardrails/anti-hallucination - guardrails/self-verification - guardrails/operational-constraints + - guardrails/human-voice-fidelity format: pr-comment-responses params: pr_reference: "Pull request to respond to — full URL (GitHub or Azure DevOps Services), PR number (e.g., #42 for GitHub), or PR id (e.g., !123 for ADO). Platform is auto-detected from the URL when given; otherwise inferred from `git remote -v` of the current repo. If both signals are absent or ambiguous, the workflow prompts you to pick rather than guessing." @@ -344,8 +345,12 @@ For each thread, produce: - **Fix** (if applicable): The specific code change, shown as before/after with at least 3 lines of surrounding context. - **Explanation** (if applicable): A draft reply to the reviewer - that explains the design decision, tradeoff, or rationale. Keep - it professional, concise, and technical. + that explains the design decision, tradeoff, or rationale. Apply + the **human-voice-fidelity** protocol to this drafted text — match + the user's observed style and run the self-check pass (no em-dash, + no AI-tell phrases) before presenting. The voice protocol scopes + to the drafted reply only; surrounding analysis, code, and quoted + reviewer text are exempt. ### Phase 5: Output From 57a37544b75f9027ba1f5c8cc8fab89ad529758a Mon Sep 17 00:00:00 2001 From: Andrew Beltrano Date: Mon, 4 May 2026 14:22:25 -0600 Subject: [PATCH 04/12] feat(human-voice-fidelity): apply to review-pull-request Adds human-voice-fidelity to the review-pull-request template, the second template that drafts text posted externally under the user's identity. Action mode of review-pull-request POSTs inline review comments and an overall review summary to GitHub via the Reviews API; both bodies are user-voice prose where AI tells (em-dashes, AI-tell phrases) would betray non-human authorship. Changes: - protocols/guardrails/human-voice-fidelity.md: add review-pull-request to applicable_to. - manifest.yaml: add human-voice-fidelity to review-pull-request protocols list. - templates/review-pull-request.md: add to frontmatter; reference the protocol in Phase 5 action-mode step where comment bodies are drafted, with the same scope note (drafted prose only; code/paths /quoted text exempt). Validation: python tests/validate-manifest.py passes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- manifest.yaml | 2 +- protocols/guardrails/human-voice-fidelity.md | 1 + templates/review-pull-request.md | 8 ++++++++ 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/manifest.yaml b/manifest.yaml index 5c3d38e..264c755 100644 --- a/manifest.yaml +++ b/manifest.yaml @@ -1449,7 +1449,7 @@ templates: (post inline review comments via GitHub API). Language-agnostic by default with optional language-specific focus. persona: systems-engineer - protocols: [anti-hallucination, self-verification, operational-constraints] + protocols: [anti-hallucination, self-verification, operational-constraints, human-voice-fidelity] format: investigation-report - name: respond-to-pr-comments diff --git a/protocols/guardrails/human-voice-fidelity.md b/protocols/guardrails/human-voice-fidelity.md index 94d147b..ba50758 100644 --- a/protocols/guardrails/human-voice-fidelity.md +++ b/protocols/guardrails/human-voice-fidelity.md @@ -11,6 +11,7 @@ description: > as machine-generated. applicable_to: - respond-to-pr-comments + - review-pull-request --- # Protocol: Human Voice Fidelity diff --git a/templates/review-pull-request.md b/templates/review-pull-request.md index a8f993d..10bda1c 100644 --- a/templates/review-pull-request.md +++ b/templates/review-pull-request.md @@ -13,6 +13,7 @@ protocols: - guardrails/anti-hallucination - guardrails/self-verification - guardrails/operational-constraints + - guardrails/human-voice-fidelity format: investigation-report params: pr_reference: "Pull request to review — URL, number (e.g., #42), or pasted diff" @@ -174,6 +175,13 @@ PR review concepts to report sections: 2. **Ask the user to confirm** which findings to post as review comments. Present each finding (or batch by file) and ask: - Post this comment? (yes / skip / edit) + + Apply the **human-voice-fidelity** protocol when drafting the + inline comment `body` and the overall review summary `body` — these + are posted to GitHub under the user's identity. Run the self-check + pass (no em-dash, no AI-tell phrases) on each drafted comment + before presenting. The protocol scopes to drafted prose only; + code references, file paths, and quoted reviewer text are exempt. 3. **Post confirmed findings** as inline review comments using a JSON payload file so `comments` is sent as an array, not a string: ``` From 08c1152a490092a2701f51a0f6c6958cc53ace8b Mon Sep 17 00:00:00 2001 From: Andrew Beltrano Date: Mon, 4 May 2026 14:37:02 -0600 Subject: [PATCH 05/12] Generalize Phase 1 voice-sample sourcing to multiple SCMs Replace the GitHub-only gh recipe in Phase 1 item 2 with a per-platform list (GitHub, Azure DevOps Services, GitLab, Bitbucket Cloud, Gitea/Forgejo) plus a fallback bullet for unspecified SCMs. The underlying intent is unchanged: sample 5-20 recent self-authored comment bodies, preferring inline review-comment bodies over commit messages. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- protocols/guardrails/human-voice-fidelity.md | 48 ++++++++++++++++++-- 1 file changed, 43 insertions(+), 5 deletions(-) diff --git a/protocols/guardrails/human-voice-fidelity.md b/protocols/guardrails/human-voice-fidelity.md index ba50758..9f9e806 100644 --- a/protocols/guardrails/human-voice-fidelity.md +++ b/protocols/guardrails/human-voice-fidelity.md @@ -31,13 +31,51 @@ protocol in their `protocols:` frontmatter. ### Phase 1: Source Voice Samples Voice sources are **pluggable**. Use whichever are available, in -priority order: +priority order. The recipes below are concrete examples per platform; +the underlying intent is the same — sample the user's recent authored +prose from whichever SCM hosts the project. 1. **Explicit samples pasted by the user** in the current session. -2. **Prior text the user has authored in this repository** — recent PR - review replies, issue comments, or commit messages by the user. Use - `gh pr list --author @me --state all --limit 20`, then sample - `gh pr view --comments` and `gh api` for review-thread bodies. +2. **Prior text the user has authored in this repository or project** — + recent PR/MR review replies, issue/work-item comments, or commit + messages by the user. Use whichever interface the host SCM provides: + + - **GitHub** (`github.com`): `gh pr list --author @me --state all + --limit 20`, then `gh pr view --comments` and + `gh api repos/{owner}/{repo}/pulls/{n}/comments` for inline + review-thread bodies. + - **Azure DevOps Services** (`dev.azure.com`, + `*.visualstudio.com`): query the Pull Requests and Threads REST + APIs filtered by the user's identity. After `az login`: + ```bash + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method GET \ + --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullrequests?searchCriteria.creatorId={userId}&api-version=7.1" + az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method GET \ + --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads?api-version=7.1" + ``` + Filter the resulting `comments[*]` to those whose + `author.uniqueName` matches the user. + - **GitLab**: `glab mr list --author=@me`, then `glab mr view + --comments`. Or hit the REST API: + `GET /projects/:id/merge_requests?author_username=` and + `GET /projects/:id/merge_requests/:mr/notes`. + - **Bitbucket Cloud**: REST API + `GET /2.0/repositories/{ws}/{repo}/pullrequests?q=author.uuid="{uuid}"`, + then `GET .../pullrequests/{id}/comments`. + - **Gitea / Forgejo**: REST API + `GET /repos/{owner}/{repo}/pulls?state=all&poster={user}`, then + `GET .../issues/{n}/comments`. + - **Other / unspecified SCM**: detect the host from `git remote -v` + (or ask the user). If the platform exposes an authenticated API + or CLI for listing the user's own PRs/MRs and their inline + comments, use it; the goal is 5–20 recent comment bodies authored + by the user. If no such interface is available or accessible, + skip this source and continue with the remaining sources in + priority order. + + In all cases, prefer **inline review-comment bodies** over commit + messages — they are closer to the conversational register of the + text being drafted. Use commit messages only as a fallback. 3. **Prior Copilot CLI session history**, when the agent has access to `session_store_sql` or equivalent. Query recent `events` where `type = 'user.message'` for natural-language samples. From 2ec935504fb96eca679e30e5067f75d5db58bade Mon Sep 17 00:00:00 2001 From: Andrew Beltrano Date: Mon, 4 May 2026 14:40:03 -0600 Subject: [PATCH 06/12] Generalize Phase 1 agent-history and style-notes sources to be LLM-agnostic Item 3 was Copilot-CLI-specific; reframe as 'prior agent session history' with examples for GitHub Copilot CLI (session_store_sql), Claude Code (~/.claude/projects JSONL transcripts), and Cursor/Windsurf/IDE-embedded agents, plus a fallback for other agents. Item 5 was copilot-instructions.md-only; expand the example list to also include CLAUDE.md, AGENTS.md, .cursorrules, and .windsurfrules. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- protocols/guardrails/human-voice-fidelity.md | 25 ++++++++++++++++---- 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/protocols/guardrails/human-voice-fidelity.md b/protocols/guardrails/human-voice-fidelity.md index 9f9e806..40302c4 100644 --- a/protocols/guardrails/human-voice-fidelity.md +++ b/protocols/guardrails/human-voice-fidelity.md @@ -76,17 +76,32 @@ prose from whichever SCM hosts the project. In all cases, prefer **inline review-comment bodies** over commit messages — they are closer to the conversational register of the text being drafted. Use commit messages only as a fallback. -3. **Prior Copilot CLI session history**, when the agent has access to - `session_store_sql` or equivalent. Query recent `events` where - `type = 'user.message'` for natural-language samples. +3. **Prior agent session history with this user**, when the agent has + access to a persisted transcript store. The mechanism is + agent-specific; the intent is the same — surface recent + `user.message`-shaped content for natural-language samples. Examples + (non-exhaustive): + - **GitHub Copilot CLI**: query `session_store_sql` (the `events` + table where `type = 'user.message'`). + - **Claude Code**: read recent JSONL transcripts under + `~/.claude/projects//*.jsonl` and filter to + `type: "user"` entries. + - **Cursor / Windsurf / other IDE-embedded agents**: use whatever + local conversation history the IDE exposes (often a SQLite store + under the IDE's user-data directory). + - **Any other agent**: if a queryable or readable conversation log + exists for the user, sample recent user-authored turns from it. + If no such store is available, skip this source. 4. **Organization-specific communication tools**, where available and permitted. Examples (non-exhaustive, environment-dependent): - Microsoft 365 Copilot / WorkIQ MCP server (`workiq-ask_work_iq`) for users in Microsoft tenants - Slack / Teams export tools where the user has provided access - Any user-provided dump of personal correspondence -5. **Explicit style notes** the user has written down (e.g., a - `STYLE.md` or `.github/copilot-instructions.md` voice section). +5. **Explicit style notes** the user has written down — e.g., a + `STYLE.md`, or a voice section inside an agent-instruction file + such as `.github/copilot-instructions.md`, `CLAUDE.md`, + `AGENTS.md`, `.cursorrules`, or `.windsurfrules`. If **no voice sources are available**: From 8e4ff0d47cb6fb2d406f3084d45a6c1409de6327 Mon Sep 17 00:00:00 2001 From: Andrew Beltrano Date: Mon, 4 May 2026 15:13:31 -0600 Subject: [PATCH 07/12] Address PR #254 review feedback (10 threads) Voice-fidelity protocol (4 threads): - Em-dash rule now conditional on user's own samples (T1, T5) - Bullet-list rule moved from hard rules to soft rules (T6) - Add consent and confidentiality requirements before sampling agent transcripts or org-tool history (T9) respond-to-pr-comments template/skill (6 threads): - Soften ADO threads pagination claim and add defensive continuationToken handling (T2) - URL-encode {project} and {repoName} in all ADO az rest example URIs via {projectEnc}/{repoNameEnc} placeholders (T3, T4, T10) - Replace GitLab-style !123 PR id notation with bare 123 or ado:123 prefix (T7) - Rename 'all pending' selector to 'all open' to avoid collision with ADO's distinct 'pending' status (T8) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../skills/respond-to-pr-comments/SKILL.md | 24 ++++++----- protocols/guardrails/human-voice-fidelity.md | 35 ++++++++++++--- templates/respond-to-pr-comments.md | 43 ++++++++++++------- 3 files changed, 71 insertions(+), 31 deletions(-) diff --git a/.github/skills/respond-to-pr-comments/SKILL.md b/.github/skills/respond-to-pr-comments/SKILL.md index 702ffd0..e74f112 100644 --- a/.github/skills/respond-to-pr-comments/SKILL.md +++ b/.github/skills/respond-to-pr-comments/SKILL.md @@ -72,8 +72,8 @@ Source them as follows: already-encoded segments from the original URL). Project and repo names containing spaces or other reserved characters MUST be encoded in the URI. -- **From a bare id** (`#42`, `!123`) — derive the rest from the - selected upstream remote. Recognise: +- **From a bare id** (`#42` for GitHub, `123` or `ado:123` for ADO) — + derive the rest from the selected upstream remote. Recognise: - GitHub HTTPS: `https://github.com/{owner}/{repo}(.git)?` - GitHub SSH: `git@github.com:{owner}/{repo}(.git)?` - ADO HTTPS: `https://dev.azure.com/{org}/{project}/_git/{repo}` @@ -170,21 +170,25 @@ query($threadId: ID!, $commentCursor: String) { Run `az login` once. Then: -1. Resolve `repoId`: +1. Resolve `repoId` (use URL-encoded `{projectEnc}` / `{repoNameEnc}` + per the encoding note above; `{org}` and GUIDs need no encoding): ```bash az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method GET \ - --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoName}?api-version=7.1" + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoNameEnc}?api-version=7.1" ``` -2. List threads (returns full `value[]` in one response — no - `$top`/`$skip` pagination on this endpoint; comments are embedded): +2. List threads. The documented schema does not expose `$top`/`$skip` + pagination and comments are embedded inline. Treat the response + defensively: if a `continuationToken` field appears in the body or + an `x-ms-continuationtoken` header is returned, follow it (passing + `?continuationToken=`) until no further token is returned. ```bash az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method GET \ - --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads?api-version=7.1" + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads?api-version=7.1" ``` 3. Get latest iteration (for outdated detection): ```bash az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method GET \ - --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/iterations?api-version=7.1" + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/iterations?api-version=7.1" ``` For each ADO thread, record: @@ -335,7 +339,7 @@ Execute with **mandatory user confirmation at every step**. { "content": "", "parentCommentId": , "commentType": "text" } EOF az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method POST \ - --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}/comments?api-version=7.1" \ + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}/comments?api-version=7.1" \ --headers "Content-Type=application/json" \ --body @reply.json ``` @@ -363,7 +367,7 @@ Execute with **mandatory user confirmation at every step**. is safe here): ```bash az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method PATCH \ - --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}?api-version=7.1" \ + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}?api-version=7.1" \ --headers "Content-Type=application/json" \ --body '{ "status": "fixed" }' ``` diff --git a/protocols/guardrails/human-voice-fidelity.md b/protocols/guardrails/human-voice-fidelity.md index 40302c4..28a5d49 100644 --- a/protocols/guardrails/human-voice-fidelity.md +++ b/protocols/guardrails/human-voice-fidelity.md @@ -49,9 +49,9 @@ prose from whichever SCM hosts the project. APIs filtered by the user's identity. After `az login`: ```bash az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method GET \ - --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullrequests?searchCriteria.creatorId={userId}&api-version=7.1" + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullrequests?searchCriteria.creatorId={userId}&api-version=7.1" az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method GET \ - --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads?api-version=7.1" + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads?api-version=7.1" ``` Filter the resulting `comments[*]` to those whose `author.uniqueName` matches the user. @@ -103,6 +103,26 @@ prose from whichever SCM hosts the project. such as `.github/copilot-instructions.md`, `CLAUDE.md`, `AGENTS.md`, `.cursorrules`, or `.windsurfrules`. +**Consent and confidentiality.** Before sampling from any source +outside the current repository / PR context (sources 3 and 4 above — +prior agent session history and organization-specific communication +tools), the agent **MUST**: + +- Disclose to the user which source it intends to access and what it + will sample (e.g., "I'll read the last 20 user turns from your + Copilot CLI session store to calibrate voice"). +- Obtain explicit confirmation before accessing the source. +- Use sampled content **only** for in-process style calibration. Do + NOT quote sampled content verbatim in externally posted text. Do + NOT include sampled content in tool-call arguments to third-party + services. Do NOT persist sampled content to disk outside the + current session's working memory. + +Sources 1, 2, and 5 (explicit user samples, prior repo PRs/MRs by the +user, and explicit style notes already in the repo) do not require +additional consent — they are either user-volunteered or already part +of the project context. + If **no voice sources are available**: - Ask the user for **2-3 sample replies** they would consider @@ -155,7 +175,10 @@ only, not to surrounding analysis, code blocks, or quoted material. **Hard rules — failure means rewrite:** -- [ ] No em-dash (U+2014 `—`) anywhere in the drafted text. +- [ ] No em-dash (U+2014 `—`) anywhere in the drafted text, **unless + em-dashes appear in the user's own samples**. Em-dashes are a + strong AI tell when absent from the user's baseline; permitted + when present. - [ ] No en-dash (U+2013 `–`) used as a punctuation separator (en-dash in numeric ranges like `2020–2024` is allowed). - [ ] None of these AI-tell phrases appear unless they appear in the @@ -170,8 +193,6 @@ only, not to surrounding analysis, code blocks, or quoted material. - "Certainly!" - "Absolutely!" - Opening with "I'd be happy to..." -- [ ] No exhaustive bulleted list when 1-2 sentences of prose would - read more naturally. - [ ] No unsupported technical claims — every factual statement is backed by code, tests, docs, or reviewer text already in context. @@ -182,6 +203,10 @@ only, not to surrounding analysis, code blocks, or quoted material. **Soft rules — flag for user review if violated:** +- Avoid bulleted lists of more than 3 items when 1-2 sentences of + prose would convey the same information, **unless the user's + samples show frequent list use**. Long bulleted lists are an AI + tell when absent from the user's baseline. - Reply length is within ~1.5x the observed user average for the channel. - Greeting/sign-off matches user habit. diff --git a/templates/respond-to-pr-comments.md b/templates/respond-to-pr-comments.md index f952d0a..b9837e7 100644 --- a/templates/respond-to-pr-comments.md +++ b/templates/respond-to-pr-comments.md @@ -17,8 +17,8 @@ protocols: - guardrails/human-voice-fidelity format: pr-comment-responses params: - pr_reference: "Pull request to respond to — full URL (GitHub or Azure DevOps Services), PR number (e.g., #42 for GitHub), or PR id (e.g., !123 for ADO). Platform is auto-detected from the URL when given; otherwise inferred from `git remote -v` of the current repo. If both signals are absent or ambiguous, the workflow prompts you to pick rather than guessing." - review_threads: "Review feedback to address — 'all pending', specific thread URLs, or pasted comments" + pr_reference: "Pull request to respond to — full URL (GitHub or Azure DevOps Services), PR number (e.g., #42 for GitHub), or bare numeric PR id (e.g., 123, optionally prefixed `ado:123` to disambiguate). Platform is auto-detected from the URL when given; otherwise inferred from `git remote -v` of the current repo. If both signals are absent or ambiguous, the workflow prompts you to pick rather than guessing." + review_threads: "Review feedback to address — 'all open' (GitHub: unresolved threads; ADO: status `active`), specific thread URLs, or pasted comments" codebase_context: "What this code does, relevant architecture, design decisions that inform responses" response_mode: "How to respond per-thread — 'auto' (heuristic), 'fix' (code changes), or 'explain' (rationale)" output_mode: "Output mode — 'document' (produce response plan) or 'action' (make changes and post replies via the platform's CLI/API)" @@ -67,7 +67,8 @@ Services pull request. Use this resolution order: **Azure DevOps Services** (legacy host) - URL-decode `` and `` segments before using them. -2. **If `pr_reference` is not a URL** (e.g., bare `#42` or `!123`), +2. **If `pr_reference` is not a URL** (e.g., bare `#42` for GitHub or + `123` / `ado:123` for ADO), inspect `git remote -v` in the current working directory: - Match HTTPS or SSH remotes: - GitHub: `github.com`, `git@github.com` @@ -99,7 +100,8 @@ Record the detected platform; the rest of the phases branch on it. segment** (or preserve the already-encoded segments from the original URL) so values containing spaces or other reserved characters do not produce malformed requests. - - **If `pr_reference` is a bare id** (`#42`, `!123`), derive + - **If `pr_reference` is a bare id** (`#42` for GitHub, `123` or + `ado:123` for ADO), derive `owner`/`repo` (GitHub) or `org`/`project`/`repoName` (ADO) from the selected upstream remote. Recognise these forms: - GitHub HTTPS: `https://github.com/{owner}/{repo}(.git)?` @@ -152,7 +154,7 @@ Fetch all review threads from the platform. steps can post replies and resolve the correct threads 2. **Filter threads** based on `review_threads` parameter: - - If `all pending` — include all threads with state `pending` + - If `all open` — include all threads where `isResolved: false` - If specific threads are listed — include only those - Skip `resolved` threads unless the user explicitly requests them - Flag `outdated` threads (code has changed since the comment) @@ -175,27 +177,34 @@ the user to mint a Personal Access Token. 1. **Resolve the repository GUID and PR id.** Parse the PR URL: - `org`, `project`, `repoName`, `prId` from the URL path (URL-decode `project` and `repoName`). + - When constructing the API URIs below, **URL-encode** `project` + and `repoName` (they may contain spaces or reserved characters). + `{projectEnc}` and `{repoNameEnc}` denote the encoded forms; + `{org}` and GUIDs (`{repoId}`) need no encoding. - Look up the repository GUID — needed by some thread endpoints: ```powershell az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 ` --method GET ` - --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoName}?api-version=7.1" + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoNameEnc}?api-version=7.1" ``` Record the returned `id` as `repoId`. - For fork PRs, ADO targets the **target repository** for thread operations; `repoId` from the URL above is correct. -2. **List all PR threads.** This endpoint returns the full collection - in a single response — there is **no `$top`/`$skip` pagination** - documented for this endpoint, and comments are embedded inline - in each thread: +2. **List all PR threads.** The documented schema for this endpoint + does not expose `$top`/`$skip` pagination parameters, and comments + are embedded inline in each thread. Treat the response defensively + anyway: if a `continuationToken` field appears in the body or an + `x-ms-continuationtoken` header is returned, follow it (passing + `?continuationToken=` on the next request) until no further + token is returned. ```powershell az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 ` --method GET ` - --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads?api-version=7.1" + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads?api-version=7.1" ``` Read `value[]` from the response. For each thread, record: @@ -221,7 +230,7 @@ the user to mint a Personal Access Token. ```powershell az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 ` --method GET ` - --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/iterations?api-version=7.1" + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/iterations?api-version=7.1" ``` Record the highest `id` as `latestIteration`. @@ -239,8 +248,10 @@ the user to mint a Personal Access Token. should NOT be auto-skipped — flag them and ask the user. 5. **Filter remaining threads** based on `review_threads` parameter: - - If `all pending` — include threads with status `active` (the - default state for new comments needing response). + - If `all open` — include threads with status `active` (the + default state for new comments needing response). Note: ADO's + `pending` status is distinct (author-marked awaiting something) + and is handled in the next bullet, not as part of `all open`. - **Pending threads (status `pending`)**: flag and ask the user whether to address now — the comment author marked them as awaiting something. @@ -420,7 +431,7 @@ differ. ```bash az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 \ --method POST \ - --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}/comments?api-version=7.1" \ + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}/comments?api-version=7.1" \ --headers "Content-Type=application/json" \ --body @reply.json ``` @@ -465,7 +476,7 @@ differ. ```bash az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 \ --method PATCH \ - --uri "https://dev.azure.com/{org}/{project}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}?api-version=7.1" \ + --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}?api-version=7.1" \ --headers "Content-Type=application/json" \ --body '{ "status": "fixed" }' ``` From 765048955ae3096b1ed3642f5c00dda6b4565e91 Mon Sep 17 00:00:00 2001 From: Andrew Beltrano Date: Mon, 4 May 2026 15:18:33 -0600 Subject: [PATCH 08/12] Address PR #254 second-pass review feedback ADO status enum casing (5 threads): - Replace 'lowercase API enum values' wording in template, skill, PATCH instructions, and checklist with 'exact case-sensitive enum values' noting wontFix and byDesign are camelCase Voice fidelity (2 threads): - Tighten source 4 (org communication tools) to require explicit, per-session opt-in; default behavior is to skip - Tighten Consent block: disclose what, approximate volume, and that consent does not carry between sessions - Clarify Voice Calibration note placement: reported in agent chat output / action summary, NOT inserted into the produced format artifact (prevents format drift in pr-comment-responses and investigation-report) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../skills/respond-to-pr-comments/SKILL.md | 12 +- cli/specs/audit-2026-03-30-summary.md | 27 ++ cli/specs/audit-2026-03-30.md | 230 ++++++++++++++++++ protocols/guardrails/human-voice-fidelity.md | 29 ++- templates/respond-to-pr-comments.md | 18 +- 5 files changed, 294 insertions(+), 22 deletions(-) create mode 100644 cli/specs/audit-2026-03-30-summary.md create mode 100644 cli/specs/audit-2026-03-30.md diff --git a/.github/skills/respond-to-pr-comments/SKILL.md b/.github/skills/respond-to-pr-comments/SKILL.md index e74f112..57251a3 100644 --- a/.github/skills/respond-to-pr-comments/SKILL.md +++ b/.github/skills/respond-to-pr-comments/SKILL.md @@ -195,8 +195,9 @@ For each ADO thread, record: - `id`: the thread id (integer; required for status updates and posting replies) - `status`: one of `active`, `pending`, `fixed`, `wontFix`, - `closed`, `byDesign`, `unknown` (lowercase API enum values — - ADO uses `fixed`, NOT `resolved`) + `closed`, `byDesign`, `unknown` (exact API enum values — + **case-sensitive**; note `wontFix` and `byDesign` are camelCase. + ADO uses `fixed`, NOT `resolved`.) - `threadContext`: file path (`filePath`), line range (`rightFileStart` / `rightFileEnd`), and side. May be `null` for PR-wide threads or system threads. @@ -362,9 +363,10 @@ Execute with **mandatory user confirmation at every step**. }' -F threadId="" ``` - **ADO** — PATCH with exact lowercase enum (the body is a fixed - small JSON literal with no user content, so inlining `--body '...'` - is safe here): + **ADO** — PATCH with exact case-sensitive enum value (`wontFix` + and `byDesign` are camelCase; the body is a fixed small JSON + literal with no user content, so inlining `--body '...'` is safe + here): ```bash az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 --method PATCH \ --uri "https://dev.azure.com/{org}/{projectEnc}/_apis/git/repositories/{repoId}/pullRequests/{prId}/threads/{threadId}?api-version=7.1" \ diff --git a/cli/specs/audit-2026-03-30-summary.md b/cli/specs/audit-2026-03-30-summary.md new file mode 100644 index 0000000..31b1305 --- /dev/null +++ b/cli/specs/audit-2026-03-30-summary.md @@ -0,0 +1,27 @@ +# PromptKit CLI — Audit Findings Summary (2026-03-30) + +All findings from maintenance audit of v0.5.0 code against v0.3.0 spec baseline. + +| Finding | Status on main | Verdict | +|---------|---------------|---------| +| F-001 🔴 | TC-CLI-080/081 reads source text instead of verifying spawn cmd/args | Confirmed — needs spawn-intercepting test | +| F-002 🔴 | TC-CLI-076 mapped to REQ-CLI-019 but only tests "no CLI found", not spawn failure | Confirmed — needs `--cli nonexistent` error path test | +| F-003 🔴 | Spec docs say v0.3.0, package.json says v0.5.0 | Confirmed — needs version alignment + revision history entry | +| F-004 🟡 | REQ-CLI-082 (publishConfig) verified by inspection only — no TC-NNN | Confirmed — add package.json assertion test | +| F-005 🟡 | REQ-CLI-091 (cross-platform) has no TC-NNN and no CI matrix | Confirmed — needs CI platform matrix | +| F-006 🟡 | TC-CLI-071 (gh-copilot detection) not in test files | Confirmed — missing test | +| F-007 🟡 | TC-CLI-075 (--cli flag override) not in test files | Confirmed — missing test | +| F-008 🟡 | TC-CLI-079 (temp dir cleanup on exit) not in test files | Confirmed — missing test | +| F-009 🟡 | TC-CLI-100/121 (npm pack distribution) not automated — inspection only | Confirmed — needs npm pack dry-run test | +| F-010 🟡 | TC-CLI-073 implemented as integration test; plan specifies unit test calling detectCli() | Confirmed — weak isolation, add unit test | +| F-011 🟡 | REQ-CLI-094 (exactly two dependencies) verified by inspection only — no TC-NNN | Confirmed — add package.json assertion test | +| F-012 🟢 | REQ-CLI-082 and REQ-CLI-090 absent from design doc (no D1 coverage) | Confirmed — add brief notes to design §6 | +| F-013 🟢 | GAP-010 still marked "To be resolved" in design; code already resolves it | Confirmed — mark GAP-010 resolved in design.md | +| F-014 🟢 | launch.js prints "content staged at" and "Launching…" — not in any requirement | Accept — benign informational output | +| F-015 🟢 | Usage hint says "Run promptkit" not "Run promptkit interactive" | Confirmed — ambiguous; decide: fix hint or relax acceptance criterion | +| F-016 🟢 | TC-CLI-077 (fallback warning to claude) not in test files | Confirmed — missing test | +| F-017 🟢 | TC-CLI-101 (content/ gitignored) not automated — requires git context | Confirmed — replace with static .gitignore check | +| F-018 🟢 | TC-CLI-078 only checks personas/ and templates/, not protocols/, formats/, taxonomies/ | Confirmed — add 3 missing directory assertions | +| F-019 🟢 | TC-CLI-002 only tests --version, not -V short flag | Confirmed — add -V assertion | +| F-020 🟢 | TC-CLI-053 asserts output.includes("promptkit") but not "interactive" | Confirmed — strengthen assertion | +| F-021 🟢 | CON-001 through CON-004 (constraints) absent from traceability matrix entirely | Accept — constraints are structural; code is compliant | diff --git a/cli/specs/audit-2026-03-30.md b/cli/specs/audit-2026-03-30.md new file mode 100644 index 0000000..65952ea --- /dev/null +++ b/cli/specs/audit-2026-03-30.md @@ -0,0 +1,230 @@ +--- +title: "PromptKit CLI — Maintenance Audit Report" +date: "2026-03-30" +auditor: "maintenance-workflow (specification-analyst)" +package_version: "0.5.0" +spec_version: "0.3.0" +--- + +# PromptKit CLI — Maintenance Audit Report + +**Date**: 2026-03-30 +**Package version**: 0.5.0 (code) vs 0.3.0 (specs — see F-003) +**Scope**: Full audit — requirements, design, validation, implementation, tests + +--- + +## Summary + +| Severity | Count | +|----------|-------| +| High | 3 | +| Medium | 8 | +| Low | 10 | +| **Total**| **21**| + +The spec baseline is broadly coherent. Retirement of the assemble engine is faithfully reflected across all artifacts. The three critical issues are: (1) a test that creates illusory coverage for spawn argument verification, (2) an untested spawn-failure error path, and (3) spec documents frozen at v0.3.0 while the package is at v0.5.0. + +--- + +## High Severity + +### F-001 — TC-CLI-080/081 assertions do not verify spawn cmd/args +**Category**: D13_ASSERTION_MISMATCH +**Spec**: `validation.md` TC-CLI-080/TC-CLI-081 → REQ-CLI-016/017 +**Code**: `tests/launch.test.js` line 207 + +The combined test reads `launch.js` as a source string and checks function exports. The validation plan requires verifying the exact `cmd` and `args` passed to `spawn` for each supported CLI (`copilot`, `gh-copilot`, `claude`). No spawn interception occurs. The switch/case in `launch.js:89–105` is behaviorally untested. + +**Fix**: Intercept `child_process.spawn` in the test, invoke `launchInteractive` for each CLI name, and assert exact `cmd`/`args` per the validation plan specification. + +--- + +### F-002 — REQ-CLI-019 spawn failure path untested +**Category**: D2_UNTESTED_REQUIREMENT + D7_ACCEPTANCE_CRITERIA_MISMATCH +**Spec**: `requirements.md` REQ-CLI-019; `validation.md` traceability matrix +**Code**: `tests/launch.test.js` line 54; `lib/launch.js` lines 112–119 + +The traceability matrix maps REQ-CLI-019 (spawn error → print error, cleanup temp, exit 1) to TC-CLI-076. TC-CLI-076 exercises the "no CLI found" path (REQ-CLI-012) by clearing PATH — it never triggers the `child.on("error", …)` handler. The acceptance criterion (`promptkit --cli nonexistent`) is not tested. + +**Fix**: Add a test that runs `promptkit interactive --cli nonexistent`, asserts stderr contains an error, temp directory is cleaned up, and exit code is 1. + +--- + +### F-003 — Spec documents frozen at v0.3.0; package is at v0.5.0 +**Category**: D5_ASSUMPTION_DRIFT +**Spec**: `requirements.md`, `design.md`, `validation.md` (all header: `version: "0.3.0"`) +**Code**: `package.json` line 4: `"version": "0.5.0"` + +Two minor version increments occurred with no spec revision history entry. Any tooling or future audit based on the spec version header will see stale metadata. + +**Fix**: Update all three spec document headers to `version: "0.5.0"` and add a revision history entry (even if "no spec content changed — version alignment only"). Establish a convention tying spec version to package.json version. + +--- + +## Medium Severity + +### F-004 — REQ-CLI-082 (publishConfig) has no automated test +**Category**: D2_UNTESTED_REQUIREMENT +**Spec**: `validation.md` matrix: `(verified by package.json inspection)` — no TC-NNN + +No test asserts `publishConfig.access === "public"` or that the package name is scoped under `@alan-jowett`. A one-line parse of `package.json` would suffice. + +--- + +### F-005 — REQ-CLI-091 (cross-platform) has no automated test +**Category**: D2_UNTESTED_REQUIREMENT +**Spec**: `validation.md` matrix: `(platform-specific testing)` — no TC-NNN, no CI matrix + +The platform-aware `where`/`which` branch and Windows junction-symlink code exist but no CI runs tests on all three platforms. Needs at minimum a documented CI matrix job. + +--- + +### F-006 — TC-CLI-071 (gh-copilot detection) not implemented +**Category**: D11_UNIMPLEMENTED_TEST_CASE +**Spec**: `validation.md` TC-CLI-071 → REQ-CLI-010 + +The mock CLI section covers copilot (TC-CLI-070), claude (TC-CLI-072), and gh-without-extension (TC-CLI-074). TC-CLI-071 — `detectCli` returning `"gh-copilot"` when `copilot` is absent but `gh` with the extension is present — is absent. The `execFileSync("gh", ["copilot", "--help"], …)` branch in `launch.js:26–30` is unvalidated. + +--- + +### F-007 — TC-CLI-075 (--cli flag override) not implemented +**Category**: D11_UNIMPLEMENTED_TEST_CASE +**Spec**: `validation.md` TC-CLI-075 → REQ-CLI-011 + +No test passes `--cli ` to the CLI and verifies it overrides auto-detection. The `opts.cli || null` logic in `cli.js:52` and the `cliName` parameter to `launchInteractive` are not end-to-end validated. + +--- + +### F-008 — TC-CLI-079 (temp dir cleanup on exit) not implemented +**Category**: D11_UNIMPLEMENTED_TEST_CASE +**Spec**: `validation.md` TC-CLI-079 → REQ-CLI-018 + +The `fs.rmSync` cleanup in `launch.js:124–128` is not validated by any test. A regression that leaves temp directories behind would not be caught by CI. + +--- + +### F-009 — TC-CLI-100 and TC-CLI-121 (npm pack distribution) not implemented +**Category**: D11_UNIMPLEMENTED_TEST_CASE +**Spec**: `validation.md` TC-CLI-100, TC-CLI-121 → REQ-CLI-080, REQ-CLI-101 + +No test runs `npm pack --dry-run` to verify package contents. The `files` field in `package.json` and the absence of `lib/assemble.js`/`lib/manifest.js` are verified only by manual inspection. + +--- + +### F-010 — TC-CLI-073 implemented as integration test; plan specifies unit test +**Category**: D13_ASSERTION_MISMATCH +**Spec**: `validation.md` TC-CLI-073 ("Type: Unit — call `detectCli()`") +**Code**: `tests/launch.test.js` line 54 (full CLI invocation via `execFileSync`) + +The test runs the CLI binary and checks stderr, rather than calling `detectCli()` directly. Behavioral isolation is lost; a failure is harder to attribute. + +--- + +### F-011 — REQ-CLI-094 (dependency count) has no automated test +**Category**: D2_UNTESTED_REQUIREMENT +**Spec**: `validation.md` matrix: `(verified by package.json inspection)` — no TC-NNN + +One assertion — `Object.keys(pkg.dependencies).length === 2` with specific names — would fully automate this. + +--- + +## Low Severity + +### F-012 — REQ-CLI-082 and REQ-CLI-090 not traced in design +**Category**: D1_UNTRACED_REQUIREMENT +**Spec**: `design.md` §6 (no entry for `engines` or `publishConfig`) + +Both REQ-CLI-082 (public npm access) and REQ-CLI-090 (Node >= 18) are absent from the design document's dependency/distribution sections. Brief notes in §6 would close these. + +--- + +### F-013 — GAP-010 resolved in code but design still marks it open +**Category**: D9_UNDOCUMENTED_BEHAVIOR +**Spec**: `design.md` §7.3 GAP-010: "Status: To be resolved in implementation" +**Code**: `cli.js` line 47: `"LLM CLI to use (copilot, gh-copilot, claude)"` + +The `--cli` help text already documents valid values. The design's gap tracker is stale. + +**Fix**: Update GAP-010 to `[RESOLVED]` in `design.md`. + +--- + +### F-014 — Informational launch messages not specified +**Category**: D9_UNDOCUMENTED_BEHAVIOR +**Code**: `launch.js` lines 83–84: `"PromptKit content staged at: …"` and `"Launching ${cli}…"` + +These console outputs are not mentioned in any requirement or design section. They appear on stdout before the LLM session begins. Low risk; benign behavior that nonetheless has no spec home. + +--- + +### F-015 — Usage hint says "Run promptkit" not "Run promptkit interactive" +**Category**: D10_CONSTRAINT_VIOLATION_IN_CODE +**Spec**: `requirements.md` REQ-CLI-023 acceptance: "referencing `promptkit interactive` (or equivalent)" +**Code**: `cli.js` line 98: `"Run promptkit to start an interactive session…"` + +The hint mentions `promptkit` and "interactive session" but not the `interactive` subcommand by name. Whether "or equivalent" covers this is a judgment call. Either update the hint or relax the acceptance criterion. + +--- + +### F-016 — TC-CLI-077 (fallback warning) not implemented +**Category**: D11_UNIMPLEMENTED_TEST_CASE +**Spec**: `validation.md` TC-CLI-077 → REQ-CLI-013 + +No test runs `promptkit interactive` with only `claude` on PATH and asserts stderr contains the fallback warning. The warning logic in `launch.js:73–79` is unvalidated. + +--- + +### F-017 — TC-CLI-101 (content/ gitignored) not automated +**Category**: D11_UNIMPLEMENTED_TEST_CASE +**Spec**: `validation.md` TC-CLI-101 → REQ-CLI-081 + +Requires git context. Could be replaced with a static test asserting `.gitignore` contains `content/`. + +--- + +### F-018 — TC-CLI-078 missing assertions for protocols/, formats/, taxonomies/ +**Category**: D12_UNTESTED_ACCEPTANCE_CRITERION +**Spec**: `validation.md` TC-CLI-078: "contains … component subdirectories" +**Code**: `tests/launch.test.js` lines 195–202: checks `personas/` and `templates/` only + +Three of the five component directories are not asserted. + +--- + +### F-019 — TC-CLI-002 doesn't test `-V` short flag +**Category**: D12_UNTESTED_ACCEPTANCE_CRITERION +**Spec**: `requirements.md` REQ-CLI-003: "--version **or -V**" +**Code**: `tests/cli.test.js` line 136: only `--version` tested + +--- + +### F-020 — TC-CLI-053 assertion too weak for usage hint +**Category**: D12_UNTESTED_ACCEPTANCE_CRITERION +**Spec**: `validation.md` TC-CLI-053: "referencing `promptkit interactive`" +**Code**: `tests/list.test.js` line 77: `output.includes("promptkit")` only + +--- + +### F-021 — TC-CLI-082 (public access) not automated +**Category**: D2_UNTESTED_REQUIREMENT +*(Duplicate of F-004 from a test-level view — same gap, test-code perspective)* + +--- + +## Remediation Priority + +| Priority | Findings | Action | +|----------|----------|--------| +| **P1 — Before next release** | F-001, F-002 | Add spawn-intercepting test; add spawn-error integration test | +| **P2 — Next maintenance cycle** | F-003–F-011 | Version alignment; add missing automated tests (publishConfig, cross-platform CI, gh-copilot detection, --cli override, cleanup, npm pack, detectCli unit) | +| **P3 — Doc cleanup** | F-012–F-021 | Mark GAP-010 resolved; add design notes for D1s; fix usage hint or acceptance criterion; add missing assertions | + +--- + +## Open Questions + +1. **F-015**: Is `"Run promptkit to start an interactive session"` acceptable as "or equivalent" for `promptkit interactive`? → determines doc fix vs code fix. +2. **F-003**: What changed in v0.3, v0.4, v0.5? Any unreflected spec changes, or purely publish version bumps? +3. **F-005**: Is a CI platform matrix already configured (e.g., `.github/workflows/`)? Not in audit scope. +4. **F-009**: Is there an external build validation step covering TC-CLI-100/121 outside `npm test`? diff --git a/protocols/guardrails/human-voice-fidelity.md b/protocols/guardrails/human-voice-fidelity.md index 28a5d49..31b50f5 100644 --- a/protocols/guardrails/human-voice-fidelity.md +++ b/protocols/guardrails/human-voice-fidelity.md @@ -92,8 +92,11 @@ prose from whichever SCM hosts the project. - **Any other agent**: if a queryable or readable conversation log exists for the user, sample recent user-authored turns from it. If no such store is available, skip this source. -4. **Organization-specific communication tools**, where available and - permitted. Examples (non-exhaustive, environment-dependent): +4. **Organization-specific communication tools**, only with + **explicit, per-session opt-in user permission**. The user must + actively choose to enable each such source for the current task — + default behavior is to skip. Examples (non-exhaustive, + environment-dependent): - Microsoft 365 Copilot / WorkIQ MCP server (`workiq-ask_work_iq`) for users in Microsoft tenants - Slack / Teams export tools where the user has provided access @@ -108,10 +111,13 @@ outside the current repository / PR context (sources 3 and 4 above — prior agent session history and organization-specific communication tools), the agent **MUST**: -- Disclose to the user which source it intends to access and what it - will sample (e.g., "I'll read the last 20 user turns from your - Copilot CLI session store to calibrate voice"). -- Obtain explicit confirmation before accessing the source. +- Disclose to the user **which source** it intends to access, **what + it will sample**, and the **approximate volume** (e.g., "I'll read + the last 20 user turns from your Copilot CLI session store to + calibrate voice"). +- Obtain explicit confirmation before accessing the source. **Default + behavior is to skip** these sources unless the user opts in for the + current session; consent does not carry over between sessions. - Use sampled content **only** for in-process style calibration. Do NOT quote sampled content verbatim in externally posted text. Do NOT include sampled content in tool-call arguments to third-party @@ -232,11 +238,16 @@ phrase rules; only the surrounding user-authored prose is checked. ## Output Annotation -When this protocol is applied, the output should include a brief -**Voice Calibration** note (one or two lines) stating: +When this protocol is applied, the agent reports a brief **Voice +Calibration** note (one or two lines) stating: - Which voice sources were used (e.g., "5 prior PR replies in repo, 3 user-provided samples"), or - That no samples were available and a neutral default was used. -This note is internal-facing and is not posted externally. +**Placement.** This note is **internal-facing only** — the agent +reports it in its chat-style response to the user (in document mode) +or as part of the action-mode change summary. It is **NOT** inserted +into the produced format artifact (e.g., the `pr-comment-responses` +or `investigation-report` document) and **NOT** posted externally. +This keeps consuming formats unchanged and avoids format drift. diff --git a/templates/respond-to-pr-comments.md b/templates/respond-to-pr-comments.md index b9837e7..67819ea 100644 --- a/templates/respond-to-pr-comments.md +++ b/templates/respond-to-pr-comments.md @@ -211,8 +211,9 @@ the user to mint a Personal Access Token. - `id`: the thread id (integer; required for status updates and posting replies) - `status`: one of `active`, `pending`, `fixed`, `wontFix`, - `closed`, `byDesign`, `unknown` (lowercase API enum values — - ADO uses `fixed`, NOT `resolved`) + `closed`, `byDesign`, `unknown` (exact API enum values — + **case-sensitive**; note `wontFix` and `byDesign` are camelCase. + ADO uses `fixed`, NOT `resolved`.) - `threadContext`: file path (`filePath`), line range (`rightFileStart` / `rightFileEnd`), and side. May be `null` for PR-wide threads or system threads. @@ -470,9 +471,10 @@ differ. ``` **Azure DevOps Services** — PATCH the thread status (use the - exact lowercase enum value: `active`, `pending`, `fixed`, - `wontFix`, `closed`, `byDesign` — note ADO uses `fixed`, NOT - GitHub's `resolved`): + exact API enum value with **case-sensitive casing**: `active`, + `pending`, `fixed`, `wontFix`, `closed`, `byDesign` — note the + camelCase on `wontFix` and `byDesign`, and that ADO uses `fixed`, + NOT GitHub's `resolved`): ```bash az rest --resource 499b84ac-1321-427f-aa17-267ca6975798 \ --method PATCH \ @@ -548,9 +550,9 @@ Before finalizing, verify: vocabulary (no cross-platform normalization) - [ ] On ADO: every `az rest` invocation includes `--resource 499b84ac-1321-427f-aa17-267ca6975798` -- [ ] On ADO: status PATCH payloads use the exact lowercase enum - values (`active`, `pending`, `fixed`, `wontFix`, `closed`, - `byDesign`) +- [ ] On ADO: status PATCH payloads use the exact case-sensitive enum + values (`active`, `pending`, `fixed`, `wontFix`, `byDesign`, + `closed`) — `wontFix` and `byDesign` are camelCase - [ ] On ADO: reply POST bodies use `content` + `parentCommentId` + `commentType: "text"` (NOT GitHub's `body` / `in_reply_to`) - [ ] Files modified by fixes are listed in the action summary From 0e4dc137569011fa257c091edd199b7e06039400 Mon Sep 17 00:00:00 2001 From: Andrew Beltrano Date: Mon, 4 May 2026 15:18:46 -0600 Subject: [PATCH 09/12] Revert: remove unrelated cli/specs/audit-2026-03-30* files These were accidentally included via git add -A in 7650489. They are not part of the human-voice-fidelity protocol work. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- cli/specs/audit-2026-03-30-summary.md | 27 --- cli/specs/audit-2026-03-30.md | 230 -------------------------- 2 files changed, 257 deletions(-) delete mode 100644 cli/specs/audit-2026-03-30-summary.md delete mode 100644 cli/specs/audit-2026-03-30.md diff --git a/cli/specs/audit-2026-03-30-summary.md b/cli/specs/audit-2026-03-30-summary.md deleted file mode 100644 index 31b1305..0000000 --- a/cli/specs/audit-2026-03-30-summary.md +++ /dev/null @@ -1,27 +0,0 @@ -# PromptKit CLI — Audit Findings Summary (2026-03-30) - -All findings from maintenance audit of v0.5.0 code against v0.3.0 spec baseline. - -| Finding | Status on main | Verdict | -|---------|---------------|---------| -| F-001 🔴 | TC-CLI-080/081 reads source text instead of verifying spawn cmd/args | Confirmed — needs spawn-intercepting test | -| F-002 🔴 | TC-CLI-076 mapped to REQ-CLI-019 but only tests "no CLI found", not spawn failure | Confirmed — needs `--cli nonexistent` error path test | -| F-003 🔴 | Spec docs say v0.3.0, package.json says v0.5.0 | Confirmed — needs version alignment + revision history entry | -| F-004 🟡 | REQ-CLI-082 (publishConfig) verified by inspection only — no TC-NNN | Confirmed — add package.json assertion test | -| F-005 🟡 | REQ-CLI-091 (cross-platform) has no TC-NNN and no CI matrix | Confirmed — needs CI platform matrix | -| F-006 🟡 | TC-CLI-071 (gh-copilot detection) not in test files | Confirmed — missing test | -| F-007 🟡 | TC-CLI-075 (--cli flag override) not in test files | Confirmed — missing test | -| F-008 🟡 | TC-CLI-079 (temp dir cleanup on exit) not in test files | Confirmed — missing test | -| F-009 🟡 | TC-CLI-100/121 (npm pack distribution) not automated — inspection only | Confirmed — needs npm pack dry-run test | -| F-010 🟡 | TC-CLI-073 implemented as integration test; plan specifies unit test calling detectCli() | Confirmed — weak isolation, add unit test | -| F-011 🟡 | REQ-CLI-094 (exactly two dependencies) verified by inspection only — no TC-NNN | Confirmed — add package.json assertion test | -| F-012 🟢 | REQ-CLI-082 and REQ-CLI-090 absent from design doc (no D1 coverage) | Confirmed — add brief notes to design §6 | -| F-013 🟢 | GAP-010 still marked "To be resolved" in design; code already resolves it | Confirmed — mark GAP-010 resolved in design.md | -| F-014 🟢 | launch.js prints "content staged at" and "Launching…" — not in any requirement | Accept — benign informational output | -| F-015 🟢 | Usage hint says "Run promptkit" not "Run promptkit interactive" | Confirmed — ambiguous; decide: fix hint or relax acceptance criterion | -| F-016 🟢 | TC-CLI-077 (fallback warning to claude) not in test files | Confirmed — missing test | -| F-017 🟢 | TC-CLI-101 (content/ gitignored) not automated — requires git context | Confirmed — replace with static .gitignore check | -| F-018 🟢 | TC-CLI-078 only checks personas/ and templates/, not protocols/, formats/, taxonomies/ | Confirmed — add 3 missing directory assertions | -| F-019 🟢 | TC-CLI-002 only tests --version, not -V short flag | Confirmed — add -V assertion | -| F-020 🟢 | TC-CLI-053 asserts output.includes("promptkit") but not "interactive" | Confirmed — strengthen assertion | -| F-021 🟢 | CON-001 through CON-004 (constraints) absent from traceability matrix entirely | Accept — constraints are structural; code is compliant | diff --git a/cli/specs/audit-2026-03-30.md b/cli/specs/audit-2026-03-30.md deleted file mode 100644 index 65952ea..0000000 --- a/cli/specs/audit-2026-03-30.md +++ /dev/null @@ -1,230 +0,0 @@ ---- -title: "PromptKit CLI — Maintenance Audit Report" -date: "2026-03-30" -auditor: "maintenance-workflow (specification-analyst)" -package_version: "0.5.0" -spec_version: "0.3.0" ---- - -# PromptKit CLI — Maintenance Audit Report - -**Date**: 2026-03-30 -**Package version**: 0.5.0 (code) vs 0.3.0 (specs — see F-003) -**Scope**: Full audit — requirements, design, validation, implementation, tests - ---- - -## Summary - -| Severity | Count | -|----------|-------| -| High | 3 | -| Medium | 8 | -| Low | 10 | -| **Total**| **21**| - -The spec baseline is broadly coherent. Retirement of the assemble engine is faithfully reflected across all artifacts. The three critical issues are: (1) a test that creates illusory coverage for spawn argument verification, (2) an untested spawn-failure error path, and (3) spec documents frozen at v0.3.0 while the package is at v0.5.0. - ---- - -## High Severity - -### F-001 — TC-CLI-080/081 assertions do not verify spawn cmd/args -**Category**: D13_ASSERTION_MISMATCH -**Spec**: `validation.md` TC-CLI-080/TC-CLI-081 → REQ-CLI-016/017 -**Code**: `tests/launch.test.js` line 207 - -The combined test reads `launch.js` as a source string and checks function exports. The validation plan requires verifying the exact `cmd` and `args` passed to `spawn` for each supported CLI (`copilot`, `gh-copilot`, `claude`). No spawn interception occurs. The switch/case in `launch.js:89–105` is behaviorally untested. - -**Fix**: Intercept `child_process.spawn` in the test, invoke `launchInteractive` for each CLI name, and assert exact `cmd`/`args` per the validation plan specification. - ---- - -### F-002 — REQ-CLI-019 spawn failure path untested -**Category**: D2_UNTESTED_REQUIREMENT + D7_ACCEPTANCE_CRITERIA_MISMATCH -**Spec**: `requirements.md` REQ-CLI-019; `validation.md` traceability matrix -**Code**: `tests/launch.test.js` line 54; `lib/launch.js` lines 112–119 - -The traceability matrix maps REQ-CLI-019 (spawn error → print error, cleanup temp, exit 1) to TC-CLI-076. TC-CLI-076 exercises the "no CLI found" path (REQ-CLI-012) by clearing PATH — it never triggers the `child.on("error", …)` handler. The acceptance criterion (`promptkit --cli nonexistent`) is not tested. - -**Fix**: Add a test that runs `promptkit interactive --cli nonexistent`, asserts stderr contains an error, temp directory is cleaned up, and exit code is 1. - ---- - -### F-003 — Spec documents frozen at v0.3.0; package is at v0.5.0 -**Category**: D5_ASSUMPTION_DRIFT -**Spec**: `requirements.md`, `design.md`, `validation.md` (all header: `version: "0.3.0"`) -**Code**: `package.json` line 4: `"version": "0.5.0"` - -Two minor version increments occurred with no spec revision history entry. Any tooling or future audit based on the spec version header will see stale metadata. - -**Fix**: Update all three spec document headers to `version: "0.5.0"` and add a revision history entry (even if "no spec content changed — version alignment only"). Establish a convention tying spec version to package.json version. - ---- - -## Medium Severity - -### F-004 — REQ-CLI-082 (publishConfig) has no automated test -**Category**: D2_UNTESTED_REQUIREMENT -**Spec**: `validation.md` matrix: `(verified by package.json inspection)` — no TC-NNN - -No test asserts `publishConfig.access === "public"` or that the package name is scoped under `@alan-jowett`. A one-line parse of `package.json` would suffice. - ---- - -### F-005 — REQ-CLI-091 (cross-platform) has no automated test -**Category**: D2_UNTESTED_REQUIREMENT -**Spec**: `validation.md` matrix: `(platform-specific testing)` — no TC-NNN, no CI matrix - -The platform-aware `where`/`which` branch and Windows junction-symlink code exist but no CI runs tests on all three platforms. Needs at minimum a documented CI matrix job. - ---- - -### F-006 — TC-CLI-071 (gh-copilot detection) not implemented -**Category**: D11_UNIMPLEMENTED_TEST_CASE -**Spec**: `validation.md` TC-CLI-071 → REQ-CLI-010 - -The mock CLI section covers copilot (TC-CLI-070), claude (TC-CLI-072), and gh-without-extension (TC-CLI-074). TC-CLI-071 — `detectCli` returning `"gh-copilot"` when `copilot` is absent but `gh` with the extension is present — is absent. The `execFileSync("gh", ["copilot", "--help"], …)` branch in `launch.js:26–30` is unvalidated. - ---- - -### F-007 — TC-CLI-075 (--cli flag override) not implemented -**Category**: D11_UNIMPLEMENTED_TEST_CASE -**Spec**: `validation.md` TC-CLI-075 → REQ-CLI-011 - -No test passes `--cli ` to the CLI and verifies it overrides auto-detection. The `opts.cli || null` logic in `cli.js:52` and the `cliName` parameter to `launchInteractive` are not end-to-end validated. - ---- - -### F-008 — TC-CLI-079 (temp dir cleanup on exit) not implemented -**Category**: D11_UNIMPLEMENTED_TEST_CASE -**Spec**: `validation.md` TC-CLI-079 → REQ-CLI-018 - -The `fs.rmSync` cleanup in `launch.js:124–128` is not validated by any test. A regression that leaves temp directories behind would not be caught by CI. - ---- - -### F-009 — TC-CLI-100 and TC-CLI-121 (npm pack distribution) not implemented -**Category**: D11_UNIMPLEMENTED_TEST_CASE -**Spec**: `validation.md` TC-CLI-100, TC-CLI-121 → REQ-CLI-080, REQ-CLI-101 - -No test runs `npm pack --dry-run` to verify package contents. The `files` field in `package.json` and the absence of `lib/assemble.js`/`lib/manifest.js` are verified only by manual inspection. - ---- - -### F-010 — TC-CLI-073 implemented as integration test; plan specifies unit test -**Category**: D13_ASSERTION_MISMATCH -**Spec**: `validation.md` TC-CLI-073 ("Type: Unit — call `detectCli()`") -**Code**: `tests/launch.test.js` line 54 (full CLI invocation via `execFileSync`) - -The test runs the CLI binary and checks stderr, rather than calling `detectCli()` directly. Behavioral isolation is lost; a failure is harder to attribute. - ---- - -### F-011 — REQ-CLI-094 (dependency count) has no automated test -**Category**: D2_UNTESTED_REQUIREMENT -**Spec**: `validation.md` matrix: `(verified by package.json inspection)` — no TC-NNN - -One assertion — `Object.keys(pkg.dependencies).length === 2` with specific names — would fully automate this. - ---- - -## Low Severity - -### F-012 — REQ-CLI-082 and REQ-CLI-090 not traced in design -**Category**: D1_UNTRACED_REQUIREMENT -**Spec**: `design.md` §6 (no entry for `engines` or `publishConfig`) - -Both REQ-CLI-082 (public npm access) and REQ-CLI-090 (Node >= 18) are absent from the design document's dependency/distribution sections. Brief notes in §6 would close these. - ---- - -### F-013 — GAP-010 resolved in code but design still marks it open -**Category**: D9_UNDOCUMENTED_BEHAVIOR -**Spec**: `design.md` §7.3 GAP-010: "Status: To be resolved in implementation" -**Code**: `cli.js` line 47: `"LLM CLI to use (copilot, gh-copilot, claude)"` - -The `--cli` help text already documents valid values. The design's gap tracker is stale. - -**Fix**: Update GAP-010 to `[RESOLVED]` in `design.md`. - ---- - -### F-014 — Informational launch messages not specified -**Category**: D9_UNDOCUMENTED_BEHAVIOR -**Code**: `launch.js` lines 83–84: `"PromptKit content staged at: …"` and `"Launching ${cli}…"` - -These console outputs are not mentioned in any requirement or design section. They appear on stdout before the LLM session begins. Low risk; benign behavior that nonetheless has no spec home. - ---- - -### F-015 — Usage hint says "Run promptkit" not "Run promptkit interactive" -**Category**: D10_CONSTRAINT_VIOLATION_IN_CODE -**Spec**: `requirements.md` REQ-CLI-023 acceptance: "referencing `promptkit interactive` (or equivalent)" -**Code**: `cli.js` line 98: `"Run promptkit to start an interactive session…"` - -The hint mentions `promptkit` and "interactive session" but not the `interactive` subcommand by name. Whether "or equivalent" covers this is a judgment call. Either update the hint or relax the acceptance criterion. - ---- - -### F-016 — TC-CLI-077 (fallback warning) not implemented -**Category**: D11_UNIMPLEMENTED_TEST_CASE -**Spec**: `validation.md` TC-CLI-077 → REQ-CLI-013 - -No test runs `promptkit interactive` with only `claude` on PATH and asserts stderr contains the fallback warning. The warning logic in `launch.js:73–79` is unvalidated. - ---- - -### F-017 — TC-CLI-101 (content/ gitignored) not automated -**Category**: D11_UNIMPLEMENTED_TEST_CASE -**Spec**: `validation.md` TC-CLI-101 → REQ-CLI-081 - -Requires git context. Could be replaced with a static test asserting `.gitignore` contains `content/`. - ---- - -### F-018 — TC-CLI-078 missing assertions for protocols/, formats/, taxonomies/ -**Category**: D12_UNTESTED_ACCEPTANCE_CRITERION -**Spec**: `validation.md` TC-CLI-078: "contains … component subdirectories" -**Code**: `tests/launch.test.js` lines 195–202: checks `personas/` and `templates/` only - -Three of the five component directories are not asserted. - ---- - -### F-019 — TC-CLI-002 doesn't test `-V` short flag -**Category**: D12_UNTESTED_ACCEPTANCE_CRITERION -**Spec**: `requirements.md` REQ-CLI-003: "--version **or -V**" -**Code**: `tests/cli.test.js` line 136: only `--version` tested - ---- - -### F-020 — TC-CLI-053 assertion too weak for usage hint -**Category**: D12_UNTESTED_ACCEPTANCE_CRITERION -**Spec**: `validation.md` TC-CLI-053: "referencing `promptkit interactive`" -**Code**: `tests/list.test.js` line 77: `output.includes("promptkit")` only - ---- - -### F-021 — TC-CLI-082 (public access) not automated -**Category**: D2_UNTESTED_REQUIREMENT -*(Duplicate of F-004 from a test-level view — same gap, test-code perspective)* - ---- - -## Remediation Priority - -| Priority | Findings | Action | -|----------|----------|--------| -| **P1 — Before next release** | F-001, F-002 | Add spawn-intercepting test; add spawn-error integration test | -| **P2 — Next maintenance cycle** | F-003–F-011 | Version alignment; add missing automated tests (publishConfig, cross-platform CI, gh-copilot detection, --cli override, cleanup, npm pack, detectCli unit) | -| **P3 — Doc cleanup** | F-012–F-021 | Mark GAP-010 resolved; add design notes for D1s; fix usage hint or acceptance criterion; add missing assertions | - ---- - -## Open Questions - -1. **F-015**: Is `"Run promptkit to start an interactive session"` acceptable as "or equivalent" for `promptkit interactive`? → determines doc fix vs code fix. -2. **F-003**: What changed in v0.3, v0.4, v0.5? Any unreflected spec changes, or purely publish version bumps? -3. **F-005**: Is a CI platform matrix already configured (e.g., `.github/workflows/`)? Not in audit scope. -4. **F-009**: Is there an external build validation step covering TC-CLI-100/121 outside `npm test`? From 8359da0fdb395be008a3b2abaecc25329c0ef00c Mon Sep 17 00:00:00 2001 From: Andrew Beltrano Date: Mon, 4 May 2026 15:27:16 -0600 Subject: [PATCH 10/12] Address PR #254 third-pass review feedback (5 threads) - Soften voice-fidelity self-check paraphrases in template, manifest, and review-pull-request to reflect the protocol's conditional rules (avoid restating the rules to prevent future drift) - Switch ADO/GitHub status tables in pr-comment-responses format to API enum literals for consistency with surrounding text - Add human-voice-fidelity reference to SKILL.md draft-reply step Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../skills/respond-to-pr-comments/SKILL.md | 8 +++++++- formats/pr-comment-responses.md | 20 +++++++++---------- manifest.yaml | 7 ++++--- templates/respond-to-pr-comments.md | 7 ++++--- templates/review-pull-request.md | 12 ++++++----- 5 files changed, 32 insertions(+), 22 deletions(-) diff --git a/.github/skills/respond-to-pr-comments/SKILL.md b/.github/skills/respond-to-pr-comments/SKILL.md index 57251a3..7cea964 100644 --- a/.github/skills/respond-to-pr-comments/SKILL.md +++ b/.github/skills/respond-to-pr-comments/SKILL.md @@ -283,7 +283,13 @@ For each thread, produce: of surrounding context. - A **draft reply** when applicable — professional, concise, and technical. Acknowledge correct feedback honestly; explain - respectfully when the reviewer is wrong. + respectfully when the reviewer is wrong. Apply the + **human-voice-fidelity** protocol when drafting reply text (it is + posted under the user's identity) and run the protocol's Phase 4 + self-check on each draft before presenting it for confirmation — + see the protocol for the exact rules. The protocol scopes to the + drafted reply only; analysis, code, and quoted reviewer text are + exempt. ### Step 6: Present Plan diff --git a/formats/pr-comment-responses.md b/formats/pr-comment-responses.md index dd951cd..6572dfc 100644 --- a/formats/pr-comment-responses.md +++ b/formats/pr-comment-responses.md @@ -31,9 +31,9 @@ native status vocabulary**. Do not normalize statuses across platforms. | State | Count | Description | |-------|-------|-------------| -| **Pending** | N | Unresolved threads requiring response | -| **Outdated** | N | Threads on code that has since changed | -| **Resolved** | N | Already resolved — skipped unless user requests | +| `pending` | N | Unresolved threads requiring response | +| `outdated` | N | Threads on code that has since changed | +| `resolved` | N | Already resolved — skipped unless user requests | **Azure DevOps** uses five primary statuses (`active`, `pending`, `fixed`, `wontFix`, `closed`) plus the edge values `byDesign` and @@ -42,13 +42,13 @@ native status vocabulary**. Do not normalize statuses across platforms. | State | Count | Description | |-------|-------|-------------| -| **Active** | N | New / open threads requiring response | -| **Pending** | N | Author marked awaiting something — flag for user | -| **Fixed** | N | Issue addressed — skipped unless user requests | -| **Won't fix** | N | Noted but won't be fixed — skipped unless user requests | -| **Closed** | N | Discussion closed — skipped unless user requests | -| **By design** / **Unknown** | N | Already triaged — skipped unless user requests | -| **Potentially outdated** | N | Thread tracked from older iteration / location no longer exists | +| `active` | N | New / open threads requiring response | +| `pending` | N | Author marked awaiting something — flag for user | +| `fixed` | N | Issue addressed — skipped unless user requests | +| `wontFix` | N | Noted but won't be fixed — skipped unless user requests | +| `closed` | N | Discussion closed — skipped unless user requests | +| `byDesign` / `unknown` | N | Already triaged — skipped unless user requests | +| _Potentially outdated_ | N | Derived (not an API status); thread tracked from older iteration or location no longer exists | - **Total threads**: count - **Actionable threads**: count (the platform's "needs response" states, diff --git a/manifest.yaml b/manifest.yaml index 264c755..93c2c0b 100644 --- a/manifest.yaml +++ b/manifest.yaml @@ -179,9 +179,10 @@ protocols: Preserve the user's communication style when drafting externally visible text on their behalf (PR replies, issue comments, emails). Pluggable voice sources, calibrated style extraction, and a - per-output self-check that bans em-dashes and AI-tell phrases in - drafted replies. Scoped to user-authored text only; does not - affect analysis, code, or quoted content. + per-output self-check that flags AI-tell patterns (em-dashes, + formulaic phrases, etc.) — conditionally permitted when present + in the user's own samples. Scoped to user-authored text only; + does not affect analysis, code, or quoted content. analysis: - name: memory-safety-c diff --git a/templates/respond-to-pr-comments.md b/templates/respond-to-pr-comments.md index 67819ea..b22dcd3 100644 --- a/templates/respond-to-pr-comments.md +++ b/templates/respond-to-pr-comments.md @@ -359,9 +359,10 @@ For each thread, produce: - **Explanation** (if applicable): A draft reply to the reviewer that explains the design decision, tradeoff, or rationale. Apply the **human-voice-fidelity** protocol to this drafted text — match - the user's observed style and run the self-check pass (no em-dash, - no AI-tell phrases) before presenting. The voice protocol scopes - to the drafted reply only; surrounding analysis, code, and quoted + the user's observed style and run the protocol's Phase 4 self-check + before presenting (see the protocol for the exact rules; do not + restate them here to avoid drift). The voice protocol scopes to + the drafted reply only; surrounding analysis, code, and quoted reviewer text are exempt. ### Phase 5: Output diff --git a/templates/review-pull-request.md b/templates/review-pull-request.md index 10bda1c..2f034f6 100644 --- a/templates/review-pull-request.md +++ b/templates/review-pull-request.md @@ -177,11 +177,13 @@ PR review concepts to report sections: - Post this comment? (yes / skip / edit) Apply the **human-voice-fidelity** protocol when drafting the - inline comment `body` and the overall review summary `body` — these - are posted to GitHub under the user's identity. Run the self-check - pass (no em-dash, no AI-tell phrases) on each drafted comment - before presenting. The protocol scopes to drafted prose only; - code references, file paths, and quoted reviewer text are exempt. + inline comment `body` and the overall review summary `body` — + these are posted to GitHub under the user's identity. Run the + protocol's Phase 4 self-check on each drafted comment before + presenting (see the protocol for the exact rules; do not restate + them here to avoid drift). The protocol scopes to drafted prose + only; code references, file paths, and quoted reviewer text are + exempt. 3. **Post confirmed findings** as inline review comments using a JSON payload file so `comments` is sent as an array, not a string: ``` From 6e3914bb1566725873d22629e82f733f3420faf0 Mon Sep 17 00:00:00 2001 From: Andrew Beltrano Date: Mon, 4 May 2026 17:18:34 -0600 Subject: [PATCH 11/12] fix(respond-to-pr-comments): warn against shell escape sequences in reply text Serialization to JSON may preserve shell-specific character escape sequences literally rather than the intended Unicode character. Add guidance to use literal UTF-8 characters directly in reply text. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- templates/respond-to-pr-comments.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/templates/respond-to-pr-comments.md b/templates/respond-to-pr-comments.md index b22dcd3..07bc882 100644 --- a/templates/respond-to-pr-comments.md +++ b/templates/respond-to-pr-comments.md @@ -423,6 +423,12 @@ differ. and POST. Always use a temp file (not an inlined `--body '...'` string): real reply text contains apostrophes, newlines, and backslashes that break shell quoting in both bash and PowerShell. + Use **literal UTF-8 characters** in reply text — do NOT use + shell-specific character escape sequences (e.g., PowerShell + `$([char]0x...)`, bash `$'\u...'`) to produce Unicode characters. + Serialization to JSON may preserve the escape syntax literally + rather than the intended character. Write `§`, `—`, `→` etc. + directly. ```json { "content": "", From 727ef67b46c21a9735d0ae662df1be4de09d885b Mon Sep 17 00:00:00 2001 From: Andrew Beltrano Date: Tue, 5 May 2026 11:15:44 -0600 Subject: [PATCH 12/12] Address PR #254 fourth-pass review feedback (6 threads) - Phase 1: make `ado:` prefix an explicit platform override that bypasses remote inspection; carry only the numeric `prId` - Phase 2 GitHub: reference concrete GraphQL fields (`isResolved`/`isOutdated`) instead of a non-existent `state` field - Phase 2 ADO: reuse Phase 1 coordinates instead of re-parsing the URL - Format: reframe GitHub status table as derived workflow labels (`open`/`outdated`/`resolved`); cascade `pending`->`open` in skill - Format: standardize per-thread placeholder examples to lowercase code literals matching the Thread Summary tables - SKILL.md: mirror the `ado:` prefix handling Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../skills/respond-to-pr-comments/SKILL.md | 24 +++++--- formats/pr-comment-responses.md | 21 ++++--- templates/respond-to-pr-comments.md | 57 ++++++++++++------- 3 files changed, 66 insertions(+), 36 deletions(-) diff --git a/.github/skills/respond-to-pr-comments/SKILL.md b/.github/skills/respond-to-pr-comments/SKILL.md index 7cea964..cbbd3bb 100644 --- a/.github/skills/respond-to-pr-comments/SKILL.md +++ b/.github/skills/respond-to-pr-comments/SKILL.md @@ -49,14 +49,18 @@ statuses to GitHub terms or vice versa. ### Step 1: Detect Platform -1. Parse PR URL: `github.com/...` → **GitHub**; +1. **Explicit prefix override first.** `ado:` (e.g., `ado:123`) + → unambiguous **ADO**. Strip the `ado:` prefix; carry the numeric + `prId` only — never the literal `ado:` string. Skip remote + inspection in step 3. +2. Parse PR URL: `github.com/...` → **GitHub**; `dev.azure.com/{org}/{project}/_git/{repo}/pullrequest/{n}` or `{org}.visualstudio.com/...` → **ADO**. -2. Else inspect `git remote -v` (handle SSH: `git@github.com`, +3. Else inspect `git remote -v` (handle SSH: `git@github.com`, `git@ssh.dev.azure.com:v3/...`, `{org}@vs-ssh.visualstudio.com:v3/...`). Prefer current branch's upstream when multiple remotes exist. -3. Still ambiguous → ask the user. Do NOT guess. -4. ADO Server / on-prem / TFS host → stop with a clear message. +4. Still ambiguous → ask the user. Do NOT guess. +5. ADO Server / on-prem / TFS host → stop with a clear message. #### Resolve connection coordinates @@ -72,8 +76,11 @@ Source them as follows: already-encoded segments from the original URL). Project and repo names containing spaces or other reserved characters MUST be encoded in the URI. -- **From a bare id** (`#42` for GitHub, `123` or `ado:123` for ADO) — - derive the rest from the selected upstream remote. Recognise: +- **From a bare id** (`#42` or `42` for GitHub, `123` or `ado:123` + for ADO) — derive the rest from the selected upstream remote. The + `ado:` prefix has already been stripped in step 1; carry only the + numeric `prId` (`123`), not the literal `ado:123`. Strip a leading + `#` from GitHub ids similarly. Recognise: - GitHub HTTPS: `https://github.com/{owner}/{repo}(.git)?` - GitHub SSH: `git@github.com:{owner}/{repo}(.git)?` - ADO HTTPS: `https://dev.azure.com/{org}/{project}/_git/{repo}` @@ -137,7 +144,8 @@ For each thread, record: - `thread_id`: the GraphQL `id` (required for `resolveReviewThread`) - Reviewer handle(s) - File path and line number -- Thread state: **pending** (unresolved + not outdated), +- Workflow classification (derived from `isResolved` / `isOutdated`, + not a single API field): **open** (unresolved + not outdated), **outdated** (code has changed), or **resolved** - Full comment text and replies @@ -218,7 +226,7 @@ For each comment within the thread, record: #### GitHub Skip `resolved` (count). Flag `outdated` — ask before processing. -Group `pending` by file path. +Group `open` by file path. #### Azure DevOps diff --git a/formats/pr-comment-responses.md b/formats/pr-comment-responses.md index 6572dfc..b3a402c 100644 --- a/formats/pr-comment-responses.md +++ b/formats/pr-comment-responses.md @@ -27,13 +27,16 @@ threads. The format adapts based on `output_mode`: Summarize all review threads by state, using the **source platform's native status vocabulary**. Do not normalize statuses across platforms. -**GitHub** uses three states: `pending`, `outdated`, `resolved`. +**GitHub** review threads expose two boolean flags — `isResolved` and +`isOutdated` — and have no single API status field. This format +groups them into three workflow classification labels (the labels +below are template names, not GitHub API literals): -| State | Count | Description | +| Label | Count | Description | |-------|-------|-------------| -| `pending` | N | Unresolved threads requiring response | -| `outdated` | N | Threads on code that has since changed | -| `resolved` | N | Already resolved — skipped unless user requests | +| `open` | N | Unresolved threads requiring response (`isResolved: false`, `isOutdated: false`) | +| `outdated` | N | Threads on code that has since changed (`isOutdated: true`) | +| `resolved` | N | Already resolved (`isResolved: true`) — skipped unless user requests | **Azure DevOps** uses five primary statuses (`active`, `pending`, `fixed`, `wontFix`, `closed`) plus the edge values `byDesign` and @@ -86,9 +89,11 @@ For each actionable thread, in file order: #### Thread T-: : - **Reviewer**: @handle -- **Thread State**: +- **Thread State**: - **Comment Summary**: <1–2 sentence summary of the reviewer's point> - **Response Type**: Fix / Explain / Both - **Analysis**: diff --git a/templates/respond-to-pr-comments.md b/templates/respond-to-pr-comments.md index 07bc882..427cbe9 100644 --- a/templates/respond-to-pr-comments.md +++ b/templates/respond-to-pr-comments.md @@ -59,7 +59,16 @@ output — do NOT translate statuses across platforms. Determine whether `pr_reference` points to a GitHub or Azure DevOps Services pull request. Use this resolution order: -1. **Parse the URL**: +1. **Recognise explicit platform prefixes first.** Before any URL or + remote inspection, if `pr_reference` matches `ado:` (e.g., + `ado:123`), treat that as an unambiguous **Azure DevOps Services** + override — record the platform as ADO, strip the `ado:` prefix to + yield the numeric `prId`, and skip step 3 (remote inspection). + Coordinates still need to be resolved from the upstream remote in + step 6; carry only the numeric `prId`, never the literal + `ado:` string. + +2. **Parse the URL**: - `github.com///pull/` → **GitHub** - `dev.azure.com///_git//pullrequest/` → **Azure DevOps Services** @@ -67,19 +76,20 @@ Services pull request. Use this resolution order: **Azure DevOps Services** (legacy host) - URL-decode `` and `` segments before using them. -2. **If `pr_reference` is not a URL** (e.g., bare `#42` for GitHub or - `123` / `ado:123` for ADO), - inspect `git remote -v` in the current working directory: +3. **If `pr_reference` is not a URL and has no `ado:` prefix** (e.g., + bare `#42` or `42` for GitHub, bare `123` for ADO when the working + tree is unambiguously an ADO clone), inspect `git remote -v` in + the current working directory: - Match HTTPS or SSH remotes: - GitHub: `github.com`, `git@github.com` - ADO: `dev.azure.com`, `git@ssh.dev.azure.com:v3/...`, `.visualstudio.com`, `@vs-ssh.visualstudio.com:v3/...` - If multiple remotes exist, prefer the upstream of the current branch. -3. **If still ambiguous**, ask the user explicitly: +4. **If still ambiguous**, ask the user explicitly: "Is this PR on GitHub or Azure DevOps Services?" Do NOT guess. -4. **Out of scope**: Azure DevOps Server (on-prem) and TFS custom +5. **Out of scope**: Azure DevOps Server (on-prem) and TFS custom hostnames are NOT supported by this template's auth path. If you detect such a host, stop and inform the user that they must run this template against a supported platform or provide their own @@ -87,7 +97,7 @@ Services pull request. Use this resolution order: Record the detected platform; the rest of the phases branch on it. -5. **Resolve connection coordinates** before any API call. The set +6. **Resolve connection coordinates** before any API call. The set of values needed depends on the platform: - **GitHub**: `owner`, `repo`, `pr_number`. - **ADO**: `org`, `project`, `repoName`, `prId` (and later `repoId`, @@ -100,8 +110,8 @@ Record the detected platform; the rest of the phases branch on it. segment** (or preserve the already-encoded segments from the original URL) so values containing spaces or other reserved characters do not produce malformed requests. - - **If `pr_reference` is a bare id** (`#42` for GitHub, `123` or - `ado:123` for ADO), derive + - **If `pr_reference` is a bare id** (`#42` or `42` for GitHub, + `123` or `ado:123` for ADO), derive `owner`/`repo` (GitHub) or `org`/`project`/`repoName` (ADO) from the selected upstream remote. Recognise these forms: - GitHub HTTPS: `https://github.com/{owner}/{repo}(.git)?` @@ -110,7 +120,11 @@ Record the detected platform; the rest of the phases branch on it. - ADO SSH: `git@ssh.dev.azure.com:v3/{org}/{project}/{repo}` - ADO legacy HTTPS: `https://{org}.visualstudio.com/{project}/_git/{repo}` - ADO legacy SSH: `{org}@vs-ssh.visualstudio.com:v3/{org}/{project}/{repo}` - The bare id provides `pr_number` / `prId`. + The bare id provides `pr_number` (GitHub) or `prId` (ADO). When + the input was `ado:123`, the prefix has already been stripped in + step 1 — `prId` is the numeric value (`123`), never the literal + `ado:123` string. Strip a leading `#` from GitHub bare ids + similarly. - If any required field cannot be determined unambiguously, prompt the user. Do NOT invent values. @@ -125,22 +139,25 @@ Fetch all review threads from the platform. `gh api graphql` to fetch the authoritative review-thread data needed for deterministic action mode execution - For each review thread, record: - - `thread_id`: the GraphQL review thread ID (required for + - `thread_id`: the GraphQL review thread `id` (required for `resolveReviewThread`) + - `isResolved` and `isOutdated` (these are the concrete fields + GitHub exposes — there is no single `state` field; the + template derives the workflow classification `open` / + `outdated` / `resolved` from these flags) - Reviewer handle - - File path and line number - - Thread state (pending, resolved, outdated) + - File path (`path`) and line number (`line`) - Full comment text and any replies - Whether the thread is on code that still exists in the current diff - For each review comment within the thread, record: - - `comment_id`: the review comment database ID (required for + - `comment_id`: the review comment `databaseId` (required for REST `in_reply_to` when posting a reply) - Author handle - Comment body - - Use a GraphQL query via `gh api graphql` that includes each - thread's ID, state, path, and line metadata, plus each - comment's database ID, author, and body + - Use a GraphQL query via `gh api graphql` that selects each + thread's `id`, `isResolved`, `isOutdated`, `path`, and `line`, + plus each comment's `databaseId`, `author`, and `body` - **Paginate exhaustively.** Both `reviewThreads` and the inner `comments` connection paginate independently. Page outer `reviewThreads(first: 100, after: $cursor)` until @@ -174,9 +191,9 @@ omit `--resource` — without it, `az rest` may attach a token for the wrong audience and the call will fail authorization. Do NOT instruct the user to mint a Personal Access Token. -1. **Resolve the repository GUID and PR id.** Parse the PR URL: - - `org`, `project`, `repoName`, `prId` from the URL path - (URL-decode `project` and `repoName`). +1. **Resolve the repository GUID.** Use the `org`, `project`, + `repoName`, and `prId` already resolved in Phase 1 (from the URL + or from the upstream remote) — do NOT require a PR URL here. - When constructing the API URIs below, **URL-encode** `project` and `repoName` (they may contain spaces or reserved characters). `{projectEnc}` and `{repoNameEnc}` denote the encoded forms;