Skip to content

Comments

fix(resolve-pr-threads): cut 40% of lines to eliminate hallucination failure modes#52

Merged
JacobPEvans merged 2 commits intomainfrom
fix/resolve-pr-threads-reliability
Feb 23, 2026
Merged

fix(resolve-pr-threads): cut 40% of lines to eliminate hallucination failure modes#52
JacobPEvans merged 2 commits intomainfrom
fix/resolve-pr-threads-reliability

Conversation

@JacobPEvans
Copy link
Owner

@JacobPEvans JacobPEvans commented Feb 23, 2026

Summary

  • Reduces SKILL.md from 357 → 212 lines (40% cut) by removing the verbose CRITICAL section, duplicate PR context block, Special Cases section, and example blocks
  • Trims sub-agent prompt templates from ~45 lines to ~25 lines by removing redundant DO NOT items already in the parent list
  • Inlines the exact reply command in the sub-agent template instead of "For REST API details: read rest-api-patterns.md" — sub-agents no longer need to find/read another file
  • Adds DO NOT write Python/shell scripts to the DO NOT list + reinforces inline in both templates
  • Step 4 now explicitly resolves threads sequentially (one at a time) to prevent cascade failures from parallel mutations
  • graphql-queries.md: 88 → 71 lines — removes duplicated Common Errors table
  • rest-api-patterns.md: 113 → 85 lines — removes When-to-Use table, duplicate Parameters table, trailing cross-reference footer

Root Cause

Seven fix attempts over 10 days failed because the skill was too long and too indirect. Sub-agents were told to read another file but hallucinated the path or mutation names. Parallel mutations cascaded on first failure. The verbose CRITICAL section was ignored while the actual commands got lost in noise.

Test plan

  • Verify SKILL.md has "DO NOT write Python/shell scripts" in DO NOT list
  • Verify no "read rest-api-patterns.md" or "read graphql-queries.md" references in sub-agent prompts
  • Verify Step 4 says "sequentially" / "one at a time"
  • Run /resolve-pr-threads against an actual PR with unresolved threads

🤖 Generated with Claude Code


Important

Reduces markdown file sizes and improves resolve-pr-threads skill by removing redundancies and ensuring sequential thread resolution.

  • Behavior:
    • Reduces SKILL.md from 357 to 212 lines by removing verbose and duplicate sections.
    • Trims sub-agent prompt templates from ~45 to ~25 lines by removing redundant items.
    • Inlines exact reply command in sub-agent template, removing need to read other files.
    • Adds "DO NOT write Python/shell scripts" to the DO NOT list.
    • Step 4 now resolves threads sequentially to prevent cascade failures.
  • Files:
    • graphql-queries.md: Reduced from 88 to 71 lines by removing duplicated Common Errors table.
    • rest-api-patterns.md: Reduced from 113 to 85 lines by removing redundant tables and footers.
  • Misc:
    • Ensures no references to "read rest-api-patterns.md" or "read graphql-queries.md" in sub-agent prompts.
    • Verifies Step 4 explicitly states "sequentially" or "one at a time".

This description was created by Ellipsis for f7d4926. You can customize this summary. It will automatically update as commits are pushed.

…failure modes

Recurring failures traced to: sub-agents told to "read rest-api-patterns.md"
without knowing the path, verbose CRITICAL sections that got ignored, parallel
mutations causing cascade failures, and Python scripts written instead of direct
gh commands.

Changes:
- SKILL.md: 357 → 212 lines. Remove CRITICAL section, Determine PR Context
  duplicate block, Special Cases section, and example blocks. Trim sub-agent
  templates from ~45 lines to ~25 lines by removing redundant DO NOT items.
  Add "DO NOT write Python/shell scripts" rule. Inline reply command directly in
  template instead of "go read rest-api-patterns.md". Step 4 now explicitly
  resolves threads sequentially (not in parallel) to prevent cascade failures.
- graphql-queries.md: 88 → 71 lines. Remove duplicated Common Errors table.
- rest-api-patterns.md: 113 → 85 lines. Remove When-to-Use table, duplicate
  Parameters table, and trailing cross-reference footer.

(claude)
Copilot AI review requested due to automatic review settings February 23, 2026 22:39
Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to f7d4926 in 12 seconds. Click for details.
  • Reviewed 500 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 0 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

Workflow ID: wflow_Ocr3jCPUWkj8TulU

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@gemini-code-assist
Copy link

Summary of Changes

Hello @JacobPEvans, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on refining and streamlining the documentation and operational logic for resolving pull request threads. The primary goal is to reduce cognitive load and potential 'hallucination' failure modes in automated agents by making instructions more concise, direct, and less verbose. This involves significant cuts to markdown files, simplifying sub-agent prompts, and enforcing sequential processing for critical steps to improve reliability and efficiency.

Highlights

  • Documentation Reduction: The SKILL.md file was significantly reduced by 40% (from 357 to 212 lines) by removing verbose sections like 'CRITICAL', duplicate PR context, 'Special Cases', and example blocks. Similar reductions were made in graphql-queries.md (88 to 71 lines) and rest-api-patterns.md (113 to 85 lines) by removing redundant tables and information.
  • Sub-Agent Prompt Optimization: Sub-agent prompt templates were trimmed from approximately 45 lines to 25 lines. This was achieved by removing redundant 'DO NOT' items already present in the parent list and by inlining the exact reply command, eliminating the need for sub-agents to read external documentation files.
  • Enhanced Safety and Clarity: A new 'DO NOT write Python/shell scripts' rule was added to the general 'DO NOT' list and reinforced within sub-agent templates. Additionally, Step 4 for resolving threads was explicitly updated to process threads sequentially, preventing potential cascade failures from parallel mutations.
Changelog
  • github-workflows/skills/resolve-pr-threads/SKILL.md
    • Removed verbose 'CRITICAL: GraphQL Query Format Requirements' section.
    • Removed 'Determine PR Context' section.
    • Simplified 'DO NOT List' and added a new entry: 'DO NOT write Python/shell scripts'.
    • Streamlined extraction details for fetched threads in Step 1a.
    • Clarified parallel execution notes for Step 1.
    • Simplified grouping descriptions in Step 2a and 2b.
    • Refactored sub-agent prompt templates to be more concise, removing verbose intros and external documentation references.
    • Inlined the exact REST API reply command directly into the sub-agent prompt.
    • Updated Step 4 to explicitly state that threads are resolved 'sequentially' to prevent cascade failures.
    • Removed the 'Special Cases for Sub-Agents' section.
    • Simplified the 'Output Format' description.
    • Updated the troubleshooting table for 'REST reply fails' to specify 'Invalid databaseId'.
  • github-workflows/skills/resolve-pr-threads/graphql-queries.md
    • Removed redundant instruction 'Run these three commands to get substitution values' from 'Get Context' section.
    • Removed the note 'All queries use camelCase placeholders for string substitution.' from 'Placeholder Convention' section.
    • Removed the 'Common Errors' table.
  • github-workflows/skills/resolve-pr-threads/rest-api-patterns.md
    • Removed the 'When to Use REST vs GraphQL' table.
    • Simplified the note for extracting databaseId from GraphQL response.
    • Removed the introductory sentence for 'Read Non-Thread Comments'.
    • Removed detailed extraction points for comments from 'Fetch Top-Level PR Comments Since Last Commit' section.
    • Removed the 'Parameters' table.
    • Removed the cross-reference footer to graphql-queries.md.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request significantly refactors the SKILL.md, graphql-queries.md, and rest-api-patterns.md files to improve clarity, reduce verbosity, and streamline the resolve-pr-threads skill. All original comments were kept as they align with good practices and do not contradict any of the provided rules. The changes aim to eliminate hallucination failure modes by removing redundant information, inlining critical commands, and explicitly enforcing sequential thread resolution. The sub-agent prompts have been trimmed, and a new DO NOT write Python/shell scripts rule has been added. Overall, these changes enhance the maintainability and correctness of the skill by making the instructions more direct and less prone to misinterpretation.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR streamlines the resolve-pr-threads skill documentation to reduce verbosity and remove indirect cross-references that were contributing to agent execution errors and command hallucinations.

Changes:

  • Condenses SKILL.md by removing redundant/verbose sections and shortening sub-agent prompt templates.
  • Updates the workflow to explicitly resolve threads sequentially (one mutation at a time) to reduce cascading failures.
  • Trims duplicated/auxiliary content from graphql-queries.md and rest-api-patterns.md.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
github-workflows/skills/resolve-pr-threads/SKILL.md Shortens orchestrator + sub-agent prompts, adds “no scripts” rule, and documents sequential thread resolution.
github-workflows/skills/resolve-pr-threads/graphql-queries.md Removes duplicated “Common Errors” content while retaining the canonical query/mutation formats.
github-workflows/skills/resolve-pr-threads/rest-api-patterns.md Removes advisory/duplicate tables and tightens non-thread comment guidance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Align placeholder {commentId} -> {databaseId} in rest-api-patterns.md
  to match SKILL.md and the actual GraphQL field name
- Broaden troubleshooting row: REST reply fails can be permissions too,
  not just invalid databaseId
- Clarify context inference block: explicitly state that shell variable
  values should be substituted for {placeholder} syntax in commands

(claude)
@JacobPEvans JacobPEvans merged commit 949151c into main Feb 23, 2026
5 checks passed
@JacobPEvans JacobPEvans deleted the fix/resolve-pr-threads-reliability branch February 23, 2026 23:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant