Add conversational replies to PR review feedback by derekmisler · Pull Request #64 · docker/cagent-action

derekmisler · 2026-03-02T18:33:59Z

Summary

Adds a reply-to-feedback job that responds directly in PR review threads when a developer replies to an agent comment
New pr-review-reply agent (Sonnet-powered) analyzes reply type and posts contextual responses
Runs in parallel with existing capture-feedback job (sync reply + async artifact fallback)
Mirrored in self-review-pr.yml for dogfooding

Architecture

User replies to <!-- cagent-review --> comment
        ↓
┌───────────────────────────────┬────────────────────────────────────┐
│ capture-feedback (existing)   │ reply-to-feedback (NEW)            │
│ • No secrets needed           │ • Needs API keys + App token       │
│ • Saves artifact for async    │ • Fetches full thread context      │
│   learning on next review     │ • Runs pr-review-reply agent       │
│                               │ • Agent posts reply + learns       │
└───────────────────────────────┴────────────────────────────────────┘

Loop prevention (4 layers)

user.type != 'Bot' in workflow condition
Distinct markers:  vs 
in_reply_to_id always points to thread root (GitHub threading model)
5-minute timeout cap

Security

Authorization gate: OWNER/MEMBER/COLLABORATOR only
Context values passed via env: blocks (no ${{ }} in shell scripts)
Randomized GITHUB_OUTPUT heredoc delimiters (prevents delimiter poisoning)
Agent reply uses jq | gh api --input - pattern (no shell quoting issues)

Test plan

Create a test PR on this repo
Let auto-review run and post inline comments
Reply to an agent comment with a question — verify reply-to-feedback triggers and agent responds in-thread
Reply again to the agent's response — verify multi-turn works
Check Actions cache for pr-review-memory-*-reply-* entries
Verify capture-feedback still runs in parallel and saves artifacts
Verify unauthorized users (non-org members) don't trigger the reply agent

docker-agent

Review Summary

✅ Approved — No issues found in the changed code.

This PR adds conversational reply functionality to the PR review system. The implementation looks solid:

Proper authorization gates (OWNER/MEMBER/COLLABORATOR only)
Good security practices (env blocks, randomized heredoc delimiters, jq piping pattern)
Effective loop prevention (4 layers including bot detection, distinct markers, threading model, timeout)
Well-structured parallel architecture (sync reply + async artifact fallback)

The shell scripting for GitHub API interactions is correctly implemented with proper variable expansion and jq processing. Great work on the multi-turn conversation support!

derekmisler · 2026-03-02T19:16:39Z

/review

docker-agent

Review Summary

Reviewed the new conversational reply feature for PR reviews. The implementation is generally well-designed with good security considerations (randomized delimiters, auth checks, safe variable passing). One medium-severity issue was found regarding overly permissive shell access in the agent configuration.

review-pr/agents/pr-review-reply.yaml

derekmisler · 2026-03-02T19:49:37Z

/review

docker-agent

Review Summary

Found a critical issue with the thread context building logic. The GitHub webhook triggers when a developer posts a reply, but at that moment the comment may not yet be available via the API when the workflow fetches the thread context. This means the agent analyzes a stale thread without seeing the actual triggering comment.

Additional concerns include overly broad permissions, silent error handling, and fragile comment detection logic. See inline comments for details.

.github/workflows/review-pr.yml

review-pr/agents/pr-review-reply.yaml

review-pr/reply/action.yml

.github/workflows/review-pr.yml

derekmisler · 2026-03-02T20:04:55Z

/review

docker-agent

Review Summary

Reviewed the conversational PR reply feature implementation. Found one issue related to error handling in the thread context builder that could result in incomplete reply data being sent to the agent.

The overall architecture looks solid with good security practices (authorization gates, randomized delimiters, safe jq/gh api patterns). The main concern is ensuring robustness when processing reply threads.

.github/workflows/review-pr.yml

derekmisler · 2026-03-02T20:10:24Z

/review

docker-agent

Review Summary

Excellent implementation of conversational reply functionality! The code demonstrates strong security practices and robust error handling:

✅ Shell Security: Uses randomized heredoc delimiters (THREAD_CONTEXT_$(openssl rand -hex 8)) to prevent injection attacks — this is the correct approach for handling untrusted user input in heredocs.

✅ Thread Context Building: Properly handles eventual consistency by sourcing the triggering comment from the webhook payload and explicitly deduplicating it from API results.

✅ Variable Handling: Shell variable expansion in jq filters ($ROOT_ID) works correctly within double quotes — the numeric value is properly passed to jq.

✅ Authorization & Loop Prevention: Multi-layer defense (Bot type check, marker validation, OWNER/MEMBER/COLLABORATOR gate, 5-min timeout) provides robust protection.

✅ Error Handling: Uses continue-on-error: true and adds "confused" reaction on failure so developers know their reply wasn't processed.

The architecture is well-designed with sync reply + async artifact capture providing a resilient fallback. No bugs found in the changed code.

derekmisler · 2026-03-03T17:56:06Z

/review

derekmisler

test

derekmisler · 2026-03-03T18:01:14Z

/review

derekmisler · 2026-03-03T18:08:42Z

/review

derekmisler · 2026-03-03T18:22:34Z

/review

derekmisler · 2026-03-03T18:58:46Z

/review

github-actions

🟡 Code Review: Needs Attention

Summary

Found 3 medium-severity and 2 low-severity issues in the conversational reply feature implementation. The most critical issues are:

Loop failures with zero replies: Both workflow files have a bug where seq 0 -1 fails when there are no previous replies in a thread, causing thread context building to silently skip
Overly permissive agent shell commands: The pr-review-reply agent allows grep * and cat *, creating prompt injection attack surface for file content leakage

Notable Positives

✅ Good authorization gating (OWNER/MEMBER/COLLABORATOR only)
✅ Proper use of env: blocks to avoid ${{ }} injection in shell
✅ Randomized heredoc delimiters reduce collision probability
✅ Loop prevention strategy with distinct markers

Recommendations

Add if [ "$reply_count" -gt 0 ]; then guards before the seq loops
Remove grep * and cat * from agent permissions (use read_file tool instead)
Consider sanitizing user input before embedding in heredocs

Review posted by cagent-action

github-actions · 2026-03-03T19:02:42Z

.github/workflows/review-pr.yml

+            # Add earlier replies from the API (excludes the triggering comment
+            # to avoid duplication if the API already has it)
+            reply_count=$(echo "$all_comments" | jq 'length')
+            for i in $(seq 0 $((reply_count - 1))); do


MEDIUM: Loop iteration fails when reply_count is 0

The loop uses seq 0 $((reply_count - 1)) which fails when reply_count=0 (expands to seq 0 -1). The seq command exits with a non-zero status for invalid ranges, causing the loop to silently skip without executing. This means when a developer replies to the root comment (with no previous replies in the thread), the context-building logic fails to collect any thread data.

Recommendation: Add a guard condition:

if [ "$reply_count" -gt 0 ]; then for i in $(seq 0 $((reply_count - 1))); do # ... done fi

github-actions · 2026-03-03T19:02:42Z

.github/workflows/review-pr.yml

+          # Build the thread context and save as step output.
+          # Use a randomized delimiter to prevent comment body content from
+          # colliding with the GITHUB_OUTPUT heredoc terminator.
+          DELIM="THREAD_CONTEXT_$(openssl rand -hex 8)"


LOW: Heredoc delimiter collision risk

While the heredoc delimiter uses openssl rand -hex 8 (2^64 combinations), the TRIGGER_COMMENT_BODY variable is inserted without sanitization. A malicious developer could intentionally craft a comment containing the delimiter pattern to prematurely terminate the heredoc, potentially truncating output or exposing subsequent commands.

Recommendation: Sanitize user input or use a fixed, non-guessable delimiter:

DELIM="EOF_$(uuidgen | tr -d -)_EOF"

Or escape/filter the comment body before embedding it.

github-actions · 2026-03-03T19:02:42Z

review-pr/agents/pr-review-reply.yaml

+  allow:
+    - shell:cmd=gh api repos/*/pulls/*/comments*
+    - shell:cmd=jq *
+    - shell:cmd=grep *


MEDIUM: Overly permissive shell commands enable prompt injection

The agent permissions include shell:cmd=grep * and shell:cmd=cat * (line 100-101), which allow the AI to execute arbitrary file read operations. A sophisticated prompt injection attack in a developer reply could trick the agent into running commands like grep -r SECRET_KEY / or cat .env, leaking repository structure, workflow files, or sensitive data into the PR thread.

While the agent instructions encourage safe patterns (jq -n --arg), the broad shell permissions create unnecessary attack surface.

Recommendation: Remove grep * and cat * permissions. The agent already has read_file via the filesystem tool, which provides safe, controlled file access without shell injection risk.

github-actions · 2026-03-03T19:02:42Z

.github/workflows/self-review-pr.yml

+            # Add earlier replies from the API (excludes the triggering comment
+            # to avoid duplication if the API already has it)
+            reply_count=$(echo "$all_comments" | jq 'length')
+            for i in $(seq 0 $((reply_count - 1))); do


MEDIUM: Loop iteration fails when reply_count is 0

Identical issue to review-pr.yml:420. The seq 0 $((reply_count - 1)) loop fails when reply_count=0, silently skipping thread context collection.

Recommendation: Add the same guard condition as suggested for the other workflow file.

github-actions · 2026-03-03T19:02:42Z

.github/workflows/self-review-pr.yml

+          # Build the thread context and save as step output.
+          # Use a randomized delimiter to prevent comment body content from
+          # colliding with the GITHUB_OUTPUT heredoc terminator.
+          DELIM="THREAD_CONTEXT_$(openssl rand -hex 8)"


LOW: Heredoc delimiter collision risk

Identical issue to review-pr.yml:399. The heredoc delimiter can be intentionally injected via crafted comment body.

Recommendation: Apply the same sanitization or delimiter hardening as suggested for the other workflow file.

derekmisler · 2026-03-03T19:52:27Z

/review

derekmisler · 2026-03-03T22:50:15Z

/review

docker-agent

Review Summary

Assessment: 🟡 NEEDS_ATTENTION

This PR adds a solid conversational reply feature to PR reviews with good security controls (authorization gates, loop prevention, safe context passing). However, there are 4 confirmed issues with error handling in the thread context building logic that could cause the agent to respond incorrectly when GitHub API calls fail (e.g., rate limits, network issues, invalid IDs).

Key Issues:

Missing error handling when fetching root comments (both workflows)
Missing error handling when fetching thread replies (both workflows)
If API calls fail, malformed/incomplete context is silently passed to the agent
Could lead to incorrect responses or hallucinations

Impact: Medium severity - will manifest in production when API errors occur

Findings Summary

NOTABLE (medium severity): 4 confirmed issues
MINOR: 4 low severity findings not detailed in this review

All issues are in the thread context building bash scripts, duplicated across both workflow files.

.github/workflows/review-pr.yml

.github/workflows/self-review-pr.yml

derekmisler · 2026-03-04T02:39:46Z

/review

docker-agent

Review Summary

Assessment: 🟡 NEEDS_ATTENTION

The PR adds conversational reply functionality to PR review feedback with solid architecture:

✅ Authorization gates (OWNER/MEMBER/COLLABORATOR only)
✅ Loop prevention via distinct markers and user type checks
✅ Security-conscious shell scripting (env blocks, randomized delimiters, jq piping)
✅ Dual-track approach (sync reply + async artifact fallback)

Found 1 issue requiring attention:

Medium severity: Insufficient API response validation in the 'Check if reply is to agent comment' step

The heredoc delimiter randomization and error handling patterns are well-thought-out. The one concern is that field extractions from the GitHub API response could silently produce empty values if the API returns unexpected JSON structure (even after the user type check passes).

.github/workflows/review-pr.yml

derekmisler · 2026-03-04T15:37:39Z

/review

docker-agent

Review Summary

Assessment: 🟡 NEEDS_ATTENTION

This PR adds conversational reply functionality with generally solid security practices (authorization gates, randomized delimiters, JSON construction via jq). However, there are 2 confirmed medium-severity issues that should be addressed:

Bot replies not filtered from thread context — The agent could read and respond to its own previous replies, causing confusion in multi-turn conversations
Potential heredoc injection risk — While mitigated by randomized delimiters, safer quoting strategies exist

Findings

🟡 2 MEDIUM severity issues
🟢 2 LOW severity issues (downgraded after verification)

The loop prevention mechanisms (marker distinction, job-level bot filter) are well-designed, but the thread context building needs bot filtering to fully prevent self-responses.

docker-agent · 2026-03-04T15:40:13Z

.github/workflows/review-pr.yml

+            # to avoid duplication if the API already has it)
+            reply_count=$(echo "$all_comments" | jq 'length')
+            if [ "$reply_count" -gt 0 ]; then
+              for i in $(seq 0 $((reply_count - 1))); do


🟡 MEDIUM: Bot replies not filtered from thread context, could cause agent to respond to itself

The workflow checks github.event.comment.user.type != 'Bot' at the job level, which correctly filters the triggering comment. However, when building thread context, the code iterates through all_comments and includes all of them without filtering bot comments:

for i in $(seq 0 $((reply_count - 1))); do comment_id=$(echo "$all_comments" | jq -r ".[$i].id") || continue # ... adds comment to thread context without checking user.type done

If the agent (or any bot) has previously replied in the thread, those bot replies will be included in the thread context. This could cause the agent to read and respond to its own previous replies, creating confusion in multi-turn conversations.

Suggested fix: Filter bot comments when building the thread context:

user_type=$(echo "$all_comments" | jq -r ".[$i].user.type") || continue if [ "$user_type" = "Bot" ]; then continue fi

docker-agent · 2026-03-04T15:40:13Z

.github/workflows/self-review-pr.yml

+            # to avoid duplication if the API already has it)
+            reply_count=$(echo "$all_comments" | jq 'length')
+            if [ "$reply_count" -gt 0 ]; then
+              for i in $(seq 0 $((reply_count - 1))); do


🟡 MEDIUM: Bot replies not filtered from thread context, could cause agent to respond to itself

The same issue exists in the self-review workflow. The code iterates through all_comments and includes all of them in the prompt without filtering bot comments:

for i in $(seq 0 $((reply_count - 1))); do comment_id=$(echo "$all_comments" | jq -r ".[$i].id") || continue # ... adds comment to thread context without checking user.type done

If a bot has previously replied in the thread, that reply will be included in the thread context, potentially causing the agent to respond to its own previous replies.

Suggested fix: Filter bot comments when building the thread context:

user_type=$(echo "$all_comments" | jq -r ".[$i].user.type") || continue if [ "$user_type" = "Bot" ]; then continue fi

docker-agent · 2026-03-04T15:40:13Z

.github/workflows/review-pr.yml

+          # Build the thread context and save as step output.
+          # Use a randomized delimiter to prevent comment body content from
+          # colliding with the GITHUB_OUTPUT heredoc terminator.
+          DELIM="THREAD_CONTEXT_$(openssl rand -hex 8)"


🟢 LOW: Potential heredoc injection via TRIGGER_COMMENT_BODY (severity downgraded after verification)

The code uses a randomized heredoc delimiter (DELIM="THREAD_CONTEXT_$(openssl rand -hex 8)") to prevent injection when writing user-controlled comment bodies into GITHUB_OUTPUT. While the 16-character cryptographically random delimiter makes exploitation extremely difficult (attacker would need to predict the value in advance), safer quoting strategies exist.

Current approach (adequate):

DELIM="THREAD_CONTEXT_$(openssl rand -hex 8)" echo "$TRIGGER_COMMENT_BODY"

More robust approach:

printf '%s\n' "$TRIGGER_COMMENT_BODY" # or use $'...' quoting

This is a defense-in-depth suggestion rather than a critical fix — the randomized delimiter is an effective practical defense.

docker-agent · 2026-03-04T15:40:13Z

.github/workflows/self-review-pr.yml

+          # Build the thread context and save as step output.
+          # Use a randomized delimiter to prevent comment body content from
+          # colliding with the GITHUB_OUTPUT heredoc terminator.
+          DELIM="THREAD_CONTEXT_$(openssl rand -hex 8)"


🟢 LOW: Potential heredoc injection via TRIGGER_COMMENT_BODY (severity downgraded after verification)

The same heredoc injection concern exists in the self-review workflow. The randomized delimiter (DELIM="THREAD_CONTEXT_$(openssl rand -hex 8)") makes exploitation extremely difficult, but safer quoting strategies like printf '%s\n' would eliminate the theoretical risk entirely.

This is a defense-in-depth suggestion — the current implementation is adequate for practical purposes.

Signed-off-by: Derek Misler <derek.misler@docker.com>

derekmisler marked this pull request as ready for review March 2, 2026 18:36

derekmisler requested a review from a team as a code owner March 2, 2026 18:36

docker-agent bot approved these changes Mar 2, 2026

View reviewed changes

docker-agent bot reviewed Mar 2, 2026

View reviewed changes

review-pr/agents/pr-review-reply.yaml Outdated Show resolved Hide resolved

docker-agent bot requested changes Mar 2, 2026

View reviewed changes

docker-agent bot reviewed Mar 2, 2026

View reviewed changes

.github/workflows/review-pr.yml Outdated Show resolved Hide resolved

docker-agent bot previously approved these changes Mar 2, 2026

View reviewed changes

derekmisler enabled auto-merge (squash) March 3, 2026 14:35

derekmisler commented Mar 3, 2026

View reviewed changes

github-actions bot reviewed Mar 3, 2026

View reviewed changes

derekmisler force-pushed the worktree-feedback-replies branch from 8046f94 to 1e50cd9 Compare March 3, 2026 20:51

derekmisler self-assigned this Mar 3, 2026

derekmisler dismissed docker-agent[bot]’s stale review via 1e50cd9 March 3, 2026 21:22

docker-agent bot reviewed Mar 3, 2026

View reviewed changes

.github/workflows/review-pr.yml Outdated Show resolved Hide resolved

.github/workflows/review-pr.yml Show resolved Hide resolved

.github/workflows/self-review-pr.yml Outdated Show resolved Hide resolved

.github/workflows/self-review-pr.yml Show resolved Hide resolved

derekmisler force-pushed the worktree-feedback-replies branch from 1e50cd9 to e00099e Compare March 4, 2026 02:39

docker-agent bot reviewed Mar 4, 2026

View reviewed changes

.github/workflows/review-pr.yml Outdated Show resolved Hide resolved

derekmisler force-pushed the worktree-feedback-replies branch from e00099e to a2fc939 Compare March 4, 2026 02:46

docker-agent bot reviewed Mar 4, 2026

View reviewed changes

Add conversational replies to PR review feedback

b66169d

Signed-off-by: Derek Misler <derek.misler@docker.com>

derekmisler force-pushed the worktree-feedback-replies branch from a2fc939 to b66169d Compare March 4, 2026 16:02

Conversation

derekmisler commented Mar 2, 2026

Summary

Architecture

Loop prevention (4 layers)

Security

Test plan

Uh oh!

docker-agent bot left a comment

Choose a reason for hiding this comment

Review Summary

Uh oh!

derekmisler commented Mar 2, 2026

Uh oh!

docker-agent bot left a comment

Choose a reason for hiding this comment

Review Summary

Uh oh!

Uh oh!

derekmisler commented Mar 2, 2026

Uh oh!

docker-agent bot left a comment

Choose a reason for hiding this comment

Review Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

derekmisler commented Mar 2, 2026

Uh oh!

docker-agent bot left a comment

Choose a reason for hiding this comment

Review Summary

Uh oh!

Uh oh!

derekmisler commented Mar 2, 2026

Uh oh!

docker-agent bot left a comment

Choose a reason for hiding this comment

Review Summary

Uh oh!

derekmisler commented Mar 3, 2026

Uh oh!

derekmisler left a comment

Choose a reason for hiding this comment

Uh oh!

derekmisler commented Mar 3, 2026

Uh oh!

derekmisler commented Mar 3, 2026

Uh oh!

derekmisler commented Mar 3, 2026

Uh oh!

derekmisler commented Mar 3, 2026

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

🟡 Code Review: Needs Attention

Summary

Notable Positives

Recommendations

Uh oh!

github-actions bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

derekmisler commented Mar 3, 2026

Uh oh!