Skip to content

Skill pairing: scope by session_id and only drop the canonical payload#124

Merged
cboos merged 1 commit intodev/pair-skill-user-messagefrom
dev/skill-pairing-followup
Apr 24, 2026
Merged

Skill pairing: scope by session_id and only drop the canonical payload#124
cboos merged 1 commit intodev/pair-skill-user-messagefrom
dev/skill-pairing-followup

Conversation

@cboos
Copy link
Copy Markdown
Collaborator

@cboos cboos commented Apr 24, 2026

Follow-up to #121 addressing CodeRabbit's review (#121 (review)).

Superseded — fix consolidated directly onto #121's branch (dev/pair-skill-user-message) at a584d7a. This stacked PR is no longer needed.

Why this exists

CodeRabbit flagged two failure modes in _pair_skill_tool_uses that the
original commit (dc1770a) didn't defend against:

  1. Cross-session pairing collision — the lookup was keyed on
    tool_use_id alone. RenderingContext.messages spans combined
    transcripts (multiple sessions), but Anthropic tool_use ids are only
    session-unique. A stray collision would let session B's slash body
    fold into session A's Skill tool_use silently.

  2. Real error tool_results silently dropped — the original code
    dropped EVERY tool_result with the matching tool_use_id, on the
    assumption that only the canonical "Launching skill: " result
    carries that id. A real error result (is_error=True) sharing the id
    would get swallowed too.

What changes

  • Lookup keyed by (render_session_id, source_tool_use_id), not bare tool_use_id.
  • New _is_launching_skill_payload(output) helper checks ToolResultContent.content for the "Launching skill:" prefix (handles both string- and list-shaped content).
  • Tool_result removal restricted to: same session, is_error=False, payload prefix matches.

Tests (3 new)

  • test_same_tool_use_id_across_sessions_does_not_cross_pair
  • test_error_tool_result_with_same_id_is_preserved
  • test_non_launching_skill_result_with_same_id_is_preserved

All three fail on the pre-fix code and pass after.

Refs #93 / #121.

CodeRabbit on PR #121 caught two failure modes in `_pair_skill_tool_uses`
that the original commit (dc1770a) didn't defend against. Both are
session-/payload-precision concerns that surface in combined transcripts
or with malformed/error tool_results.

## #1 — Cross-session pairing collision

The lookup was keyed on `tool_use_id` alone. `RenderingContext.messages`
spans combined transcripts (multiple sessions), but Anthropic tool_use
ids are only session-unique. A stray collision would let session B's
slash body fold into session A's Skill tool_use (or vice versa) silently.

Fix: key the lookup by `(render_session_id, source_tool_use_id)` and
match the same compound on the consuming side. `render_session_id` is
fork-aware so within-session branches stay distinct too.

## #2 — Real error tool_results silently dropped

The original code dropped EVERY tool_result with the matching
`tool_use_id`, on the assumption that only the canonical
`"Launching skill: <name>"` result carries that id. A real error result
("Skill 'X' not found", `is_error=True`) would have the same id and get
swallowed too — hiding the failure from the user entirely.

Fix: only drop tool_results that are (a) in the same session,
(b) `is_error=False`, AND (c) whose payload starts with
`"Launching skill:"`. New `_is_launching_skill_payload` helper handles
both string- and list-shaped `ToolResultContent.content`.

## Tests (3 new in TestSkillPairing)

- `test_same_tool_use_id_across_sessions_does_not_cross_pair` — combined
  transcript with two sessions sharing `tool_use_id="toolu_DUP"` but
  different bodies; asserts each Skill keeps its own session's body.
- `test_error_tool_result_with_same_id_is_preserved` — a Skill failure
  flow where the error result shares the tool_use_id; asserts the error
  survives while the canonical "Launching skill:" result is still dropped.
- `test_non_launching_skill_result_with_same_id_is_preserved` — divergent
  payload sharing the id (e.g. malformed transcript) survives.

All three fail on the pre-fix code (cross-pair, swallow error, swallow
divergent payload) and pass after.

Refs #93 / #121.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 24, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f5802c0b-4936-41c6-ae1c-35d9441110c6

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev/skill-pairing-followup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@cboos cboos merged commit a584d7a into dev/pair-skill-user-message Apr 24, 2026
1 check passed
@cboos cboos deleted the dev/skill-pairing-followup branch April 24, 2026 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant