Skip to content

Add submit_feedback tool with consent-gated session transcript capture (POC)#22

Draft
swarup-padhi-glean wants to merge 1 commit into
mainfrom
feedback-session-capture-poc
Draft

Add submit_feedback tool with consent-gated session transcript capture (POC)#22
swarup-padhi-glean wants to merge 1 commit into
mainfrom
feedback-session-capture-poc

Conversation

@swarup-padhi-glean

Copy link
Copy Markdown
Contributor

What

POC for capturing the Claude/Codex session transcript on a feedback event in the Glean plugin, per the #actions-working-group thread (EN-1845292) and the security review there.

A plugin-owned submit_feedback tool records a vote (up/down) + optional comment. When the admin has enabled capture and the user consents, it also captures this session's transcript — the user's prompts and every other tool/MCP server's activity, not just Glean's — so Glean can debug multi-step skill orchestrations that run inside the host.

Draft / POC. Upload to Glean is stubbed — the redacted transcript is written to local disk only (~/.glean/feedback/, 0600). No backend/gateway changes. Not for production rollout: still needs Legal sign-off, production-grade (Presidio) redaction, geo-based opt-out defaults, and a feature-security review. Transcript data is Restricted customer data with mixed third-party provenance.

Sunil's 4 requirements → controls

# Requirement Implementation
1 Debugging justification Stated in the tool description + the consent prompt
2 Admin opt-out GLEAN_SESSION_CAPTURE env gate, default off; capture skipped with a clear message when disabled
3 User opt-out Per-event consent elicitation (fail-closed — errors/timeouts/no-elicitation-surface ⇒ no capture) + a persistent feedback-capture-optout marker file
4 Redact credentials, keep PII redact.ts best-effort scrub (PEM, JWT, OpenAI/AWS/GitHub/Slack/Google keys, Bearer/Authorization, user:pass@host, generic secret=value); PII intentionally preserved

How it works

  • Locate the transcript from the host session id (resolveSessionId()GLEAN_SESSION_ID):
    • Claude Code: ~/.claude/projects/<encoded-cwd>/<session-id>.jsonl
    • Codex: ~/.codex/sessions/YYYY/MM/DD/rollout-<ts>-<session-id>.jsonl
    • Cursor: no session-id env var → locateTranscript returns null, capture is skipped gracefully.
  • Read with tail-truncation (most-recent activity kept; default 25 MB cap, GLEAN_FEEDBACK_MAX_BYTES).
  • Redact credentials best-effort, then write a local 0600 artifact. Upload is stubbed.

Files

  • src/redact.ts — best-effort credential scrubber (PII left intact; Presidio extension point noted).
  • src/session-transcript.tslocateTranscript() + readTranscript() (tail-truncation).
  • src/tools/submit-feedback.ts — the tool + handleSubmitFeedback().
  • tests/* — 29 new tests for the three modules.
  • src/index.ts, .mcp.json / .mcp.codex.json, three plugin manifests — registration + GLEAN_SESSION_CAPTURE + version bump.

Verification

  • tsc --noEmit clean · npm run build clean · 184/184 vitest tests pass (29 new).
  • Real-data demo against this live session's transcript: located the Claude Code transcript → cross-tool content present (mcp__conductor, Bash, Workflow, …) → planted sk-… / AKIA… / ghp_… scrubbed to «redacted:…».

Deferred (as scoped)

Actual upload + backend chat_transcript wiring · Presidio-grade redaction · geo-based opt-out defaults + Legal sign-off · Cursor support · auto/bulk scraping (separate, higher-scrutiny effort).

…e (POC)

Plugin-owned submit_feedback records a vote + optional comment, and — when
the admin enables capture and the user consents — captures this session's
transcript so Glean can debug multi-step skill orchestrations that run inside
the host.

Implements the four controls from the security review:
- Admin opt-out: GLEAN_SESSION_CAPTURE env gate, default off.
- User opt-out: per-event consent elicitation (fail-closed) + persistent
  feedback-capture-optout marker file.
- Credential redaction: best-effort regex pass (PEM, JWT, OpenAI/AWS/GitHub/
  Slack/Google keys, Bearer/Authorization, url creds, generic secret=value).
  PII intentionally preserved per the review.
- Justification surfaced in the tool description + consent text.

Transcript is located from the host session id (~/.claude/projects for Claude
Code, ~/.codex/sessions rollout files for Codex; Cursor has no session-id env
var and is skipped gracefully) and tail-truncated. Upload to Glean is STUBBED;
the redacted artifact is written locally (0600) only.

Adds GLEAN_SESSION_CAPTURE to both .mcp configs and bumps the three plugin
manifests 0.2.29 -> 0.2.30. 29 new tests; 184/184 pass, typecheck + build clean.
swarup-padhi-glean added a commit that referenced this pull request Jun 22, 2026
Three open PRs (#21/#22/#23) were stacked on 0.2.31; move this one to the
next version so it stays strictly above main and mergeable regardless of
order against a single 0.2.31 PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant