Skip to content

Render user content as Markdown with raw fallback toggle#119

Merged
cboos merged 13 commits intomainfrom
dev/user-markdown
Apr 26, 2026
Merged

Render user content as Markdown with raw fallback toggle#119
cboos merged 13 commits intomainfrom
dev/user-markdown

Conversation

@cboos
Copy link
Copy Markdown
Collaborator

@cboos cboos commented Apr 19, 2026

Summary

User messages have always been escaped into <pre> blocks, which is safe but visually coarse — markdown formatting in the source (**bold**, # headings, backticks) showed up as literal asterisks and pounds. This branch renders user text through the same mistune pipeline as assistant text while preserving the raw view as a one-click fallback.

Rendering safety

  • render_user_markdown() uses escape=True so raw <script> or <iframe> tags are escaped rather than passed through. Assistant-side rendering keeps escape=False (by design, since assistants emit trusted structured HTML).
  • Every rendered output is validated by is_well_formed_html() (stdlib HTMLParser tracking a tag stack). XHTML-style self-closing tags (<br />, <hr />, <img />) are recognised as atomic — mistune emits these with hard_wrap=True.
  • If the output is ill-formed (unclosed tag, mismatched nesting), the message falls back to a bare <pre> with no toggle. Well-formed output emits both .user-md and .user-raw inside a .user-content wrapper.

Toggle UX

  • Per-message: the .user-view-toggle button flips one message's wrapper by setting data-user-view="raw" or "md".
  • Global: the floating #toggleUserView button flips body.show-raw-user, which cascades to every wrapper that has no explicit data-user-view attribute. A user who explicitly chose md on a specific message keeps that choice even if global raw is on.
  • Default view is Markdown. Default state ships with no data-user-view attribute so the global toggle actually matches default messages (an earlier iteration baked data-user-view="md" in, which broke the global toggle — fixed in 659bce8).
  • Global preference persists via localStorage["claude-code-log:user-view"].

Validation

  • 955 tests / 7 skipped across unit, snapshot, and Playwright browser suites.
  • Fixture audit: 0/102 user messages rejected by is_well_formed_html (down from 27/102 before the XHTML self-close fix in 3c6fe86).
  • pyright, ty, and ruff all clean on the branch tip.

Test plan

  • Unit tests: uv run pytest -n auto -m \"not (tui or browser)\" -v
  • Browser tests: uv run pytest -n auto -m browser -v (5 Playwright scenarios covering default load, global toggle, per-message toggle, explicit-md precedence, and localStorage persistence across reload)
  • Type checks: uv run pyright and uv run ty check
  • Lint: uv run ruff check
  • Snapshot tests: uv run pytest -n auto test/test_snapshot_html.py
  • Fixture audit: render every user message in test/test_data/ and confirm 0/102 fall back to the raw-only path

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Per-message and global floating toggles to switch user messages between rendered Markdown and raw text; global choice persists across sessions and shows active state and updated labels.
  • Behavior / Fixes

    • Well-formed Markdown now displays as rendered HTML alongside a preserved raw view; ill-formed Markdown falls back to raw-only. Per-message explicit choices override the global setting and button labels reflect the next view.
  • Tests

    • Added unit and browser tests validating dual-view rendering, precedence, persistence, HTML-validation, and edge cases.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 19, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds server-side Markdown rendering with HTML validation and a dual-view UI for user messages (rendered Markdown vs escaped raw), plus per-message and global toggles (persisted in localStorage), CSS visibility rules, and expanded unit and Playwright browser tests.

Changes

Cohort / File(s) Summary
CSS & Styling
claude_code_log/html/templates/components/global_styles.css
Adds .user-content, .user-md, .user-raw, .user-view-toggle, and .user-view-global.floating-btn rules; implements selector precedence for data-user-view and body.show-raw-user and floating button .active styling.
Frontend HTML & JS
claude_code_log/html/templates/transcript.html
Adds global floating toggle #toggleUserView, DOMContentLoaded initialization, localStorage persistence (claude-code-log:user-view), per-message toggle labels, and delegated handlers to flip data-user-view.
Server-side formatters
claude_code_log/html/user_formatters.py
format_user_text_content() now attempts render_user_markdown() + is_well_formed_html() and emits a dual-view <div class='user-content'> with .user-md, .user-raw, and a toggle when valid; falls back to legacy <pre> when rendering is ill-formed.
Rendering utils
claude_code_log/html/utils.py
Adds cached Mistune renderer _get_user_markdown_renderer, render_user_markdown(text), _VOID_HTML_TAGS, and is_well_formed_html(fragment) for fragment validation and timing instrumentation.
Markdown renderer logic
claude_code_log/markdown/renderer.py
Adds _TagProtectingMarkdownRenderer, _protect_html_tags() and cached pipeline; format_UserTextMessage now uses render_user_markdown + is_well_formed_html to choose inline-protected Markdown or fenced fallback.
Snapshots & Unit Tests
test/__snapshots__/*, test/test_ide_tags.py, test/test_user_renderer.py, test/test_message_types.py, test/test_markdown_helpers.py, test/test_markdown_rendering.py
Updates snapshots and assertions to expect dual-view output; adds tests for HTML validation, fallback on ill-formed HTML, _protect_html_tags, newline/escape edge cases, and adjusts expectations for duplicate emissions (MD + raw).
Browser E2E tests
test/test_user_view_toggle_browser.py
New Playwright test suite validating default Markdown view, global toggle persistence, per-message toggles and overrides, deterministic localStorage handling, and interactions across reloads.

Sequence Diagram(s)

sequenceDiagram
    participant Browser as Browser/Client
    participant JS as JS Toggle Handler
    participant DOM as DOM & CSS
    participant Storage as localStorage
    participant Py as Python Renderer

    rect rgba(200, 150, 150, 0.5)
    Note over Py,Browser: Server-side render during page generation
    Py->>Py: render_user_markdown(text)
    Py->>Py: is_well_formed_html(rendered)
    Py-->>Browser: emit HTML (`.user-content` with `.user-md` & `.user-raw` or fallback `<pre>`)
    end

    rect rgba(150, 200, 150, 0.5)
    Note over Browser,JS: Page init & global toggle state
    JS->>Storage: read 'claude-code-log:user-view'
    Storage-->>JS: value or null
    JS->>DOM: apply/remove `body.show-raw-user`
    JS->>DOM: update `#toggleUserView` UI
    end

    rect rgba(150, 150, 200, 0.5)
    Note over Browser,DOM: User interactions
    Browser->>JS: click `#toggleUserView` or `.user-view-toggle`
    JS->>DOM: toggle body class or `data-user-view` on `.user-content`
    JS->>Storage: persist global choice when applicable
    DOM->>DOM: CSS rules determine visible `.user-md` / `.user-raw`
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

enhancement

Suggested reviewers

  • daaain

Poem

A rabbit hopped through CSS and script,
Toggling views with a tiny skip,
Markdown blooms while raw stays near,
Per-message clicks make modes appear,
LocalStorage hums — the toggle's clear. 🐰

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: rendering user content as Markdown with a raw fallback toggle mechanism, which is the core feature across all modified files.
Docstring Coverage ✅ Passed Docstring coverage is 83.87% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev/user-markdown

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (6)
claude_code_log/html/utils.py (1)

323-383: Validator is reasonable; one permissiveness note worth being aware of.

handle_startendtag as a no-op correctly handles mistune's XHTML-style <br /> / <hr /> output, but it also silently accepts non-void self-closing tags like <div />, <span />, or <p /> as a zero-op (neither opening nor closing). Mistune won't emit those today, so this isn't a real bug — just flagging in case the underlying renderer ever changes or a new plugin is added that could emit them. If you want to tighten, you could check tag in _VOID_HTML_TAGS inside handle_startendtag and fall back to handle_starttag + handle_endtag for non-voids.

Also: because the check runs on every well-formed render, consider moving the from html.parser import HTMLParser and the _Checker class definition to module scope so the class isn't re-defined per call (minor — unlikely to matter unless you render thousands of messages per request).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@claude_code_log/html/utils.py` around lines 323 - 383, The current
is_well_formed_html treats every XHTML-style self-closing tag as a no-op which
silently accepts non-void self-closing tags (e.g., <div />) — update
_Checker.handle_startendtag to only treat tags in _VOID_HTML_TAGS as no-ops and
otherwise invoke handle_starttag + handle_endtag (or push/pop) so non-void
self-closing tags are validated; additionally move the "from html.parser import
HTMLParser" import and the _Checker class definition out of is_well_formed_html
into module scope to avoid re-defining the class on every call.
claude_code_log/html/templates/components/global_styles.css (1)

288-288: Stylelint: unnecessary quotes around SFMono-Regular.

Stylelint's font-family-name-quotes rule flagged lines 288 and 330. The identifier is a valid CSS <custom-ident> and doesn't need quoting. Same nit applies to the older .debug-toggle.floating-btn block above, so you may want to fix it repo-wide or add a stylelint exception — either is fine, but the current CI warning should be resolved.

♻️ Suggested fix (lines 288 and 330)
-    font-family: 'SFMono-Regular', Consolas, monospace;
+    font-family: SFMono-Regular, Consolas, monospace;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@claude_code_log/html/templates/components/global_styles.css` at line 288,
Remove the unnecessary quotes around the SFMono-Regular identifier in the
font-family declarations flagged by Stylelint: update the occurrences in the
global styles where you set font-family to 'SFMono-Regular', Consolas, monospace
(and the older .debug-toggle.floating-btn block) to use SFMono-Regular without
quotes; this resolves the font-family-name-quotes lint rule by treating
SFMono-Regular as a valid custom-ident rather than a quoted string.
test/__snapshots__/test_snapshot_html.ambr (2)

6059-6068: Per-message toggle doesn't account for global state.

When the global toggle is active (showing raw) and a user clicks the per-message toggle on a message without an explicit data-user-view attribute, the code assumes the current view is 'md' and switches to 'raw', but the message is already showing raw due to the global toggle. This means the first click produces no visual change—only the button text updates. The user must click a second time to see the expected switch to Markdown.

The current detection logic defaults to 'md' when no attribute is present, but should also check the global state:

♻️ Proposed fix to account for global state
 // Delegated per-message toggle.
 document.addEventListener('click', function (event) {
     const btn = event.target.closest('.user-view-toggle');
     if (!btn) return;
     const container = btn.closest('.user-content');
     if (!container) return;
-    const current = container.getAttribute('data-user-view') || 'md';
+    const current = container.getAttribute('data-user-view') || 
+                    (document.body.classList.contains('show-raw-user') ? 'raw' : 'md');
     const next = current === 'md' ? 'raw' : 'md';
     container.setAttribute('data-user-view', next);
     btn.textContent = next === 'md' ? 'raw' : 'md';
 });

Also applies to: 11512-11521, 16991-17033, 22345-22387

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/__snapshots__/test_snapshot_html.ambr` around lines 6059 - 6068, The
per-message toggle handler (the document.addEventListener('click'...) that
queries .user-view-toggle and .user-content and uses
container.getAttribute('data-user-view') to compute current/next) currently
defaults missing per-message state to 'md' and ignores the global state; change
the logic so current = container.getAttribute('data-user-view') ||
document.documentElement.getAttribute('data-user-view') || 'md' (or read the
equivalent global toggle attribute/place your app uses for global view), then
compute next as before and set container.setAttribute('data-user-view', next)
and btn.textContent accordingly so the first click respects the global toggle.

6037-6043: Minor: Inconsistent button label conventions.

The global toggle button displays the current state ('raw' when showing raw, 'md' when showing markdown), while per-message toggles display the next action ('raw' when currently showing markdown). This inconsistency may cause confusion about what the labels indicate.

Consider standardizing to one convention—either both show current state or both show next action.

Also applies to: 6067-6067

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/__snapshots__/test_snapshot_html.ambr` around lines 6037 - 6043, The
user/global toggle uses the current-state labels while per-message toggles use
next-action labels; standardize them by updating applyUserViewButtonLabel to
match the per-message convention (or vice versa). In applyUserViewButtonLabel,
change userViewButton.textContent and userViewButton.title to present the next
action (e.g., textContent = showingRaw ? 'md' : 'raw' and title = showingRaw ?
'Show user messages as Markdown' : 'Show user messages as raw text' swapped
accordingly), and ensure the active class logic
(userViewButton.classList.toggle('active', ...)) remains consistent with the
chosen convention; also audit the per-message label function (e.g.,
applyMessageViewButtonLabel or messageViewButton handlers) to make both use the
same "next action" wording.
test/test_user_view_toggle_browser.py (1)

79-88: Prefer Path.as_uri() for cross-platform file URLs.

f"file://{html}" works on POSIX where temp paths start with /, but on Windows a path like C:\Users\...\foo.html would produce file://C:\Users\... which is not a valid file URI. Path.as_uri() handles both platforms correctly.

♻️ Proposed refactor
-        page.goto(f"file://{html}")
+        page.goto(html.as_uri())
         page.evaluate("() => localStorage.clear()")
         page.reload()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/test_user_view_toggle_browser.py` around lines 79 - 88, In _goto_clean,
replace the manual f"file://{html}" URL with the platform-safe file URI from
Path.as_uri() so the page.goto call uses html.as_uri() (or equivalent) to
produce a valid file:// URL on Windows and POSIX; update the call site in the
_goto_clean function where page.goto is invoked to use the Path.as_uri() value
before calling page.evaluate and page.reload.
test/test_markdown_rendering.py (1)

113-118: Fragile string-equality assertions on rendered HTML.

"<pre class='user-raw'># A rendered heading" and the raw-asterisks check both assume the renderer emits single-quoted attributes and exact substring layout. If the template (or any future HTML-escape pass) switches to double quotes or inserts whitespace/attributes before the text, these assertions break without signalling a real regression. Consider asserting on DOM structure (e.g. parse with html.parser/BeautifulSoup and check .user-raw text content) or at least matching both quote styles.

Non-blocking — flagging because this pattern is replicated across several new tests (also lines 139-141, 188-190).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/test_markdown_rendering.py` around lines 113 - 118, The brittle
substring assertions that expect exact attribute quoting should be replaced with
DOM checks: parse the rendered HTML with html.parser or BeautifulSoup, locate
the <pre> element with class "user-raw" (e.g. find('pre', class_='user-raw')),
and assert its .text contains the raw content "# A rendered heading" and "**And
some bold text**" instead of relying on literal quote/spacing; apply the same
change to the other similar assertions (the other raw-view checks) so they
verify element existence and text content rather than exact attribute string
formatting.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@claude_code_log/html/templates/components/global_styles.css`:
- Around line 282-305: The per-message toggle (.user-content .user-view-toggle)
is hidden via opacity:0 and only shown on parent :hover, making it invisible to
keyboard users; add keyboard-visible focus styles instead of relying on :hover
by adding :focus and/or :focus-visible selectors (e.g., .user-content
.user-view-toggle:focus and .user-content .user-view-toggle:focus-visible) that
set opacity:1 and a clear focus outline/box-shadow and visible color/background
so the control becomes apparent when tabbed to; ensure these rules coexist with
the existing .user-content:hover .user-view-toggle so mouse hover and keyboard
focus both reveal the button.

In `@claude_code_log/html/templates/transcript.html`:
- Around line 242-252: The click handler assumes absence of data-user-view means
'md', causing a no-op when the global raw toggle is active; change the logic in
the delegated listener (the function that reads
container.getAttribute('data-user-view') || 'md') to compute the effective
current view by checking for an explicit attribute first and otherwise deriving
it from the global state (e.g.,
document.body.classList.contains('show-raw-user') ? 'raw' : 'md'), then flip to
the opposite and update data-user-view and btn.textContent accordingly; also
update the initial-label routine (applyUserViewButtonLabel or a new helper that
runs on load) to correct server-rendered labels by setting each
.user-view-toggle.textContent to the effective opposite view when the container
has no data-user-view and global raw is active so the button accurately reflects
the action.
- Line 186: The hard-coded button label/title in the static HTML causes a flash
before JS applies the persisted state; remove the initial text and title from
the element with id="toggleUserView" (render it empty, maybe with hidden or
aria-hidden while JS initializes) and let applyUserViewButtonLabel() set both
textContent and title based on the restored localStorage value (and remove the
hidden flag after updating). Ensure applyUserViewButtonLabel() is called during
startup before the button is shown so the DOM reflects the persisted
"show-raw-user" state immediately.

In `@test/test_markdown_rendering.py`:
- Around line 204-210: The __main__ test runner omitted calling
test_user_markdown_with_newline_keeps_dual_view so that test (defined around
line 181) doesn't run when executing the file directly; add a call to
test_user_markdown_with_newline_keeps_dual_view() alongside the other test_*
calls in the if __name__ == "__main__": block so the standalone python
test_markdown_rendering.py invocation covers that regression.

---

Nitpick comments:
In `@claude_code_log/html/templates/components/global_styles.css`:
- Line 288: Remove the unnecessary quotes around the SFMono-Regular identifier
in the font-family declarations flagged by Stylelint: update the occurrences in
the global styles where you set font-family to 'SFMono-Regular', Consolas,
monospace (and the older .debug-toggle.floating-btn block) to use SFMono-Regular
without quotes; this resolves the font-family-name-quotes lint rule by treating
SFMono-Regular as a valid custom-ident rather than a quoted string.

In `@claude_code_log/html/utils.py`:
- Around line 323-383: The current is_well_formed_html treats every XHTML-style
self-closing tag as a no-op which silently accepts non-void self-closing tags
(e.g., <div />) — update _Checker.handle_startendtag to only treat tags in
_VOID_HTML_TAGS as no-ops and otherwise invoke handle_starttag + handle_endtag
(or push/pop) so non-void self-closing tags are validated; additionally move the
"from html.parser import HTMLParser" import and the _Checker class definition
out of is_well_formed_html into module scope to avoid re-defining the class on
every call.

In `@test/__snapshots__/test_snapshot_html.ambr`:
- Around line 6059-6068: The per-message toggle handler (the
document.addEventListener('click'...) that queries .user-view-toggle and
.user-content and uses container.getAttribute('data-user-view') to compute
current/next) currently defaults missing per-message state to 'md' and ignores
the global state; change the logic so current =
container.getAttribute('data-user-view') ||
document.documentElement.getAttribute('data-user-view') || 'md' (or read the
equivalent global toggle attribute/place your app uses for global view), then
compute next as before and set container.setAttribute('data-user-view', next)
and btn.textContent accordingly so the first click respects the global toggle.
- Around line 6037-6043: The user/global toggle uses the current-state labels
while per-message toggles use next-action labels; standardize them by updating
applyUserViewButtonLabel to match the per-message convention (or vice versa). In
applyUserViewButtonLabel, change userViewButton.textContent and
userViewButton.title to present the next action (e.g., textContent = showingRaw
? 'md' : 'raw' and title = showingRaw ? 'Show user messages as Markdown' : 'Show
user messages as raw text' swapped accordingly), and ensure the active class
logic (userViewButton.classList.toggle('active', ...)) remains consistent with
the chosen convention; also audit the per-message label function (e.g.,
applyMessageViewButtonLabel or messageViewButton handlers) to make both use the
same "next action" wording.

In `@test/test_markdown_rendering.py`:
- Around line 113-118: The brittle substring assertions that expect exact
attribute quoting should be replaced with DOM checks: parse the rendered HTML
with html.parser or BeautifulSoup, locate the <pre> element with class
"user-raw" (e.g. find('pre', class_='user-raw')), and assert its .text contains
the raw content "# A rendered heading" and "**And some bold text**" instead of
relying on literal quote/spacing; apply the same change to the other similar
assertions (the other raw-view checks) so they verify element existence and text
content rather than exact attribute string formatting.

In `@test/test_user_view_toggle_browser.py`:
- Around line 79-88: In _goto_clean, replace the manual f"file://{html}" URL
with the platform-safe file URI from Path.as_uri() so the page.goto call uses
html.as_uri() (or equivalent) to produce a valid file:// URL on Windows and
POSIX; update the call site in the _goto_clean function where page.goto is
invoked to use the Path.as_uri() value before calling page.evaluate and
page.reload.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9a98bed0-7736-4d04-9155-287d8e7a212a

📥 Commits

Reviewing files that changed from the base of the PR and between 1dd392d and 9503f0f.

📒 Files selected for processing (10)
  • claude_code_log/html/templates/components/global_styles.css
  • claude_code_log/html/templates/transcript.html
  • claude_code_log/html/user_formatters.py
  • claude_code_log/html/utils.py
  • test/__snapshots__/test_snapshot_html.ambr
  • test/test_ide_tags.py
  • test/test_markdown_rendering.py
  • test/test_message_types.py
  • test/test_user_renderer.py
  • test/test_user_view_toggle_browser.py

Comment on lines +282 to +305
.user-content .user-view-toggle {
position: absolute;
top: 2px;
right: 2px;
padding: 1px 6px;
font-size: 0.7em;
font-family: 'SFMono-Regular', Consolas, monospace;
background: transparent;
border: 1px solid #ccc;
border-radius: 3px;
color: #888;
cursor: pointer;
opacity: 0;
transition: opacity 0.15s;
}

.user-content:hover .user-view-toggle {
opacity: 1;
}

.user-content .user-view-toggle:hover {
background-color: #f0f0f0;
color: #333;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Per-message toggle is unreachable/invisible for keyboard users.

.user-view-toggle sits at opacity: 0 and is only revealed by the parent's :hover. A keyboard user tabbing through the page will focus the button but see no visible control, and there's no :focus / :focus-visible rule to surface it. Since the button is inside every well-formed user message, this is the most-repeated focusable element in a transcript.

♻️ Suggested fix
 .user-content:hover .user-view-toggle {
     opacity: 1;
 }

+.user-content .user-view-toggle:focus,
+.user-content .user-view-toggle:focus-visible {
+    opacity: 1;
+    outline: 2px solid `#4a90e2`;
+    outline-offset: 1px;
+}
+
 .user-content .user-view-toggle:hover {
     background-color: `#f0f0f0`;
     color: `#333`;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
.user-content .user-view-toggle {
position: absolute;
top: 2px;
right: 2px;
padding: 1px 6px;
font-size: 0.7em;
font-family: 'SFMono-Regular', Consolas, monospace;
background: transparent;
border: 1px solid #ccc;
border-radius: 3px;
color: #888;
cursor: pointer;
opacity: 0;
transition: opacity 0.15s;
}
.user-content:hover .user-view-toggle {
opacity: 1;
}
.user-content .user-view-toggle:hover {
background-color: #f0f0f0;
color: #333;
}
.user-content .user-view-toggle {
position: absolute;
top: 2px;
right: 2px;
padding: 1px 6px;
font-size: 0.7em;
font-family: 'SFMono-Regular', Consolas, monospace;
background: transparent;
border: 1px solid `#ccc`;
border-radius: 3px;
color: `#888`;
cursor: pointer;
opacity: 0;
transition: opacity 0.15s;
}
.user-content:hover .user-view-toggle {
opacity: 1;
}
.user-content .user-view-toggle:focus,
.user-content .user-view-toggle:focus-visible {
opacity: 1;
outline: 2px solid `#4a90e2`;
outline-offset: 1px;
}
.user-content .user-view-toggle:hover {
background-color: `#f0f0f0`;
color: `#333`;
}
🧰 Tools
🪛 Stylelint (17.7.0)

[error] 288-288: Expected no quotes around "SFMono-Regular" (font-family-name-quotes)

(font-family-name-quotes)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@claude_code_log/html/templates/components/global_styles.css` around lines 282
- 305, The per-message toggle (.user-content .user-view-toggle) is hidden via
opacity:0 and only shown on parent :hover, making it invisible to keyboard
users; add keyboard-visible focus styles instead of relying on :hover by adding
:focus and/or :focus-visible selectors (e.g., .user-content
.user-view-toggle:focus and .user-content .user-view-toggle:focus-visible) that
set opacity:1 and a clear focus outline/box-shadow and visible color/background
so the control becomes apparent when tabbed to; ensure these rules coexist with
the existing .user-content:hover .user-view-toggle so mouse hover and keyboard
focus both reveal the button.

Comment thread claude_code_log/html/templates/transcript.html
Comment thread claude_code_log/html/templates/transcript.html
Comment on lines 204 to 210
if __name__ == "__main__":
test_server_side_markdown_rendering()
test_user_message_not_markdown_rendered()
test_user_message_markdown_rendered_with_raw_preserved()
test_user_ill_formed_markdown_falls_back_to_raw()
test_is_well_formed_html_unit()
test_render_user_markdown_escapes_html()
print("\n✅ All markdown rendering tests passed!")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

__main__ runner missing test_user_markdown_with_newline_keeps_dual_view.

The four other new tests are wired in, but test_user_markdown_with_newline_keeps_dual_view (line 181) is skipped when running this file directly. Not a correctness issue when running via pytest, but it makes the python test_markdown_rendering.py path under-cover the regression suite.

♻️ Proposed fix
     test_user_ill_formed_markdown_falls_back_to_raw()
     test_is_well_formed_html_unit()
+    test_user_markdown_with_newline_keeps_dual_view()
     test_render_user_markdown_escapes_html()
     print("\n✅ All markdown rendering tests passed!")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if __name__ == "__main__":
test_server_side_markdown_rendering()
test_user_message_not_markdown_rendered()
test_user_message_markdown_rendered_with_raw_preserved()
test_user_ill_formed_markdown_falls_back_to_raw()
test_is_well_formed_html_unit()
test_render_user_markdown_escapes_html()
print("\n✅ All markdown rendering tests passed!")
if __name__ == "__main__":
test_server_side_markdown_rendering()
test_user_message_markdown_rendered_with_raw_preserved()
test_user_ill_formed_markdown_falls_back_to_raw()
test_is_well_formed_html_unit()
test_user_markdown_with_newline_keeps_dual_view()
test_render_user_markdown_escapes_html()
print("\n✅ All markdown rendering tests passed!")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/test_markdown_rendering.py` around lines 204 - 210, The __main__ test
runner omitted calling test_user_markdown_with_newline_keeps_dual_view so that
test (defined around line 181) doesn't run when executing the file directly; add
a call to test_user_markdown_with_newline_keeps_dual_view() alongside the other
test_* calls in the if __name__ == "__main__": block so the standalone python
test_markdown_rendering.py invocation covers that regression.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@claude_code_log/markdown/renderer.py`:
- Around line 85-118: _protect_html_tags currently rewrites HTML tags even when
they appear inside inline code spans or indented code blocks; update it to (1)
detect and skip indented code lines (lines that start with four or more spaces
or a tab) similarly to fenced blocks, and (2) for non-fenced, non-indented lines
split the line into literal inline-code spans and non-code segments (using
backtick parsing) and apply _HTML_TAG_RE.sub(r"`\1`", ...) only to non-code
segments so constructs like `x <br> y` are preserved; keep using _FENCE_OPEN_RE
and _HTML_TAG_RE to locate fences and tags and ensure backtick-delimited
segments are left untouched.
- Around line 370-374: The code currently checks
is_well_formed_html(render_user_markdown(item.text)) and then emits the original
Markdown via _protect_html_tags(item.text), which leaves unsafe URL destinations
like javascript: or data: active; instead, parse and sanitize link/image
destinations or emit the sanitized/rendered HTML: run a sanitizer over the
rendered HTML (or rewrite Markdown AST) to remove or rewrite unsafe schemes
(allow only http, https, mailto, etc.), and replace the
parts.append(_protect_html_tags(item.text)) branch with
parts.append(sanitized_rendered_html) or
parts.append(_protect_markdown_with_sanitized_urls(item.text)) so that
link/image URLs are validated and unsafe schemes stripped/neutralized; use the
existing functions render_user_markdown, is_well_formed_html and replace
_protect_html_tags/_code_fence usage accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ced12254-a95d-47f9-a339-266dafe56b51

📥 Commits

Reviewing files that changed from the base of the PR and between 9503f0f and babc990.

📒 Files selected for processing (3)
  • claude_code_log/markdown/renderer.py
  • test/__snapshots__/test_snapshot_markdown.ambr
  • test/test_markdown_helpers.py
💤 Files with no reviewable changes (1)
  • test/snapshots/test_snapshot_markdown.ambr

Comment thread claude_code_log/markdown/renderer.py Outdated
Comment thread claude_code_log/markdown/renderer.py
cboos added a commit that referenced this pull request Apr 19, 2026
The regex-based approach mangled text the user had already taken
care to quote themselves — ``use `x <br> y` here`` would get the
``<br>`` rewritten a second time, making the situation worse for
anyone who'd already done the right thing. Handling inline-code
spans, fenced blocks, and indented code blocks via regex gets
hairy fast (coderabbit flagged this on #119).

Switch to the approach mistune already supports: subclass
``MarkdownRenderer`` (the stock re-emitter that parses Markdown
and writes it back out), override ``inline_html`` to backtick-wrap
and ``block_html`` to fence-wrap. The parser does the hard work of
distinguishing raw HTML from content inside code spans / fences /
indented blocks, so we only ever transform the tokens that would
actually let a downstream viewer interpret markup.

Net effect for the user's stated concern:
  before: ``use `x <br> y` here`` → ``use `x `<br>` y` here``  ← breaks it
  after:  ``use `x <br> y` here`` → ``use `x <br> y` here``   ← preserved

Acceptable round-trip quirks (mistune normalisation, not
regressions):
- Standalone HTML on its own line (``<script>…`` alone) becomes a
  fenced code block instead of backtick-wrapped inline. Stronger
  protection, same semantics.
- Indented HTML becomes fenced. Equivalent literal rendering.
- ``\`\`<br>\`\`` (double-backtick inline code) renders with a
  single-backtick delimiter when single suffices. Semantically
  the same inline code span.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cboos added a commit that referenced this pull request Apr 19, 2026
When the global raw toggle was active and a user clicked a
per-message toggle, the first click changed nothing visible:

1. Container has no `data-user-view` attribute (default state).
2. JS read `current = attr || 'md'`, setting `next = 'raw'`.
3. CSS `[data-user-view="raw"]` already matched the global-raw
   display → no visual change, just a button-label flip.
4. User had to click a second time to actually reach markdown.

The default-to-'md' fallback was only correct while the global
toggle was off. Fix: derive the *effective* current view from the
explicit attribute first, then fall back to the global state.

Also correct the per-message button labels on load and after
global toggles — every server-rendered button ships with text
"raw", which is wrong when the effective current view is already
raw (i.e. the user would flip TO md, not to raw). New helper
`applyPerMessageToggleLabels()` runs on init and after the
global toggle so the label always names the view the user would
switch to.

Added browser test `test_per_message_click_under_global_raw_flips_in_one_click`
as the regression guard. Flagged by coderabbit on PR #119.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@claude_code_log/markdown/renderer.py`:
- Around line 93-99: Neutralising HTML tokens uses fixed backticks which can be
broken by backticks inside the token; update inline_html and block_html to
mirror _code_fence's adaptive delimiter approach: retrieve raw as you do now,
compute a safe delimiter with _backtick_delimiter(raw, minimum=1) for
inline_html and _backtick_delimiter(raw, minimum=3) for block_html, then wrap
the content using that computed fence instead of hard-coded backticks (keep
trimming/strip logic for block_html as before). Ensure you reference the
existing helper _backtick_delimiter and the pattern used in _code_fence when
implementing the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f324fad1-0073-42f5-8405-61cd59a8ca23

📥 Commits

Reviewing files that changed from the base of the PR and between babc990 and fc1488f.

📒 Files selected for processing (5)
  • claude_code_log/html/templates/transcript.html
  • claude_code_log/markdown/renderer.py
  • test/__snapshots__/test_snapshot_html.ambr
  • test/test_markdown_helpers.py
  • test/test_user_view_toggle_browser.py
✅ Files skipped from review due to trivial changes (1)
  • test/snapshots/test_snapshot_html.ambr
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/test_markdown_helpers.py

Comment thread claude_code_log/markdown/renderer.py Outdated
cboos added a commit that referenced this pull request Apr 19, 2026
Fixed-length delimiters (1 backtick for inline, 3 for block) can
be broken by backticks the user embedded in the HTML token itself:

  input : <span title="`">x</span>
  wrap  : `<span title="`">`x`</span>`
             ^^^^^^^^^^^^^^
  problem: the inner ` closes the inline code span early, so
           `">x` leaks back to the Markdown renderer as live text.

Same class of break for block HTML that happens to contain a
triple fence:

  input : <div>\n```\n</div>
  wrap  : ```\n<div>\n```\n</div>\n```
  problem: the inner ``` closes the outer fence, `</div>` leaks.

Mirrors the existing `_code_fence` pattern: compute an adaptive
delimiter that's one backtick longer than the longest run in the
token (or at least the minimum — 1 for inline spans, 3 for
fences). Verified that the resulting Markdown, when re-rendered,
emits the tag as entity-escaped text inside <code>, with no live
markup leakage.

Flagged by coderabbit on PR #119.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@claude_code_log/markdown/renderer.py`:
- Around line 113-116: inline_html currently chooses a backtick wrapper using
only the token's raw content, which can merge with adjacent backticks and allow
live HTML; change inline_html (and its use of _backtick_delimiter) to emit
HTML-escaped content instead of raw when token represents inline HTML (escape
'<', '>', '&', etc.), or alternatively make the delimiter context-aware by
inspecting surrounding text via the state parameter to ensure the opening and
closing backtick runs match; implement the safer option: HTML-escape
token.get("raw") before computing/wrapping with _backtick_delimiter in
inline_html so adjacent backticks cannot create mismatched code spans.

In `@test/test_markdown_helpers.py`:
- Around line 437-441: The test is using mistune.create_markdown() which
defaults to escaping HTML, so change the renderer used in the assertions to a
permissive one to ensure escaping comes from _protect_html_tags() and not the
renderer; replace calls to mistune.create_markdown() with creating a Markdown
renderer using mistune.HTMLRenderer(escape=False) (e.g.,
mistune.create_markdown(renderer=mistune.HTMLRenderer(escape=False))) for the
checks that assert "&lt;span" and "<span title" (also update the analogous
assertions at the 449-453 block) so the test verifies the protection function
itself rather than renderer-level escaping.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5a6dc03d-2fec-4abe-8d51-9b8b05ee25a1

📥 Commits

Reviewing files that changed from the base of the PR and between fc1488f and e6f62be.

📒 Files selected for processing (2)
  • claude_code_log/markdown/renderer.py
  • test/test_markdown_helpers.py

Comment thread claude_code_log/markdown/renderer.py Outdated
Comment thread test/test_markdown_helpers.py Outdated
cboos added a commit that referenced this pull request Apr 19, 2026
Adaptive backtick delimiters still leaked the moment a user typed
a stray backtick adjacent to a tag — ``x `<br> y`` ended up as
``x ``<br>` y``, where the opening two-backtick run never matched
the closing one-backtick and ``<br>`` reached the downstream
renderer as live markup.

The earlier switch to mistune's round-trip was supposed to
sidestep this class of edge cases by leaning on the parser; the
point got lost when we then hand-built a new class of edge cases
in the wrapper itself.

Simpler and robust: feed the raw HTML token through
``html.escape`` in both ``inline_html`` and ``block_html``. The
output is plain Markdown text — no delimiter to merge with, no
surrounding-context interactions, no precedence for the user to
outsmart us on. Permissive downstream renderers (GitHub, VS Code
previews) display the tag text; strict ones show entity-encoded
text. Either way the tag never reaches the HTML output live.

Dropped ``_backtick_delimiter``, the adaptive-delim tests, and
the ``<code>``-wrapping assumptions. Added three "does not leak"
tests for the breakage cases we previously handled (stray
backtick adjacent to tag, backtick in attribute, block HTML
carrying a fence) plus covers for entity-escape semantics across
the rest of the scenarios.

Flagged by coderabbit on PR #119.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cboos
Copy link
Copy Markdown
Collaborator Author

cboos commented Apr 19, 2026

This one might be a bit controversial... I mostly always use Markdown in my prompts, so I think it would be nice to see them rendered as such, I made it the default. But I'd also be OK with a more conservative approach (keep the status quo), as there's a global switch md/raw.

@cboos cboos requested a review from daaain April 19, 2026 21:08
@cboos cboos mentioned this pull request Apr 19, 2026
Copy link
Copy Markdown
Owner

@daaain daaain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! I sometimes use Markdown too and also often paste in random stuff that might break it, but having a toggle as an escape hatch is a good solution

cboos and others added 13 commits April 26, 2026 18:44
Default user-text rendering now tries Markdown via mistune. When the
rendered HTML is well-formed (balanced tags, no parser errors),
both views are emitted so the reader can toggle between them; when
the HTML is ill-formed — strong evidence the source wasn't actually
Markdown — only the bare raw `<pre>` is emitted and the toggle is
suppressed. Markdown is the default view.

Mechanism:

- `html/utils.py`:
  - `render_user_markdown(text)` — mistune with `escape=True`. Unlike
    the shared `render_markdown` used for assistant output, user
    content must escape raw HTML so a user typing `<script>` sees
    the literal characters instead of injecting a tag.
  - `is_well_formed_html(fragment)` — walks the rendered output with
    stdlib `html.parser.HTMLParser`, tracks an open-tag stack (void
    elements excluded), and returns False on any unexpected close,
    unclosed tag at end of input, or parser exception.

- `html/user_formatters.py::format_user_text_content(text)`:
  - Well-formed: emits a `.user-content[data-user-view="md"]` wrapper
    with `.user-md` (rendered), `.user-raw` (escaped `<pre>`), and a
    small `.user-view-toggle` button.
  - Ill-formed: returns the bare `<pre>{escaped}</pre>` as before.

UI:

- Template `transcript.html` gains a floating `md`/`raw` button next
  to the existing uuid/details toggles. Clicking flips a body class
  (`show-raw-user`) and persists the choice in localStorage.
- Delegated click handler toggles `data-user-view` on individual
  `.user-content` wrappers for per-message control.
- CSS precedence: per-message `data-user-view="md"` beats the global
  class, so a user who explicitly chose Markdown on one message isn't
  surprised when they flip the global toggle.

Tests:

- `test_is_well_formed_html_unit` — balanced/self-closing/mismatched/
  unclosed coverage.
- `test_render_user_markdown_escapes_html` — raw `<script>` escaped.
- `test_user_ill_formed_markdown_falls_back_to_raw` — monkey-patches
  the renderer to emit unclosed HTML and asserts the bare `<pre>`
  fallback.
- `test_user_message_markdown_rendered_with_raw_preserved` —
  replaces the former "user messages are never markdown-rendered"
  test (directly contradicting the new policy) and asserts the
  dual view is present with raw preserved.

Updated tests that asserted the old `<pre>` shape. The queue-operation
test now expects the message text to appear twice (once per view)
rather than once.

Snapshot updates absorb the dual-view HTML + CSS/JS additions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Monk's review caught a real bug: mistune emits XHTML self-closing
syntax for void elements (``<br />``, ``<hr />``, ``<img />``) but
stdlib `HTMLParser`'s default `handle_startendtag` calls
`handle_starttag` followed by `handle_endtag`. My checker then saw the
endtag, found an empty stack (because void tags don't push), and
logged "unexpected </br>" — rejecting every user message with a
newline as ill-formed and silently demoting the dual view to bare
`<pre>`.

Override `handle_startendtag` as a no-op for tracking purposes.
`<tag />` opens and closes in a single token — neither push nor pop
is needed whether the tag is void or not.

Regression tests:
- `is_well_formed_html` unit cases now cover `<br />`, `<hr />`, and
  `<img src="x" alt="y" />`.
- `test_user_markdown_with_newline_keeps_dual_view` asserts
  `format_user_text_content("line1\nline2")` keeps the dual-view
  wrapper (the exact failure mode monk surfaced).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Monk's second review caught that every `.user-content` wrapper was
emitted with `data-user-view="md"` baked in. The global raw-toggle CSS
uses `:not([data-user-view="md"])` to skip messages the user
explicitly chose to keep as Markdown — but with the attribute present
on every untouched message, the selector never matched and the global
toggle had no visible effect.

Drop the baked attribute. The per-message toggle JS already sets
`data-user-view` after a click, and the CSS selectors interpret a
missing attribute as "untouched" — which is what we want the global
raw-view to flip. A user who explicitly locks a message to Markdown
still gets the attribute set to "md" after their first round-trip
through the toggle, so per-message overrides continue to win over
global.

Verified end-to-end with Playwright:
- Default: md visible.
- Global toggle (no prior per-message click): md hidden, raw visible.
- Global toggle again: md visible.

Updated `test_format_user_text_content` to assert the attribute is
absent on fresh output. Snapshots refreshed — purely the removed
attribute, no other structural change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Covers the scenarios monk's review flagged as unexercised:
- Default load: markdown visible, raw hidden
- Global toggle flips untouched messages (regression guard for the
  bake-in bug where data-user-view='md' broke the global selector)
- Per-message toggle affects only that message
- Per-message explicit 'md' overrides global 'raw'
- Global choice persists across reload via localStorage

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Swap the untyped lambda for unittest.mock.patch. ty (stricter than
pyright) rejected `uf.render_user_markdown = lambda ...` as an
invalid-assignment against a typed module attribute. `mock.patch`
sidesteps that check and also restores the original automatically,
so the try/finally goes away.

Reported by monk in #2208.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Chromium treats every file:// URL as the same null origin, so
localStorage persists across Playwright tests in a session even
though each test gets a fresh BrowserContext. That made the test
order sensitive:
- If a prior test left 'raw' in localStorage, the next default-md
  test started in raw mode and the initial assertion failed.
- If a prior test left 'md', the reload test's click toggled
  out of raw instead of into it.

Fix: every test now goes through `_goto_clean`, which clears
localStorage and reloads once before the test body runs. Verified
deterministic across 2 sequential runs + a parallel -n auto run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The wrapper no longer ships with `data-user-view='md'` — the
attribute is absent until the per-message toggle is clicked.
Reported by monk in #2220.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The HTML renderer runs user text through mistune and falls back to
raw only when the output is ill-formed; the Markdown renderer was
still wrapping every user message in an unconditional code fence.
That erased headings, bold, lists, and links from the Markdown
output for no good reason — the same well-formed-HTML gate is
just as informative about "is this actually Markdown?" regardless
of the final format.

format_UserTextMessage now:
- Tries render_user_markdown + is_well_formed_html.
- On pass, emits the user's text as raw Markdown so downstream
  viewers (GitHub, IDEs, etc.) render it naturally.
- On fail, keeps the current code-fence fallback — literal content
  is preserved when mistune can't cleanly make sense of it.

Raw HTML/XML tags in otherwise-clean text get wrapped in inline
backticks by `_protect_html_tags`, mirroring the HTML path's
`escape=True` safety posture. Without that, a permissive viewer
would interpret user-typed `<script>` or `<iframe>` as markup.
The wrapper skips fenced code blocks (tags are already literal
there) and existing inline code spans (via negative lookaround).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The regex-based approach mangled text the user had already taken
care to quote themselves — ``use `x <br> y` here`` would get the
``<br>`` rewritten a second time, making the situation worse for
anyone who'd already done the right thing. Handling inline-code
spans, fenced blocks, and indented code blocks via regex gets
hairy fast (coderabbit flagged this on #119).

Switch to the approach mistune already supports: subclass
``MarkdownRenderer`` (the stock re-emitter that parses Markdown
and writes it back out), override ``inline_html`` to backtick-wrap
and ``block_html`` to fence-wrap. The parser does the hard work of
distinguishing raw HTML from content inside code spans / fences /
indented blocks, so we only ever transform the tokens that would
actually let a downstream viewer interpret markup.

Net effect for the user's stated concern:
  before: ``use `x <br> y` here`` → ``use `x `<br>` y` here``  ← breaks it
  after:  ``use `x <br> y` here`` → ``use `x <br> y` here``   ← preserved

Acceptable round-trip quirks (mistune normalisation, not
regressions):
- Standalone HTML on its own line (``<script>…`` alone) becomes a
  fenced code block instead of backtick-wrapped inline. Stronger
  protection, same semantics.
- Indented HTML becomes fenced. Equivalent literal rendering.
- ``\`\`<br>\`\`` (double-backtick inline code) renders with a
  single-backtick delimiter when single suffices. Semantically
  the same inline code span.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When the global raw toggle was active and a user clicked a
per-message toggle, the first click changed nothing visible:

1. Container has no `data-user-view` attribute (default state).
2. JS read `current = attr || 'md'`, setting `next = 'raw'`.
3. CSS `[data-user-view="raw"]` already matched the global-raw
   display → no visual change, just a button-label flip.
4. User had to click a second time to actually reach markdown.

The default-to-'md' fallback was only correct while the global
toggle was off. Fix: derive the *effective* current view from the
explicit attribute first, then fall back to the global state.

Also correct the per-message button labels on load and after
global toggles — every server-rendered button ships with text
"raw", which is wrong when the effective current view is already
raw (i.e. the user would flip TO md, not to raw). New helper
`applyPerMessageToggleLabels()` runs on init and after the
global toggle so the label always names the view the user would
switch to.

Added browser test `test_per_message_click_under_global_raw_flips_in_one_click`
as the regression guard. Flagged by coderabbit on PR #119.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fixed-length delimiters (1 backtick for inline, 3 for block) can
be broken by backticks the user embedded in the HTML token itself:

  input : <span title="`">x</span>
  wrap  : `<span title="`">`x`</span>`
             ^^^^^^^^^^^^^^
  problem: the inner ` closes the inline code span early, so
           `">x` leaks back to the Markdown renderer as live text.

Same class of break for block HTML that happens to contain a
triple fence:

  input : <div>\n```\n</div>
  wrap  : ```\n<div>\n```\n</div>\n```
  problem: the inner ``` closes the outer fence, `</div>` leaks.

Mirrors the existing `_code_fence` pattern: compute an adaptive
delimiter that's one backtick longer than the longest run in the
token (or at least the minimum — 1 for inline spans, 3 for
fences). Verified that the resulting Markdown, when re-rendered,
emits the tag as entity-escaped text inside <code>, with no live
markup leakage.

Flagged by coderabbit on PR #119.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adaptive backtick delimiters still leaked the moment a user typed
a stray backtick adjacent to a tag — ``x `<br> y`` ended up as
``x ``<br>` y``, where the opening two-backtick run never matched
the closing one-backtick and ``<br>`` reached the downstream
renderer as live markup.

The earlier switch to mistune's round-trip was supposed to
sidestep this class of edge cases by leaning on the parser; the
point got lost when we then hand-built a new class of edge cases
in the wrapper itself.

Simpler and robust: feed the raw HTML token through
``html.escape`` in both ``inline_html`` and ``block_html``. The
output is plain Markdown text — no delimiter to merge with, no
surrounding-context interactions, no precedence for the user to
outsmart us on. Permissive downstream renderers (GitHub, VS Code
previews) display the tag text; strict ones show entity-encoded
text. Either way the tag never reaches the HTML output live.

Dropped ``_backtick_delimiter``, the adaptive-delim tests, and
the ``<code>``-wrapping assumptions. Added three "does not leak"
tests for the breakage cases we previously handled (stray
backtick adjacent to tag, backtick in attribute, block HTML
carrying a fence) plus covers for entity-escape semantics across
the rest of the scenarios.

Flagged by coderabbit on PR #119.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The teammates fixture (added in PR #122 / merged via #125) contains
user messages that the user-markdown rendering now displays
differently:

- HTML: emits a `.user-content` wrapper with markdown/raw toggle UI;
  CSS for `.user-md`, `.user-raw`, `.user-view-toggle`, and the
  per-message + body-class precedence rules.
- Markdown: drops the triple-backtick fence around user text, since
  user content is now rendered as Markdown directly.

Snapshot churn is purely additive (HTML) or the intended shape change
(Markdown). No behavioral regression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cboos cboos force-pushed the dev/user-markdown branch from a8517cd to 2dc1f78 Compare April 26, 2026 16:59
@cboos cboos merged commit 9b9d3c5 into main Apr 26, 2026
11 checks passed
cboos added a commit that referenced this pull request Apr 27, 2026
Main flagged that the rendered async-agents output didn't look right
on their side. Rebasing onto current main (PR #119, user-content
markdown rendering) didn't change the rendering, but spotting the
issue did:

``TaskNotificationMessage`` is a User entry, but
``_get_message_hierarchy_level`` had no explicit branch for the
``"task_notification"`` type. It fell through to the default level
1, and since the next conversation turn is an assistant (level 2),
the assistant ended up nested as a *child* of the notification —
the notification claimed every subsequent turn as its descendant
("1 assistant + 3 tools" hanging off d-118 in the test fixture).

Conceptually the notification is more like a tool_result: it's a
delayed status update for work the previous assistant initiated, not
a new user turn that the next assistant is responding to. Place it
at level 3 explicitly:

- Pops anything ≥ 3 from the stack but keeps the level-2 spawning
  assistant on top.
- Sits as a sibling of the spawning assistant's tool_use/tool_result
  entries.
- Subsequent level-2 assistants pop the notification (≥ 2) and
  start a fresh turn — siblings of the spawning assistant, not
  descendants of the notification.

Verified on the clmail-monk fixture: d-118 (a8b740b notification)
now renders with zero descendants, and the following assistant d-119
becomes its sibling under their shared parent d-11.

Doc: ``dev-docs/FOLD_STATE_DIAGRAM.md`` level table updated to call
out ``task_notification`` alongside the other Level-3 types.

Refs: #90.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cboos added a commit that referenced this pull request Apr 27, 2026
* async-agents: typed models + parsers for TaskOutput / TaskNotification

Phase 1 of dev/async-agents (issue #90). Adds the data layer
underneath the upcoming rendering work; nothing visible changes yet.

## What's new

- ``TaskOutputInput`` (Pydantic): typed input for the ``TaskOutput``
  polling tool — ``task_id`` / ``block`` / ``timeout``.
- ``TaskOutputResult`` (dataclass): the parsed XML-tagged tool result
  — ``retrieval_status``, ``task_id``, ``task_type``, ``status``,
  plus a flag + path captured from the ``[Truncated. Full output: …]``
  marker. The bulky ``<output>`` snapshot itself is **not** kept; the
  agent's full transcript already lands inline as a sidechain in our
  rendering, and the completion result reaches the trunk via the
  ``<task-notification>`` user entry.
- ``TaskNotificationUsage`` (dataclass): ``total_tokens`` /
  ``tool_uses`` / ``duration_ms``.
- ``TaskNotificationMessage`` (MessageContent): typed shape for the
  User entry Claude Code injects on async-agent completion.
  Mirrors ``TeammateMessage``'s data-layer shape: fields for
  ``task_id`` / ``status`` / ``summary``, the ``<result>`` body
  (markdown), the parsed ``<usage>``, and the trailing
  ``Full transcript available at:`` path.

## Parsers

- ``factories/tool_factory.py``: register
  ``TOOL_INPUT_MODELS["TaskOutput"] = TaskOutputInput`` and
  ``TOOL_OUTPUT_PARSERS["TaskOutput"] = parse_taskoutput_output``.
  The parser captures the four metadata tags + truncation marker;
  malformed payloads return ``None`` so the generic raw fallback
  keeps the visible content.
- ``factories/task_notification_factory.py`` (new): mirror of
  ``teammate_factory`` — ``has_task_notification`` for the cheap
  detector, ``create_task_notification_message`` for the typed
  payload. Single-tag fields, ``<result>`` block, ``<usage>``
  key:value lines, trailing transcript path. Returns ``None`` for
  empty / malformed payloads so the User card falls back to its
  default text rendering.
- ``factories/user_factory.create_user_message``: hook
  ``create_task_notification_message`` ahead of the default text
  path, right after the teammate detection. ``UserMessageContent``
  union extended.

## Plan / tracker

- ``work/async-agents.md`` carries the full 4-phase plan (this is
  Phase 1) plus data-shape notes verified against the test fixture
  ``d602eb5f-…/.jsonl`` from the clmail-monk session in #90.

## Verified

- All 1040 unit tests still pass; pyright + ruff clean.
- End-to-end on the real fixture: 4 ``TaskNotificationMessage`` and
  3 ``TaskOutput`` tool_use/tool_result pairs are parsed into the
  typed shapes (rendering still falls back to the generic
  formatters until Phase 2 lands).

Refs: #90.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* async-agents: HTML + Markdown rendering for TaskOutput / TaskNotification

Phase 2 of dev/async-agents (issue #90). Wires Phase 1's typed
models into the renderers; the user-visible cards now look right
(without the Phase 3 dedup / fold-into-anchor work yet).

## HTML

New ``html/async_formatter.py`` module (mirrors
``teammate_formatter`` style):

- ``format_taskoutput_input`` / ``format_taskoutput_output`` — minimal
  cards with ``<dl class="teammate-tool-card task-output-card">``. The
  result card shows ``retrieval_status`` / ``task_type`` / ``status``
  + a transcript-path hint when truncation was reported, and
  deliberately drops the bulky ``<output>`` snapshot (the agent's
  full transcript already lands inline as a sidechain).
- ``format_task_notification_content`` — metadata ``<dl
  class="task-notification-card">`` (task_id, status pill, usage
  fields, transcript path) + ``render_markdown_collapsible`` for the
  ``<result>`` body.

``HtmlRenderer`` adds the dispatch methods + the title formatters:

- ``title_TaskOutputInput`` → ``🔍 TaskOutput #<task_id>`` (the
  ``🔍`` short-circuits the template's default ``🛠️`` and reads as
  "look up / inspect" — distinct from the spawning ``🔧 Task``).
- ``title_TaskNotificationMessage`` → ``🔄 Async result • <summary>``.
- ``title_TaskInput`` extended: when ``run_in_background=True``,
  appends a muted ``[async]`` hint so the reader can tell async
  spawns from sync ones at a glance.

CSS additions in ``teammate_styles.css``:
- ``.task-async-hint`` (blue, muted) for the ``[async]`` title tag.
- ``.task-output-card`` (cyan border).
- ``.task-notification-card`` (blue border).

## Markdown

Mirrors the HTML in ``markdown/renderer.py``:

- ``format_TaskOutputInput`` / ``format_TaskOutputResult`` — terse
  ``key:value`` lines, transcript path appended.
- ``format_TaskNotificationMessage`` — bulleted metadata + a
  ``<details><summary>Result</summary>`` block carrying the result
  Markdown.
- ``title_TaskInput`` extended with ``*[async]*`` italic hint when
  ``run_in_background=True``.
- ``title_TaskOutputInput`` and ``title_TaskNotificationMessage``
  parallel the HTML titles.

## Verified on the clmail-monk fixture

Rendering the test JSONL produces:
- 8 ``[async]`` hints on Task tool_use titles (every async spawn).
- 4 ``🔄 Async result`` notifications.
- 3 ``🔍 TaskOutput`` polling cards.

All 1040 unit tests + 5 snapshot fixtures green; pyright + ruff
clean.

Refs: #90.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* async-agents: dedup notification bodies + spawn-backlink anchor

Phase 3 of dev/async-agents (issue #90).

The async-agent flow ends with a User entry whose ``<task-
notification>`` ``<result>`` body duplicates the spawning Task's
last sidechain sub-assistant — the agent's actual answer is rendered
twice (once inline as the spawn's sidechain content, once below in
the notification card). This commit collapses the second copy to a
backlink-only stub.

## What's new

- ``TaskNotificationMessage`` gains two fields:
  - ``result_is_duplicate: bool`` — set when the renderer pass
    confirms the body matches the spawning sidechain content.
  - ``spawning_task_message_index: Optional[int]`` — the message
    index of the spawning Task's tool_use, used as a backlink
    anchor for the reader to navigate to the actual spawn.

- New ``_link_async_notifications`` pass in ``renderer.py``,
  scheduled after ``_populate_task_metadata`` (post tree-build):
  - Indexes notifications by ``task_id``.
  - Walks Task/Agent tool_results, extracting the async-agent's
    ``agent_id`` via ``_async_agent_id_from_tool_result``
    (preferring ``TaskOutput.metadata.agent_id`` set by
    ``parse_agent_result_metadata``, then ``TaskOutput.agent_id``,
    then a regex fallback on the raw text).
  - For each match: wires the spawning Task's ``pair_first``
    (tool_use index) as the backlink anchor, then walks the
    tool_result's descendants in document order via
    ``_last_sidechain_assistant_text`` and compares with
    ``_normalize_for_dedup`` against the notification's
    ``result_text``. On match, flags ``result_is_duplicate``.

- ``format_task_notification_content`` (HTML) renders a ``Spawn: ↱
  Task`` row with a ``<a class="task-notification-backlink"
  href="#msg-d-N">`` anchor, and elides the markdown body when the
  flag is set.

- ``format_TaskNotificationMessage`` (Markdown) mirrors:
  ``- **Spawn:** ↱ Task `#d-N``` line + body suppression on dup.

- New ``.task-notification-backlink`` CSS rule (blue, no underline,
  underline on hover).

## Verified on the clmail-monk fixture

All 4 notifications (a8b740b / a5de609 / a9d6832 / a70b9c2) match
their spawning Task's last sidechain sub-assistant (after fixing a
walk-order bug in the descendant traversal — the naive
``stack.pop()`` after ``stack.extend(children)`` reverses
document order; corrected to push reversed). Each notification card
now ends with ``Spawn: ↱ Task`` linked to the spawn's ``msg-d-N``,
and the duplicated markdown body is gone.

Test count holds at 1065; pyright + ruff clean.

Refs: #90.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* async-agents: place task_notification at level 3 in the hierarchy

Main flagged that the rendered async-agents output didn't look right
on their side. Rebasing onto current main (PR #119, user-content
markdown rendering) didn't change the rendering, but spotting the
issue did:

``TaskNotificationMessage`` is a User entry, but
``_get_message_hierarchy_level`` had no explicit branch for the
``"task_notification"`` type. It fell through to the default level
1, and since the next conversation turn is an assistant (level 2),
the assistant ended up nested as a *child* of the notification —
the notification claimed every subsequent turn as its descendant
("1 assistant + 3 tools" hanging off d-118 in the test fixture).

Conceptually the notification is more like a tool_result: it's a
delayed status update for work the previous assistant initiated, not
a new user turn that the next assistant is responding to. Place it
at level 3 explicitly:

- Pops anything ≥ 3 from the stack but keeps the level-2 spawning
  assistant on top.
- Sits as a sibling of the spawning assistant's tool_use/tool_result
  entries.
- Subsequent level-2 assistants pop the notification (≥ 2) and
  start a fresh turn — siblings of the spawning assistant, not
  descendants of the notification.

Verified on the clmail-monk fixture: d-118 (a8b740b notification)
now renders with zero descendants, and the following assistant d-119
becomes its sibling under their shared parent d-11.

Doc: ``dev-docs/FOLD_STATE_DIAGRAM.md`` level table updated to call
out ``task_notification`` alongside the other Level-3 types.

Refs: #90.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* async-agents: fold last sub-assistant into spawning Task tool_result

Main flagged that Phase 3 only got the *notification* dedup right —
the spawning Task tool_result still showed only "Async agent
launched successfully", and the agent's actual answer remained
buried at the tail of the relocated sidechain. The reader had to
scroll past the agent's working tools (Reads, Bash) to find the
final summary.

This commit closes the gap: the agent's last sub-assistant content
folds *into* the spawning Task's tool_result rendering, with the
matching sidechain message removed so the answer appears exactly
once at the natural reading position.

## Mechanics

- ``TaskOutput`` gains ``async_final_answer: Optional[str]``.
  Populated by ``_link_async_notifications`` on every async-agent
  match; ``None`` for sync Tasks.
- ``_last_sidechain_assistant`` (replaces ``…_text``) returns
  ``(msg, parent, index)`` so the caller can both inspect the text
  and ``del parent.children[index]`` — same pattern
  ``_cleanup_sidechain_duplicates`` uses for sync Tasks.
- ``_link_async_notifications`` now does three things on each match:
  copies the answer onto ``TaskOutput.async_final_answer``, drops
  the duplicate sub-assistant from the sidechain tree, and flags
  the notification body as duplicate (with the spawn backlink).
  Three views — spawn / sidechain / notification — converge on a
  single visible copy at the spawn.
- ``format_task_output`` (HTML) renders the folded answer as a
  ``<div class="task-async-answer-label">Result <small>(from async
  notification)</small></div>`` followed by the answer in a
  ``render_markdown_collapsible`` block.
- ``format_TaskOutput`` (Markdown) appends a second
  ``<details><summary>Result (from async notification)</summary>``
  block.
- New ``.task-async-answer-label`` CSS rule.

## Verified on the clmail-monk fixture

All 4 async Task tool_results (a8b740b / a5de609 / a9d6832 /
a70b9c2) now render:

- their original "Async agent launched successfully" stub,
- followed by the "Result (from async notification)" fold,
- followed by the collapsible agent answer.

Sub-assistant count drops by 4 (the duplicates removed from the
sidechain trees). Notification cards stay backlink-only.

Refs: #90.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* async-agents: stop tool_result negative-margin from yanking the async-answer fold over its label

The .tool_result .collapsible-code rule applies margin-top:-2.5em to
tuck the *first* collapsible under the tool title. Inside an async-task
fold, the second collapsible (.task-async-answer) was also caught by it
and overlapped the "Result (from async notification)" label. Reset the
margin only on .task-async-answer .collapsible-code; the first .task-
result collapsible keeps its tucked-up alignment.

Snapshots refreshed for the CSS bump.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* async-agents: Phase 4 — tests + minimal sliced fixture

Adds a synthetic fixture sliced down from the canonical clmail-monk
session (`d602eb5f-…`) and a focused test module exercising the new
async-agents pipeline end-to-end:

- `test/test_data/async_agents/` — 7-entry main session + 3-entry
  sidechain. Covers the four pieces the renderer has to handle:
  `Task` with `run_in_background=true`, the canonical
  `Async agent launched successfully\nagentId: …` tool_result,
  a `TaskOutput` poll with `<retrieval_status>/<task_id>/<output>
  [Truncated…]` shape, and a `<task-notification>` whose `<result>`
  matches the last sub-assistant verbatim.

- `test/test_async_agents.py` — 25 tests:
  * `has_task_notification` and `create_task_notification_message`
    parser coverage (positive + edge cases: empty, missing usage,
    partial usage).
  * `parse_taskoutput_output` coverage (full payload, in-progress
    status without `<output>`, non-TaskOutput rejection).
  * Dispatch-table assertions (defensive against accidental churn).
  * Fixture loading and factory dispatch.
  * Phase 3 rendering pipeline assertions:
    - notification flagged `result_is_duplicate=True`
    - `TaskOutput.async_final_answer` populated on the spawning Task
    - duplicate sub-assistant dropped from the rendered tree
    - the agent's final answer text appears exactly once across
      the entire tree (folded under the spawn).

- Snapshot coverage in both `test_snapshot_html.py` and
  `test_snapshot_markdown.py` for the new fixture, locking in:
  the `*[async]*` hint badge on the Task title, the
  ``Async agent launched successfully`` stub, the
  ``Result (from async notification)`` fold, the TaskOutput poll
  card, and the notification-collapsed-to-backlink stub.

Verified: 1115 unit tests pass, pyright + ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* async-agents: source the spawn-fold from the notification (Plan A)

The Phase 3 fold disappeared at `--detail low`: the linker walked the
sidechain looking for the last sub-assistant to fold, but
`_filter_by_detail` had already dropped sidechain entries pre-render.
With nothing to match against, the fold was skipped and the agent's
final answer was buried in the (now-collapsed) `<task-notification>`
card.

Plan A splits the pass:

- **Spawn-fold (FULL/HIGH/LOW):** when a notification's `task_id`
  matches the spawning Task tool_result's `agent_id`, fold the
  notification's `result_text` directly onto
  `TaskOutput.async_final_answer` and flag the notification
  `result_is_duplicate`. Sidechain text is no longer required for the
  fold itself — the notification body is the canonical source.

- **Sidechain dedup (FULL/HIGH only):** when the last sub-assistant
  text matches the notification's `result_text`, drop it from the
  tree. This is the only piece that needs the sidechain — at LOW the
  sidechain is gone and there's nothing to remove anyway.

- **MINIMAL/USER_ONLY:** the spawn fold is skipped (the spawning
  Task tool_result is dropped post-render — there's nothing to fold
  onto). The notification card retains its body so the agent's
  answer stays visible.

`_link_async_notifications` now takes the active `DetailLevel` so it
can decide whether the spawn target survives.

Tests:
- 5 new parametrized cases in `TestAsyncAgentsDetailLevels` cover
  FULL/HIGH/LOW (fold present + notification flagged duplicate) and
  MINIMAL/USER_ONLY (no fold + notification body kept visible).
- New `test_async_agents_fixture_html_low` snapshot locks in the
  rendered LOW shape — guard against silent regressions to the fold
  pipeline at that detail level.

Verified: 1121 unit tests pass, pyright + ruff clean. Fold count
on the canonical clmail-monk fixture: 4 at FULL/HIGH/LOW (was 0 at
LOW before the fix), 0 at MINIMAL/USER_ONLY (notification body is
the surviving copy at those levels).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* async-agents: drop duplicate-flagged notifications at --detail low

At LOW the spawn-fold survives (Plan A), so the standalone
`<task-notification>` card duplicates the answer the reader is
already seeing folded under the spawning Task tool_result. FULL/HIGH
keep the card for transcript fidelity; MINIMAL/USER_ONLY don't flag
duplicates (the Task tool_result is filtered there, so the
notification body is the surviving copy).

Implementation note: main's mail #2631 suggested adding the rule
inside `_filter_template_by_detail`. That filter runs before
`_link_async_notifications` in the rendering pipeline (line 754 vs
826 in renderer.py), so `result_is_duplicate` is still False when
the filter visits each message. Instead, a small post-link pass
`_drop_duplicate_notifications_at_low` runs right after the linker
when `detail == LOW`. Survivors are remapped via the existing
`_reindex_filtered_context`; tree children are pruned so the
notification doesn't linger as a sub-message of its parent.

Tests:
- New `test_duplicate_notification_dropped_at_low` confirms the
  notification is gone from `ctx.messages` at LOW.
- New `test_notification_flagged_duplicate_at_full_and_high` keeps
  the assertions previously bundled into the LOW test (notification
  remains in ctx, flagged duplicate, with backlink wired).
- Existing detail-level parametrized cases retained.

Snapshot: `test_async_agents_fixture_html_low` regenerated to lock
in the new LOW shape — fold visible at the spawn, no notification
card.

Verified: 1123 unit tests pass, pyright + ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* async-agents: register TaskNotificationMessage in CSS_CLASS_REGISTRY

The runtime message-type filter in transcript.html (line 484) hides any
message whose CSS classes don't include one of the toolbar's known
types: `user, system, assistant, thinking, tool_use, tool_result,
sidechain, image`. `TaskNotificationMessage` was missing from
`CSS_CLASS_REGISTRY`, so `css_class_from_message` fell back to bare
`msg.type` = `task_notification` — matching no toolbar type, hence
permanently flagged `filtered-hidden` even when "All filters" was
active.

Register the content type with the same shape as the other user-
variant entries (`UserSteeringMessage: ["user", "steering"]`,
`TeammateMessage: ["user", "teammate"]`, etc.). The notification's
underlying JSONL entry is a plain User message — Claude Code injects
it as `type: "user"` with the `<task-notification>` block in
`message.content` — so the User toggle controlling its visibility
is the natural mapping.

Snapshot impact: the FULL async-agents fixture's notification div
class string changes from `task_notification` to `user task_notification`
(single-line diff). The LOW snapshot is unchanged because the
duplicate notification is dropped pre-template by
`_drop_duplicate_notifications_at_low`.

Verified: 1123 unit tests pass, pyright + ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* async-agents: re-pair messages after dropping duplicate notifications at LOW

`_drop_duplicate_notifications_at_low` calls `_reindex_filtered_context`
to remap message_index after removing the dropped notifications, but
that helper *clears* every message's pair_first/pair_middle/pair_last
on the assumption the caller will re-run pair identification.
`_identify_message_pairs` runs at renderer.py:775 — before
`_link_async_notifications` (826) and the drop pass (829) — so nothing
re-establishes the pairs the helper just cleared.

This broke two LOW-only behaviors:

- **Markdown LOW**: tool_use renders Instructions but not Report or
  the async-fold "Result (from async notification)" body. The
  Markdown renderer's `_render_message` only emits a tool_result
  body when its tool_use is `is_first_in_pair` (renderer.py:1417);
  without `pair_last` set, the body is dropped, and the tool_result
  has no `title_ToolResultMessage` to render itself.

- **HTML LOW**: Task tool_use and tool_result render with a visual
  gap because the `pair_first`/`pair_last` CSS classes that flush
  adjacent cards together are absent.

Fix: re-run `_identify_message_pairs(ctx.messages)` immediately
after `_reindex_filtered_context` in the drop pass. Pairs are
reconstructed from scratch via the standard two-pass algorithm.

Verified on the canonical clmail-monk fixture at LOW: 7
`pair_first`/`pair_last` HTML divs (Task tool_use ↔ tool_result),
Markdown shows Instructions + Report + "Result (from async
notification)" for every async Task. The LOW snapshot diff is
limited to two added pair classes on the synthetic fixture's Task
tool_use and tool_result.

Verified: 1123 unit tests pass, pyright + ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* async-agents: drop unused type-ignore comments in test helpers

`ty check` flagged four `# type: ignore[no-untyped-def]` comments in
test/test_async_agents.py as unused — the inferred parameter types
already satisfy ty's checks. Carried over from when I sketched the
helpers without type annotations; no longer needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* async-agents: document the async-task-agent flow + clarify TaskOutput naming

- New `dev-docs/agents.md` covers all three flavors of Task-spawned
  agents (sync sub-agents #79, async task agents #90, teammates #91).
  The async-agents § is the new detail: pipeline shape diagram,
  the two `TaskOutput`-named dataclasses (`TaskOutput` on the Task
  tool_result vs `TaskOutputResult` on the polling tool), the
  Phase 3 fold mechanics, the per-detail-level visibility matrix,
  key files, and the test fixture pointer.

- `dev-docs/messages.md` gains:
  - A `TaskOutput` polling-tool row in the Tool Results table and
    the Available Tools matrix (was previously absent).
  - A note on the `TaskOutput` vs `TaskOutputResult` name collision,
    forwarding to agents.md § 2.2.
  - A new "Async Task Notification" subsection under user content
    documenting `TaskNotificationMessage` (Phase 3 dedup markers
    included), forwarding to agents.md § 2 for the end-to-end flow.

- `dev-docs/teammates.md` § 10.1 ("Standard sub-agents and async
  task agents") now reflects that #90 is shipped — points readers
  at agents.md § 2 for the as-built reference. The doc's top-of-
  file companion-doc list and References block both gain agents.md.

Surfaced by user feedback on PR #132: the `TaskOutput` ↔
`TaskOutputResult` name collision is genuinely confusing; documenting
the distinction in the canonical message-type reference + a
focused agents.md should keep future readers (and Claude) on track.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* async-agents: only fold + flag duplicate when fold target is a TaskOutput

CodeRabbit review on PR #132: `_link_async_notifications` set
`notification.result_is_duplicate = True` whenever a Task tool_result
matched a notification by `agent_id`, even when `content.output` was
not a parsed `TaskOutput`. `_async_agent_id_from_tool_result` has
three pathways — the third (regex fallback on raw text) supports
shapes the parser couldn't structure into a `TaskOutput`. On those
shapes we'd skip the actual fold (no `async_final_answer` field to
write into) but still suppress the notification body, silently losing
the agent's only visible answer.

Move the duplicate flag inside the `isinstance(content.output,
TaskOutput)` guard. The notification body now stays visible whenever
the fold can't land — preserving "answer visible at least once" at
every detail level.

Also rebuild `session_nav` after `_drop_duplicate_notifications_at_low`
runs at LOW. The drop pass remaps `ctx.session_first_message` indices,
but `session_nav` was built earlier with the pre-drop indices baked
into its `message_index` and `parent_message_index` fields. The
single-session canonical fixture didn't surface this (notifications
sit after the only session header, so dropping them doesn't shift the
header), but a multi-session transcript with notifications between
session headers would land nav anchors one (or more) message slot off
after the LOW reindex. Reuses the same `prepare_session_navigation`
call with the up-to-date ctx.

Verified: 1123 unit tests pass, pyright + ruff + ty clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* async-agents: rename CSS modifier to task-notification, surface in timeline

CodeRabbit review on PR #132 flagged two paired issues:

- Naming convention: every other modifier in `CSS_CLASS_REGISTRY`
  uses hyphens (`system-hook`, `slash-command`, `command-output`,
  `bash-input`, `bash-output`); the actual CSS rules in
  `teammate_styles.css` also use hyphens
  (`task-notification-card`, `task-notification-backlink`). The
  registry entry I added for `TaskNotificationMessage` was the
  underscored `"task_notification"` — no `.task_notification` CSS
  rule exists, making the underscore version dead.

- Timeline misclassification: `timeline.html` only knew the seven
  toolbar types plus `teammate`/`sidechain`/etc. With both `user`
  and `task-notification` classes on the rendered div, the
  generic `.find()` returned `user` first — async notifications
  landed in the User row of the timeline rather than getting their
  own group.

Rename the modifier to `task-notification` (Python message-type
identifier `"task_notification"` in `models.py` / `renderer.py`
unchanged — that's a separate code identifier), add a
`task-notification` entry to `messageTypeGroups` in `timeline.html`,
and add a `classList.includes('task-notification')` branch ahead of
the generic `.find()` (mirroring the existing `teammate` branch, for
the same reason: both carry the `user` class). The new group is
inserted in the timeline's `groupOrder` between `teammate` and
`system`.

Snapshot: HTML async-agents fixture now renders the div as
`message user task-notification …`; LOW snapshot still drops the
duplicate-flagged notification.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* async-agents: ghost duplicate-flagged notifications at LOW (no reindex)

CodeRabbit review on PR #132 found a third remap cascade — after
session_nav and pair refs, `TemplateMessage.ancestry` and the
`spawning_task_message_index` / `parent_message_index` backlink
fields are also frozen at hierarchy-build time, so any reindex
after the tree is built leaves them pointing at stale slots. The
"each fix unblocks the next bug" pattern was a sign the approach
was wrong.

Switch to "ghosting": the duplicate notification stays in
`ctx.messages` with its original `message_index`. Only its
*rendered output* disappears at LOW. Implementation: gate
`format_TaskNotificationMessage` and `title_TaskNotificationMessage`
in both `HtmlRenderer` and `MarkdownRenderer` on
`self.detail == LOW and content.result_is_duplicate` → return
`""`. The rendering loop's existing "skip empty messages" elision
(HTML: `if title or html or msg.children:`; Markdown:
`_render_message` returning `""` when there's no title and no
content) drops the entry from the visible output without touching
ancestry classes, backlinks, session nav, or pair refs.

This deletes:

- `_drop_duplicate_notifications_at_low` (the survivor list +
  `_reindex_filtered_context` + `_identify_message_pairs` re-run
  + tree-children prune).
- The post-link call site at the end of `generate_template_messages`.
- The `session_nav` rebuild that the reindex required.
- The `_identify_message_pairs` re-run that the pair-clear required.

Net: -88 lines in `renderer.py`, +42 across the two formatter
gates, no behavior change in the rendered HTML/Markdown at any
detail level.

Test refresh: `test_duplicate_notification_dropped_at_low` →
`test_duplicate_notification_ghosted_at_low`. Asserts the
notification is *still* in `ctx.messages` with
`result_is_duplicate=True`, and that both `HtmlRenderer` and
`MarkdownRenderer` return `""` for its title and body when
configured at LOW.

LOW snapshot regenerated: `message_id` indices for messages after
the (now ghosted) notification stay at their original values
instead of shifting down by one. Visible rendered output is
otherwise byte-equal.

Verified: 1123 unit tests pass, pyright + ruff + ty clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* work: capture follow-up plan to replace _reindex_filtered_context with ghosting

Surfaced during the PR #132 review pass: every "drop messages and
reindex" pass that adds new index-bearing fields creates a fresh
"remember to remap X" trap. The async-agents PR hit three in
succession (pair refs, session_nav, ancestry/backlinks) before
switching to ghosting. PR #131 added a fourth remap target
(`SessionHeaderMessage.parent_message_index`).

The note records the architectural assessment of generalizing
ghosting to the two existing `_reindex_filtered_context` callers
(`_pair_skill_tool_uses` — easy, same shape as the async-agents
fix; `_filter_template_by_detail` — medium refactor, needs
tree-build child grafting + pair-id skip + render-loop elision
flag), and proposes a migration path that ends with deleting
`_reindex_filtered_context` entirely.

Not work for this PR — captured here so the design rationale is
preserved when someone (likely Claude later) picks it up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* async-agents: don't let <result> body bleed into header-field parsing

CodeRabbit review on PR #132: ``_FIELD_RE`` (the regex matching
``<task-id>``, ``<status>``, ``<summary>``) was running over the
full notification body, including the ``<result>`` payload. The
``<result>`` body is agent-authored markdown and frequently
contains literal HTML/XML — an agent quoting a ``<summary>`` tag
verbatim, for instance, would clobber the real notification
``<summary>`` field. Same risk for ``<status>`` and ``<task-id>``.
Downstream this poisons the fold/dedup path: the spawning Task
tool_result wouldn't match the right notification, and the wrong
status badge would render on the card.

Extract ``<result>`` and ``<usage>`` first, strip their full
match strings from the search surface, then run ``_FIELD_RE``
over the residual header. The result body is unchanged — it
still ships the agent's verbatim XML-ish content.

New regression test: a notification whose ``<result>`` includes
``<task-id>fake999</task-id> <status>failed</status>
<summary>Bogus summary</summary>`` must not overwrite the
real ``real123`` / ``completed`` / ``Real summary`` header
metadata. Inline tags are preserved verbatim in ``result_text``.

Verified: 1129 unit tests pass (33 in test_async_agents),
pyright + ruff + ty clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* async-agents: drop the unconditional Task/Agent tool_name early-skip

CodeRabbit review on PR #132: ``_link_async_notifications`` skipped
every tool_result whose ``tool_name`` was not exactly ``"Task"`` or
``"Agent"`` before even calling ``_async_agent_id_from_tool_result``.
``tool_name`` is populated by pair-id, which can leave a tool_result
orphaned in fork/branch shapes where the spawning tool_use sits in
a different branch — yet that orphan still carries the canonical
``agentId:`` line, so the notification ought to fold onto it.

Drop the unconditional pre-filter. After the agent-id detector
returns a hit, gate the non-Task/Agent path on a stronger signal —
a parsed ``TaskOutput`` output OR an ``agentId`` already tagged on
the entry's meta — so an unrelated tool_result that happens to
mention "agentId:" in its raw text doesn't hijack a notification
meant for a real spawn.

The canonical path (paired Task/Agent tool_result with parsed
``TaskOutput``) is unaffected; only the fallback fork/branch case
gains coverage. No behavior change on any existing fixture.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants