export conversations to json#36
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds JSON export support: new JsonRenderer and package, CLI Changes
Sequence DiagramsequenceDiagram
autonumber
actor User
participant CLI as CLI
participant Factory as RendererFactory
participant JSON as JsonRenderer
participant Converter as Converter
participant FS as FileSystem
User->>CLI: run CLI with --format json
CLI->>Factory: get_renderer("json", detail, compact)
Factory->>JSON: import & instantiate JsonRenderer
Factory-->>CLI: return JsonRenderer
CLI->>JSON: generate(messages, title, ...)
JSON->>JSON: serialize messages -> nested dicts
JSON-->>CLI: return JSON payload / filename
CLI->>Converter: request projects index generation
Converter->>JSON: generate_projects_index(...)
JSON-->>Converter: return index JSON
Converter->>FS: write `all-projects-summary.json` and per-project files
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
This makes sense, can be really useful and only takes a few lines, so why not! |
|
Are you still interested in this @raedatoui? We have a SQLite cache now with a table of every single message (amongst others) so this could be a very flexible feature. |
hey, apologies for not following up earlier. SQLite support is cool, I'll check it out. |
|
I'd rather go for a structured output of the processed messages, IOW, another renderer, in addition to HTML and Markdown. The Markdown output (and especially the lower-level of details ones, like |
|
Good call — agreed that a raw TranscriptEntry dump wasn't pulling its weight alongside HTML/Markdown. Pushed a rewrite that runs the JSON renderer through the same generate_template_messages pipeline as the other renderers, so it now:
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@claude_code_log/converter.py`:
- Around line 2680-2682: The cleanup in
claude_code_log/cli.py::_clear_output_files() must match the new JSON index
filename used in converter.py (all-projects-summary.json); update
_clear_output_files to compute the index filename the same way as conversion by
adding or calling a helper like get_index_filename/get_file_extension (or
implement the equivalent logic) so that for JSON it deletes
"all-projects-summary.json" and for other formats it deletes f"index.{ext}"
(refer to converter.py's index_filename logic and ensure _clear_output_files
uses the same symbol/name to derive the filename).
In `@claude_code_log/json/renderer.py`:
- Around line 102-117: The current session_messages comprehension filters only
by msg.sessionId == session_id which crashes on messages lacking sessionId and
omits sidechain/subagent entries rewritten to synthetic IDs; update the filter
used to build session_messages to (1) safely handle missing sessionId (guarding
against AttributeError), and (2) include messages whose sessionId equals
session_id, whose sessionId starts with f"{session_id}#" (synthetic sub-session
IDs created by _integrate_agent_entries), or whose parent/session parent field
points to session_id (e.g., parentSessionId or equivalent), then pass that list
into the existing call to generate(...) unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5c45cc45-7f4e-4763-8dab-db3f3c087146
📒 Files selected for processing (5)
claude_code_log/cli.pyclaude_code_log/converter.pyclaude_code_log/json/__init__.pyclaude_code_log/json/renderer.pyclaude_code_log/renderer.py
24247dc to
82ac060
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
test/test_cli.py (1)
243-311: Consider adding JSON coverage for_clear_output_files.Since this PR introduces
--format json, it would be worth adding aTestClearOutputFilescase that exercisesoutput_format="json", particularly to lock in the special top-level index filename (all-projects-summary.jsonviaget_index_filename) — that branch is unique to JSON and currently uncovered here. Same rationale for extendingtest_format_option_mdwith a JSON variant.🧪 Suggested additional tests
def test_clear_json_all_projects( self, cli_projects_setup: ProjectsSetup, sample_jsonl_content: list[dict] ): """Clears JSON files and the all-projects-summary.json index.""" projects_dir = cli_projects_setup.projects_dir for i in range(2): project_dir = create_project_with_jsonl( projects_dir, f"project-{i}", sample_jsonl_content ) (project_dir / "combined_transcripts.json").write_text("{}") # JSON uses a distinct top-level index name to avoid colliding with per-project exports (projects_dir / "all-projects-summary.json").write_text("{}") _clear_output_files(projects_dir, all_projects=True, output_format="json") assert not (projects_dir / "all-projects-summary.json").exists() for i in range(2): assert len(list((projects_dir / f"project-{i}").glob("*.json"))) == 0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/test_cli.py` around lines 243 - 311, Add tests exercising output_format="json" in TestClearOutputFiles and extend test_format_option_md with a JSON variant: create per-project JSON files (e.g., "combined_transcripts.json") and the special top-level index file returned by get_index_filename (all-projects-summary.json), call _clear_output_files with all_projects=True and output_format="json", and assert the index file and all per-project .json files are removed; reference _clear_output_files and get_index_filename to locate the cleanup logic and test_format_option_md to add the JSON case.claude_code_log/json/renderer.py (1)
34-38: Defensive guard fordataclasses.asdicton non-dataclass content.
dataclasses.asdictraisesTypeErrorifmsg.contentisn't a dataclass instance. Every currentMessageContentsubclass is a dataclass, so this works today, but a future non-dataclass content type (e.g., a Pydantic model) would crash the entire JSON export for that transcript. A fallback keeps the exporter robust without changing current output.🩹 Proposed fix
- content_dump = dataclasses.asdict(msg.content) + if dataclasses.is_dataclass(msg.content): + content_dump = dataclasses.asdict(msg.content) + else: + content_dump = {"message_type": msg.content.message_type} # Meta is surfaced at the node level; drop the nested copy for clarity. content_dump.pop("meta", None)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@claude_code_log/json/renderer.py` around lines 34 - 38, Guard the dataclasses.asdict call so non-dataclass msg.content doesn't raise: before calling dataclasses.asdict on msg.content (used to produce content_dump), check dataclasses.is_dataclass(msg.content) and only call asdict in that case; otherwise, fall back to a safe conversion (prefer msg.content.dict() if present, else vars(msg.content) or dict(msg.content) if it's already a mapping) to produce content_dump, then continue to pop "meta" and "message_index" as before; keep references to msg.content, content_dump, and the pop operations unchanged.claude_code_log/renderer.py (1)
2971-2998: Optional: update theget_rendererdocstring to list"json".The dispatch addition is correct and idiomatic, but the
formatparameter doc at line 2972 still only advertises"html","md", and"markdown". Worth a one-line update so CLI maintainers discover the new value from the signature.📝 Proposed docstring tweak
Args: - format: The output format ("html", "md", or "markdown"). + format: The output format ("html", "md"/"markdown", or "json"). image_export_mode: Image export mode ("placeholder", "embedded", "referenced"). If None, defaults to "embedded" for HTML and "referenced" for Markdown.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@claude_code_log/renderer.py` around lines 2971 - 2998, Update the get_renderer docstring to include "json" in the description of the format parameter: locate the get_renderer function and its docstring (look for the "format" parameter description) and add "json" alongside "html", "md", and "markdown" so the docstring reads that valid format values include "html", "md", "markdown", and "json"; ensure the wording matches existing style and keep the one-line change in the param list without altering any code logic (references: get_renderer, format parameter, HtmlRenderer/MarkdownRenderer/JsonRenderer).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@claude_code_log/json/renderer.py`:
- Around line 32-64: The serialized node omits the current message's uuid (it
pops meta from content_dump so downstream consumers can't resolve parent_uuid),
and it also doesn't emit render_session_id when it differs from session_id;
update _message_to_dict to add the node's own UUID (e.g., set node["uuid"] =
msg.meta.uuid or msg.uuid depending on the TemplateMessage representation) after
building node, keep popping meta from content_dump to avoid duplication, and
also conditionally include node["render_session_id"] = msg.render_session_id
when msg.render_session_id is present and not equal to msg.session_id so forked
branches can be grouped correctly.
---
Nitpick comments:
In `@claude_code_log/json/renderer.py`:
- Around line 34-38: Guard the dataclasses.asdict call so non-dataclass
msg.content doesn't raise: before calling dataclasses.asdict on msg.content
(used to produce content_dump), check dataclasses.is_dataclass(msg.content) and
only call asdict in that case; otherwise, fall back to a safe conversion (prefer
msg.content.dict() if present, else vars(msg.content) or dict(msg.content) if
it's already a mapping) to produce content_dump, then continue to pop "meta" and
"message_index" as before; keep references to msg.content, content_dump, and the
pop operations unchanged.
In `@claude_code_log/renderer.py`:
- Around line 2971-2998: Update the get_renderer docstring to include "json" in
the description of the format parameter: locate the get_renderer function and
its docstring (look for the "format" parameter description) and add "json"
alongside "html", "md", and "markdown" so the docstring reads that valid format
values include "html", "md", "markdown", and "json"; ensure the wording matches
existing style and keep the one-line change in the param list without altering
any code logic (references: get_renderer, format parameter,
HtmlRenderer/MarkdownRenderer/JsonRenderer).
In `@test/test_cli.py`:
- Around line 243-311: Add tests exercising output_format="json" in
TestClearOutputFiles and extend test_format_option_md with a JSON variant:
create per-project JSON files (e.g., "combined_transcripts.json") and the
special top-level index file returned by get_index_filename
(all-projects-summary.json), call _clear_output_files with all_projects=True and
output_format="json", and assert the index file and all per-project .json files
are removed; reference _clear_output_files and get_index_filename to locate the
cleanup logic and test_format_option_md to add the JSON case.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d37ca592-f4a2-49b9-8e64-1a6b748fabd5
📒 Files selected for processing (6)
claude_code_log/cli.pyclaude_code_log/converter.pyclaude_code_log/json/__init__.pyclaude_code_log/json/renderer.pyclaude_code_log/renderer.pytest/test_cli.py
✅ Files skipped from review due to trivial changes (1)
- claude_code_log/json/init.py
🚧 Files skipped from review as they are similar to previous changes (2)
- claude_code_log/converter.py
- claude_code_log/cli.py
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
claude_code_log/cli.py (1)
384-386:⚠️ Potential issue | 🟠 MajorAvoid deleting arbitrary
.jsonfiles during--clear-output --format json.The current glob removes every
*.jsonin project dirs, not just generated transcript outputs. That can remove unrelated user files.🛠️ Proposed fix
- output_files = list(project_dir.glob(f"*.{file_ext}")) + if file_ext == "json": + output_files = [ + *project_dir.glob("combined_transcripts*.json"), + *project_dir.glob("session-*.json"), + ] + else: + output_files = list(project_dir.glob(f"*.{file_ext}")) @@ - output_files = list(input_path.glob(f"*.{file_ext}")) + if file_ext == "json": + output_files = [ + *input_path.glob("combined_transcripts*.json"), + *input_path.glob("session-*.json"), + ] + else: + output_files = list(input_path.glob(f"*.{file_ext}"))Also applies to: 415-417
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@claude_code_log/cli.py` around lines 384 - 386, The code currently unlinks every file matching f"*.{file_ext}" (variables project_dir, file_ext, output_file.unlink()), which deletes any .json in the directory; change the deletion to target only generated transcript outputs by narrowing the glob or filtering before unlinking—e.g., use a specific pattern/suffix/prefix for generated outputs (like f"*{GENERATED_TRANSCRIPT_SUFFIX}.{file_ext}" or check output_file.stem.startswith(GENERATED_PREFIX)) or cross-check output_file against the known list of generated filenames, and then call output_file.unlink() only for those matched files (apply the same fix for the similar block around lines 415-417).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@claude_code_log/json/renderer.py`:
- Around line 190-193: In is_outdated(), defensively handle non-object JSON by
checking the parsed data type before calling .get(): after json.load(f) verify
isinstance(data, dict) and only call data.get("version") when it is a dict; if
it’s not a dict (e.g., list or scalar) return True to treat the file as
outdated. Keep the existing except (OSError, json.JSONDecodeError,
UnicodeDecodeError) block unchanged and continue to use get_library_version()
for the version comparison when data is a dict.
---
Outside diff comments:
In `@claude_code_log/cli.py`:
- Around line 384-386: The code currently unlinks every file matching
f"*.{file_ext}" (variables project_dir, file_ext, output_file.unlink()), which
deletes any .json in the directory; change the deletion to target only generated
transcript outputs by narrowing the glob or filtering before unlinking—e.g., use
a specific pattern/suffix/prefix for generated outputs (like
f"*{GENERATED_TRANSCRIPT_SUFFIX}.{file_ext}" or check
output_file.stem.startswith(GENERATED_PREFIX)) or cross-check output_file
against the known list of generated filenames, and then call
output_file.unlink() only for those matched files (apply the same fix for the
similar block around lines 415-417).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: aa023496-cad5-4f42-b94b-48d18bfa39ca
📒 Files selected for processing (6)
claude_code_log/cli.pyclaude_code_log/converter.pyclaude_code_log/json/__init__.pyclaude_code_log/json/renderer.pyclaude_code_log/renderer.pytest/test_cli.py
✅ Files skipped from review due to trivial changes (1)
- claude_code_log/json/init.py
🚧 Files skipped from review as they are similar to previous changes (1)
- test/test_cli.py
There was a problem hiding this comment.
🧹 Nitpick comments (1)
claude_code_log/json/renderer.py (1)
32-70: Verifydataclasses.asdictround-trip for non-dataclass fields (Enums, datetimes, Paths).
dataclasses.asdictrecurses through dataclasses/dict/list/tuple but leaves other objects unchanged. Those are then serialized viajson.dumps(..., default=str)at line 96, which callsstr()on them. ForEnummembers this yields"ClassName.MEMBER"rather than the underlying.value, and fordatetime/Pathit produces the__str__form — both potentially lossy or awkward for downstream consumers of the JSON contract. Worth confirming which concrete field types can appear insideMessageContentsubclasses.Consider replacing the
default=strfallback with a serializer that normalizes Enums to.valueand datetimes to ISO 8601 to lock down a stable schema.#!/bin/bash # Inspect MessageContent subclasses and their field types to see what non-dataclass types # can reach the JSON serializer via dataclasses.asdict. fd -e py . claude_code_log | xargs rg -nP '@dataclass|class\s+\w+Content\b|class\s+MessageContent\b' -C2 # Look for Enum/datetime/Path fields on content-ish dataclasses. rg -nP '^\s*\w+\s*:\s*(Enum|datetime|date|Path|PurePath|UUID|bytes)\b' --type=py # Confirm whether any MessageContent subclass is NOT a `@dataclass` (which would # make asdict(msg.content) raise TypeError at runtime). ast-grep --pattern $'class $NAME(MessageContent): $$$'🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@claude_code_log/json/renderer.py` around lines 32 - 70, The dataclasses.asdict call in _message_to_dict can leave Enum, datetime, Path, UUID, and bytes objects intact which later get serialized with json.dumps(..., default=str) producing unstable representations; update _message_to_dict to recurse the content_dump (after dataclasses.asdict(msg.content)) and normalize: convert Enum instances to .value, datetimes/dates to ISO 8601 via .isoformat(), Path/PurePath to str(), UUID to str(), and bytes to a stable encoding (e.g., base64 or hex) so the JSON contract is stable; alternatively, implement a custom serializer used by the json.dumps call (referencing the json.dumps usage) that handles these types, but do the normalization in or before _message_to_dict to ensure children/embedded fields (from dataclasses.asdict) are normalized consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@claude_code_log/json/renderer.py`:
- Around line 32-70: The dataclasses.asdict call in _message_to_dict can leave
Enum, datetime, Path, UUID, and bytes objects intact which later get serialized
with json.dumps(..., default=str) producing unstable representations; update
_message_to_dict to recurse the content_dump (after
dataclasses.asdict(msg.content)) and normalize: convert Enum instances to
.value, datetimes/dates to ISO 8601 via .isoformat(), Path/PurePath to str(),
UUID to str(), and bytes to a stable encoding (e.g., base64 or hex) so the JSON
contract is stable; alternatively, implement a custom serializer used by the
json.dumps call (referencing the json.dumps usage) that handles these types, but
do the normalization in or before _message_to_dict to ensure children/embedded
fields (from dataclasses.asdict) are normalized consistently.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: aaeddd3d-2906-4305-91c0-e978579d65de
📒 Files selected for processing (1)
claude_code_log/json/renderer.py
There was a problem hiding this comment.
🧹 Nitpick comments (1)
claude_code_log/json/renderer.py (1)
83-88: Minor: truthiness check onpair_durationdrops legitimate0values.
pair_first/pair_lastcorrectly useis not None, butpair_durationuses a truthy check at line 87, which will omit a duration of0(e.g., back-to-back messages in the same second). Low impact, but worth aligning for consistency.♻️ Proposed tweak
- if msg.pair_duration: + if msg.pair_duration is not None: node["pair_duration"] = msg.pair_duration🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@claude_code_log/json/renderer.py` around lines 83 - 88, The check for pair_duration currently uses truthiness and will drop valid zero values; update the condition that sets node["pair_duration"] to use an explicit None check (like the existing checks for msg.pair_first and msg.pair_last) so that msg.pair_duration == 0 is preserved when serializing in renderer.py (i.e., change the truthy check on msg.pair_duration to an is not None style check).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@claude_code_log/json/renderer.py`:
- Around line 83-88: The check for pair_duration currently uses truthiness and
will drop valid zero values; update the condition that sets
node["pair_duration"] to use an explicit None check (like the existing checks
for msg.pair_first and msg.pair_last) so that msg.pair_duration == 0 is
preserved when serializing in renderer.py (i.e., change the truthy check on
msg.pair_duration to an is not None style check).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 06037f0f-43ae-483c-b4ed-38084a55a33a
📒 Files selected for processing (2)
claude_code_log/cli.pyclaude_code_log/json/renderer.py
🚧 Files skipped from review as they are similar to previous changes (1)
- claude_code_log/cli.py
If the JSON file contains a list or scalar, data.get() would raise AttributeError and break the staleness check. Treat any non-dict payload as outdated so the next run regenerates cleanly.
Two bugs addressed from PR review:
1. Tool inputs/outputs (ToolUseContent, ReadInput, TextContent, etc.)
are Pydantic BaseModel instances embedded inside dataclasses.
dataclasses.asdict does not recurse into them, so json.dumps was
falling back to default=str and serialising them as __repr__ strings
like "type='tool_use' id='toolu_...' input={...}" instead of nested
objects. Added a custom default handler that calls model_dump on
Pydantic models (and normalises Enums/Paths) so the JSON output
contains the structured content consumers expect.
2. --clear-output --format json was globbing *.json across project
directories and deleting arbitrary user files. Narrowed to the
filename patterns this tool actually produces
(combined_transcripts*.json, session-*.json).
Covers JsonRenderer.generate / generate_session / generate_projects_index / is_outdated and the CLI --format json flow, including the safety contract that --clear-output preserves unrelated user .json files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two type-checker fixes the rebase exposed (both type-inference
tightening that's landed via main since the PR was opened — same
code passed both checkers when originally committed):
- `claude_code_log/json/renderer.py:is_outdated`: pyright (1.1.408)
narrows `json.load(f)` after `isinstance(data, dict)` to
`dict[Unknown, Unknown]`, which then flags `data.get("version")`
as `reportUnknownMemberType`. Cast to `dict[str, Any]` after the
isinstance check.
- `test/test_json_rendering.py:test_detail_minimal_filters_tool_messages`:
ty narrows `get_renderer("json", ...).generate(...)` to the base
`Optional[str]` return type and rejects passing it directly to
`json.loads`. Hoist the call to a local + assert it's not None,
matching how the other tests in this file already do it.
Runtime semantics unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR daaain#127 introduced `pair_middle` for the (UserSlash → Slash → CommandOutput) triple, but the JSON renderer was never updated to emit it — `_message_to_dict` had `pair_first`/`pair_last`/ `pair_duration` but silently dropped the middle. Downstream tools walking the JSON would lose the middle linkage and see what looks like an unrelated message wedged between two paired siblings. Mirror the existing pair_* emission pattern. New test asserts the three pair members emit pair_first/pair_middle/pair_last as the exclusive role properties dictate (first carries middle+last; middle carries first+last; last carries first only). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three Windows-specific failures the GH Actions runner caught (this
repo's `just ci` only runs on Linux):
- `test_shape_and_totals`: `Path("/tmp/alpha")` is a `WindowsPath`
on Windows whose `str()` yields `\tmp\alpha`. Compare against
`str(Path(...))` so the expected value matches the platform's own
`Path` rendering.
- `test_generates_combined_and_session_files` and
`test_all_projects_index_uses_dedicated_filename`: `read_text()`
defaults to the system codec (cp1252 on Windows), which can't
decode the UTF-8 multi-byte sequences the JSON renderer emits
(`ensure_ascii=False`). Add explicit `encoding="utf-8"` to all
`read_text` calls in this file.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Parametrizes over every session-level JSONL in `test/test_data/real_projects/` (48 fixtures, ~384 invariant checks) and asserts shape rather than specific values. Real fixtures are the only way to exercise sidechains, within-session forks, multi-session resumes, teammates, and the long tail of structural diversity that hand-written tests can't reach. Invariants asserted per fixture: 1. JSON parses (would catch _json_default gaps for new content types). 2. Top-level + per-node required keys present and well-typed. 3. `index` is unique tree-wide. 4. All `pair_first` / `pair_middle` / `pair_last` references resolve to nodes that exist in the document. 5. Pair role exclusivity: `pair_middle` (the field) only appears on the triple's pair_first (which has `pair_last` set, no `pair_first`). 6. Pair cross-references resolve to equivalent partners — relaxed from strict symmetry because real fixtures have stitched / forked transcripts where duplicate `tool_use_id` values produce multiple tool_use nodes pointing at the same tool_result; the back-reference only mirrors the latest one. The invariant that does hold is that the back-reference resolves to a node sharing the same `tool_use_id` (the logical pair equivalence class). 7. `type` strings match an identifier-shaped pattern (catches obvious garbage without hardcoding the closed set). 8. Tree is acyclic (no node appears in its own ancestor chain). Investigated finding worth flagging: bug-like behavior on duplicate `tool_use_id` is pre-existing renderer behavior (not specific to JSON output) — `_try_pair_by_index`'s dict-based pair index keeps only the last write, leaving earlier tool_use nodes with stale `pair_last` references. Acceptable for now; documented in the relaxed invariant above. Worth tracking as a separate cleanup if we ever revisit stitching. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
any interest in something like this? I find that to be useful for other workflows
happy to make it an optional flag , etc.
Summary by CodeRabbit
New Features
Bug Fixes
Tests