refactor: extract shared TranscriptedCaptureKit for CLI and MCP tools#1073
Open
r3dbars wants to merge 2 commits into
Open
refactor: extract shared TranscriptedCaptureKit for CLI and MCP tools#1073r3dbars wants to merge 2 commits into
r3dbars wants to merge 2 commits into
Conversation
Capture-library resolution, capture-markdown detection, and transcript parsing were duplicated nearly verbatim between TranscriptedCLI (ContextStore.swift) and TranscriptedMCP (DataDirectories.swift + TranscriptLoader.swift), and had already drifted: MCP resolved symlinks before enumerating legacy candidates while the CLI did not, MCP only attached frontmatter speaker metadata to system speakers, and looksLikeCaptureMarkdown existed four times (plus extractTitle three times, counting the copy in ToolHandlers.swift). Add Tools/TranscriptedCaptureKit, a dependency-free local SPM package both tools consume via a relative path dependency: - CaptureLibraryResolver: full resolution chain (shared data dir, per-kind overrides, mcp-directories.json manifest, transcriptSaveLocation preference, defaults + legacy fallback), returning sharedDataRoot so MCP can derive its index dir - CaptureMarkdown: capture-markdown detection, directory probing, frontmatter title extraction - CaptureMarkdownParser: frontmatter, speaker metadata, styled + legacy transcript entries, dictation day entries, into superset models each tool maps to its own output types Both tools keep their existing facade types so public APIs and test suites are unchanged. Drift resolved by taking the safer variant of each: symlink-resolved enumeration (MCP behavior) and metadata matching by system id or unique normalized name for all speakers (CLI behavior). run-e2e-smoke.sh now pre-compiles the kit as a swiftmodule + static lib before its raw swiftc compile, mirroring the TranscriptedCore pattern. Also fixes a pre-existing smoke breakage on main: SWIFT_SOURCES was missing Sources/Support/LocalMeetingSummaryPreferences.swift, which defines LocalMeetingSummaryProvider used by LocalMeetingSummarizer. Verification map gains a Tools/TranscriptedCaptureKit/** rule (kit + both consumer suites + e2e smoke) in .agents/test-matrix.yml and agent-preflight.sh; docs updated to match. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Core's CLAUDE.md (TranscriptFormatter/TranscriptFrontmatter) and the dictation day-file notes now cross-reference the kit so a written-format change updates the standalone tools' parsers in the same change instead of drifting silently. Kit-side cross-reference already exists in Tools/TranscriptedCaptureKit/CLAUDE.md. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Capture-library resolution and markdown transcript parsing were duplicated nearly verbatim between
Tools/TranscriptedCLI(ContextStore.swift) andTools/TranscriptedMCP(DataDirectories.swift+TranscriptLoader.swift), and drift had already started: MCP resolved symlinks before enumerating legacy candidates while the CLI did not, MCP only attached frontmatter speaker metadata to system speakers, andlooksLikeCaptureMarkdownexisted four times (plus a third copy ofextractTitlehiding inToolHandlers.swift). One implementation now serves both tools so future fixes land once.Product Impact
agent artifactsagent workflowWhat changed
Tools/TranscriptedCaptureKitlocal SPM package (dependency-free), consumed by both tools via.package(path: "../TranscriptedCaptureKit"):CaptureLibraryResolver— full resolution chain (shared data dir, per-kind overrides,mcp-directories.jsonmanifest,transcriptSaveLocationpreference, defaults + legacy Draft/~/Documents/Transcriptedfallback), returningsharedDataRootso MCP derives its index dirCaptureMarkdown— capture-markdown detection, directory probing, frontmattertitle:extractionCaptureMarkdownParser— frontmatter, speaker metadata, styled + legacy transcript entries, dictation day entries, producing superset models each tool maps into its own output typesCLIContextDirectories,CLIContextStore,TranscriptedDataDirectories,TranscriptLoader), so public APIs are unchanged;ContextStore.swiftdrops from ~1,140 to ~480 linesrun-e2e-smoke.shpre-compiles the kit as a swiftmodule + static lib before its rawswiftccompile (same pattern asTranscriptedCore)main(verified viagit stash):SWIFT_SOURCESwas missingSources/Support/LocalMeetingSummaryPreferences.swift, which definesLocalMeetingSummaryProviderused byLocalMeetingSummarizer.swiftTools/TranscriptedCaptureKit/**rule in.agents/test-matrix.yml+agent-preflight.sh(kit tests + both consumer suites + e2e smoke); docs updated (CLAUDE.md,Tools/README.md, toolCLAUDE.mds,docs/agent-onboarding.md, new kitCLAUDE.md)testMalformedDurationFallsBackToZeroasserts on parser source text; repointed at the kit file where the guards now liveHow I checked it
scripts/dev/agent-preflight.sh(suggests the new rule union correctly).agents/test-matrix.ymlfor the files changedbash build.sh --no-open(noSources/**or root-test changes; not required by the matrix for these paths)bash run-tests.sh(same)bash run-integration-smoke.sh(noSources/Meeting/orSources/TranscriptedCore/changes)swift testfor the core seam (rootPackage.swiftuntouched)swift test --package-path Tools/TranscriptedCaptureKit19/19,swift test --package-path Tools/TranscriptedCLI40/40,swift test --package-path Tools/TranscriptedMCP68/68 (zero MCP test edits),bash run-e2e-smoke.shgreen,swift build -c release --package-path Tools/TranscriptedMCP(the pathbuild.shuses for the bundled helper) green,transcripted-mcp --self-testagainst a tempTRANSCRIPTED_DATA_DIRresolves correctly through the kitRisk Review
build.sh/build-beta.shbuild the MCP helper viaswift build, which handles the path dependency transparently — verified release build).agent-review/visuals/evidence (no UI changes)Notes
mainbefore this branch (missing source inSWIFT_SOURCESfrom the recent Gemma summary work); this PR fixes it since a green smoke was needed to verify the script change.Sources/TranscriptedCore/Storage/TranscriptScanner.swift; left out of scope deliberately (build-system boundary) and flagged as a follow-up.Agent handoff
COORD_DONE: GREEN | https://github.com/r3dbars/transcripted/pull/1073 | extracted shared TranscriptedCaptureKit, refactored CLI+MCP to consume it, fixed pre-existing e2e smoke breakage, updated test matrix + docs | none | none | kit 19/19, CLI 40/40, MCP 68/68, e2e smoke green, MCP release build + self-test green, preflight | review and merge🤖 Generated with Claude Code