feat(DRC-3526): MCP App widget POC — 15 widget tools + config-install helper#1397
Open
iamcxa wants to merge 33 commits into
Open
feat(DRC-3526): MCP App widget POC — 15 widget tools + config-install helper#1397iamcxa wants to merge 33 commits into
iamcxa wants to merge 33 commits into
Conversation
Added during the /sync-gbrain session that backfilled embeddings after Conductor's GSTACK_OPENAI_API_KEY shim was wired up. The .gbrain-source file (gitignored) pins this worktree to its scoped code source for kubectl-style routing. CLAUDE.md gets a new "GBrain Search Guidance" section so future agents in this worktree prefer gbrain over Grep for semantic queries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
…_TOOLS filter (iter 1 Day 1)
Per design-20260521-234647 amended — adds a parallel FastMCP-based widget server
that delegates to RecceMCPServer's existing _tool_row_count_diff and
_tool_schema_diff methods, gated by RECCE_MCP_WIDGETS=1 env var.
- recce/widget_server.py: NEW, FastMCP("recce-widgets") with @mcp.tool delegates
and @mcp.resource handlers (graceful when recce/data/mcp/*.html missing).
- recce/cli.py: NEW `recce mcp-widget-server` subcommand (local mode only).
- recce/mcp_server.py: WIDGET_TOOLS set, _widgets_enabled helper, list_tools
filter, call_tool defensive raise.
Day 1 scope only — widget HTML assets + tests deferred to Day 1.5 / Day 2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…eedback) Fix #1: run_widget_server() is synchronous — mcp.run(transport="stdio") manages its own asyncio event loop internally. asyncio.run(run_widget_server(...)) raised ValueError because the function returns None, not a coroutine. Drop asyncio.run() in cli.py and remove the unused asyncio import. Fix #2: replace **arguments with typed params so FastMCP infers a proper JSON inputSchema for tools/list. Without typed params, Claude Desktop sees the tool registered but cannot construct a tools/call (no schema to fill). Params translated from existing low-level Tool inputSchema definitions: - row_count_diff: node_names, node_ids, select, exclude (all Optional) - schema_diff: select, exclude, packages (all Optional) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
- recce/data/mcp/row_count_diff.html: NEW, ~110 LOC tier 1 widget (status
pills + diff numbers; base/curr null guards using === null)
- recce/data/mcp/schema_diff.html: NEW, ~110 LOC tier 2 widget (added/
removed/type_changed sections; empty state shows unchanged_count)
- recce/mcp_server.py: extract _compute_schema_changes() helper that returns
rich per-model dict {added, removed, type_changed, unchanged_count};
existing _tool_schema_diff flattens output back to DataFrame for low-level
mcp-server consumers — zero regression on existing response shape
- recce/widget_server.py: row_count_diff delegate wraps result as
{"models": result}; schema_diff delegate calls _compute_schema_changes
directly and returns {"models": rich_result} for widget consumption
- .gitignore: replace broad recce/data ignore with per-extension rules so
recce/data/mcp/*.html (widget source, not build output) can be committed
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
- tests/test_widget_server.py: 5 tests covering WIDGET_TOOLS env-var coordination, widget server tool/resource registration, file-missing graceful degradation, and tool enumeration regression. - docs/mcp-widgets.md: template for Scott — file layout, Claude Desktop config example, add-widget walkthrough using row_count_diff as worked reference, structuredContent contract, gotchas (SDK version pin, typed-params requirement, dual-key meta, mcp.run sync nature, .gitignore allowlist, stdout discipline). Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
…de Desktop cwd=/ workaround) Claude Desktop spawns MCP servers with cwd=/. Two issues surfaced when manually testing iter 1 widgets: 1. recce/track.py:61 used bare print() which goes to stdout. When RecceConfig couldn't find/create recce.yml at cwd=/, the resulting traceback corrupted the JSON-RPC stdio channel. Fixed to print to stderr. Pre-existing bug — affected mcp-server equally. 2. recce/cli.py mcp_server and mcp_widget_server now chdir to --project-dir before RecceConfig so the config file is searched relative to the user's dbt project instead of cwd. Eliminates need for a bash wrapper in Claude Desktop config. Verified by reproducing the cwd=/ failure before fix (32 lines of traceback to stdout) and confirming clean stdout (0 lines) after. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
Claude Desktop manual test revealed widget rendered but showed empty
state ("No models found") while agent text summary correctly reported
5 models. Root cause: FastMCP with @mcp.tool returning Dict[str, Any]
may not populate the JSON-RPC structuredContent field — only content
text payload. Widget HTML only read structuredContent.
- Both widgets now try structuredContent first, fall back to parsing
content[0].text as JSON. Works regardless of FastMCP serialization
behavior.
- row_count_diff also skips keys starting with "_" when iterating
models (so _maybe_add_single_env_warning's _warning key doesn't
break the per-row render); _warning text is rendered as a notice
banner above the table when present.
- schema_diff gets the same underscore-key skip for defensive uniformity.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…Content
Per MCP log inspection: FastMCP synthesizes an outputSchema like
\`{properties: {result: {...}}, required: ["result"]}\` when @mcp.tool
returns a Dict[str, Any] without a Pydantic return type. The actual
structuredContent sent to the widget has shape {result: {models: ...}}
not {models: ...}. Widget HTML now unwraps the result key when present,
so both flat returns and FastMCP-wrapped returns render correctly.
Defensive: only unwraps if data.result exists AND data.models does not,
so tools that legitimately return both keys aren't mishandled.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…ntegration
Manual Claude Desktop test showed widget rendered real data correctly but
hand-rolled hardcoded colors looked washed-out in dark mode and out of place
next to Claude's native UI. Refactored both widgets to use Claude's CSS
custom properties (var(--color-text-primary), var(--color-background-*),
var(--font-sans), etc.) with light-mode hex fallbacks and a
prefers-color-scheme dark fallback for non-Claude hosts.
- row_count_diff.html: redesigned with header + 4-up summary cards + main
table, semantic status badges (warning/success/info/secondary), tabular
numeric alignment, monospace model names.
- schema_diff.html: same design-token migration, per-model sections with
semantic colors for added/removed/type_changed.
- No new external dependencies (kept ext-apps SDK as only unpkg load,
CSP resourceDomains unchanged). Used Unicode glyphs instead of icon
font to stay dependency-light.
Preserves all existing JS: unwrap fallback for FastMCP {result: ...}
wrapping, _warning notice rendering, === null guards, _-prefix key skip.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…lt + annotations
Per anthropic/skills/mcp-builder Python idiom checklist (python_mcp_server.md:691).
Closes the gap between iter 1's "make it work" shortcuts and SDK best practices.
- Pydantic BaseModel for tool inputs (RowCountDiffInput, SchemaDiffInput) with
Field(description=...) on every param — replaces bare Optional[...] params,
gives the LLM richer inputSchema descriptions.
- Pydantic BaseModel for tool outputs (RowCountDiffOutput, SchemaDiffOutput,
nested per-model models) — replaces Dict[str, Any] return, generates proper
outputSchema, eliminates FastMCP's {result: ...} wrapping (widget JS unwrap
fallback removed in commit 2).
- Return CallToolResult explicitly with short content (one sentence) + data in
structuredContent only. Agent receives a summary; widget receives the dict.
- annotations dict on every @mcp.tool: title, readOnlyHint, destructiveHint,
idempotentHint, openWorldHint per checklist line 691.
- _warning moved from top-level result dict to RowCountDiffOutput.warning
named field — cleaner Pydantic shape, no more _-prefix key skip in widget JS.
- logging.basicConfig explicitly sets stream=sys.stderr per stdio discipline
(mcp_best_practices.md:139).
- prefersBorder: False added to @mcp.resource meta per spec UIResourceMeta.
- Tool docstrings expanded per python_mcp_server.md:278-328 pattern:
what / Use when / Don't use when / Returns / Error Handling.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
Per add-app-to-server/SKILL.md JS-side patterns. Replaces manual CSS var()
fallback with SDK helpers that actively apply host design tokens via
postMessage context.
- Import applyDocumentTheme, applyHostStyleVariables, applyHostFonts from
ext-apps SDK. Register app.onhostcontextchanged to invoke them on theme/
font change. Apply initial context after connect() if exposed.
- Remove dead defensive layers no longer needed after commit 1:
- structuredContent unwrap (FastMCP no longer wraps Pydantic returns)
- content[0].text JSON.parse fallback (content is one-sentence string)
- _-prefix key skip (warning is now a named field, models is clean dict)
- Add app.onteardown handler returning {} per SDK pattern.
- CSS fallback values: rename --color-text-error → --color-text-danger and
--color-background-error → --color-background-danger (and badge-error →
badge-danger) to match spec enum McpUiStyleVariableKey (no "error"
semantic; "danger" is canonical).
- CSS var() fallbacks retained as defensive layer in case SDK helper races.
- warning now read from data.warning (named field) not data.models._warning.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…ests
Closes iter 1 idiomatic refactor.
- tests/test_widget_server.py: 3 new tests (5→8 total):
- test_row_count_diff_returns_calltoolresult_with_short_content: verifies
content is short one-sentence string and structuredContent has correct keys.
- test_structured_content_matches_pydantic_model: RowCountDiffOutput.model_validate
round-trip on structuredContent proves clean Pydantic shape.
- test_widget_tool_annotations_present: asserts readOnlyHint/destructiveHint/
idempotentHint/openWorldHint/title on both tools.
- docs/mcp-widgets.md:
- Step 2 (HTML template): updated to use applyDocumentTheme/applyHostStyleVariables/
applyHostFonts from ext-apps SDK + onteardown handler.
- Step 3 (@mcp.tool template): Pydantic BaseModel input/output + CallToolResult
replaces bare typed params + Dict[str, Any] return pattern.
- Step 4 (@mcp.resource): added prefersBorder: False to meta example.
- structuredContent contract: updated to explain CallToolResult + short content
discipline; added note on why bare Dict causes {result: ...} wrapping.
- Gotchas: replaced bare-Dict warning with Pydantic requirement; added CSS
danger vs error naming gotcha; removed {result: ...} unwrap entry (resolved).
- Added "Python vs TypeScript SDK Support" subsection explaining qr-server
pattern and why Recce stays Python.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…classes Captain's manual test in Claude Desktop dark mode revealed summary card values invisible and model name cells with unexpected styling. Root cause: the @media (prefers-color-scheme: dark) blocks in both widget HTML files only covered .card, th, border, .na, .empty, .warning — leaving .card-value, .card-label, .header-*, .model-name, .diff-*, and .badge-* with light-mode fallback hex values that become invisible on Claude's dark card backgrounds. Both widgets now have exhaustive @media dark overrides so rendering is correct in all four (mode × SDK helpers fire/no-fire) combinations: - Dark + SDK fire: host tokens drive - Dark + no SDK: @media dark fallback hex drives (this commit's job) - Light + SDK fire: host tokens drive - Light + no SDK: default var() inline fallback drives Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
Codecov Report❌ Patch coverage is
... and 25 files with indirect coverage changes 🚀 New features to boost your workflow:
|
Contributor
There was a problem hiding this comment.
Pull request overview
Adds an Iter 1 POC for MCP Apps widgets by introducing a parallel FastMCP-based “widget server” that serves widget-capable versions of row_count_diff and schema_diff, coordinated with the existing stdio MCP server via RECCE_MCP_WIDGETS=1.
Changes:
- Added
recce mcp-widget-server(FastMCP) with widget tool delegates + HTML widget resources forrow_count_diffandschema_diff. - Coordinated tool routing between servers using
WIDGET_TOOLS+_widgets_enabled()filtering inrecce/mcp_server.py. - Added tests and documentation for the widget workflow; fixed one stdout→stderr issue for stdio MCP safety.
Reviewed changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_widget_server.py | Adds test coverage for widget server registration, env-var coordination, and tool result shapes. |
| recce/widget_server.py | Implements FastMCP widget server with two widget tools and two HTML resources. |
| recce/mcp_server.py | Adds widget tool filtering/guardrails and extracts _compute_schema_changes() for shared schema-diff shape. |
| recce/cli.py | Adds mcp-widget-server CLI command and ensures --project-dir becomes cwd for config discovery. |
| recce/track.py | Sends traceback output to stderr to avoid corrupting stdio JSON-RPC. |
| recce/data/mcp/row_count_diff.html | Self-contained MCP Apps widget for row count diffs. |
| recce/data/mcp/schema_diff.html | Self-contained MCP Apps widget for schema diffs. |
| docs/mcp-widgets.md | Developer guide/template for adding future widgets. |
| .gitignore | Stops ignoring the entire recce/data tree; ignores only build output paths so recce/data/mcp/*.html can be committed. |
| CLAUDE.md | Adds gbrain search guidance block. |
Comment on lines
+109
to
+112
| ref = importlib.resources.files("recce.data.mcp") / f"{name}.html" | ||
| return ref.read_text(encoding="utf-8") | ||
| except (FileNotFoundError, TypeError, ModuleNotFoundError): | ||
| return f"<html><body>Widget asset missing: {name}.html. Run pnpm run build.</body></html>" |
Comment on lines
+1555
to
+1556
| base_type = base_columns[col].get("type") | ||
| curr_type = current_columns[col].get("type") |
| row_count_diff.html # allowlist — see .gitignore). Self-contained HTML files. | ||
| schema_diff.html | ||
| tests/ | ||
| test_widget_server.py # 5 tests covering WIDGET_TOOLS coordination + widget server. |
Comment on lines
+353
to
+358
| - **`recce/data/` is gitignored as build output.** Widget HTML files use a | ||
| per-extension allowlist in `.gitignore` to escape the broad `recce/data` | ||
| ignore rule. The allowlist currently covers `*.html`. If you add new file | ||
| types (`.css`, `.svg`, `.js`) under `recce/data/mcp/`, check `.gitignore` | ||
| and add an allowlist entry if needed; otherwise `git add` will silently skip | ||
| your file. |
Comment on lines
57
to
+61
| console = Console() | ||
| if params.get("debug"): | ||
| console.print_exception(show_locals=True) | ||
| else: | ||
| print(traceback.format_exc()) | ||
| print(traceback.format_exc(), file=sys.stderr) |
…onfig POC users had to manually edit claude_desktop_config.json to register both recce + recce-widgets MCP servers and set RECCE_MCP_WIDGETS=1 on both. This helper does it idempotently: merges entries (preserves other servers), auto-sets the env var, backs up the old config, and prints next-steps. - Required: --project-dir <PATH> (validated for dbt_project.yml presence) - Optional: --config <PATH>, --yes, --dry-run - iter 1 scope: macOS only, local mode only (errors on cloud flags / other OS with explicit fix message) - 5 unit tests using CliRunner + tmp_path (write/preserve/validate/dry-run/backup) Closes the manual install friction surfaced in iter 1 manual testing. Cuts "5 minutes editing JSON + finding absolute recce path + restarting" to one command + Cmd+Q. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
Adds the third widget in the iter 1 widget POC line — tier 1 (status pill + key/value grid) for server runtime state. First widget added post-iter-1 idiomatic refactor, so it's born following the canonical pattern: - Pydantic output model matching _tool_get_server_info return shape - @mcp.tool with annotations dict (readOnly/idempotent/title) - CallToolResult with short content + structuredContent (no agent dual-render) - @mcp.resource with mime text/html;profile=mcp-app + CSP + prefersBorder - Widget HTML: SDK helpers (applyDocumentTheme + applyHostStyleVariables + applyHostFonts), onhostcontextchanged + onteardown, exhaustive @media dark for every class with var(--color-*, fallback) — closes the dark-mode fallback gap that bit row_count_diff/schema_diff earlier (commit b882697) - WIDGET_TOOLS set updated to filter from main mcp-server when widgets enabled - 2 new tests + updated enumeration assertions (10 total, was 8) - docs/mcp-widgets.md "Add widget N+1" walkthrough now references this as the canonical worked example (born idiomatic, no later-cycle defensive layers) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
Fourth widget. tier 2 (list / simple table) — saved Recce checks for a PR
session, rendered as a 3-up summary + status table.
- Pydantic CheckSummary + ListChecksOutput models matching actual
_tool_list_checks shape (minimal field set — name/type/status/description,
NOT internal params). approved count from raw return ("approved" key);
pending derived as total - approved in the delegate.
- @mcp.tool with annotations, CallToolResult short content + structuredContent
- @mcp.resource with mime/CSP/prefersBorder
- Widget HTML: SDK helpers + onhostcontextchanged + onteardown + exhaustive
@media dark overrides (354 lines, 13 KB)
- WIDGET_TOOLS filter updated to 4 widget tools
- 2 new tests; enumeration assertion bumped to 4; annotation loop includes
list_checks; filter test asserts list_checks absent when widgets enabled
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
Fifth widget, completing Phase A of the post-iter-1 widget expansion. Single-model detail view (base/current column tables, constraints, not-found state). - Pydantic ColumnInfo + ModelEnvironment + GetModelOutput + GetModelInput matching actual _tool_get_model return shape (columns dict → list normalised in _parse_model_env helper; primary_key preserved; raw_code omitted) - @mcp.tool with annotations, CallToolResult short content + structuredContent - @mcp.resource with mime/CSP/prefersBorder - Widget HTML: SDK helpers + onhostcontextchanged + onteardown + exhaustive @media dark; adaptive 2-col/3-col column table layout (constraints visible only when present); per-env base/current sections; not-found empty state - WIDGET_TOOLS filter at 5 tools - 2 new tests (test_get_model_widget_registered, test_get_model_returns_calltoolresult_with_pydantic_shape); enumeration assertions bumped to 5 tools/resources Phase A (tier 1-2 multipliers) complete: row_count_diff, schema_diff, get_server_info, list_checks, get_model. ~half the design's 12 widget candidates done — design Open Q #6 >=50% FastMCP migration trigger approaches. Iter 2 mini-doc to reevaluate. No Pydantic reserved-name conflicts in actual handler shape (columns, primary_key, raw_code are all safe); Gotcha documented in mcp-widgets.md for future implementors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
…st[str]
Captain hit Pydantic validation error in Claude Desktop: actual
_tool_get_server_info returns support_tasks as {"query": True, ...,
"change_analysis": True} (dict of task slug → enabled bool), not a list
of enabled slugs. Fix the Pydantic field type and widget HTML iteration.
- ServerInfoOutput.support_tasks: Dict[str, bool] (was List[str])
- Widget HTML iterates Object.entries(...).filter(([_,v]) => v) for enabled
tasks; renders as info badges (same visual as before)
- Test fixture updated to use dict shape
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
… invocation)
Captain hit Pydantic validation error in Claude Desktop:
Error executing tool list_checks: 1 validation error for list_checksArguments
args Field required [type=missing, input_value={}, input_type=dict]
Root cause: list_checks delegate signature was `async def list_checks(args:
ListChecksInput)` where ListChecksInput was an empty BaseModel (no real args
to pass). FastMCP exposed this as an inputSchema requiring `args` as a top-
level field — agent called with no params, MCP sent {}, validation failed.
Fix: drop the empty ListChecksInput class and signature param. Match the
no-args pattern of get_server_info. `_tool_list_checks` still receives {}
internally (the handler accepts but ignores arguments).
Confirmed only list_checks affected: row_count_diff / schema_diff have
Optional-fields-only inputs (OK), get_server_info has no-args (OK),
get_model has required model_id (OK).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…dget) Sixth widget; first tier-3 widget. Establishes the data-table rendering pattern for the other Phase B widgets (query_diff, value_diff, value_diff_detail, top_k_diff). - QueryInput / QueryOutput Pydantic models matching actual DataFrame.model_dump shape (READ from source — columns/data/limit/more/total_row_count) - Sticky-header scrollable table inside ~400px container - Cell type rendering (null → "—" italic secondary, number → tabular-nums right, string → truncated with title attr, bool → check/dash, date → display as-is) - Truncation badge when result is capped (DataFrame.more) - Empty state with SQL echo for debug context - @mcp.tool with annotations (openWorldHint=True since queries hit the warehouse) - CallToolResult short content + structuredContent - WIDGET_TOOLS filter expanded to 6 tools - 2 new tests (tests 15-16); enumeration assertions bumped to 6 - docs/mcp-widgets.md gains "Adding a tier-3 (data table) widget" section as template for the remaining Phase B widgets Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
…mparison) Seventh widget. Builds on query widget (46251d5)'s data-table pattern. Compares SQL execution against base + current environments with per-row status rendering. Handles both QueryDiffTask result shapes from source. - QueryDiffInput / QueryDiffOutput Pydantic models verified against actual QueryDiffTask.execute() return shape (QueryDiffResult) - Two render modes detected at runtime from structuredContent: * Shape A (no primary_keys): side-by-side base / current tables * Shape B (primary_keys): join-diff table with in_a/in_b stripped, status pills (Added/Removed), row tinting, filter buttons - _parse_dataframe() helper mirrors _parse_model_env pattern - _warning extracted to output.warning named field (consistent with row_count_diff Day 1.5 pattern) - Widget HTML: sticky-header scrollable tables + status pills + filter buttons (All / Added / Removed with counts) + dark-mode coverage - @mcp.tool with openWorldHint=True (queries hit warehouse) - WIDGET_TOOLS updated to 7 tools - 2 new tests (registration + both Shape A and B); enumeration bumped to 7 - test_widget_tool_annotations_present updated for query_diff open-world group - docs/mcp-widgets.md: widget count + reference table updated Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
…comparison)
Eighth widget. Aggregate stats + per-column match breakdown for value
comparison across two environments (primary key matching).
- ValueDiffInput / ValueDiffColumnRow / ValueDiffSummary / ValueDiffOutput
Pydantic models matching actual ValueDiffTask return shape (verified from source):
summary={total, added, removed}, data.data=[[col, matched, matched_p], ...]
where matched_p is 0.0-1.0 fraction (not percentage)
- _warning extracted to output.warning named field
- 4-up summary cards (total/common/columns-affected/added+removed) + per-column
table with inline bar visualization (CSS-only, no chart lib)
- All-match empty state when every column is 100% identical
- @mcp.tool with annotations (openWorldHint=True, queries warehouse)
- WIDGET_TOOLS at 8 tools
- 2 new tests (registration + Pydantic shape); enumeration bumped to 8
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…el inspection) Ninth widget. Row-level companion to value_diff. Shows the actual rows with mismatched values for investigation. - ValueDiffDetailInput / Output Pydantic models matching actual ValueDiffDetailTask return shape (plain DataFrame with in_a/in_b flags) - Scrollable table with sticky primary-key column (CSS sticky left), filter buttons (All/Removed/Added) with counts, status pill per row, row tinting (red=removed, green=added), cell type-aware rendering - _warning extracted to output.warning named field - @mcp.tool with annotations (openWorldHint=True) - WIDGET_TOOLS at 9 tools - 2 new tests (Tests 21-22); enumeration bumped to 9; 27 total passing Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
Tenth widget; closes Phase B (5 tier-3 widgets). Side-by-side ranked
lists of top-K most frequent values across base + current envs.
- TopKDiffInput / TopKEnvStats / TopKDiffOutput Pydantic models matching
actual TopKDiffTask return shape (READ from source):
base/current: {values[], counts[], valids, total} parallel lists.
values[] is the SAME union list in both envs (SQL FULL OUTER JOIN order:
curr_count DESC, base_count DESC). count=0 means absent from that env.
Param is column_name (not column), default k=10 (not 50).
- 2-column grid with ranked lists per env, inline bar viz, rank-change
arrows (up/down), New/Gone badges for env-exclusive categories
- _warning extracted to output.warning named field
- @mcp.tool annotations (openWorldHint=True — executes warehouse SQL)
- WIDGET_TOOLS at 10 tools — 50% coverage (10/20), design Open Q #6
FastMCP migration trigger threshold REACHED. iter 2 to reevaluate.
- 2 new tests (23: registration + inputSchema, 24: Pydantic shape);
enumeration bumped to 10
Phase B (tier-3 data tables): query, query_diff, value_diff,
value_diff_detail, top_k_diff. Each hand-rolled its table/list layout
because patterns diverged enough that a shared <recce-table> would
need extreme flexibility. Iter 2 mini-doc: evaluate renderRankedList()
JS helper extraction; full component abstraction deferred until 3+
widgets converge on same layout.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…t widget)
Eleventh widget; first tier-4 (statistics/chart) widget. Hand-rolled SVG
bars — no new chart library, no CSP/CDN changes. Establishes the
"tier-4 via hand-rolled SVG" pattern that profile_diff and any future
chart widgets will follow.
Actual HistogramDiffTask return shape (read from recce/tasks/histogram.py):
base/current: {counts: List[int], total: int} (or {} if env failed)
min/max: overall min/max across both envs (numeric or date object)
bin_edges: N+1 edge values (int/float for numeric, date for datetime)
labels: List[str] for numeric cols ("lo-hi" format); None for datetime
_tool_histogram_diff auto-detects column_type from catalog — tool only
needs model + column_name (+ optional num_bins).
- HistogramDiffInput / HistogramEnvStats / HistogramDiffOutput Pydantic
models matching actual HistogramDiffTask return shape
- Date bin_edges serialised to ISO strings in widget delegate
(date objects are not JSON-serialisable natively)
- Hand-rolled SVG: viewBox 600x180, base bars (blue 45% opacity) behind
current bars (green 70% opacity), x-axis label density auto-reduced
(every Nth label for bins > 10), hover tooltip via mousemove
- CSS tokens for bar colors with exhaustive @media dark fallback
- _warning extracted to output.warning named field
- @mcp.tool annotations (openWorldHint=True, hits the warehouse)
- WIDGET_TOOLS at 11 tools
- 2 new tests (25, 26); enumeration bumped to 11; annotations test
extended to cover histogram_diff openWorldHint=True
- docs/mcp-widgets.md: widget count 10→11, new "Tier-4 (Chart) Widget
Architecture" section explaining hand-roll SVG approach + iter-2
upgrade criteria for swapping to a real chart library
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…ete)
Twelfth widget; closes Phase C. Per-column statistical profile comparison
(count, null proportion, distinct count, min, max, avg, median) across base
+ current envs rendered as a card grid.
- ProfileColumnStats / ProfileColumnDiff / ProfileDiffOutput / ProfileDiffInput
Pydantic models matching actual ProfileDiffResult return shape (base + current
DataFrames with columns: column_name, data_type, row_count, not_null_proportion,
distinct_proportion, distinct_count, is_unique, min, max, avg, median)
- _parse_profile_dataframe() helper: DataFrame dict -> {col_name: ProfileColumnStats}
with _to_float / _to_int / _to_str / _to_bool coercions for agate type variants
- _parse_data_type_map() helper: DataFrame dict -> {col_name: data_type}
- Union column ordering: base columns first, then current-only appended;
columns absent from one env have base=None or current=None in the diff
- min/max arrive as str (SQL CAST to text type in PROFILE_COLUMN_JINJA_TEMPLATE)
-- no isoformat() conversion needed (unlike histogram_diff bin_edges)
- _warning extracted to output.warning named field
- @mcp.tool annotations: openWorldHint=True (warehouse queries)
- WIDGET_TOOLS now 12 tools -- 60% coverage, Open Q #6 FastMCP migration
trigger threshold EXCEEDED (was 50% hypothesis)
- 2 new tests (27, 28); enumeration bumped to 12 in test 3; profile_diff
added to annotation loop in test 8; openWorldHint assertion added
- HTML: 521 lines, ~22KB; CSS grid card layout (260px min per card); stat
mini-table per card with Base->Current columns; delta chips for numeric
changes; pp (percentage-point) delta for proportions; type classification
(numeric/text/date-time/boolean/other) for badge and stat visibility
- No sparklines -- ProfileDiffTask returns aggregate stats only, no per-bin data
Phase C (tier 4 chart/stats): histogram_diff (SVG bars), profile_diff (CSS grid
cards). Both keep CSP at single unpkg origin. If iter 3+ needs richer chart types
(stacked bar, line, heatmap), evaluate Chart.js / Vega-Lite then.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
Thirteenth widget; first tier-5 (mini graph) widget. Column-level lineage rendered as a hand-rolled SVG mini-DAG with layered layout (sources left, target middle, downstream right). Establishes the "tier-5 via hand-rolled SVG with bail-out" pattern. Actual CllData shape (from recce/models/types.py) uses nodes/columns dicts and parent_map/child_map sets — significantly different from the placeholder nodes+edges list shape in the mission. Column-to-column edges come from CllColumn.depends_on, not the top-level parent_map (which is node-level). Adapter logic in the delegate normalises sets → lists for JSON serialisation. - GetCllInput / GetCllColumnDep / GetCllColumnInfo / GetCllNodeInfo / GetCllOutput Pydantic models matching actual CllData shape - Hand-roll SVG: BFS layered layout, bezier edge routing via depends_on, model cards with per-column rows, target column highlighted in blue - Complexity bail-out: >12 nodes or >30 edges → text summary list with hint to use Recce web app lineage view for full DAG - @mcp.tool annotations (openWorldHint=False — manifest read, no warehouse) - WIDGET_TOOLS bumped to 13 tools - 2 new tests (enumeration bumped to 13, annotations test updated) - docs/mcp-widgets.md gains Tier-5 Widget Architecture section explaining layered layout, bezier routing, bail-out approach, iter-2 considerations Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
…mplete) Fourteenth widget; closes Phase D (2 mini-graph widgets). Model-level blast radius — confirmed/potential impact per downstream model with next-action hints. Hand-roll SVG mini-DAG + below-SVG actionable list. - ImpactedModel / NextAction / ImpactAnalysisOutput Pydantic models matching actual _tool_impact_analysis return shape (_guidance, row_count, value_diff summary, schema_changes, next_action per model) - ImpactValueDiffSummary (renamed to avoid shadowing existing ValueDiffSummary) - Hand-roll SVG: 2-layer BFS DAG (modified left, downstream right) with per-node impact badge chips, row-count/value-diff metric, next_action hint - 'What to investigate next' actionable list grouped by priority (high/medium/low) - Bail-out for >15 nodes: skip SVG, show summary counts + actionable list only - _warning extracted to output.warning, _guidance to output.guidance - WIDGET_TOOLS at 14 tools — 70% coverage (only lineage_diff in Phase E left) - 2 new tests; enumeration bumped to 14 (37 total pass) Phase D complete (get_cll, impact_analysis). Both use hand-roll SVG mini-DAG with BFS layered layout. Approach scales to ~15 model nodes — anything larger hits bail-out. Phase E (lineage_diff, full DAG with potentially hundreds of nodes) needs ReactFlow + the @datarecce/ui lineage borrow contract; can't be served by hand-roll SVG. Captain decision required before Phase E. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
…ema_diff
Phase D impact_analysis introduced a second `class SchemaChange` at module
top level, which silently shadowed the schema_diff model defined earlier in
the file. Module-global binding meant schema_diff serialization tried to fit
{added, removed, type_changed, unchanged_count} into the impact_analysis
shape and failed Pydantic validation with "column / change_status Field
required".
Rename the impact_analysis variant to `ColumnSchemaChange` and add a
regression test that pins both field surfaces so a future widget tool
cannot reintroduce the shadow.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
…e cap) 15th widget tool. Hand-roll SVG mini-DAG (no Mermaid/ReactFlow dep) copying the impact_analysis BFS layout pattern. Hard cap MAX_INLINE_NODES=10 — over cap shows graceful empty-state message pointing to Recce web UI, no truncated view. Full ReactFlow plan from design Phase E remains deferred. Motivation: without a lineage_diff widget, Claude Desktop's agent falls back to rendering lineage as Mermaid text after seeing impact_analysis output, bypassing the widget rendering pipeline. Adding this widget keeps lineage visualization inside the same theming + interaction surface as the other 14 widgets and brings widget coverage to 15/20 (75%). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
These two Claude Code plugins back the MCP App widget work in this branch (ext-apps SDK reference, MCP server scaffolding). Enabling them in the repo settings keeps the dev environment consistent for anyone continuing the widget POC. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
The impact_analysis value_diff builder derived its per-column diff list from model_info["columns"], which reflects only the CURRENT relation (get_model -> get_columns(base=False)). It then applied `b."col" IS DISTINCT FROM c."col"` to BOTH the base (b) and current (c) relations. When a column had drifted — present in the current physical table but not the base — the warehouse binder failed hard (Snowflake: `Table "b" does not have a column named "<col>"`). Restrict the diff to the intersection of base and current columns via get_model(node_id, base=True) (which manages its own warehouse connection), mirroring ValueDiffTask. Drifted columns are already reported via schema_changes. Also skip value_diff when the PK itself has drifted, since it is the FULL OUTER JOIN key. Adds a regression test asserting drifted columns never reach the generated SQL and that value_diff covers only common non-PK columns. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
`logs/` (dbt run logs) and `.hypothesis/` (hypothesis testing example cache) are runtime/test artifacts regenerated on every test run, but the existing rules only covered the root `dbt.log` file — not the `logs/` directory dbt writes to, nor `integration_tests/dbt/logs/`. They showed as persistent untracked changes in the workspace UI. Ignore the dir forms (no leading slash → matches at any depth). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kent <iamcxa@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Recce MCP App POC (DRC-3526). Ships 15 widget-enabled MCP tools rendered as interactive HTML widgets in Claude Desktop and other MCP App-capable hosts, plus a
recce mcp-config-installhelper that auto-writes the Claude Desktop config so POC users skip the hand-edited JSON.Architecture is a parallel FastMCP-based widget server (
recce mcp-widget-server) alongside the existing low-levelrecce mcp-server, coordinated byRECCE_MCP_WIDGETS=1. Day 0 spike found low-levelmcp.server.Serverdoesn't render widgets in Claude Desktop whilemcp.server.fastmcp.FastMCPdoes — hence two servers rather than upgrading the existing one. Full FastMCP migration is deferred (design Open Q #6; now at 15/20 = 75% widget coverage, past the ≥50% reconsideration threshold).Demo journeys (recorded)
Both journeys use plain reviewer-style prompts — no
mcp__recce-widgets__<tool>tool-naming — to validate that a non-technical reviewer's natural phrasing reliably triggers the right widget.Demo 1 — Quick scan (Journey A) · Loom
Replaces the AI-summary "block of text" with inline widgets so a reviewer grasps "what changed" in seconds. PR scenario: jaffle_shop_golden#15 (refactor
customers.sql→ two intermediate models).Prompt 1 →
row_count_diffwidget:Prompt 2 →
schema_diffwidget:Demo 2 — Investigate impact (Journey 2) · Loom
Blast-radius summary → evidence drilldown, with the agent following the
next_actionadvisory hint. PR scenario: jaffle_shop_golden#16 (amountDOUBLE → DECIMAL onstg_payments).Prompt 1 →
impact_analysiswidget:Prompt 2 →
profile_diffwidget:What's in this PR
New surface:
recce mcp-widget-serverCLI subcommand (local mode only — iter 1 scope)recce mcp-config-installCLI helper — writes both server entries +RECCE_MCP_WIDGETS=1intoclaude_desktop_config.json(macOS, iter 1 scope), with--dry-runand--yes, and a backup of the existing configrecce/widget_server.py— FastMCP server with 15@mcp.tooldelegates + matching@mcp.resourcehandlers, idiomatic shape (Pydantic input/output models,CallToolResultwith shortcontent+structuredContent, tool annotations, ext-apps SDK theme helpers)recce/data/mcp/*.html— 15 self-contained HTML widgets, each with exhaustive@media (prefers-color-scheme: dark)overrides and Claude design-token themingWIDGET_TOOLSfilter inrecce/mcp_server.py— whenRECCE_MCP_WIDGETS=1, removes the 15 widget-eligible tools from the main server'slist_tools(defers to widget server)15 widgets by tier:
row_count_diff,schema_diff,get_server_info,list_checks,get_modelquery,query_diff,value_diff,value_diff_detail,top_k_diffhistogram_diff,profile_diffget_cll,impact_analysislineage_diff— hand-roll SVG, hard 10-node cap, graceful empty-state over capBug fixes (pre-existing, surfaced by the POC):
recce/track.pyprint(traceback.format_exc())→file=sys.stderr— stdout traceback is fatal for any stdio MCP server when RecceConfig failsrecce mcp-server/mcp-widget-servernowos.chdir(--project-dir)beforeRecceConfig, so Claude Desktop'scwd=/spawn finds the dbt project + recce.yml (removes the bash-wrapper workaround)SchemaChangeclass-name collision — Phase D'simpact_analysisdefined a second module-levelSchemaChangethat shadowedschema_diff's model, breakingschema_diffserialization with a Pydantic "column / change_status Field required" error. Renamed toColumnSchemaChange; regression test pins both field surfaces.Quality / docs:
tests/test_widget_server.py— 35 tests (registration via public FastMCP API, env-var coordination 18-vs-20 tools, file-missing graceful degradation,CallToolResultshape, Pydantic structuredContent round-trip, annotations, collision guard,lineage_diff10-node-cap branch)tests/test_mcp_config_install.py— 5 tests (project-dir validation, dry-run no-write, backup creation)docs/mcp-widgets.md— template doc for Scott's hand-off: file layout, config example, "Add widget N+1" walkthrough, structuredContent contract, gotchas, Python-vs-TypeScript SDK noteTest plan
pytest tests/test_widget_server.py→ 35 passedpytest tests/test_mcp_config_install.py→ 5 passedpytest tests/ -k "mcp or widget or schema_diff or impact or lineage"→ 363 passed, no widget-related regressionsrecce mcp-widget-server --project-dir <dbt>) — exit 0, clean stderr, no stdout pollutionNote:
tests/test_server.py::test_spa_route_lineagefails locally because the frontend SPA build artifact (recce/data/lineage/index.html) isn't present in this worktree — pre-existing, unrelated to this PR (fails identically with the diff stashed; resolved bycd js && pnpm run build).User-facing changes
For Claude Desktop POC users:
recce mcp-config-install --project-dir /path/to/dbt(writes both entries + env var, backs up existing config). Or hand-edit perdocs/mcp-widgets.md.RECCE_MCP_WIDGETS=0/unset → zero change, all 20 tools text-only fromrecce mcp-server, backward compatible.RECCE_MCP_WIDGETS=1→recce mcp-serverserves 5 non-widget tools,recce mcp-widget-serverserves the 15 widget tools. No name collision._meta.ui.resourceUri, uses the tool's one-sentencecontenttext.What's NOT in this PR
recce mcp-widget-server(iter 1 = local mode only). This is the strategic next step — MCP ↔ Recce Cloud is what lets non-technical users skip git/dbt entirely and is where PR-selection + artifact-packaging are properly solved (not as extra local tools).lineage_diffbeyond v1 — progressive degradation for >10 nodes (shrink / zoom / aggregate) and the ReactFlow@datarecce/ui/lineageborrow contract_recce_serverstill module-global)Reviewer notes
docs/mcp-widgets.mdis the canonical Scott hand-off doc — if the "Add widget N+1" walkthrough leaves a cold reader unsure, that's PR-blocking feedback.rgb(), no@datarecce/uiinternal-path imports (widget HTML is plain JS).🤖 Generated with Claude Code