feat(mcp): scaffold api/mcp module (T1 #648)#666
Conversation
… entry point Add the bare MCP server module (api/mcp/) using the official FastMCP SDK, wire the cgraph-mcp console script in pyproject.toml, and include a protocol smoke test that spawns the server over stdio and verifies list_tools returns an empty tool set. Also copies the MCP design docs into docs/. Closes #648 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughScaffolds an MCP stdio server under ChangesMCP Server Scaffolding & Delivery
Sequence Diagram(s)sequenceDiagram
participant CLI as "cgraph-mcp (process)"
participant Server as "api.mcp.server (FastMCP)"
participant Client as "mcp.ClientSession (stdio client)"
CLI->>Server: start (stdio transport)
Client->>Server: initialize() (MCP handshake)
Server-->>Client: initialize response
Client->>Server: tools/list()
Server-->>Client: tools: []
Client->>CLI: exit
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Scaffolds an MCP (Model Context Protocol) server module within the existing Python backend so external agents can connect via stdio, and establishes the packaging/test wiring needed to evolve into a full “code-graph” MCP server.
Changes:
- Add
api/mcp/with a bareFastMCP("code-graph")app plus a stdiomain()runner. - Wire a new
cgraph-mcpconsole script and add themcp>=1,<2dependency (with updated lockfile). - Add MCP stdio smoke tests and check in MCP design documentation assets.
Reviewed changes
Copilot reviewed 5 out of 8 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
api/mcp/server.py |
Introduces the FastMCP app instance and stdio runner entry point. |
api/mcp/__init__.py |
Adds module-level documentation for the MCP server package. |
pyproject.toml |
Adds mcp dependency and cgraph-mcp console-script wiring. |
uv.lock |
Locks transitive dependencies introduced by mcp (and its dependency graph). |
tests/mcp/test_scaffold.py |
Adds a stdio protocol smoke test verifying handshake + empty tool list. |
tests/mcp/__init__.py |
Establishes the tests.mcp package. |
docs/MCP_SERVER_DESIGN.md |
Adds MCP server design summary and roadmap for Phase 1+. |
docs/code-graph-mcp-v4.docx |
Adds the full design document (binary). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| [project.scripts] | ||
| cgraph = "api.cli:app" | ||
| cgraph-mcp = "api.mcp.server:main" | ||
|
|
There was a problem hiding this comment.
The design/issue text specifies the console-script entry point as api.mcp.server:app, but pyproject.toml wires cgraph-mcp to api.mcp.server:main. Please align the entry point target with the documented contract (either change the script target back to :app if supported by the MCP SDK, or update the docs/issue references to :main).
| "javatools>=1.6.0,<2.0.0", | ||
| "pygit2>=1.17.0,<2.0.0", | ||
| "typer>=0.24.0,<1.0.0", | ||
| "mcp>=1.0.0,<2.0.0", | ||
| ] |
There was a problem hiding this comment.
Adding mcp as a core dependency pulls in heavier transitive deps (notably pyjwt[crypto] → cryptography, plus python-multipart, sse-starlette, etc. per uv.lock). Please confirm this footprint is acceptable for all installs of the base package; if MCP is optional, consider moving it under an extra to avoid requiring compiled wheels for users who don’t use cgraph-mcp.
| ## Key Decisions Made | ||
|
|
||
| - **Mono-repo**: Build inside `code-graph/api/mcp/`, not a separate project. One pip package, one repo. | ||
| - **Module path is `api/mcp/`, NOT top-level `mcp/`**: A top-level `mcp/` directory would shadow the installed `mcp` PyPI SDK and break `from mcp.server.fastmcp import FastMCP`. Entry point: `cgraph-mcp = "api.mcp.server:app"`. |
There was a problem hiding this comment.
This section documents the entry point as cgraph-mcp = "api.mcp.server:app", but the PR wires the console script to api.mcp.server:main. Please update this doc (and any other occurrences) so the documented entry point matches the actual packaging.
| - **Module path is `api/mcp/`, NOT top-level `mcp/`**: A top-level `mcp/` directory would shadow the installed `mcp` PyPI SDK and break `from mcp.server.fastmcp import FastMCP`. Entry point: `cgraph-mcp = "api.mcp.server:app"`. | |
| - **Module path is `api/mcp/`, NOT top-level `mcp/`**: A top-level `mcp/` directory would shadow the installed `mcp` PyPI SDK and break `from mcp.server.fastmcp import FastMCP`. Entry point: `cgraph-mcp = "api.mcp.server:main"`. |
| │ ├── test_ask.py # T11 — mocked LLM, real Cypher against fixture | ||
| │ ├── test_auto_init.py # T12 | ||
| │ └── test_init_agent.py # T13 | ||
| └── pyproject.toml # Adds `cgraph-mcp = "api.mcp.server:app"` and `mcp>=1.0,<2.0` | ||
| ``` | ||
|
|
There was a problem hiding this comment.
The directory structure example also states cgraph-mcp = "api.mcp.server:app", but the package currently exports cgraph-mcp = "api.mcp.server:main". Please keep the design doc consistent with the actual pyproject.toml script target.
| - **GraphRAG SDK powers the ask tool**: `kg.ask()` does NL-to-Cypher. Code-graph's `api/llm.py` already integrates this — repackage for MCP. | ||
| - **Reuse the existing hand-coded ontology** from `api/llm.py:_define_ontology()` (lines 26–233) rather than auto-extracting via `Ontology.from_kg_graph()`. The hand-coded version has richer entity attributes and descriptions tuned for code. Refactor: rename `_define_ontology` → `define_ontology` so the MCP module can import it. | ||
| - **Ship with 3 languages** (Python/Java/C#), add tree-sitter for broad coverage in Phase 2. | ||
| - **No incremental indexing in v1**: Full re-indexing is sufficient. Deferred to Phase 3. |
There was a problem hiding this comment.
The “Key Decisions Made” list says “No incremental indexing in v1” (deferred), but later in this same document it describes incremental indexing as “default-on” in Phase 1. Please reconcile these statements so the design doc has a single, consistent plan.
| - **No incremental indexing in v1**: Full re-indexing is sufficient. Deferred to Phase 3. | |
| - **Incremental indexing in v1**: First-time indexing is full; once a `(project, branch)` graph exists, incremental indexing is the default. Allow explicit full re-index via `--full` (CLI) or `incremental=False` (MCP). |
| params = StdioServerParameters(command=cgraph_mcp, args=[]) | ||
| async with stdio_client(params) as (read, write): | ||
| async with ClientSession(read, write) as session: | ||
| await session.initialize() | ||
| result = await session.list_tools() | ||
| assert result.tools == [] |
There was a problem hiding this comment.
This protocol smoke test can hang indefinitely if the spawned server fails to start or the MCP handshake stalls (no timeout around initialize() / list_tools()). Consider wrapping the stdio session block in an anyio.fail_after(...) (or equivalent pytest timeout) to keep CI runs from blocking forever on failures.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/MCP_SERVER_DESIGN.md`:
- Line 18: Update the documented console-script entry point from
"api.mcp.server:app" to the actual runtime target "api.mcp.server:main" wherever
it appears in the design doc (e.g., the references currently stating
api.mcp.server:app around the console-script/entrypoint description), so the doc
matches the pyproject wiring and avoid misleading setup instructions.
- Line 37: Several fenced code blocks in the document (the ones showing
"code-graph/ ..." the ASCII graph starting with "T1 ──┬─> T2 ..." and the cli
example "claude mcp add-json \"code-graph\" ...") are missing language
identifiers; update those backtick-fenced blocks to include appropriate language
tags (e.g., ```text for directory/listing and ASCII art, ```bash for the CLI
command) so markdownlint MD040 is satisfied and the blocks render correctly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 208a6980-6e4c-4d2e-92e2-51f4193a35af
⛔ Files ignored due to path filters (2)
docs/code-graph-mcp-v4.docxis excluded by!**/*.docxuv.lockis excluded by!**/*.lock
📒 Files selected for processing (6)
api/mcp/__init__.pyapi/mcp/server.pydocs/MCP_SERVER_DESIGN.mdpyproject.tomltests/mcp/__init__.pytests/mcp/test_scaffold.py
- Fix stale entry point references in design doc: api.mcp.server:app → :main - Remove contradicting decisions about tree-sitter/incremental indexing scope - Add language tags to fenced code blocks (MD040) - Add anyio.fail_after timeout to stdio smoke test to prevent CI hangs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- server: pass transport="stdio" explicitly to guard against future FastMCP default changes - test: drop STDIO_TIMEOUT to 10s (a stuck handshake should fail fast) - test: pin anyio backend to asyncio via fixture so transitive trio installs cannot silently double-run the test - pyproject: add anyio to test extras since the smoke test imports it directly (was previously available only via mcp's transitives) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Without this, uv sync detects pyproject/lockfile drift on CI and silently re-resolves the entire dep tree to newer versions (uvicorn 0.41.0 → 0.46.0 was observed), which broke the e2e playwright suite. Lock now matches pyproject so installs are reproducible. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This reverts commit 0c7e3db.
The falkordb/falkordb:latest base image is now Debian Trixie-based and arrives with apt in a state where the t64 ABI deps that git and build-essential require (libcurl3t64-gnutls, libtinfo6, libc6-dev, etc.) are held back. apt itself recommends `apt --fix-broken install`. Running `apt-get install -y -f` between update and the real install clears the broken state so the install can proceed. Verified locally against the exact base image digest CI uses (sha256:aaf67c724bba36b9fb8d43a2671fd57e89c536b971d72b692a63a168c8053ff4). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GraphRAG-SDK released v1.0 (April 16) and force-pushed history during the release, dropping the pre-v1.0 API surface that the e2e tests were built against. Cloning HEAD now produces a graph without the merge_with/combine/import_data/add_node/add_edge/ask Function nodes the tests interact with. Switch to analyzing the installed graphrag-sdk package (pinned to 0.8.2 via uv.lock — immutable on PyPI). flask clone stays for autocomplete variety on set/lo/as substrings. ensure_calls_edges keeps acting as a safety net for the two required CALLS edges. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
e2e/seed_test_data.py (1)
20-21: 💤 Low valueConsider removing the hardcoded version reference.
The comment explicitly mentions "0.8.2," which will require manual updates if the pinned version changes. Consider either:
- Removing the version number and keeping only the rationale ("Upstream HEAD has the new v1.0 API...")
- Making the comment more generic: "pinned via uv.lock to avoid v1.0 API incompatibility"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@e2e/seed_test_data.py` around lines 20 - 21, Update the inline comment that currently includes the hardcoded version "0.8.2" in the line starting with "Use the installed graphrag-sdk..." so it does not pin a specific version; either remove "0.8.2" entirely and keep the rationale ("Upstream HEAD has the new v1.0 API which the tests aren't built for") or reword to a generic note such as "pinned via uv.lock to avoid v1.0 API incompatibility" to avoid future manual edits.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@e2e/seed_test_data.py`:
- Around line 20-21: Update the inline comment that currently includes the
hardcoded version "0.8.2" in the line starting with "Use the installed
graphrag-sdk..." so it does not pin a specific version; either remove "0.8.2"
entirely and keep the rationale ("Upstream HEAD has the new v1.0 API which the
tests aren't built for") or reword to a generic note such as "pinned via uv.lock
to avoid v1.0 API incompatibility" to avoid future manual edits.
Two follow-ups to address the remaining 7 of 31 e2e failures: 1. Copy installed graphrag-sdk to a tempdir before analyzing. When the source path lives under .venv/lib/.../site-packages/, LSP treats it as an installed library and stops resolving call sites between functions (analyzer produced 0 CALLS edges vs 392 on the April 12 baseline). Copying to /tmp lets LSP treat it as a project and restores organic call-graph extraction. 2. Synthesize missing Function nodes in ensure_calls_edges. import_data has no `def` in any graphrag-sdk version (was a phantom from LSP resolution into a transitive dep). MERGE both source and dest Function nodes with minimal properties so the e2e path tests can find them. Adds the Searchable label so autocomplete works. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the last 3 of the original 31 e2e failures. 1. Pass url= to Project() so save_repo_info populates Redis. The /api/repo_info endpoint returns 400 if repo_info is None, which broke canvas:167 with TypeError on response.info.node_count. 2. Synthesize test_<module> Function nodes for the search-bar tests. testData.ts parametrizes over searchInput "test", but graphrag-sdk 0.8.2 has zero functions whose names contain "test", so the auto-scroll dropdown isn't scrollable and the auto-complete count is 0. 12 synthesized names give the dropdown enough to scroll. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
api/mcp/module with a bareFastMCP("code-graph")server instancecgraph-mcpconsole-script entry point inpyproject.tomlmcp>=1.0,<2.0dependencydocs/MCP_SERVER_DESIGN.md,docs/code-graph-mcp-v4.docx) into the repotests/mcp/test_scaffold.py) that spawns the server over stdio, completes the MCP handshake, and assertslist_toolsreturns an empty setTest plan
uv run pytest tests/mcp/test_scaffold.py -v— 3/3 pass (import, entry point, stdio round-trip)ruff check api/mcp/ tests/mcp/— cleancgraph-mcpresolves on PATH afteruv syncCloses #648
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Tests
Chores