Skip to content

Commit 150e14f

Browse files
committed
docs: rewrite codegraph-mcp readme
1 parent c7a1e68 commit 150e14f

File tree

1 file changed

+127
-114
lines changed

1 file changed

+127
-114
lines changed

crates/codegraph-mcp/README.md

Lines changed: 127 additions & 114 deletions
Original file line numberDiff line numberDiff line change
@@ -1,145 +1,158 @@
1-
# CodeGraph MCP
2-
3-
An async, type-safe Rust implementation for the Model Context Protocol (MCP) with a comprehensive CLI for server management and project indexing.
4-
5-
Features
6-
- Async/await with Tokio
7-
- JSON-RPC 2.0 message handling
8-
- MCP initialize handshake with version negotiation
9-
- WebSocket transport (tokio-tungstenite)
10-
- Heartbeat with websocket ping/pong
11-
- Connection pooling with least-busy selection
12-
- Comprehensive error types
13-
14-
Quick Start
15-
```rust
16-
use codegraph_mcp::{connection::McpClientConfig, McpConnection};
17-
use url::Url;
18-
19-
#[tokio::main]
20-
async fn main() -> codegraph_mcp::Result<()> {
21-
let url = Url::parse("wss://localhost:8081/mcp").unwrap();
22-
let cfg = McpClientConfig::new(url);
23-
let client = McpConnection::connect(&cfg).await?;
24-
25-
// Typed request example
26-
#[derive(serde::Serialize)]
27-
struct EchoParams { value: String }
28-
#[derive(serde::Deserialize)]
29-
struct EchoResult { echoed: String }
30-
31-
let res: EchoResult = client
32-
.send_request_typed("codegraph/echo", &EchoParams { value: "hello".into() })
33-
.await?;
34-
println!("echoed={}", res.echoed);
35-
36-
client.close().await?;
37-
Ok(())
38-
}
39-
```
40-
41-
Notes
42-
- Supported protocol versions: 2024-11-05, 2025-03-26, 2025-06-18 (default)
43-
- Uses websocket ping/pong for heartbeat; integrates with HeartbeatManager
44-
- Requests are timed out; responses routed via in-flight map
45-
46-
## CLI Usage
47-
48-
The `codegraph` CLI provides comprehensive tools for managing MCP servers and indexing projects.
49-
50-
### Installation
1+
# CodeGraph MCP – Agentic Tool Server
2+
3+
CodeGraph MCP is the SurrealDB-backed Model Context Protocol server that powers the
4+
`agentic_*` tool suite used by AutoAgents and other reasoning-first assistants. The
5+
crate wraps our AutoAgents orchestrator, Surreal graph functions, and semantic search
6+
pipeline into a single binary so LLM agents can issue rich analysis requests over
7+
MCP (STDIO or HTTP) without having to understand the underlying codebase structure.
8+
9+
## Highlights
10+
11+
- **AutoAgents orchestration** – every tool request spins up a ReAct-style plan
12+
(tier-aware prompts, max-step guards, intermediate reasoning logs) so LLMs
13+
explore the graph incrementally instead of returning shallow answers.
14+
- **Seven advanced MCP tools** – multi-hop graph queries, semantic synthesis, and
15+
architecture diagnostics wrapped behind stable MCP endpoints.
16+
- **SurrealDB for graph + embeddings** – all structure resides in SurrealDB
17+
(nodes, edges, embeddings, context caches). No RocksDB/FAISS dependencies remain.
18+
- **Flexible transports** – STDIO for local agents (Claude, GPTs, AutoAgents) and
19+
an optional HTTP transport for remote evaluation (`test_http_mcp.py`).
20+
- **Structured outputs** – every response follows our JSON schemas so downstream
21+
agents can capture file paths, node IDs, and prompt traces deterministically.
22+
23+
## Agentic Tool Suite
24+
25+
| Tool name | Purpose |
26+
|-----------------------------|---------------------------------------------------------------------------------------------------|
27+
| `agentic_code_search` | Multi-step semantic/code search with AutoAgents planning. |
28+
| `agentic_dependency_analysis` | Explores transitive `Imports`/`Calls` edges for a symbol, ranks hotspots, flags potential risks. |
29+
| `agentic_call_chain_analysis` | Traces execution from an entrypoint (e.g., `execute_agentic_workflow`) to graph tools. |
30+
| `agentic_architecture_analysis` | Calculates coupling metrics, hub nodes, and layering issues for a subsystem. |
31+
| `agentic_api_surface_analysis` | Enumerates public functions/structs of a component and links them back to files/lines. |
32+
| `agentic_context_builder` | Gathers everything needed for prompt construction (files, dependencies, semantic neighbors). |
33+
| `agentic_semantic_question` | Free-form “explain X” queries that blend embeddings + structured graph walks. |
34+
35+
Each handler lives in `official_server.rs` and funnels into
36+
`execute_agentic_workflow`, which:
37+
38+
1. Detects the LLM tier (`Small`, `Medium`, `Large`, `Massive`) using context-window metadata.
39+
2. Picks the corresponding prompt template from `prompts/`.
40+
3. Executes SurrealDB graph functions (`codegraph_graph::GraphFunctions`) and semantic
41+
search helpers.
42+
4. Streams reasoning steps + final answers back through MCP.
43+
44+
## Architecture at a Glance
45+
46+
- **AutoAgents + CodeGraph AI** – enabled via the `ai-enhanced` feature flag. The orchestrator uses
47+
AutoAgents’ planner/critic/executor roles but our prompts + structured tool calls.
48+
- **Surreal Graph Storage** – compile with `codegraph-graph/surrealdb` (default in this repo) and
49+
provide Surreal credentials via `CODEGRAPH_SURREALDB_*` env vars or `config/surrealdb_example.toml`.
50+
- **Embedding Providers** – choose via `CODEGRAPH_LLM_PROVIDER` and the `embeddings-*` Cargo features
51+
(Ollama, OpenAI, Jina, etc.). All agentic tools require embeddings because they mix symbolic +
52+
vector context.
53+
- **Transports** – STDIO transport listens on stdin/stdout (ideal for MCP-compliant hosts) and the
54+
optional HTTP server exposes `/mcp` for streaming JSON-RPC over HTTP.
55+
56+
## Quick Start
57+
58+
### 1. Prepare configuration
5159

5260
```bash
53-
cargo install --path .
61+
cp config/surrealdb_example.toml ~/.codegraph/config.toml
62+
export CODEGRAPH_SURREALDB_URL=ws://localhost:3004
63+
export CODEGRAPH_SURREALDB_NAMESPACE=ouroboros
64+
export CODEGRAPH_SURREALDB_DATABASE=codegraph
65+
export CODEGRAPH_SURREALDB_USERNAME=root
66+
export CODEGRAPH_SURREALDB_PASSWORD=root
67+
export CODEGRAPH_LLM_PROVIDER=ollama # or openai / anthropic / xai
68+
export MCP_CODE_AGENT_MAX_OUTPUT_TOKENS=4096 # optional override
5469
```
5570

56-
### Server Management
71+
### 2. Run the STDIO MCP server
5772

5873
```bash
59-
# Start MCP server with STDIO transport
60-
codegraph start stdio
61-
62-
# Start with HTTP transport
63-
codegraph start http --host 127.0.0.1 --port 3000
74+
cargo run -p codegraph-mcp \
75+
--features "ai-enhanced,embeddings-ollama,autoagents-experimental" \
76+
-- start stdio
77+
```
6478

65-
# Start with dual transport (STDIO + HTTP)
66-
codegraph start dual --port 3000
79+
Hook this process up to your MCP host (Claude Desktop, custom AutoAgents runner, etc.).
6780

68-
# Check server status
69-
codegraph status --detailed
81+
### 3. (Optional) Run the HTTP transport
7082

71-
# Stop server
72-
codegraph stop
83+
```bash
84+
cargo run -p codegraph-mcp \
85+
--features "ai-enhanced,embeddings-ollama,server-http,autoagents-experimental" \
86+
-- start http --host 127.0.0.1 --port 3003
7387
```
7488

75-
### Project Indexing
89+
Point the Python harness at it:
7690

7791
```bash
78-
# Index current directory
79-
codegraph index .
80-
81-
# Index with specific languages
82-
codegraph index . --languages rust,python,typescript
83-
84-
# Watch for changes and auto-reindex
85-
codegraph index . --watch
86-
87-
# Force reindex with multiple workers
88-
codegraph index . --force --workers 8
92+
CODEGRAPH_HTTP_HOST=127.0.0.1 CODEGRAPH_HTTP_PORT=3003 \
93+
python test_http_mcp.py
8994
```
9095

91-
### Code Search
96+
or use the streaming tester (`test_agentic_tools_http.py`) which logs each request under
97+
`test_output_http/`.
9298

93-
```bash
94-
# Semantic search
95-
codegraph search "authentication handler"
96-
97-
# Exact match
98-
codegraph search "fn process_data" --search-type exact
99+
## Invoking Tools Manually
99100

100-
# Regex search
101-
codegraph search "fn \w+_handler" --search-type regex
101+
All MCP requests follow the standard JSON-RPC envelope. Example HTTP payload for
102+
`agentic_dependency_analysis`:
102103

103-
# Output as JSON
104-
codegraph search "database" --format json --limit 20
104+
```jsonc
105+
{
106+
"jsonrpc": "2.0",
107+
"id": "1",
108+
"method": "agentic_dependency_analysis",
109+
"params": {
110+
"query": "Analyze the dependency chain for the AgenticOrchestrator. What does it depend on?",
111+
"workspaceId": "default"
112+
}
113+
}
105114
```
106115

107-
### Configuration
116+
The response contains:
108117

109-
```bash
110-
# Show configuration
111-
codegraph config show
118+
- `reasoning`: step-by-step chain of thought
119+
- `tool_call`: the Surreal function invocation (e.g., `get_transitive_dependencies`)
120+
- `analysis`: markdown/JSON summary
121+
- `components` + `file_locations`: structured references with file paths + line numbers
112122

113-
# Set configuration values
114-
codegraph config set embedding_model openai
115-
codegraph config set vector_dimension 1536
123+
## Configuration + Environment Variables
116124

117-
# Validate configuration
118-
codegraph config validate
119-
```
125+
| Variable | Purpose |
126+
|----------------------------------------|----------------------------------------------------------------------------|
127+
| `CODEGRAPH_SURREALDB_URL` (+ namespace/db/user/password) | Points the server at your SurrealDB instance. |
128+
| `CODEGRAPH_LLM_PROVIDER` | `ollama`, `openai`, `anthropic`, `xai`, etc. |
129+
| `CODEGRAPH_EMBEDDING_PROVIDER` | Chooses embedding backend (see Cargo feature flags above). |
130+
| `MCP_CODE_AGENT_MAX_OUTPUT_TOKENS` | Hard override for AutoAgents’ final response length. |
131+
| `CODEGRAPH_HTTP_HOST` / `CODEGRAPH_HTTP_PORT` | Used by the HTTP server & test harnesses. |
120132

121-
### Statistics
133+
For advanced tuning (batch sizes, prompt overrides, tier thresholds) see
134+
`crates/codegraph-mcp/src/context_aware_limits.rs` and the prompt files under `src/prompts/`.
122135

123-
```bash
124-
# Show all statistics
125-
codegraph stats
136+
## Testing
126137

127-
# Index statistics only
128-
codegraph stats --index --format json
129-
```
138+
- `python test_agentic_tools.py` – exercises all tools over STDIO.
139+
- `python test_agentic_tools_http.py` – same via HTTP transport (outputs logs to `test_output_http/`).
140+
- `python test_http_mcp.py` – minimal MCP smoke test for custom HTTP clients.
141+
- `cargo test -p codegraph-mcp --features "ai-enhanced"` – Rust-level tests for orchestrator pieces.
130142

131-
### Project Initialization
143+
Every successful run drops JSON logs into `test_output/` (STDIO) or
144+
`test_output_http/` (HTTP) so you can diff reasoning traces between commits.
132145

133-
```bash
134-
# Initialize new project
135-
codegraph init --name my-project
136-
```
146+
## Observability Tips
137147

138-
### Cleanup
148+
- Set `--debug` when starting the server to tee AutoAgents traces into
149+
`~/.codegraph/logs`.
150+
- Each tool emits structured output; store them if you need regressions.
151+
- The Python harness prints timing data (e.g., 26.6s for `agentic_code_search`) so
152+
you can monitor throughput after embedding/provider changes.
139153

140-
```bash
141-
# Clean all resources
142-
codegraph clean --all --yes
143-
```
154+
## Need More?
144155

145-
For more detailed documentation, run `codegraph --help` or `codegraph <command> --help`.
156+
- See `docs/faiss_rocksdb_deprecation_plan.md` for the Surreal-only roadmap.
157+
- `TESTING.md` documents the recommended feature flags for Ollama / OpenAI / Anthropic setups.
158+
- The `codegraph-api` crate exposes the same Surreal graph via GraphQL/REST if you need to build custom dashboards.

0 commit comments

Comments
 (0)