coregraph builds an in-memory code symbol graph from your repo (tree-sitter for
symbol extraction, stack-graphs for cross-file name resolution) and lets you query
callers, callees, impact, dead code, and cross-language inconsistencies. Every edge
carries a confidence score so you (or an LLM) know how much to trust it.
This page is the full command reference. For the model behind the numbers see confidence.md and graph-model.md.
# Install the CLI (puts `coregraph` on your PATH)
npm install -g @coregraph/cli
# 1. Index the repo (builds the symbol graph)
coregraph index --stats
# 2. Who calls this function?
coregraph query compute_impact --direction incoming --edge-kind calls --hop-limit 1
# 3. What breaks if I change it?
coregraph impact build_router --risk
# 4. What looks dead?
coregraph orphans --exclude-testsindex --stats prints something like:
coregraph: skipped 1 minified/generated file(s) (e.g. ./vscode-extension/media/cytoscape.min.js)
Index complete — 281 files, 3396 symbols, 21342 edges (2337ms)
The first query auto-starts a background daemon and reuses the cached graph for every later command, so subsequent queries are fast. You never have to start the daemon by hand (see Daemon auto-start).
| Command | What it does |
|---|---|
index |
Index source files and build the symbol graph |
query |
Query a symbol's neighbors (callers, callees, references) |
inspect |
Show the symbol at FILE:LINE with surrounding source |
stats |
Graph statistics (--breakdown for histograms) |
orphans |
List orphan symbols (dead-code candidates) |
impact |
Impact analysis for a symbol (--risk adds scoring) |
diff |
Impact of a git diff: which symbols a change reaches |
review |
Auto-comment a GitHub PR with the diff impact summary |
inconsistencies |
Detect cross-enum / api-path / config-key / doc-drift issues |
export |
Export the graph as dot, cypher, or json-graph |
snapshot |
save / load a binary snapshot |
config |
init / show / unset / path configuration |
server |
Daemon mgmt: start stop status restart install uninstall |
lsp |
LSP stdio bridge (IDE code intelligence) |
mcp |
MCP stdio bridge (LLM agent tools) |
watch |
Watch files and rebuild the graph |
batch |
Run multiple queries from a JSON file |
plugin |
Manage plugin hooks (list / run) |
These apply to every subcommand.
| Option | Default | Meaning |
|---|---|---|
-C, --project <PATH> |
. |
Project root directory |
-c, --config <PATH> |
platform config dir + coregraph/config.toml |
Config file path |
--output-format <FMT> |
human |
human | llm | json |
--color <WHEN> |
auto |
auto | always | never |
--token-budget <N> |
8000 |
Max tokens for LLM-shaped output |
--hop-limit <N> |
3 |
Max graph traversal depth |
--min-confidence <F> |
0.7 |
Minimum edge confidence filter, 0.0–1.0 |
--include-stale |
off | Include stale nodes/edges in results |
--lang <LANG> |
— | Filter by language; repeatable (java, rust, …) |
-v, --verbose |
— | Verbose logging |
-q, --quiet |
— | Errors only |
--log-level <LEVEL> |
info |
trace | debug | info | warn | error |
--no-auto-start |
off | Don't spawn the daemon; build in-process instead |
Confidence is kind-base × origin-base, so the cutoff depends on the edge kind as
well as its origin (the filter is a strict less-than). At the default 0.70,
SyntaxMatched imports (0.7225) and calls (0.765) survive and all PatternMatched
guesses (0.60 baseline) are dropped — but it is not true that every non-pattern
edge passes: SyntaxMatched references/generic-param (0.68) and string-match
(0.595), plus low-base kinds like DependsOn/Configures, also fall below the
cutoff. At 0.90 only kind-base ≈1.0 edges (Resolves/Contains/BelongsTo/
Documents) remain, so even CompilerDerived calls (0.891) are dropped.
| Value | Keeps |
|---|---|
0.0 |
The full unfiltered graph |
0.70 (default) |
SyntaxMatched imports/calls and above; drops pattern guesses and low-base kinds |
0.85 |
Keeps Resolves and higher kind-base edges; drops most SyntaxMatched |
0.90 |
Keeps only top kind-base edges (e.g. Resolves/Contains) |
See confidence.md for how these numbers are computed.
Presets are shorthand for common flag combinations — mostly less typing, though
--fast and --full also adjust --min-confidence (see below).
| Preset | Expands to | Use for |
|---|---|---|
--fast |
--min-confidence 0.9 --hop-limit 1 --token-budget 2000 |
Quick one-hop lookups |
--standard |
the defaults above (no-op) | Everyday use |
--full |
--min-confidence 0.0 --hop-limit 5 --include-stale --token-budget 16000 |
Deep analysis, refactoring |
--fast tightens confidence to 0.9 so only top kind-base edges survive, and
--full drops confidence to 0.0 so even PatternMatched edges are admitted.
Preset precedence: a preset only fills fields still at their clap defaults, so an
explicit flag always wins (e.g. --fast --hop-limit 10 keeps 10). The two
exceptions are --min-confidence under --fast/--full: --fast force-overrides
--min-confidence to 0.9 even when you pass 0.7 or 0.85, and --full
force-overrides it to 0.0 when it is at 0.70 or 0.85.
coregraph query UserController --fast # 1 hop, high confidence only
coregraph impact CardService --full # deep analysis, stale data included| Format | For | Notes |
|---|---|---|
human (default) |
Terminal reading | Colors, box-drawing, an interactive pager |
llm |
Feeding an LLM | Token-efficient Markdown; uncertainty tagged inline |
json |
Scripts / pipelines | Stable schema; see JSON shape |
coregraph index [OPTIONS]
Indexes the project and builds the graph. The CLI index command performs a full
rebuild on every run — it re-extracts every file and does not load or diff a
snapshot. (Incremental invalidate+rebuild exists only in the daemon's file-watch
path; see watch.) No external toolchain install is required.
| Flag | Meaning |
|---|---|
--full |
Accepted but a no-op — the CLI always reindexes everything (it is only echoed into the JSON output) |
--dry-run |
Detect changes only; don't rebuild the graph. The baseline is git diff --name-only HEAD (uncommitted changes), so committed-but-unindexed changes report 0 |
--stats |
Print file/symbol/edge counts and elapsed time (also printed under --verbose) |
--snapshot <PATH> |
Also write the resulting graph to this snapshot file (write-only; index never reads it back) |
How the graph is built:
- tree-sitter extracts symbol nodes for every supported language. Syntactic
matches are recorded as
SyntaxMatched(confidence ~0.85). - stack-graphs resolves cross-file names into
Resolvesedges (NameResolved, confidence ~0.95). This now covers all seven code languages (see Languages). - Structurally certain edges the extractor observes directly (file→symbol
Contains, symbol→moduleBelongsTo) are recorded asCompilerDerived(confidence 0.99).
coregraph query <SYMBOL> [OPTIONS]
Looks up a symbol by exact name or substring and shows its graph neighborhood.
| Flag | Default | Meaning |
|---|---|---|
--kind <KIND> |
— | Filter the center symbol's kind (see kinds) |
--direction <DIR> |
both |
incoming | outgoing | both |
--edge-kind <KIND> |
— | Filter by edge kind; repeatable (see edge kinds) |
--depth <N> |
— | Traversal depth; overrides the global --hop-limit |
--aggregate |
off | Union the neighborhood across every same-name definition (recall over precision) |
--page-size <N> |
50 |
Page size (combined hard cap with the token budget) |
--cursor <TOKEN> |
— | Opaque pagination cursor from a previous response |
--expand <NODE_ID> |
— | Drill into one node id for detailed context |
--no-heal |
off | Skip on-demand healing before the query |
The clean way to list callers is to narrow direction and edge kind:
coregraph query compute_impact --direction incoming --edge-kind calls --hop-limit 1── query: compute_impact ──────────────────────────────────
✓ compute_impact [crates/query/src/impact.rs:27]
kind: Function | package: query (cargo)
Incoming (14):
├── calls ← run [Function] @ crates/cli/src/commands/diff.rs [0.85] ✓
├── calls ← run [Function] @ crates/cli/src/commands/impact.rs [0.85] ✓
├── calls ← cached_impact [Function] @ crates/cli/src/dispatch.rs [0.85] ✓
├── calls ← api_impact [Function] @ crates/server/src/handlers.rs [0.85] ✓
└── ... (14 total)
✓ trust: all paths verified
── page 1/1 | 14 edges total | budget: 506/5600 tokens ──
[n]ext page | [e]xpand <id> | [f]ilter --edge-kind | [q]uit
The last two lines are the interactive pager (human format only). When results
span more than one page you can press n for the next page, e <id> to expand a
node, f to filter by edge kind, or q to quit. The budget readout (506/5600)
shows tokens used against the effective budget — the effective budget is the
advertised --token-budget scaled by a 0.7 safety margin, since token counts are
estimated from byte counts rather than a real tokenizer.
queryis the command that takes--depth. The depth flag onimpactis named--max-depth— they are not interchangeable.
coregraph inspect <FILE:LINE> [OPTIONS]
Shows the symbol(s) covering a source location, with surrounding code.
| Flag | Default | Meaning |
|---|---|---|
--context-lines <N> |
5 |
Surrounding source lines to include |
coregraph inspect crates/query/src/impact.rs:33── inspect: crates/query/src/impact.rs:33 ──
compute_impact [Function] bytes 1128..3581
doc::compute_impact [DocComment] bytes 531..1128
31 /// this repo's graph via shared callees), not an impact measure. What X itself
32 /// depends on (outgoing) does not break when X changes.
→ 33 pub fn compute_impact(graph: &SymbolGraph, seed_id: SymbolId, max_depth: usize) -> ImpactResult {
34 let mut visited: HashSet<SymbolId> = HashSet::new();
coregraph stats [OPTIONS]
| Flag | Default | Meaning |
|---|---|---|
--breakdown |
off | Symbol/edge kind histograms, per-crate counts, top in-degree symbols, heaviest files |
--top <N> |
20 |
Top-N cut-off for breakdown lists |
coregraph statssymbols: 3396
edges: 21357
stats --breakdown --top 8 adds histograms:
Indexed 281 files
symbols: 3396
edges: 21342
## Symbol kinds
Function 1191
DocComment 593
Method 459
...
## Edge kinds
Resolves 7669
Calls 4365
Contains 2262
...
## Analysis origins
SyntaxMatched 9237
NameResolved 6699
CompilerDerived 4524
PatternMatched 861
ConventionInferred 21
coregraph orphans [OPTIONS]
Lists symbols with no incoming or outgoing edges — dead-code candidates.
| Flag | Default | Meaning |
|---|---|---|
--public-only[=true|false] |
true |
Report only public symbols; pass --public-only=false to also include private ones (higher-confidence dead code) |
--exclude-tests |
off | Exclude symbols from test files/directories |
coregraph orphans --exclude-testsOrphan symbols (12): 7 likely dead, 5 library API surface, 0 test code
as_kebab [Method] — crates/cli/src/commands/query.rs
strip_api_path_prefix [Function] — crates/extractor/src/string_literal_extractor.rs [library API]
unregister [Method] — crates/graph/src/hooks.rs [library API]
outputChannel [Constant] — vscode-extension/src/extension.ts
A [library API] tag marks a public orphan whose file belongs to a package the
LibraryClassifier classifies as a library (from its manifest — Cargo/npm/etc.):
it has no internal callers but may be called from outside the package, so it is
lower-confidence "dead." Public orphans in application packages, or in packages the
classifier can't decide, are reported untagged as likely dead; when no manifest
signal exists at all nothing is tagged and the human output instead appends a note
that public orphans may be external API if the project is a library. Override the
classification project-wide with [project] kind = "library"|"application" in
.coregraph/config.toml (see
manifest-parser.md).
coregraph impact <SYMBOL> [OPTIONS]
Computes which symbols are reachable from (affected by) a change to <SYMBOL>.
| Flag | Default | Meaning |
|---|---|---|
--max-depth <N> |
5 |
Maximum impact propagation depth — applies only with --transitive; otherwise it is ignored and the effective depth is the global --hop-limit (default 3) |
--transitive |
off | Compute the transitive closure (runs BFS at --max-depth) |
--risk |
off | Add confidence-weighted risk scoring |
The depth flag here is
--max-depth, not--depth. Note that a defaultimpactrun (without--transitive) ignores--max-depthand uses the global--hop-limit.
--risk adds a blast-radius score, a confidence-weighted impact total, and the set
of affected tests:
coregraph impact build_router --riskImpact of 'build_router': 1251 reachable symbols, 1251 edges, depth 3
Risk Score: 0.96 (Critical)
Blast Radius: Critical (16 modules, 910 callers)
Confidence-Weighted Impact: 653.500
Affected tests: 334
test_app (distance 2, path_confidence 0.90) — ./crates/server/src/handlers.rs
create_app_returns_router (distance 2, path_confidence 0.90) — ./crates/server/src/lib.rs
... (more affected tests)
post [Method] — ./crates/graph/src/hooks.rs
create_app [Function] — ./crates/server/src/lib.rs
... (reachable symbols listed)
The risk score blends visibility, direct-caller count, module spread, and impact kind into a 0–1 value classified Low / Medium / High / Critical. See graph-model.md for the exact weights and thresholds.
coregraph diff <BASE> [OPTIONS]
Maps a git diff onto the graph: which symbols the change touches, and which symbols those reach.
| Flag | Default | Meaning |
|---|---|---|
--to <REF> |
HEAD |
Compare this ref instead of the working tree |
--max-depth <N> |
global --hop-limit |
Impact propagation depth from each touched symbol |
--exclude-tests |
off | Skip symbols defined under test directories |
coregraph diff HEAD~1 --exclude-testscoregraph: skipped 1 minified/generated file(s) (e.g. vscode-extension/media/cytoscape.min.js)
Diff HEAD~1..HEAD: 52 file(s), 974 touched symbol(s), 1659 reachable (depth 3)
• reindex_latency.rs [File] @ crates/cli/examples/reindex_latency.rs
• main [Function] @ crates/cli/examples/reindex_latency.rs
… and 954 more
coregraph review --pr <N> [OPTIONS]
Posts (or prints) a GitHub PR comment summarizing the diff impact. The PR number is
inferred from the repo via gh pr view; the gh CLI must be authenticated.
| Flag | Default | Meaning |
|---|---|---|
--pr <N> |
required | PR number in the current repo |
--dry-run |
off | Print the comment body to stdout instead of posting |
--max-depth <N> |
3 |
Impact propagation depth from each touched symbol |
--exclude-tests |
off | Skip symbols defined under test directories |
coregraph review --pr 42 --dry-runcoregraph inconsistencies [OPTIONS]
Detects cross-language drift the compiler can't see.
| Flag | Meaning |
|---|---|
--category <CAT> |
Restrict to one category |
Categories:
| Category | Detects |
|---|---|
enum-mismatch |
The same value appearing under different enums/roles |
api-path |
The same API path declared in places that disagree |
config-key |
Config keys that don't line up across files |
doc-drift |
A @param/:param naming a parameter the signature no longer has (JS/TS/Java/Python) |
coregraph inconsistenciesInconsistencies (63):
[enum-mismatch] 'admin' appears in:
- Permission.ADMIN (./tests/e2e/golden/04-inconsistencies/src/permissions.ts)
- Role.ADMIN (./tests/e2e/golden/04-inconsistencies/src/roles.ts)
[api-path] /a.rs vs /b.rs
...
On a repo that contains test fixtures (like this one) the raw output includes
fixture noise. Narrow with --category to focus.
coregraph export [OPTIONS]
| Flag | Default | Meaning |
|---|---|---|
--format <FMT> |
dot |
dot | cypher | json-graph |
--subgraph <SYMBOL> |
— | Restrict to a subgraph centered on this symbol (+--hop-limit hops) |
coregraph export --format dot --subgraph build_router > graph.dotcoregraph snapshot save --out <PATH>
coregraph snapshot load <FILE>
save indexes the project and writes a binary snapshot; load reads one back and
prints its summary. Snapshots are bincode blobs (schema v6).
| Subcommand | Flag/arg | Meaning |
|---|---|---|
save |
-o, --out <PATH> (required) |
Output snapshot path |
load |
<FILE> (positional) |
Snapshot file to load |
coregraph config <init|show|unset|path>
| Subcommand | Meaning |
|---|---|
init |
Create a default config file at the configured path |
show |
Print the effective config (on-disk values + defaults) |
unset <KEY> |
Remove a key |
path |
Print the config file path |
A legacy positional form (coregraph config <KEY> <VALUE>) still works for writing
a single key.
coregraph config showGlobal config: ~/Library/Application Support/coregraph/config.toml
Project config: ./.coregraph/config.toml
limits.token_budget = 8000 [project]
# Default token budget for LLM output
limits.hop_limit = 3 [project]
# Default graph traversal depth
limits.min_confidence = 0.7 [project]
# Default minimum edge confidence (matches clap default)
server.max_loaded_projects = 5 [project]
# Maximum projects held in the daemon cache (LRU eviction above this)
server.graceful_shutdown_sec = 30 [project]
# Seconds the daemon waits for in-flight queries before hard-exit on SIGTERM
Per-project config lives at <project>/.coregraph/config.toml (created on first
index); global config under the platform config directory (~/Library/Application Support/coregraph/config.toml on macOS, $XDG_CONFIG_HOME/coregraph/config.toml
or ~/.config/coregraph/config.toml on Linux). Config files use
[limits], [server], and [index] sections — [index] exclude = [...] accepts
gitignore-style patterns.
coregraph server <start|stop|status|restart|install|uninstall> [OPTIONS]
Manage the background daemon directly. You normally don't need this — queries auto-start the daemon — but it's here for explicit control and HTTP exposure.
| Subcommand | Meaning |
|---|---|
start |
Start the daemon (detached by default) |
stop |
Stop the running daemon (SIGTERM + drain) |
status |
Show daemon status (add --json for machine output) |
restart |
Stop + start in one command |
install |
Register the daemon as an OS service (launchd on macOS, systemd on Linux) |
uninstall |
Remove the OS service registration |
server start options:
| Flag | Default | Meaning |
|---|---|---|
--http [<ADDR>] |
off | Also expose an HTTP API; bare --http binds 127.0.0.1:27787 |
--allow-external |
off | Allow binding to non-localhost interfaces |
--foreground |
off | Run in the foreground (the process is the daemon itself) |
--auto-stop-minutes <N> |
30 |
Self-terminate after N idle minutes; 0 disables. Only honored with --foreground — on the detached path (start/restart) the flag is not forwarded to the spawned daemon, so any non-default value (including 0) is silently discarded and the daemon uses the default 30 |
coregraph server start --http # bind 127.0.0.1:27787
coregraph server start --http 127.0.0.1:9120
coregraph server status --jsoncoregraph watch [OPTIONS]
Watches the project and rebuilds the graph on change.
| Flag | Meaning |
|---|---|
--diff |
Show the graph diff (before/after) on each rebuild |
--no-incremental |
Force a full rebuild on each change instead of incremental invalidate+heal |
coregraph batch <QUERIES_FILE> [OPTIONS]
Runs many symbol queries from a single JSON file (an array of names) in one
in-process invocation: it builds the graph locally via build_graph and does not
contact the daemon, so the daemon's cached graph is not reused. (Daemon-cached
batched queries exist only as the HTTP server's POST /batch endpoint.)
["compute_impact", "build_router", "query_symbol"]coregraph batch queries.jsonbatch always prints pretty JSON; it ignores --output-format.
coregraph plugin <list|run>
| Subcommand | Meaning |
|---|---|
list |
List all registered plugin hooks |
run |
Dry-run the default registry against a directory, firing all pre/post hooks |
coregraph lsp # LSP stdio bridge — your IDE launches this
coregraph mcp # MCP stdio bridge — your LLM client launches this
Both are lightweight stdio bridges: they connect to the daemon over IPC (starting it if needed) and translate protocol. The daemon holds the graph; the bridge just relays. See Integrations.
Thin-client commands (query, impact, orphans, inconsistencies, stats,
diff, …) connect to a background daemon over IPC — a Unix domain socket on
macOS/Linux, a named pipe (\\.\pipe\coregraph-<user>) on Windows. On the first
command:
coregraph query build_router
│
├─ IPC socket present? ── yes ─→ connect → send query → return result
│
└─ no → auto-start enabled?
├─ yes → spawn the daemon (detached), wait for the socket, then query
└─ no → fall back to an in-process build_graph for this one command
The daemon caches the graph so later queries skip the rebuild. It evicts projects
beyond server.max_loaded_projects (default 5, LRU) and self-terminates after
--auto-stop-minutes of full idleness (default 30).
To suppress auto-start:
--no-auto-start— one command, build in-process insteadCOREGRAPH_NO_AUTO_START=1— for the whole session
--kind accepts: function, method, class, struct, interface, trait,
enum, enum-variant, constant, variable, field, type-alias, module,
namespace, config-key, string-literal, package, external-package.
--edge-kind accepts: resolves, calls, implements, extends, overrides,
references, imports, string-match, configures, depends-on.
Symbol extraction (tree-sitter): Rust, Java, TypeScript, JavaScript, Go,
Python. Kotlin symbol extraction is regex-based (tree-sitter-kotlin-ng is used
only in the stack-graphs resolution backend, via the hand-authored kotlin.tsg).
Config files (YAML / TOML / JSON / .properties → ConfigKey nodes) are parsed
with serde/toml parsers, and Markdown (the documentation layer) with a regex line
scanner — neither uses tree-sitter.
Cross-file name resolution (stack-graphs): all seven code languages.
- Upstream stack-graphs rules: Java, TypeScript, JavaScript, Python
- Hand-authored
.tsgrules (crates/stack/rules/{go,rust,kotlin}.tsg): Go, Rust, Kotlin
Resolution falls back to tree-sitter syntactic matching only when a language has no rules at all, or when resolution produces no binding.
--output-format json produces a stable shape. A trimmed query example:
{
"query": "compute_impact",
"center": {
"id": 1296, "name": "compute_impact", "kind": "Function",
"file": "crates/query/src/impact.rs", "span_start": 926, "span_end": 2903,
"context": { "package": "query (cargo)", "generated": false, "generator": null }
},
"edges": [
{
"direction": "incoming", "kind": "calls", "depth": 1,
"other_id": 40, "other_name": "run",
"confidence": 0.8549999594688416,
"trust": "NameResolved", "origin": "NameResolved",
"trust_model": "SourceEvidenced",
"stale_evidence_count": 0, "current_confidence": 0.95
}
]
}Each edge carries confidence (computed at index time), origin/trust
(how the edge was derived), trust_model, stale_evidence_count, and
current_confidence (after decay). See confidence.md.
coregraph mcp is a JSON-RPC stdio server exposing initialize, tools/list, and
tools/call. It exposes exactly five tools (plain names, no prefix):
| Tool | Input | Returns |
|---|---|---|
query |
{ "name": string } |
Symbols matching the name |
impact |
{ "name": string, "depth": integer = 5 } |
Transitive impact for a symbol name |
orphans |
{} |
Symbols with no incoming or outgoing edges |
inconsistencies |
{} |
Cross-enum value mismatches |
stats |
{} |
Graph summary: nodes, edges, file count |
Register it with a Claude Code .mcp.json (or claude_desktop_config.json):
{ "mcpServers": { "coregraph": { "command": "coregraph", "args": ["mcp"] } } }coregraph lsp is a stdio LSP bridge. It advertises:
| Capability | Request |
|---|---|
definitionProvider |
textDocument/definition |
referencesProvider |
textDocument/references |
workspaceSymbolProvider |
workspace/symbol |
Start it with coregraph server start --http [ADDR]. The default bind is
127.0.0.1:27787 (off the common 8080/8000/3000 band). Use --allow-external to
bind a non-localhost interface.
| Method | Route | Params / body | Returns |
|---|---|---|---|
GET |
/health |
— | { status, version, symbol_count } |
POST |
/query |
{ name, limit=50 } |
{ name, count, symbols[] } |
POST |
/batch |
{ queries: [name, …] } |
{ results: [...] } |
GET |
/api/query |
?symbol=&page=0&page_size=50&budget=8000 |
{ query, matches[], pagination, budget } |
GET |
/api/expand |
?node=<id>&budget=2000 |
{ node, incoming[], outgoing[], budget } |
GET |
/api/impact |
?symbol=&depth=5 |
{ symbol, depth, reachable_count, edge_count, nodes[] } |
GET |
/api/source |
?file=&line=1&context=5 |
{ file, target_line, context_lines, total_lines, snippet[] } |
curl http://127.0.0.1:27787/health
curl 'http://127.0.0.1:27787/api/query?symbol=SymbolGraph&page=0&page_size=50'