From e13b8ef09df4c57c6bc6ab7f9c4b2ed34c614c24 Mon Sep 17 00:00:00 2001 From: Laith Al-Saadoon <9553966+theagenticguy@users.noreply.github.com> Date: Fri, 29 May 2026 16:53:51 -0500 Subject: [PATCH] docs(repo): clarify `sql` targets the temporal store, not the node/edge graph MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit `codehub sql` and the MCP `sql` tool's `sql` arg run against the DuckDB temporal store (cochanges + symbol_summaries). The node/edge graph moved to lbug in ADR 0016 and is reached via the typed tools or Cypher. Docs across the repo still said "SQL against the graph store", and the opencodehub-guide skill shipped a whole "Graph schema" + "SQL cheat-sheet" of `SELECT FROM nodes/relations` queries that ERROR against the current store (field-report Issue 4). Reworded every user-facing site: CLAUDE.md, AGENTS.md, packages/cli/README.md, the `sql` --help in cli/src/index.ts, cli/src/agent-context.ts, docs reference/cli.md + tool-decision-matrix.mdx, and both copies of the opencodehub-guide SKILL.md (.claude + plugins). The SKILL cheat-sheet is rewritten to REAL lbug Cypher: single `:CodeNode` label with `kind` as a property, snake_case props, per-type relationship labels, plus a small temporal-SQL cochanges example. Also fixed two MCP next-step hints (dependencies, list-findings) that told users to run a relations SELECT — now a Cypher MATCH on the relationship label. No production query behavior changes — guidance and strings only. Typecheck and biome clean. A pre-existing, unrelated mcp unit test ("impact surfaces cochanges") fails on clean main from cochange-harness drift since the #117 DuckDB-graph rip; not in any path this PR touches and not a required check. Tracked separately. --- .claude/skills/opencodehub-guide/SKILL.md | 107 ++++++++++++------ AGENTS.md | 2 +- CLAUDE.md | 2 +- packages/cli/README.md | 2 +- packages/cli/src/agent-context.ts | 2 +- packages/cli/src/index.ts | 4 +- .../docs/agents/tool-decision-matrix.mdx | 2 +- .../docs/src/content/docs/reference/cli.md | 6 +- packages/mcp/src/tools/dependencies.ts | 2 +- packages/mcp/src/tools/list-findings.ts | 2 +- .../skills/opencodehub-guide/SKILL.md | 106 +++++++++++------ 11 files changed, 156 insertions(+), 81 deletions(-) diff --git a/.claude/skills/opencodehub-guide/SKILL.md b/.claude/skills/opencodehub-guide/SKILL.md index d47419dc..dd420fa2 100644 --- a/.claude/skills/opencodehub-guide/SKILL.md +++ b/.claude/skills/opencodehub-guide/SKILL.md @@ -5,7 +5,7 @@ description: "Use when the user asks about OpenCodeHub itself — available MCP # OpenCodeHub Guide -Quick reference for every OpenCodeHub MCP tool, MCP resource, and the DuckDB-backed graph schema. +Quick reference for every OpenCodeHub MCP tool, MCP resource, and the graph + temporal store schema. ## Always Start Here @@ -40,6 +40,7 @@ for the scope rationale. | Draft a PR description from the current diff | `codehub-pr-description` | "write the PR description", "summarize this branch" | | Write an onboarding guide with reading order | `codehub-onboarding` | "write ONBOARDING.md", "what should a new hire read first" | | Map inter-repo contracts for a group | `codehub-contract-map` | "map the contracts", "show the contract matrix for " | +| Build a deterministic 9-item code-pack BOM | `codehub-code-pack` | "pack this repo for an LLM", "deterministic code pack", "pack the platform group" | | Draft an ADR (P1 — not yet shipped) | `codehub-adr` *(P1 backlog)* | — | Fire these directly; do not nest them inside analysis skills. Each is a @@ -57,7 +58,7 @@ standalone artifact producer with its own preconditions and output path. | `mcp__opencodehub__impact` | Blast radius with risk tier + `confidenceBreakdown` | | `mcp__opencodehub__detect_changes` | Map an uncommitted or committed diff to affected symbols and flows | | `mcp__opencodehub__rename` | Graph-assisted multi-file rename; dry-run by default | -| `mcp__opencodehub__sql` | Read-only DuckDB SQL against the graph (5 s timeout) | +| `mcp__opencodehub__sql` | Read-only query: `sql` arg → temporal DuckDB (cochanges/summaries); `cypher` arg → lbug graph (5 s timeout) | | `mcp__opencodehub__signature` | Function signature lookup for a target symbol | ### HTTP / RPC surface @@ -111,63 +112,97 @@ Lightweight reads for navigation (every URI uses the `codehub://` scheme): | `codehub://repo/{name}/context` | Stats + staleness envelope | | `codehub://repo/{name}/schema` | Live node kinds / relation types for `sql` | -> Cluster and process navigation resources (`codehub://repo/{name}/clusters`, `codehub://repo/{name}/processes`, etc.) are slated for a later wave. Use `sql` against the `nodes` table filtered to `kind = 'Community'` or `kind = 'Process'` in the meantime. +> Cluster and process navigation resources (`codehub://repo/{name}/clusters`, `codehub://repo/{name}/processes`, etc.) are slated for a later wave. Until then, use the typed tools or Cypher (below) filtered to `kind = 'Community'` / `kind = 'Process'`. -## Graph schema +## Where the graph lives (ADR 0016) -The graph is a DuckDB-backed store. One unified `nodes` table, one `relations` table, an `embeddings` table, a `cochanges` side table, and `store_meta`. +There are **two stores**, and they are queried differently: -**Node kinds** (load-bearing order — new kinds are appended): -File, Folder, Function, Class, Method, Interface, Constructor, Struct, Enum, Macro, Typedef, Union, Namespace, Trait, Impl, TypeAlias, Const, Static, Variable, Property, Record, Delegate, Annotation, Template, Module, CodeElement, Community, Process, Route, Tool. +- **Graph tier — `graph.lbug`** (ladybug, Cypher dialect). Holds nodes, edges, + and embeddings. Query it via the typed tools (`query` / `context` / `impact` / + `route_map` / …) or, for bespoke questions, **Cypher** via the MCP `sql` + tool's `cypher` argument. There is NO `nodes` or `relations` SQL table. +- **Temporal tier — `temporal.duckdb`** (DuckDB SQL). Holds only the + `cochanges` and `symbol_summaries` tables. The `sql` argument of the MCP + `sql` tool (and `codehub sql` on the CLI) targets THIS store. -**Relation types** (append-only): -CONTAINS, DEFINES, IMPORTS, CALLS, EXTENDS, IMPLEMENTS, HAS_METHOD, HAS_PROPERTY, ACCESSES, METHOD_OVERRIDES, OVERRIDES, METHOD_IMPLEMENTS, MEMBER_OF, PROCESS_STEP, HANDLES_ROUTE, FETCHES, HANDLES_TOOL, ENTRY_POINT_OF, WRAPS, QUERIES, REFERENCES, FOUND_IN, DEPENDS_ON, OWNED_BY. +Pass exactly one of `sql` (temporal DuckDB) or `cypher` (lbug graph) to the MCP +`sql` tool. -Cochange edges live in a **separate `cochanges` table**, NOT in `relations`. Do not query `relations` for them. +### Graph schema (lbug / Cypher) -## SQL cheat-sheet (use `mcp__opencodehub__sql`) +One node label `CodeNode` carrying `kind` as a **property** (NOT a per-kind +label). One relationship table per relation type. Properties are **snake_case** +(`file_path`, `start_line`, `inferred_label`, `step_count`, `entry_point_id`); +a camelCase RETURN alias comes back as the alias you give it, but the stored +property names are snake_case. + +**Node kinds** (`n.kind` values): File, Folder, Function, Class, Method, +Interface, Constructor, Struct, Enum, Macro, Typedef, Union, Namespace, Trait, +Impl, TypeAlias, Const, Static, Variable, Property, Record, Delegate, +Annotation, Template, Module, CodeElement, Community, Process, Route, Tool, +Finding, Dependency, Contributor, Repo, ProjectProfile, Section. + +**Relationship types** (each is its own edge label): CONTAINS, DEFINES, IMPORTS, +CALLS, EXTENDS, IMPLEMENTS, HAS_METHOD, HAS_PROPERTY, ACCESSES, METHOD_OVERRIDES, +OVERRIDES, METHOD_IMPLEMENTS, MEMBER_OF, PROCESS_STEP, HANDLES_ROUTE, FETCHES, +HANDLES_TOOL, ENTRY_POINT_OF, WRAPS, QUERIES, REFERENCES, FOUND_IN, DEPENDS_ON, +OWNED_BY. + +Cochanges live only in the **temporal** `cochanges` table (DuckDB SQL), never as +graph edges. + +## Cypher cheat-sheet (MCP `sql` tool, `cypher` arg) All inbound callers of a function by name: -```sql -SELECT caller.name, caller.file_path, caller.start_line, r.confidence, r.reason -FROM relations r -JOIN nodes caller ON caller.id = r.from_id -JOIN nodes callee ON callee.id = r.to_id -WHERE r.type = 'CALLS' - AND callee.name = 'validateUser' - AND callee.kind = 'Function' +```cypher +MATCH (caller:CodeNode)-[r:CALLS]->(callee:CodeNode) +WHERE callee.name = 'validateUser' AND callee.kind = 'Function' +RETURN caller.name AS name, caller.file_path AS file, caller.start_line AS line, + r.confidence AS confidence, r.reason AS reason ORDER BY r.confidence DESC -LIMIT 50; +LIMIT 50 ``` Top communities by cohesion: -```sql -SELECT name, inferred_label, cohesion, symbol_count, keywords -FROM nodes -WHERE kind = 'Community' -ORDER BY cohesion DESC -LIMIT 20; +```cypher +MATCH (n:CodeNode) +WHERE n.kind = 'Community' +RETURN n.name AS name, n.inferred_label AS label, n.cohesion AS cohesion, + n.symbol_count AS symbols +ORDER BY n.cohesion DESC +LIMIT 20 ``` Process entry points: -```sql -SELECT n.name, n.inferred_label, n.step_count, entry.name AS entry_point -FROM nodes n -LEFT JOIN nodes entry ON entry.id = n.entry_point_id +```cypher +MATCH (n:CodeNode) WHERE n.kind = 'Process' -ORDER BY n.step_count DESC; +RETURN n.name AS name, n.inferred_label AS label, n.step_count AS steps, + n.entry_point_id AS entry_point +ORDER BY n.step_count DESC +``` + +SCIP-confirmed CALLS edges only (strict impact): + +```cypher +MATCH ()-[r:CALLS]->() +WHERE r.confidence >= 0.95 AND r.reason STARTS WITH 'scip:' +RETURN r ``` -SCIP-confirmed edges only (for strict impact queries): +### Temporal SQL cheat-sheet (MCP `sql` tool, `sql` arg) + +Tightest co-change pairs (DuckDB SQL — temporal store): ```sql -SELECT from_id, to_id, type, reason -FROM relations -WHERE confidence >= 0.95 - AND reason LIKE 'scip:%'; +SELECT source_file, target_file, lift, cocommit_count +FROM cochanges +ORDER BY lift DESC +LIMIT 20; ``` ## Invariants agents must respect diff --git a/AGENTS.md b/AGENTS.md index b01398bf..bc86bf5b 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -14,7 +14,7 @@ tiers. - `impact` — dependents of a target up to a configurable depth, with a risk tier. - `detect_changes` — map an uncommitted or committed diff to affected symbols. - `rename` — graph-assisted multi-file rename; dry-run is the default. -- `sql` — read-only SQL against the local graph store with a 5 s timeout. +- `sql` — read-only SQL against the local temporal store (the `cochanges` and `symbol_summaries` tables), 5 s timeout. The node/edge graph lives in `graph.lbug` (ADR 0016) and is reached via the typed tools (`query`/`context`/`impact`) or Cypher via the MCP `sql` tool's `cypher` arg — NOT via this SQL path. Run `codehub analyze` after pulling new commits so the index stays aligned with the working tree. `codehub status` reports staleness. diff --git a/CLAUDE.md b/CLAUDE.md index 73db824c..e781f027 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -11,7 +11,7 @@ tiers. - `impact` — dependents of a target up to a configurable depth, with a risk tier. - `detect_changes` — map an uncommitted or committed diff to affected symbols. - `rename` — graph-assisted multi-file rename; dry-run is the default. -- `sql` — read-only SQL against the local graph store with a 5 s timeout. +- `sql` — read-only SQL against the local temporal store (the `cochanges` and `symbol_summaries` tables), 5 s timeout. The node/edge graph lives in `graph.lbug` (ADR 0016) and is reached via the typed tools (`query`/`context`/`impact`) or Cypher via the MCP `sql` tool's `cypher` arg — NOT via this SQL path. Run `codehub analyze` after pulling new commits so the index stays aligned with the working tree. `codehub status` reports staleness. diff --git a/packages/cli/README.md b/packages/cli/README.md index 90094782..d02719bf 100644 --- a/packages/cli/README.md +++ b/packages/cli/README.md @@ -63,7 +63,7 @@ top-level subcommands by phase of the workflow. | `doctor` | Probe the local environment and print actionable hints | | `ci-init` | Emit GitHub Actions / GitLab CI workflow scaffolds | | `augment` | Fast-path BM25 enrichment for editor PreToolUse hooks | -| `sql` | Read-only SQL against the graph store with a 5 s timeout | +| `sql` | Read-only SQL against the temporal store (cochanges + symbol_summaries) | | `group ` | Cross-repo groups: `create`, `list`, `delete`, `status`, `query`, `sync` | ## Design diff --git a/packages/cli/src/agent-context.ts b/packages/cli/src/agent-context.ts index 20fe28cd..1db8fb95 100644 --- a/packages/cli/src/agent-context.ts +++ b/packages/cli/src/agent-context.ts @@ -38,7 +38,7 @@ tiers. - \`impact\` — dependents of a target up to a configurable depth, with a risk tier. - \`detect_changes\` — map an uncommitted or committed diff to affected symbols. - \`rename\` — graph-assisted multi-file rename; dry-run is the default. -- \`sql\` — read-only SQL against the local graph store with a 5 s timeout. +- \`sql\` — read-only SQL against the local temporal store (cochanges + symbol_summaries), 5 s timeout; the node/edge graph is queried via the typed tools or Cypher via the MCP \`sql\` tool. Run \`codehub analyze\` after pulling new commits so the index stays aligned with the working tree. \`codehub status\` reports staleness. diff --git a/packages/cli/src/index.ts b/packages/cli/src/index.ts index 26bc9c90..3f996d3e 100644 --- a/packages/cli/src/index.ts +++ b/packages/cli/src/index.ts @@ -771,7 +771,9 @@ program program .command("sql ") - .description("Run a read-only SQL query against the graph store") + .description( + "Run a read-only SQL query against the temporal store (cochanges + symbol_summaries); the node/edge graph is queried via the typed tools or Cypher", + ) .option("--repo ", "Registered repo name (default: current directory)") .option("--timeout ", "Per-query timeout in ms", (v) => Number.parseInt(v, 10), 5_000) .option("--json", "Emit JSON on stdout") diff --git a/packages/docs/src/content/docs/agents/tool-decision-matrix.mdx b/packages/docs/src/content/docs/agents/tool-decision-matrix.mdx index 738b6f2a..61b6b09d 100644 --- a/packages/docs/src/content/docs/agents/tool-decision-matrix.mdx +++ b/packages/docs/src/content/docs/agents/tool-decision-matrix.mdx @@ -36,7 +36,7 @@ anti-pattern column says what _not_ to reach for first. | "What's the license tier of my deps?" | `license_audit` | Tiers each transitive dep: permissive / weak-copyleft / strong-copyleft / proprietary / unknown. | `license-checker` raw output. | | "Which areas are getting riskier?" | `risk_trends` | Per-community trend lines + 30-day projection from temporal data. | One-off risk snapshots. | | "Who is changing what most, and where" | `risk_trends` + `owners` | Trends point to communities; `owners` names the people. | Either alone. | -| "Bespoke graph query I can't express above" | `sql` | Read-only SQL against the local graph store, 5s timeout. | When a typed tool covers it — typed tools return `next_steps`. | +| "Bespoke temporal query (cochanges / summaries)" | `sql` | Read-only SQL against the temporal store (cochanges + symbol_summaries), 5s timeout. NOT the node/edge graph. | A typed tool covers it; or you need the graph — use Cypher (MCP `sql` `cypher` arg). | ## Cross-repo group intents diff --git a/packages/docs/src/content/docs/reference/cli.md b/packages/docs/src/content/docs/reference/cli.md index 0409a7ec..d89928de 100644 --- a/packages/docs/src/content/docs/reference/cli.md +++ b/packages/docs/src/content/docs/reference/cli.md @@ -380,7 +380,11 @@ codehub augment ## `sql` -Read-only SQL against the graph store. 5-second timeout by default. +Read-only SQL against the **temporal store** — the DuckDB-backed `cochanges` and +`symbol_summaries` tables. 5-second timeout by default. The node/edge graph lives +in `graph.lbug` (see ADR 0016) and is **not** reachable from this SQL path; query +it via the typed tools (`query` / `context` / `impact`) or Cypher via the MCP `sql` +tool. ```bash title="usage" codehub sql diff --git a/packages/mcp/src/tools/dependencies.ts b/packages/mcp/src/tools/dependencies.ts index f4ea7204..b5f42843 100644 --- a/packages/mcp/src/tools/dependencies.ts +++ b/packages/mcp/src/tools/dependencies.ts @@ -122,7 +122,7 @@ export async function runDependencies( ] : [ "call `query` with one of the names above to find import sites", - "call `sql` with 'SELECT * FROM relations WHERE type = ''DEPENDS_ON''' for the raw edges", + "call `sql` with cypher 'MATCH ()-[r:DEPENDS_ON]->() RETURN r' for the raw edges", ]; return withNextSteps( diff --git a/packages/mcp/src/tools/list-findings.ts b/packages/mcp/src/tools/list-findings.ts index a4bbe117..bd85ce08 100644 --- a/packages/mcp/src/tools/list-findings.ts +++ b/packages/mcp/src/tools/list-findings.ts @@ -149,7 +149,7 @@ export async function runListFindings( ] : [ "call `context` with a finding's file path for caller/callee neighbours", - "call `sql` with 'SELECT * FROM relations WHERE type = ''FOUND_IN''' for raw edges", + "call `sql` with cypher 'MATCH ()-[r:FOUND_IN]->() RETURN r' for raw edges", ]; return withNextSteps( diff --git a/plugins/opencodehub/skills/opencodehub-guide/SKILL.md b/plugins/opencodehub/skills/opencodehub-guide/SKILL.md index 466713ae..dd420fa2 100644 --- a/plugins/opencodehub/skills/opencodehub-guide/SKILL.md +++ b/plugins/opencodehub/skills/opencodehub-guide/SKILL.md @@ -5,7 +5,7 @@ description: "Use when the user asks about OpenCodeHub itself — available MCP # OpenCodeHub Guide -Quick reference for every OpenCodeHub MCP tool, MCP resource, and the DuckDB-backed graph schema. +Quick reference for every OpenCodeHub MCP tool, MCP resource, and the graph + temporal store schema. ## Always Start Here @@ -58,7 +58,7 @@ standalone artifact producer with its own preconditions and output path. | `mcp__opencodehub__impact` | Blast radius with risk tier + `confidenceBreakdown` | | `mcp__opencodehub__detect_changes` | Map an uncommitted or committed diff to affected symbols and flows | | `mcp__opencodehub__rename` | Graph-assisted multi-file rename; dry-run by default | -| `mcp__opencodehub__sql` | Read-only DuckDB SQL against the graph (5 s timeout) | +| `mcp__opencodehub__sql` | Read-only query: `sql` arg → temporal DuckDB (cochanges/summaries); `cypher` arg → lbug graph (5 s timeout) | | `mcp__opencodehub__signature` | Function signature lookup for a target symbol | ### HTTP / RPC surface @@ -112,63 +112,97 @@ Lightweight reads for navigation (every URI uses the `codehub://` scheme): | `codehub://repo/{name}/context` | Stats + staleness envelope | | `codehub://repo/{name}/schema` | Live node kinds / relation types for `sql` | -> Cluster and process navigation resources (`codehub://repo/{name}/clusters`, `codehub://repo/{name}/processes`, etc.) are slated for a later wave. Use `sql` against the `nodes` table filtered to `kind = 'Community'` or `kind = 'Process'` in the meantime. +> Cluster and process navigation resources (`codehub://repo/{name}/clusters`, `codehub://repo/{name}/processes`, etc.) are slated for a later wave. Until then, use the typed tools or Cypher (below) filtered to `kind = 'Community'` / `kind = 'Process'`. -## Graph schema +## Where the graph lives (ADR 0016) -The graph is a DuckDB-backed store. One unified `nodes` table, one `relations` table, an `embeddings` table, a `cochanges` side table, and `store_meta`. +There are **two stores**, and they are queried differently: -**Node kinds** (load-bearing order — new kinds are appended): -File, Folder, Function, Class, Method, Interface, Constructor, Struct, Enum, Macro, Typedef, Union, Namespace, Trait, Impl, TypeAlias, Const, Static, Variable, Property, Record, Delegate, Annotation, Template, Module, CodeElement, Community, Process, Route, Tool. +- **Graph tier — `graph.lbug`** (ladybug, Cypher dialect). Holds nodes, edges, + and embeddings. Query it via the typed tools (`query` / `context` / `impact` / + `route_map` / …) or, for bespoke questions, **Cypher** via the MCP `sql` + tool's `cypher` argument. There is NO `nodes` or `relations` SQL table. +- **Temporal tier — `temporal.duckdb`** (DuckDB SQL). Holds only the + `cochanges` and `symbol_summaries` tables. The `sql` argument of the MCP + `sql` tool (and `codehub sql` on the CLI) targets THIS store. -**Relation types** (append-only): -CONTAINS, DEFINES, IMPORTS, CALLS, EXTENDS, IMPLEMENTS, HAS_METHOD, HAS_PROPERTY, ACCESSES, METHOD_OVERRIDES, OVERRIDES, METHOD_IMPLEMENTS, MEMBER_OF, PROCESS_STEP, HANDLES_ROUTE, FETCHES, HANDLES_TOOL, ENTRY_POINT_OF, WRAPS, QUERIES, REFERENCES, FOUND_IN, DEPENDS_ON, OWNED_BY. +Pass exactly one of `sql` (temporal DuckDB) or `cypher` (lbug graph) to the MCP +`sql` tool. -Cochange edges live in a **separate `cochanges` table**, NOT in `relations`. Do not query `relations` for them. +### Graph schema (lbug / Cypher) -## SQL cheat-sheet (use `mcp__opencodehub__sql`) +One node label `CodeNode` carrying `kind` as a **property** (NOT a per-kind +label). One relationship table per relation type. Properties are **snake_case** +(`file_path`, `start_line`, `inferred_label`, `step_count`, `entry_point_id`); +a camelCase RETURN alias comes back as the alias you give it, but the stored +property names are snake_case. + +**Node kinds** (`n.kind` values): File, Folder, Function, Class, Method, +Interface, Constructor, Struct, Enum, Macro, Typedef, Union, Namespace, Trait, +Impl, TypeAlias, Const, Static, Variable, Property, Record, Delegate, +Annotation, Template, Module, CodeElement, Community, Process, Route, Tool, +Finding, Dependency, Contributor, Repo, ProjectProfile, Section. + +**Relationship types** (each is its own edge label): CONTAINS, DEFINES, IMPORTS, +CALLS, EXTENDS, IMPLEMENTS, HAS_METHOD, HAS_PROPERTY, ACCESSES, METHOD_OVERRIDES, +OVERRIDES, METHOD_IMPLEMENTS, MEMBER_OF, PROCESS_STEP, HANDLES_ROUTE, FETCHES, +HANDLES_TOOL, ENTRY_POINT_OF, WRAPS, QUERIES, REFERENCES, FOUND_IN, DEPENDS_ON, +OWNED_BY. + +Cochanges live only in the **temporal** `cochanges` table (DuckDB SQL), never as +graph edges. + +## Cypher cheat-sheet (MCP `sql` tool, `cypher` arg) All inbound callers of a function by name: -```sql -SELECT caller.name, caller.file_path, caller.start_line, r.confidence, r.reason -FROM relations r -JOIN nodes caller ON caller.id = r.from_id -JOIN nodes callee ON callee.id = r.to_id -WHERE r.type = 'CALLS' - AND callee.name = 'validateUser' - AND callee.kind = 'Function' +```cypher +MATCH (caller:CodeNode)-[r:CALLS]->(callee:CodeNode) +WHERE callee.name = 'validateUser' AND callee.kind = 'Function' +RETURN caller.name AS name, caller.file_path AS file, caller.start_line AS line, + r.confidence AS confidence, r.reason AS reason ORDER BY r.confidence DESC -LIMIT 50; +LIMIT 50 ``` Top communities by cohesion: -```sql -SELECT name, inferred_label, cohesion, symbol_count, keywords -FROM nodes -WHERE kind = 'Community' -ORDER BY cohesion DESC -LIMIT 20; +```cypher +MATCH (n:CodeNode) +WHERE n.kind = 'Community' +RETURN n.name AS name, n.inferred_label AS label, n.cohesion AS cohesion, + n.symbol_count AS symbols +ORDER BY n.cohesion DESC +LIMIT 20 ``` Process entry points: -```sql -SELECT n.name, n.inferred_label, n.step_count, entry.name AS entry_point -FROM nodes n -LEFT JOIN nodes entry ON entry.id = n.entry_point_id +```cypher +MATCH (n:CodeNode) WHERE n.kind = 'Process' -ORDER BY n.step_count DESC; +RETURN n.name AS name, n.inferred_label AS label, n.step_count AS steps, + n.entry_point_id AS entry_point +ORDER BY n.step_count DESC +``` + +SCIP-confirmed CALLS edges only (strict impact): + +```cypher +MATCH ()-[r:CALLS]->() +WHERE r.confidence >= 0.95 AND r.reason STARTS WITH 'scip:' +RETURN r ``` -SCIP-confirmed edges only (for strict impact queries): +### Temporal SQL cheat-sheet (MCP `sql` tool, `sql` arg) + +Tightest co-change pairs (DuckDB SQL — temporal store): ```sql -SELECT from_id, to_id, type, reason -FROM relations -WHERE confidence >= 0.95 - AND reason LIKE 'scip:%'; +SELECT source_file, target_file, lift, cocommit_count +FROM cochanges +ORDER BY lift DESC +LIMIT 20; ``` ## Invariants agents must respect