fix: purge dataflow rows keyed by call_edge_id before edge deletion#1662
Conversation
Impact: 4 functions changed, 8 affected
Impact: 1 functions changed, 5 affected
The dataflow table has a call_edge_id column that REFERENCES edges(id). When a file is deleted during an incremental rebuild, purgeFilesData deleted dataflow rows matched by source_id/target_id and via dataflow_vertices, but missed rows whose call_edge_id pointed to a calls edge touching the deleted file — where the dataflow row's own source_id/target_id belonged to other files (cross-file inter-procedural stitching). With better-sqlite3's default PRAGMA foreign_keys = ON, the subsequent edge deletion failed silently, leaving stale nodes behind and causing a node count mismatch after file deletion in incremental builds. Fix: add a dataflowByCallEdge purge step in preparePurgeStmts that deletes dataflow rows referencing edges that touch the deleted file, executed immediately before the edges delete. The call_edge_id column is optional (added in schema v18), so the statement is wrapped in tryPrepare and invoked with ?. on the result. Closes #1645 Impact: 3 functions changed, 8 affected
Greptile SummaryThis PR fixes a FK-constraint gap in the WASM/JS incremental file-deletion path:
Confidence Score: 4/5The core database fix is correct and well-ordered, but the simplified printRoles function introduces a definite display regression in the CLI overview output. countRoles() in module-map.ts always adds a synthetic dead aggregate key on top of the individual dead-callable/dead-leaf sub-role keys. The removed hasDeadSubRoles guard was the only thing preventing printRoles from summing all of them together. On any database that has dead-role nodes, the classified symbols total shown by codegraph overview will be inflated. src/presentation/queries-cli/overview.ts - printRoles should either restore the hasDeadSubRoles exclusion or countRoles should stop injecting the synthetic dead aggregate key when sub-role keys are present. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[purgeFileData / purgeFilesData] --> B[embeddings]
B --> C[cfgEdges / cfgBlocks]
C --> D["dataflow (source_id / target_id)"]
D --> E["dataflowByVertex (source_vertex / target_vertex)"]
E --> F["dataflowByCallEdge NEW - call_edge_id -> edges touching file"]
F --> G[dataflowSummary / dataflowVertices]
G --> H[complexity / nodeMetrics / astNodes]
H --> I["edges - FK constraint satisfied"]
I --> J[nodes]
J --> K[fileHashes]
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
A[purgeFileData / purgeFilesData] --> B[embeddings]
B --> C[cfgEdges / cfgBlocks]
C --> D["dataflow (source_id / target_id)"]
D --> E["dataflowByVertex (source_vertex / target_vertex)"]
E --> F["dataflowByCallEdge NEW - call_edge_id -> edges touching file"]
F --> G[dataflowSummary / dataflowVertices]
G --> H[complexity / nodeMetrics / astNodes]
H --> I["edges - FK constraint satisfied"]
I --> J[nodes]
J --> K[fileHashes]
Reviews (6): Last reviewed commit: "fix: resolve merge conflicts with main" | Re-trigger Greptile |
| try { | ||
| ctx.nativeDb.exec('PRAGMA foreign_keys = OFF'); | ||
| } catch { | ||
| // exec may not exist on very old addon versions — safe to ignore | ||
| } |
There was a problem hiding this comment.
PRAGMA foreign_keys = OFF applies to the entire ctx.nativeDb connection for the rest of its lifetime, not just the buildGraph() call. If any JS-side code later writes to ctx.nativeDb (e.g., post-build gap-repair steps that share this connection), those writes will also run without FK enforcement. The comment "restored automatically when this connection is closed" is accurate for SQLite's session-scoped PRAGMAs, but could be misread as an explicit teardown. Worth verifying that nothing after buildGraph() writes to nativeDb expecting FK protection, or adding ctx.nativeDb.exec('PRAGMA foreign_keys = ON') right after buildGraph() returns to narrow the scope.
There was a problem hiding this comment.
Fixed — added ctx.nativeDb.exec('PRAGMA foreign_keys = ON') immediately after buildGraph() returns (before JSON.parse), so FK enforcement is restored as soon as the Rust-side purge+build is done. The comment was also updated to reflect this explicit restore rather than the previous wording that implied it was only restored on connection close.
Codegraph Impact Analysis17 functions changed → 37 callers affected across 24 files
|
…le JSDoc (#1662) - Narrow FK-off scope in native-orchestrator: restore PRAGMA foreign_keys = ON immediately after nativeDb.buildGraph() returns instead of relying on connection close, so post-build writes retain FK protection - Remove orphaned JSDoc block before runInterproceduralStitch that described the old buildInterproceduralStitch wrapper semantics; the correct JSDoc for the internal helper was already present Impact: 1 functions changed, 5 affected
|
Addressed both Greptile findings:
|
| try { | ||
| ctx.nativeDb.exec('PRAGMA foreign_keys = OFF'); | ||
| } catch { | ||
| // exec may not exist on very old addon versions — safe to ignore | ||
| } | ||
|
|
||
| const resultJson = ctx.nativeDb.buildGraph( | ||
| ctx.rootDir, | ||
| JSON.stringify(ctx.config), | ||
| JSON.stringify(ctx.aliases), | ||
| JSON.stringify(ctx.opts), | ||
| ); | ||
|
|
||
| // Restore FK enforcement immediately after buildGraph() so any subsequent | ||
| // writes to this connection (gap-repair, structure patch) retain FK protection. | ||
| try { | ||
| ctx.nativeDb.exec('PRAGMA foreign_keys = ON'); | ||
| } catch { | ||
| // safe to ignore on very old addon versions | ||
| } |
There was a problem hiding this comment.
If
buildGraph() throws (native addon OOM, missing file, internal Rust panic, etc.), the PRAGMA foreign_keys = ON restore never executes. ctx.nativeDb lives in the shared PipelineContext and — given the comment explicitly calls out "gap-repair, structure patch" as subsequent writers — any outer catch block or pipeline teardown that reuses the connection would do so with FK enforcement still off. A try/finally bracket eliminates the gap.
| try { | |
| ctx.nativeDb.exec('PRAGMA foreign_keys = OFF'); | |
| } catch { | |
| // exec may not exist on very old addon versions — safe to ignore | |
| } | |
| const resultJson = ctx.nativeDb.buildGraph( | |
| ctx.rootDir, | |
| JSON.stringify(ctx.config), | |
| JSON.stringify(ctx.aliases), | |
| JSON.stringify(ctx.opts), | |
| ); | |
| // Restore FK enforcement immediately after buildGraph() so any subsequent | |
| // writes to this connection (gap-repair, structure patch) retain FK protection. | |
| try { | |
| ctx.nativeDb.exec('PRAGMA foreign_keys = ON'); | |
| } catch { | |
| // safe to ignore on very old addon versions | |
| } | |
| try { | |
| ctx.nativeDb.exec('PRAGMA foreign_keys = OFF'); | |
| } catch { | |
| // exec may not exist on very old addon versions — safe to ignore | |
| } | |
| let resultJson: string; | |
| try { | |
| resultJson = ctx.nativeDb.buildGraph( | |
| ctx.rootDir, | |
| JSON.stringify(ctx.config), | |
| JSON.stringify(ctx.aliases), | |
| JSON.stringify(ctx.opts), | |
| ); | |
| } finally { | |
| // Restore FK enforcement so any subsequent writes to this connection | |
| // (gap-repair, structure patch) retain FK protection — even if buildGraph() | |
| // throws. | |
| try { | |
| ctx.nativeDb.exec('PRAGMA foreign_keys = ON'); | |
| } catch { | |
| // safe to ignore on very old addon versions | |
| } | |
| } |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Fixed — wrapped buildGraph() in try/finally so FK enforcement is always restored, even if the native call throws (OOM, Rust panic, etc.). The PRAGMA foreign_keys = ON restore is now in the finally block, mirroring the pattern you suggested.
#1662) Impact: 1 functions changed, 5 affected
docs check acknowledged
Summary
purgeFilesData/purgeFileData(the WASM/JS incremental file-deletion path) had a gap:dataflowrows whosecall_edge_idreferenced acallsedge touching the deleted file — but whose ownsource_id/target_idbelonged to other files (inter-procedural stitching) — survived the existing source/target and vertex-based deletions.better-sqlite3's defaultPRAGMA foreign_keys = ON, the subsequentDELETE FROM edgesstep then failed silently because those survivingdataflowrows held a FK reference oncall_edge_id. Stale nodes remained, causing a node count mismatch after file deletion in incremental builds.dataflowByCallEdgepurge step inpreparePurgeStmts(src/db/repository/build-stmts.ts) that removes anydataflowrow whosecall_edge_idreferences an edge touching the deleted file. The step runs just before theedgesdelete so the FK constraint is always satisfied.PRAGMA foreign_keysbeforebuildGraph) was already merged as prerequisite fix: issue-1174 incremental parity test fails — imports edge count mismatch (11 vs 6) #1644.build-parity.test.tswas discovered and filed as issue fix: build-parity test fails — native classifies main/square as dead-unresolved but WASM says leaf #1661 (unrelated to this change).Test plan
npx vitest run tests/integration/incremental-edge-parity.test.ts— all 10 tests pass, includingfile deletion > node count matchesnpm test— 3271 passed, 1 pre-existing failure (build-parity.test.ts > produces identical roles, tracked as fix: build-parity test fails — native classifies main/square as dead-unresolved but WASM says leaf #1661)npm run lint— no issuesCloses #1645