Skip to content

feat(dynamic-calls): phase 5 — Go MethodByName + C/C++ function pointer/dlsym#1655

Merged
carlos-alm merged 6 commits into
mainfrom
feat/dynamic-call-phase5
Jun 21, 2026
Merged

feat(dynamic-calls): phase 5 — Go MethodByName + C/C++ function pointer/dlsym#1655
carlos-alm merged 6 commits into
mainfrom
feat/dynamic-call-phase5

Conversation

@carlos-alm

Copy link
Copy Markdown
Contributor

Summary

Stacked on PRs #1629-#1654. Phase 5 adds Go and C/C++ dynamic dispatch detection.

Language Pattern Kind Result
Go v.MethodByName("Greet") reflection ✅ 100% P/R (literal)
Go v.MethodByName(name) computed-key Sink edge
C/C++ (*fp)(args) unresolved-dynamic Sink edge
C/C++ dlsym(handle, "symbol") reflection ✅ 100% P/R (literal)
C/C++ dlsym(handle, var) unresolved-dynamic Sink edge

All mirrored in Rust extractors (go.rs, c.rs).

Benchmark results

  • dynamic-go: 100% P/RMethodByName("Greet") resolves (Go functions unqualified in DB)
  • dynamic-c: 100% P/Rdlsym(handle, "greet") resolves (C functions unqualified)

Test plan

  • 30/30 FFI tests pass (4 new for Go MethodByName + C function-pointer/dlsym)
  • dynamic-go benchmark: 100% P/R, 1 flagged (computed-key)
  • dynamic-c benchmark: 100% P/R, 1 flagged (unresolved-dynamic from *fp)
  • cargo check clean

Same pattern as Kotlin/Python/Ruby

Go and C use unqualified function names in the DB → literal MethodByName/dlsym resolve correctly without type-aware lookup.

…on pointer detection

Go (go.ts + go.rs):
- v.MethodByName("name") — extracts string literal as reflection kind, resolves
  to top-level function (100% recall for literal names)
- v.MethodByName(variable) — computed-key kind, sink edge

C/C++ (c.ts + cpp.ts + c.rs):
- (*fp)(args) function pointer dereference — unresolved-dynamic sink edge
- dlsym(handle, "symbol") with literal — reflection kind, resolves if same unit
- dlsym(handle, variable) — unresolved-dynamic sink edge

Fixtures:
- dynamic-go: MethodByName("Greet") → 100% P/R; variable → 1 flagged
- dynamic-c: dlsym("greet") → 100% P/R; (*fp)() → 1 flagged

FFI tests: 30 total (added 4 for Go/C MethodByName/dlsym/function-pointer)
docs check acknowledged

Impact: 16 functions changed, 10 affected
@greptile-apps

greptile-apps Bot commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Phase 5 extends dynamic dispatch detection to Go (reflect.Value.MethodByName) and C/C++ ((*fp)() function pointers, dlsym/dlvsym) in both the TypeScript and Rust extractors, following the same pattern established in phases 2–4.

  • Go: v.MethodByName(\"Greet\") is extracted as dynamicKind: 'reflection' resolving to Greet; a variable argument falls back to computed-key with a sink edge. Benchmarks pass at 100% P/R.
  • C/C++: (*fp)(args) function-pointer dereference calls are tagged unresolved-dynamic; dlsym(handle, \"greet\") with a string literal resolves as reflection. The C++ extractor correctly notes that extern \"C\" symbols are unmangled, making literal-arg resolution valid. Benchmarks pass at 100% P/R.
  • Tests: 4 new FFI unit tests and 2 new benchmark fixtures (dynamic-go, dynamic-c) are added; 30/30 FFI tests pass.

Confidence Score: 5/5

Safe to merge; new extraction logic is well-tested, benchmarks at 100% P/R, and the two minor observations are edge-case inconsistencies that do not affect real-world Go or C/C++ codebases.

The changes follow the established Phase 2–4 pattern exactly. Both TypeScript and Rust extractors are updated in lockstep, 30/30 FFI tests pass, and two new benchmark fixtures confirm end-to-end resolution. The only findings are a defensive-coding inconsistency in get_go_arg and a degenerate-input divergence for MethodByName with an empty string, neither of which is reachable with valid Go or C source.

crates/codegraph-core/src/extractors/go.rs — minor inconsistency in the new get_go_arg helper worth a one-line fix before the next phase lands on top of it.

Important Files Changed

Filename Overview
crates/codegraph-core/src/extractors/go.rs Adds get_go_arg helper and MethodByName detection; ? on child lookup is inconsistent with the C/C++ extractors, and the empty-string literal edge case causes a Rust/TS divergence.
crates/codegraph-core/src/extractors/c.rs Adds dlsym/dlvsym reflection detection and (*fp)() function-pointer tagging; uses correct if let Some(child) pattern throughout, mirrors TypeScript extractor faithfully.
crates/codegraph-core/src/extractors/cpp.rs Mirrors the C extractor with dlsym string-literal resolution and (*fp)() detection; comment correctly explains that extern C symbols are unmangled so literal-arg resolution is valid.
src/extractors/go.ts TypeScript Go extractor cleanly refactored to add MethodByName detection with correct reflection/computed-key distinction; uses null-safe child skipping throughout.
src/extractors/c.ts TypeScript C extractor refactored with getCArg helper, dlsym string-literal resolution, and (*fp)() tagging; logic matches the Rust extractor correctly.
src/extractors/cpp.ts TypeScript C++ extractor updated symmetrically with C: dlsym string-literal resolution and (*fp)() detection; getCppArg duplicates getCArg body but is in a separate file.
tests/engines/dynamic-call-ffi.test.ts Adds 4 focused unit tests covering Go MethodByName (literal + variable) and C (*fp)() + dlsym patterns; all assertions are well-targeted.
tests/benchmarks/resolution/resolution-benchmark.test.ts Registers dynamic-go and dynamic-c at 100% precision/recall thresholds, consistent with the established Phase 4 benchmark pattern.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[call_expression node] --> B{fn_node kind}
    B -->|identifier| C{dlsym or dlvsym?}
    C -->|yes| D{second arg is string literal?}
    D -->|yes non-empty| E[emit reflection - name = literal value]
    D -->|no or empty| F[emit unresolved-dynamic]
    C -->|no| G[emit static call - name = fn_name]
    B -->|parenthesized_expression or pointer_expression| H[emit unresolved-dynamic - fp pattern]
    B -->|selector_expression Go only| I{field == MethodByName?}
    I -->|yes| J{arg 0 is string literal?}
    J -->|yes non-empty| K[emit reflection - name = stripped literal]
    J -->|variable or empty| L[emit computed-key - keyExpr = arg text]
    I -->|no| M[emit static method call]
    B -->|field_expression C struct| N[emit static call with receiver]
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[call_expression node] --> B{fn_node kind}
    B -->|identifier| C{dlsym or dlvsym?}
    C -->|yes| D{second arg is string literal?}
    D -->|yes non-empty| E[emit reflection - name = literal value]
    D -->|no or empty| F[emit unresolved-dynamic]
    C -->|no| G[emit static call - name = fn_name]
    B -->|parenthesized_expression or pointer_expression| H[emit unresolved-dynamic - fp pattern]
    B -->|selector_expression Go only| I{field == MethodByName?}
    I -->|yes| J{arg 0 is string literal?}
    J -->|yes non-empty| K[emit reflection - name = stripped literal]
    J -->|variable or empty| L[emit computed-key - keyExpr = arg text]
    I -->|no| M[emit static method call]
    B -->|field_expression C struct| N[emit static call with receiver]
Loading

Reviews (7): Last reviewed commit: "Merge remote-tracking branch 'origin/mai..." | Re-trigger Greptile

Comment on lines +1 to +7
/* Fixture: C dynamic dispatch patterns
* (*fp)(args) → flagged as unresolved-dynamic
* dlsym(handle, "symbol") → flagged as unresolved-dynamic (cross-lib)
*/
#include <stdio.h>
#include <dlfcn.h>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Fixture header comment contradicts actual behavior

The file header says dlsym(handle, "symbol") → flagged as unresolved-dynamic (cross-lib), but both src/extractors/c.ts and crates/codegraph-core/src/extractors/c.rs resolve a string-literal dlsym call as dynamicKind: 'reflection'. The expected-edges.json correctly documents this as resolving to greet(), so the comment is the odd one out and will mislead future readers.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — corrected the header comment in dispatch.c. The comment now accurately reflects that dlsym with a string literal resolves as reflection (matching the expected-edges.json), not as unresolved-dynamic.

Comment thread src/extractors/cpp.ts Outdated
Comment on lines 282 to 294
// For simplicity, flag as unresolved (symbol might not be in the codebase)
ctx.calls.push({
name: '<dynamic:unresolved>',
line: callLine,
dynamic: true,
dynamicKind: 'unresolved-dynamic',
});
return;
}
if (call.name) ctx.calls.push(call);

if (fnName) ctx.calls.push({ name: fnName, line: callLine });
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 dlsym with a string literal silently downgrades to unresolved-dynamic in C++ while C resolves it

src/extractors/c.ts extracts dlsym(handle, "greet") as dynamicKind: 'reflection' and resolves it to the named function. The C++ extractor always emits unresolved-dynamic, losing resolution for string-literal arguments. Mixed-language projects using dlsym in .cpp files (the common case on Linux) will get no edge where the C extractor would produce one. If this is intentional (name mangling concerns), a note in the comment explaining the reasoning would avoid confusion.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added full dlsym string-literal resolution to the C++ extractor (both src/extractors/cpp.ts and crates/codegraph-core/src/extractors/cpp.rs), mirroring the C extractor. A string-literal dlsym argument now resolves as reflection; a variable argument falls back to unresolved-dynamic. Also added (*fp)(args) function-pointer detection to the Rust C++ extractor. The comment explains that extern "C" symbols are not mangled so string-literal resolution is valid.

Comment on lines +761 to 775
edges.push(ComputedEdge {
source_id: caller_id,
target_id: fc.file_node_id,
kind: "calls".to_string(),
confidence: 0.0,
dynamic: 1,
dynamic_kind: Some(dk.clone()),
});
}
}
}
}
}

emit_hierarchy_edges(ctx, file_input, fc.rel_path, edges);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 First-byte dedup key is fragile — a future flag-only kind sharing a first letter would silently skip sink edges

dk.as_bytes()[0] is used as the third element of seen_sink_edges, meaning the set treats two kinds with the same leading byte as the same discriminator. Currently "eval" (e), "computed-key" (c), and "unresolved-dynamic" (u) are all distinct. However, "computed-literal" also starts with 'c'. If it were ever promoted to a flag-only kind — or a new kind like "call-indirect" was added — the dedup would incorrectly suppress a sink edge that should be emitted. Using the full string (as the JS engine already does via `${caller.id}:${fc.file_node_id}:${dk}`) would be safer.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already fixed — the merge from main resolved this. build_edges.rs now uses HashSet<(u32, u32, String)> with dk.clone() as the full string key instead of the first byte, matching the JS engine's approach.

carlos-alm added a commit that referenced this pull request Jun 21, 2026
…on to C++ extractor (#1655)

Impact: 2 functions changed, 3 affected
@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

Comment on lines +1671 to +1679
// Back-fill dynamic_kind for flagged sink edges emitted by the native engine.
// Include dynamic_kind in the WHERE clause so two sink edges from the same caller
// to the same file node with different kinds don't clobber each other.
if (dynamicKindRows.length > 0) {
const stmt = db.prepare(
"UPDATE edges SET dynamic_kind = ? WHERE kind = 'calls' AND source_id = ? AND target_id = ? AND (dynamic_kind IS NULL OR dynamic_kind = ?)",
);
for (const r of dynamicKindRows) stmt.run(r[6], r[0], r[1], r[6]);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 New UPDATE block is missing confidence = 0.0 AND dynamic = 1 guard

The pre-existing block at line 1665 correctly scopes to sink edges with AND confidence = 0.0 AND dynamic = 1. The new block added in this PR omits those predicates, so it will match any calls edge between the same (source_id, target_id) where dynamic_kind IS NULL — including ordinary high-confidence static call edges. In a codebase where runFoo calls greet() both directly and via dlsym(handle, "greet"), the static call edge (dynamic=NULL, confidence=0.8) would be incorrectly tagged with dynamic_kind = 'reflection' after this UPDATE runs. This taints the static-call metadata for all such pairs.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already addressed — the UPDATE block at this location includes AND confidence = 0.0 AND dynamic = 1 (visible in the current code at line 1667). The fix was applied via the 'fix: correct dispatch.c comment and add dlsym string-literal resolution to C++ extractor' commit. Static call edges sharing a (source_id, target_id) pair with a dynamic-dispatch row are not affected.

@github-actions

github-actions Bot commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Codegraph Impact Analysis

14 functions changed9 callers affected across 3 files

  • getCArg in src/extractors/c.ts:140 (3 transitive callers)
  • handleCCallExpression in src/extractors/c.ts:155 (2 transitive callers)
  • getCppArg in src/extractors/cpp.ts:197 (3 transitive callers)
  • handleCppCallExpression in src/extractors/cpp.ts:260 (2 transitive callers)
  • getGoArg in src/extractors/go.ts:230 (3 transitive callers)
  • handleGoCallExpr in src/extractors/go.ts:245 (2 transitive callers)
  • greet in tests/benchmarks/resolution/fixtures/dynamic-c/dispatch.c:8 (0 transitive callers)
  • farewell in tests/benchmarks/resolution/fixtures/dynamic-c/dispatch.c:12 (0 transitive callers)
  • runFunctionPointer in tests/benchmarks/resolution/fixtures/dynamic-c/dispatch.c:17 (0 transitive callers)
  • runDlsym in tests/benchmarks/resolution/fixtures/dynamic-c/dispatch.c:22 (0 transitive callers)
  • Greet in tests/benchmarks/resolution/fixtures/dynamic-go/reflect.go:8 (0 transitive callers)
  • Farewell in tests/benchmarks/resolution/fixtures/dynamic-go/reflect.go:12 (0 transitive callers)
  • runMethodByNameLiteral in tests/benchmarks/resolution/fixtures/dynamic-go/reflect.go:17 (0 transitive callers)
  • runMethodByNameVariable in tests/benchmarks/resolution/fixtures/dynamic-go/reflect.go:22 (0 transitive callers)

@carlos-alm carlos-alm force-pushed the feat/dynamic-call-phase5 branch from ea88009 to c278409 Compare June 21, 2026 04:49
Impact: 3 functions changed, 8 affected
@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

Comment on lines +139 to +150
/// Get the first non-punctuation argument from a Ruby call node's argument list.
fn get_first_ruby_arg<'a>(node: &Node<'a>) -> Option<Node<'a>> {
let args = node.child_by_field_name("arguments")?;
for i in 0..args.child_count() {
let child = args.child(i)?;
match child.kind() {
"(" | ")" | "," => continue,
_ => return Some(child),
}
}
None
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 get_first_ruby_arg returns splat_argument nodes while the TypeScript mirror in ruby.ts skips them. For obj.send(*args, ...), the Rust extractor sees the splat_argument as the first argument, falls into the _ arm, and emits computed-key with keyExpr: "*args". The TypeScript extractor skips it, finds no remaining argument, and emits unresolved-dynamic. The two engines produce a different call kind for the same Ruby source.

Suggested change
/// Get the first non-punctuation argument from a Ruby call node's argument list.
fn get_first_ruby_arg<'a>(node: &Node<'a>) -> Option<Node<'a>> {
let args = node.child_by_field_name("arguments")?;
for i in 0..args.child_count() {
let child = args.child(i)?;
match child.kind() {
"(" | ")" | "," => continue,
_ => return Some(child),
}
}
None
}
/// Get the first non-punctuation argument from a Ruby call node's argument list.
fn get_first_ruby_arg<'a>(node: &Node<'a>) -> Option<Node<'a>> {
let args = node.child_by_field_name("arguments")?;
for i in 0..args.child_count() {
let child = args.child(i)?;
match child.kind() {
"(" | ")" | "," | "splat_argument" => continue,
_ => return Some(child),
}
}
None
}

Fix in Claude Code

@carlos-alm carlos-alm merged commit 1a3217c into main Jun 21, 2026
29 checks passed
@carlos-alm carlos-alm deleted the feat/dynamic-call-phase5 branch June 21, 2026 06:48
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 21, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant