feat(dynamic-calls): phase 6 — C#/Swift/Elixir/Lua long-tail dynamic dispatch#1657
Conversation
…dispatch
C# (csharp.ts):
- type.GetMethod("name") / GetRuntimeMethod / GetDeclaredMethod → reflection, keyExpr captured
- method.Invoke(target, args) → unresolved-dynamic sink edge
Swift (swift.ts):
- NSSelectorFromString("greet") → reflection kind, 100% P/R (top-level fns unqualified)
- performSelector → unresolved-dynamic sink edge
Elixir (elixir.ts):
- apply(module, :function, args) with atom literal → reflection kind
- apply(module, variable, args) → computed-key sink edge
Lua (lua.ts):
- load(code) / loadstring(code) / dofile(code) → eval kind, sink edge
- t["key"]() string-literal bracket call → resolved (computed-literal, 100% P/R)
- t[k]() variable bracket call → computed-key sink edge
Fixtures:
- dynamic-csharp: 0%/0% (class-qualified names, tracked for RES-3 in #1650)
- dynamic-swift: 100%/100% (NSSelectorFromString → top-level fn resolves)
- dynamic-elixir: 0%/0% (module-qualified names, tracked for RES-3)
- dynamic-lua: 100%/100% (t["greet"]() → local greet() resolves)
Rust parity gap filed as #1656 (C#, Swift, Elixir, Lua Rust extractors not updated).
docs check acknowledged
Impact: 21 functions changed, 11 affected
Greptile SummaryPhase 6 extends dynamic-dispatch detection to C#, Swift, Elixir, and Lua by adding extractors for
Confidence Score: 4/5Safe to merge with one known defect: the Elixir extractor misidentifies 2-arity apply/2 calls, producing incorrect computed-key edges for anonymous function applications in any Elixir codebase. The Elixir apply handler reads the second argument unconditionally as a function name selector. For the 3-arity apply(module, :atom, args) form this is correct, but for the 2-arity apply(fun, args) form the second argument is an argument list — not a function name. Every apply/2 call site emits a spurious computed-key edge with keyExpr set to the argument list text, which is incorrect graph data for any Elixir project using anonymous function application. src/extractors/elixir.ts — the apply case block needs arity detection to avoid mishandling the 2-arity form. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Extractor sees call node] --> B{Language}
B --> CS[C#]
CS --> CS1{member_access_expression method name?}
CS1 -->|Invoke| CS2[unresolved-dynamic sink edge]
CS1 -->|GetMethod / GetRuntimeMethod / GetDeclaredMethod| CS3{string literal arg?}
CS3 -->|yes| CS4[reflection edge — name = literal]
CS3 -->|no| CS5[computed-key sink edge]
CS1 -->|other| CS6[normal call edge]
B --> SW[Swift]
SW --> SW1{call.name?}
SW1 -->|performSelector| SW2[unresolved-dynamic sink edge]
SW1 -->|NSSelectorFromString| SW3{direct first arg is string literal?}
SW3 -->|yes| SW4[reflection edge — name = literal]
SW3 -->|no| SW5[unresolved-dynamic sink edge]
SW1 -->|other| SW6[normal call edge]
B --> EX[Elixir]
EX --> EX1{keyword = apply?}
EX1 -->|3-arity: apply M F args| EX2{2nd arg type?}
EX2 -->|atom / atom_literal| EX3[reflection edge — name = stripped atom]
EX2 -->|identifier / other| EX4[computed-key sink edge]
EX1 -->|2-arity: apply fun args| EX5[computed-key sink edge — keyExpr = args list WRONG]
EX1 -->|no| EX6[normal call edge]
B --> LU[Lua]
LU --> LU1{nameNode type?}
LU1 -->|identifier: load / loadstring / dofile| LU2[eval sink edge]
LU1 -->|bracket_index_expression| LU3{key type?}
LU3 -->|string / string_literal| LU4[normal call — name = stripped literal]
LU3 -->|other| LU5[computed-key sink edge]
LU1 -->|other| LU6[normal call edge]
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
A[Extractor sees call node] --> B{Language}
B --> CS[C#]
CS --> CS1{member_access_expression method name?}
CS1 -->|Invoke| CS2[unresolved-dynamic sink edge]
CS1 -->|GetMethod / GetRuntimeMethod / GetDeclaredMethod| CS3{string literal arg?}
CS3 -->|yes| CS4[reflection edge — name = literal]
CS3 -->|no| CS5[computed-key sink edge]
CS1 -->|other| CS6[normal call edge]
B --> SW[Swift]
SW --> SW1{call.name?}
SW1 -->|performSelector| SW2[unresolved-dynamic sink edge]
SW1 -->|NSSelectorFromString| SW3{direct first arg is string literal?}
SW3 -->|yes| SW4[reflection edge — name = literal]
SW3 -->|no| SW5[unresolved-dynamic sink edge]
SW1 -->|other| SW6[normal call edge]
B --> EX[Elixir]
EX --> EX1{keyword = apply?}
EX1 -->|3-arity: apply M F args| EX2{2nd arg type?}
EX2 -->|atom / atom_literal| EX3[reflection edge — name = stripped atom]
EX2 -->|identifier / other| EX4[computed-key sink edge]
EX1 -->|2-arity: apply fun args| EX5[computed-key sink edge — keyExpr = args list WRONG]
EX1 -->|no| EX6[normal call edge]
B --> LU[Lua]
LU --> LU1{nameNode type?}
LU1 -->|identifier: load / loadstring / dofile| LU2[eval sink edge]
LU1 -->|bracket_index_expression| LU3{key type?}
LU3 -->|string / string_literal| LU4[normal call — name = stripped literal]
LU3 -->|other| LU5[computed-key sink edge]
LU1 -->|other| LU6[normal call edge]
Reviews (7): Last reviewed commit: "fix: resolve merge conflicts with main" | Re-trigger Greptile |
| if (call.name === 'NSSelectorFromString') { | ||
| // Search the call_expression subtree for a string literal (BFS) | ||
| const queue: TreeSitterNode[] = [node]; | ||
| let literal: string | null = null; | ||
| while (queue.length > 0 && !literal) { | ||
| const curr = queue.shift()!; | ||
| if (curr.type === 'line_string_literal' || curr.type === 'string_literal') { | ||
| // Look for inner string_content node | ||
| for (let i = 0; i < curr.childCount; i++) { | ||
| const ch = curr.child(i); | ||
| if (ch?.type === 'string_content') { | ||
| literal = ch.text; | ||
| break; | ||
| } | ||
| } | ||
| if (!literal) literal = curr.text.replace(/^["']|["']$/g, ''); | ||
| break; | ||
| } | ||
| for (let i = 0; i < curr.childCount; i++) { | ||
| const ch = curr.child(i); | ||
| if (ch) queue.push(ch); | ||
| } | ||
| } | ||
| ctx.calls.push({ | ||
| name: literal ?? '<dynamic:unresolved>', | ||
| line: call.line, | ||
| dynamic: true, | ||
| dynamicKind: literal ? 'reflection' : 'unresolved-dynamic', | ||
| keyExpr: literal ?? undefined, | ||
| }); | ||
| return; | ||
| } | ||
|
|
||
| ctx.calls.push(call); | ||
| } | ||
|
|
There was a problem hiding this comment.
NSSelectorFromString BFS matches nested string literals
The BFS starts from the entire call_expression node and accepts the first line_string_literal or string_literal it finds anywhere in the subtree. For code like NSSelectorFromString(generateSelector("greet")) or NSSelectorFromString(dict["greet"]), the BFS descends into the nested call/subscript and extracts "greet" as if it were the directly-passed selector. This creates a false-positive reflection edge pointing to greet when the actual runtime selector is not statically knowable. The fix is to look only at the direct arguments of the call rather than BFS-ing the entire subtree.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Fixed — replaced the BFS over the entire call_expression subtree with a direct argument inspection. The new implementation walks only the direct children of the argument_list (or call_suffix/value_arguments), stopping at the first real argument. Nested string literals inside sub-expressions (e.g. NSSelectorFromString(generateSelector("greet"))) are no longer extracted.
| const argIter = iterPyArgs(node); | ||
| const firstArg = argIter.next().value as TreeSitterNode | undefined; | ||
| const secondArg = argIter.next().value as TreeSitterNode | undefined; | ||
| const receiver = firstArg?.type === 'identifier' ? firstArg.text : firstArg?.text; |
There was a problem hiding this comment.
No-op ternary — receiver is always
firstArg?.text
The condition firstArg?.type === 'identifier' ? firstArg.text : firstArg?.text evaluates to firstArg?.text in both branches. If the intent was to restrict receiver to simple identifiers (e.g. avoid setting a complex expression like obj.attr as the receiver), the else branch should be undefined; as written, complex expressions are silently accepted as receiver strings, which can cause spurious lookup mismatches downstream.
| const receiver = firstArg?.type === 'identifier' ? firstArg.text : firstArg?.text; | |
| const receiver = firstArg?.type === 'identifier' ? firstArg.text : undefined; |
There was a problem hiding this comment.
Fixed — changed the else branch from firstArg?.text to undefined. Now only simple identifiers are used as the receiver string; complex expressions (e.g. obj.attr, self.registry) are excluded, avoiding spurious lookup mismatches downstream.
| if targets.is_empty() { | ||
| if let Some(ref dk) = call.dynamic_kind { | ||
| if dk == "eval" || dk == "computed-key" || dk == "unresolved-dynamic" { | ||
| let sink_key = (caller_id, fc.file_node_id, dk.as_bytes()[0]); |
There was a problem hiding this comment.
First-byte dedup key is fragile for future
dynamic_kind values
dk.as_bytes()[0] discriminates the three current kinds by their first ASCII byte (e/c/u). Any future kind whose name starts with the same letter as an existing one (e.g. "computed-call" vs "computed-key") would silently be treated as a duplicate and the second sink edge would be suppressed. Using the full string as the discriminator is cheap (all values are short) and eliminates the hazard.
| let sink_key = (caller_id, fc.file_node_id, dk.as_bytes()[0]); | |
| let sink_key = (caller_id, fc.file_node_id, dk.as_str()); |
There was a problem hiding this comment.
Already fixed in the merge conflict resolution — the seen_sink_edges type was changed from HashSet<(u32, u32, u8)> (first byte) to HashSet<(u32, u32, String)> (full kind string), and the key construction was updated to use dk.clone(). The full kind string discriminator eliminates the first-byte collision hazard you identified.
| export type DynamicKind = | ||
| | 'computed-literal' // obj["foo"]() — resolvable; already emitted as normal edge | ||
| | 'computed-key' // obj[k]() — potentially resolvable via pts; else flagged | ||
| | 'reflection' // .call/.apply/.bind — usually resolved; flagged if target unknown |
There was a problem hiding this comment.
Misleading comment on
reflection — it is never flagged when unresolved
The inline comment says "flagged if target unknown" but reflection is intentionally excluded from FLAG_ONLY_KINDS in build-edges.ts; an unresolved reflection call is silently dropped, not emitted as a sink edge. The comment should say "resolved if target reachable; silently dropped if not (tracked by RES-3)".
| | 'reflection' // .call/.apply/.bind — usually resolved; flagged if target unknown | |
| | 'reflection' // .call/.apply/.bind — resolved if target reachable; silently dropped if not (RES-3) |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Updated — the comment now reads: 'resolved if target reachable; silently dropped if not (RES-3)', matching the actual behavior where reflection calls are excluded from FLAG_ONLY_KINDS and silently dropped when unresolved.
| try { | ||
| return db | ||
| .prepare( | ||
| `SELECT dynamic_kind, COUNT(*) AS count | ||
| FROM edges | ||
| WHERE dynamic_kind IS NOT NULL | ||
| GROUP BY dynamic_kind | ||
| ORDER BY count DESC`, | ||
| ) | ||
| .all() as DynamicCallCount[]; | ||
| } finally { |
There was a problem hiding this comment.
dynamicCallsData does not filter by confidence = 0
The query counts all edges where dynamic_kind IS NOT NULL, while the mirror query in resolution-benchmark.test.ts (extractFlaggedDynamicCalls) also requires confidence = 0. Today these are equivalent because dynamic_kind is only written to sink edges (confidence=0), but the invariant is implicit and not enforced here. If a future change sets dynamic_kind on a resolved edge the two counters will silently diverge. Adding AND confidence = 0 makes the intent explicit and future-proof.
There was a problem hiding this comment.
Fixed — added AND confidence = 0 to the WHERE clause in dynamicCallsData. This makes the intent explicit and ensures the query stays aligned with the benchmark's extractFlaggedDynamicCalls if dynamic_kind is ever written to a non-sink edge in the future.
Impact: 29 functions changed, 165 affected
…, roles query (#1657) Impact: 3 functions changed, 6 affected
…, roles query (#1657) Impact: 3 functions changed, 6 affected
Codegraph Impact Analysis26 functions changed → 16 callers affected across 8 files
|
…icate UPDATE (#1657) - swift.ts: fix NSSelectorFromString argument parsing — resolve through call_suffix/value_arguments/value_argument tree, look for line_str_text (not string_content). Fixes dynamic-swift benchmark: was 0/0 recall. - lua.ts: replace fragile positional child(childCount-2) with an id-based scan, eliminating the Greptile-flagged off-by-one hazard. - build-edges.ts: remove duplicate dynamic_kind UPDATE block that lacked the confidence=0.0 AND dynamic=1 guards. - c.ts, cpp.ts, csharp.ts: remove unused Call import. Impact: 3 functions changed, 8 affected
1692faa to
e8f8df5
Compare
|
Addressed all Greptile feedback and fixed CI failures: CI fixes:
Greptile feedback addressed (from latest review):
Inline comments addressed (in previous commit):
|
Resolves conflicts after phase 3 merged to main as #1653: - python.ts: keep identifier-only receiver fix - overview.ts: incorporate dead sub-role metadata from main - types.ts, resolution-benchmark.test.ts, dynamic-call-ffi.test.ts: auto-resolved to retain our phase 4/5/6 additions
docs check acknowledged Impact: 4 functions changed, 7 affected
Summary
Stacked on PRs #1629-#1655. Phase 6 completes the language-family detection sweep for the long-tail languages.
type.GetMethod("name")reflectionmethod.Invoke(target, args)unresolved-dynamicNSSelectorFromString("greet")reflectionperformSelectorunresolved-dynamicapply(module, :function, args)reflectionapply(module, variable, args)computed-keyload(code)/loadstringevalt["greet"]()t[k]()computed-keyRust parity gap filed as issue #1656 — Rust extractors for C#, Swift, Elixir, Lua not yet updated (WASM only).
Benchmark results
dynamic-csharp: 0%/0% recall (class-qualified method names → RES-3)dynamic-swift: 100%/100% — top-level Swift fns unqualified, same as Kotlin/Python/Ruby/Godynamic-elixir: 0%/0% recall (module-qualified function names → RES-3)dynamic-lua: 100%/100% — string-literal bracket calls resolve;load(code)flaggedPattern summary across all phases
100% recall languages (unqualified DB names): JS, TS, Python, Ruby, PHP, Go, C, Kotlin, Swift, Lua top-level
0% recall languages (class/module-qualified DB names): Java, Scala, Groovy, C#, Elixir → need RES-3 type-aware lookup
Issues filed