Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughAdds import extraction and resolution to the analysis pipeline: base analyzer gains import hooks, File entities track import nodes and resolved targets, Python analyzer implements full import handling, other language analyzers get stubs, and SourceAnalyzer wires extraction and IMPORTS edges into the two-pass analysis. Changes
Sequence Diagram(s)sequenceDiagram
participant SA as "SourceAnalyzer"
participant A as "Analyzer (language)"
participant F as "File"
participant G as "Graph"
SA->>A: First pass: add_file_imports(file)
A->>F: extract import AST nodes
F->>F: add_import(import_node)
SA->>A: Second pass: resolve_import(files, lsp, file_path, path, import_node)
A->>A: parse import AST (from/import, dotted_name, alias)
A->>A: _resolve_import_name -> resolve_type / resolve_method
A-->>F: return list[Entity]
F->>F: add_resolved_import(entity)
F->>G: create IMPORTS edge (file -> entity)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
api/analyzers/source_analyzer.py (1)
29-29:⚠️ Potential issue | 🔴 CriticalCritical:
CSharpAnalyzerfails to instantiate due to missing abstract method implementations.Verification confirms that
CSharpAnalyzercannot be instantiated because it doesn't implement the newly abstract methodsadd_file_imports()andresolve_import(). Since these are@abstractmethodinAbstractAnalyzer, Python raisesTypeErrorat module load time.JavaAnalyzerhas both methods correctly implemented;CSharpAnalyzermust be updated to match.Add these methods to
CSharpAnalyzer:def add_file_imports(self, file: File) -> None: """C# import tracking not yet implemented.""" pass def resolve_import(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, import_node: Node) -> list[Entity]: """C# import resolution not yet implemented.""" return []🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/analyzers/source_analyzer.py` at line 29, CSharpAnalyzer currently cannot be instantiated because it omits the abstract methods defined in AbstractAnalyzer; implement add_file_imports(self, file: File) -> None and resolve_import(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, import_node: Node) -> list[Entity] on the CSharpAnalyzer class, with add_file_imports returning None (no-op or a brief docstring) and resolve_import returning an empty list, matching the signatures used by JavaAnalyzer so the class satisfies the abstract interface and no TypeError is raised at import time.
🧹 Nitpick comments (2)
tests/test_py_imports.py (2)
42-57: Consider using a public API instead of_query().The test accesses the internal
_query()method (prefixed with underscore) to verify relationships. While this works, it couples the test to implementation details. If a public query method exists or could be added, that would make the tests more maintainable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_py_imports.py` around lines 42 - 57, Tests are calling the internal g._query(...) which couples tests to implementation; replace uses of _query in tests (the two occurrences checking IMPORTS to Class and Function) with the public query API (e.g., g.query(...) or g.run_query(...)) if available, or add a small public wrapper method on the graph object (name it query or run_query) that delegates to _query and use that in tests; update the assertions to call this public method (referencing g._query in the diff to locate the spots to change) so tests rely only on the public API.
59-59: Remove debug print statement.Print statements in tests can clutter test output. Consider removing this or using
logging.debug()if the message is needed for debugging.🧹 Suggested fix
- print("✓ Import tracking test passed")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_py_imports.py` at line 59, Remove the debug print statement that emits "✓ Import tracking test passed" from the test (tests/test_py_imports.py); either delete that line entirely or replace it with a logging.debug call and ensure the module imports the logging library and uses logging.getLogger(__name__).debug(...) so test output remains clean.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/analyzers/java/analyzer.py`:
- Around line 131-145: The annotations in add_file_imports and resolve_import
reference File and Entity but those names are only available via a wildcard
import; add explicit imports for File and Entity at the top of the module
(alongside existing imports like Node, SyncLanguageServer, Path) so the type
annotations no longer rely on the wildcard; update the module import list to
include File and Entity (used by add_file_imports and resolve_import) and run
linters to confirm F405 is resolved.
In `@api/analyzers/python/analyzer.py`:
- Around line 160-216: The resolve_import function currently only resolves class
imports by calling resolve_type and doesn't handle wildcard imports or function
imports; update resolve_import to (1) detect wildcard imports in
import_from_statement (e.g., the '*' or wildcard_import child) and, when found,
resolve the entire module by calling the module-resolution path used elsewhere
(use the same file/module lookup used by resolve_type) and add all exported
entities to res; and (2) when resolving individual imported names (both in
import_statement/aliased_import and
import_from_statement/aliased_import/dotted_name cases) try both
resolve_type(...) and resolve_method(...) (or the function-resolution helper
used in this analyzer) and extend res with results from both so functions and
classes are captured. Ensure these changes are made inside resolve_import so
existing control flow (import_statement vs import_from_statement) remains intact
and exceptions are still caught.
In `@test-project/c.java`:
- Line 1: The package declaration "package test-project;" in c.java is invalid
because hyphens are not allowed; update the package statement (the package
declaration at the top of c.java) to a valid Java package identifier such as
"package test.project;" or "package test_project;" and move the file to the
matching directory structure (e.g., test/project/c.java for test.project) so the
package name and filesystem layout align.
In `@tests/source_files/py_imports/module_b.py`:
- Around line 3-10: The resolver currently only walks up to class_definition and
therefore resolves ClassA but drops function_a; update the Python import target
resolution in api/analyzers/python/analyzer.py to also recognize and resolve
function definitions (and plain name bindings) in addition to class_definition
so that imports like ClassA and function_a from module_b.py are both linked;
modify the resolver that walks AST nodes (the logic that checks for
class_definition) to also check for function_definition and simple name
assignments so IMPORTS edges are created for function_a as well as ClassA.
---
Outside diff comments:
In `@api/analyzers/source_analyzer.py`:
- Line 29: CSharpAnalyzer currently cannot be instantiated because it omits the
abstract methods defined in AbstractAnalyzer; implement add_file_imports(self,
file: File) -> None and resolve_import(self, files: dict[Path, File], lsp:
SyncLanguageServer, file_path: Path, path: Path, import_node: Node) ->
list[Entity] on the CSharpAnalyzer class, with add_file_imports returning None
(no-op or a brief docstring) and resolve_import returning an empty list,
matching the signatures used by JavaAnalyzer so the class satisfies the abstract
interface and no TypeError is raised at import time.
---
Nitpick comments:
In `@tests/test_py_imports.py`:
- Around line 42-57: Tests are calling the internal g._query(...) which couples
tests to implementation; replace uses of _query in tests (the two occurrences
checking IMPORTS to Class and Function) with the public query API (e.g.,
g.query(...) or g.run_query(...)) if available, or add a small public wrapper
method on the graph object (name it query or run_query) that delegates to _query
and use that in tests; update the assertions to call this public method
(referencing g._query in the diff to locate the spots to change) so tests rely
only on the public API.
- Line 59: Remove the debug print statement that emits "✓ Import tracking test
passed" from the test (tests/test_py_imports.py); either delete that line
entirely or replace it with a logging.debug call and ensure the module imports
the logging library and uses logging.getLogger(__name__).debug(...) so test
output remains clean.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4d159efb-8f1f-417d-871d-0b2ef50aea9c
📒 Files selected for processing (12)
CI_OPTIMIZATION.mdapi/analyzers/analyzer.pyapi/analyzers/java/analyzer.pyapi/analyzers/python/analyzer.pyapi/analyzers/source_analyzer.pyapi/entities/file.pytest-project/a.ctest-project/b.py___test-project/c.javatests/source_files/py_imports/module_a.pytests/source_files/py_imports/module_b.pytests/test_py_imports.py
58861d2 to
d5e506f
Compare
gkorland
left a comment
There was a problem hiding this comment.
Rebased on staging and addressed review comments:
-
Unused imports in test: Removed unused
PathandFileimports fromtest_py_imports.py. -
Explicit imports in Java analyzer: Added
from ...entities.entity import Entityandfrom ...entities.file import Fileexplicit imports to avoid F405 linting errors from star imports. -
Function import resolution: Updated
resolve_type()in the Python analyzer to also walk up tofunction_definition(not justclass_definition). Added_resolve_import_name()helper that tries bothresolve_type()andresolve_method()so function imports (likefrom module_a import function_a) are correctly resolved. -
Invalid Java package name: Fixed
test-project/c.java— changedpackage test-project;topackage test_project;(hyphens are not valid in Java package identifiers).
Note: Wildcard import handling (from module import *) is not addressed in this PR as it would require a broader change to enumerate and resolve all exports from a module.
There was a problem hiding this comment.
Pull request overview
Adds Python import tracking to the analysis pipeline so Python files can produce (:File)-[:IMPORTS]->(:Class|:Function) relationships in the graph.
Changes:
- Extend
Fileto store import AST nodes and resolved import entities. - Add import extraction + import resolution hooks to
AbstractAnalyzerand wire them intoSourceAnalyzer’s two-pass pipeline. - Implement Python import capture/resolution and add a focused unit test + fixtures.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
api/entities/file.py |
Adds per-file storage for import nodes and resolved imports. |
api/analyzers/analyzer.py |
Extends the analyzer interface with add_file_imports and resolve_import. |
api/analyzers/source_analyzer.py |
Integrates import extraction in first pass and import resolution/edge creation in second pass. |
api/analyzers/python/analyzer.py |
Implements Python import capture + resolution logic and adjusts symbol/type resolution. |
api/analyzers/java/analyzer.py |
Adds stub methods to satisfy the new abstract interface. |
api/analyzers/csharp/analyzer.py |
Adds stub methods to satisfy the new abstract interface. |
tests/test_py_imports.py |
New test validating IMPORTS edges for from ... import ... in Python fixtures. |
tests/source_files/py_imports/module_a.py |
Fixture defining a class and function to be imported. |
tests/source_files/py_imports/module_b.py |
Fixture importing from module_a and using the imported symbols. |
CI_OPTIMIZATION.md |
Adds CI optimization documentation (appears unrelated to import tracking). |
test-project/a.c |
Adds a sample C file (appears unrelated to import tracking). |
test-project/b.py___ |
Adds a sample Python-like file with unusual extension (appears unrelated to import tracking). |
test-project/c.java |
Adds a sample Java file (appears unrelated to import tracking). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if decl.type not in ['class_definition', 'function_definition']: | ||
| decl = self.find_parent(resolved_node, ['class_definition', 'function_definition']) | ||
| if decl in file.entities: |
There was a problem hiding this comment.
resolve_type() now treats function_definition as a valid declaration and can return Function entities. Since this method is also used for return_type / parameters symbol resolution, it can create RETURNS/PARAMETERS edges to functions (not types) when the LSP resolves an annotation to a function symbol. Consider keeping resolve_type() limited to class/type declarations and handling function imports via resolve_method() (or a dedicated resolver) to avoid mixing semantics.
| if decl.type not in ['class_definition', 'function_definition']: | |
| decl = self.find_parent(resolved_node, ['class_definition', 'function_definition']) | |
| if decl in file.entities: | |
| if decl.type != 'class_definition': | |
| decl = self.find_parent(resolved_node, ['class_definition']) | |
| if decl and decl.type == 'class_definition' and decl in file.entities: |
There was a problem hiding this comment.
I considered this, but I believe the current approach is acceptable here. The purpose of this PR is to track imports, and resolve_type() expanding to include function_definition is what enables _resolve_import_name() to resolve function imports via a single call path before falling back to resolve_method().
The risk of resolve_type() returning Function entities for base_class/parameters/return_type resolution is low in practice — the LSP resolves annotations to their actual definition, and a type annotation like x: some_function would be unusual Python. That said, splitting resolve_type into type-only and definition-generic variants is a valid improvement but better suited for a dedicated refactor PR to avoid scope creep here.
Also added explicit Entity/File imports to the Python analyzer in 1b1ef5c to match the Java analyzer pattern and resolve F405 ruff warnings.
gkorland
left a comment
There was a problem hiding this comment.
Re: resolve_type returning Function entities: This is intentional behavior for this PR. The goal of import support is to track where functions and classes are imported from. When Python code does from module import some_function, resolve_type correctly resolves the import to the Function entity — this is by design, not a bug. The name resolve_type is admittedly a bit misleading (it now resolves both types and functions), but renaming it is outside the scope of this PR.
All other review comments have been addressed in the latest commit.
gkorland
left a comment
There was a problem hiding this comment.
Addressing remaining unresolved threads:
python/analyzer.py:215 (coderabbitai — wildcard imports + function resolution): The _resolve_import_name helper already tries both resolve_type() and resolve_method(), so function imports ARE handled. Wildcard imports (from module import *) are intentionally not handled — resolving them requires enumerating a module's __all__ or all exports, which is complex and out of scope for this PR.
python/analyzer.py:102 (copilot — resolve_type returning Functions): This is intentional. The purpose of this PR is to track imports, including function imports. When from module import some_function appears, resolve_type correctly resolves it to the Function entity. The name is admittedly a bit misleading but renaming is out of scope.
Agent Review SummaryComments Fixed (agreed & resolved)
Comments Declined (disagreed — threads left open)
Tests
CI StatusAll checks passing ✅ (Build, Docker, Playwright shards 1+2, CodeQL, CodeRabbit) |
There was a problem hiding this comment.
🧹 Nitpick comments (3)
api/analyzers/python/analyzer.py (2)
155-156: Broad exception catch is acceptable here but could be narrowed.The
except Exceptioncatches all errors including unexpected ones, which is flagged by Ruff (BLE001). Since import extraction is non-critical and failures are logged, this defensive approach is reasonable. However, if specific failure modes are known (e.g., tree-sitter query errors), catching those explicitly would be more precise.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/analyzers/python/analyzer.py` around lines 155 - 156, Narrow the blanket "except Exception as e" around the import-extraction block (the try/except where logger.debug(f"Failed to extract imports from {file.path}: {e}") is called) to catch only the expected failure types (for example ValueError, RuntimeError, and any tree-sitter specific exceptions your code uses such as TreeSitterError or QueryError), then optionally keep a final broad fallback to log truly unexpected exceptions; update the except blocks accordingly around the import extraction logic so the known errors are handled explicitly and the existing logger.debug message remains for the fallback.
101-105: Consider explicitNonecheck for consistency withresolve_method.When
find_parentreturnsNone, theif decl in file.entitiescheck works correctly (returnsFalse). However,resolve_method(lines 116-117) explicitly guards against this withif not method_dec: continue. Adding a similar guard here would make the intent clearer and keep the two resolution methods consistent.♻️ Suggested refactor for explicit None handling
decl = resolved_node if decl.type not in ['class_definition', 'function_definition']: decl = self.find_parent(resolved_node, ['class_definition', 'function_definition']) + if not decl: + continue if decl in file.entities: res.append(file.entities[decl])🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/analyzers/python/analyzer.py` around lines 101 - 105, The current resolution logic uses decl = resolved_node or self.find_parent(...) and then checks membership in file.entities without an explicit None guard; to match resolve_method's pattern, add an explicit None check (e.g., if not decl: continue) before the membership test so that when find_parent returns None the intent is clear; update the block around decl, resolved_node, and file.entities accordingly.test-project/a.c (1)
1-11: Remove this C fixture file if not required by the Python-import PR.Verification shows
test-project/a.cand its exportedintsetNew()function are not referenced anywhere else in the codebase. If this file is not intentionally part of the PR scope, it should be removed to keep the changeset focused.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test-project/a.c` around lines 1 - 11, The file introduces an unused C test fixture exposing intsetNew(), which is not referenced by the PR; either remove this fixture file entirely or, if it is required, add a clear reference and usage (or a test) that justifies keeping the exported function intsetNew() and ensure build/import entries reference it; update the PR description to explain why the C fixture is included if you keep it.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@api/analyzers/python/analyzer.py`:
- Around line 155-156: Narrow the blanket "except Exception as e" around the
import-extraction block (the try/except where logger.debug(f"Failed to extract
imports from {file.path}: {e}") is called) to catch only the expected failure
types (for example ValueError, RuntimeError, and any tree-sitter specific
exceptions your code uses such as TreeSitterError or QueryError), then
optionally keep a final broad fallback to log truly unexpected exceptions;
update the except blocks accordingly around the import extraction logic so the
known errors are handled explicitly and the existing logger.debug message
remains for the fallback.
- Around line 101-105: The current resolution logic uses decl = resolved_node or
self.find_parent(...) and then checks membership in file.entities without an
explicit None guard; to match resolve_method's pattern, add an explicit None
check (e.g., if not decl: continue) before the membership test so that when
find_parent returns None the intent is clear; update the block around decl,
resolved_node, and file.entities accordingly.
In `@test-project/a.c`:
- Around line 1-11: The file introduces an unused C test fixture exposing
intsetNew(), which is not referenced by the PR; either remove this fixture file
entirely or, if it is required, add a clear reference and usage (or a test) that
justifies keeping the exported function intsetNew() and ensure build/import
entries reference it; update the PR description to explain why the C fixture is
included if you keep it.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0240b636-3407-47e1-a999-fd21190e4624
📒 Files selected for processing (11)
api/analyzers/analyzer.pyapi/analyzers/csharp/analyzer.pyapi/analyzers/java/analyzer.pyapi/analyzers/python/analyzer.pyapi/analyzers/source_analyzer.pyapi/entities/file.pytest-project/a.ctest-project/c.javatests/source_files/py_imports/module_a.pytests/source_files/py_imports/module_b.pytests/test_py_imports.py
✅ Files skipped from review due to trivial changes (3)
- tests/source_files/py_imports/module_a.py
- test-project/c.java
- api/analyzers/java/analyzer.py
🚧 Files skipped from review as they are similar to previous changes (5)
- tests/source_files/py_imports/module_b.py
- api/analyzers/source_analyzer.py
- tests/test_py_imports.py
- api/analyzers/analyzer.py
- api/entities/file.py
Migrated from FalkorDB/code-graph-backend PR #97. Original issue: FalkorDB/code-graph-backend#61 Resolves #535 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove unused Path and File imports from test_py_imports.py - Add explicit Entity and File imports to Java analyzer - Fix resolve_type to also resolve function_definition (not just classes) - Add _resolve_import_name helper to try both type and method resolution - Fix invalid Java package name in test-project/c.java (hyphens not allowed) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CSharpAnalyzer was missing implementations for the new abstract methods added by this PR, causing TypeError on import. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Fix CSharpAnalyzer.resolve_import signature to match abstract (add Node type) - Switch add_file_imports to use self._captures() instead of query.captures() - Remove stale CI_OPTIMIZATION.md and test-project/b.py___ - Remove debug print from test Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds explicit imports for Entity and File alongside the star import, matching the pattern used in the Java analyzer. Fixes F405 ruff warnings. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…yzers Adds stub add_file_imports() and resolve_import() implementations to JavaScriptAnalyzer and KotlinAnalyzer to satisfy the abstract interface added in the import tracking feature. Without these stubs, the analyzers cannot be instantiated, breaking the seed_test_data step in CI. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ecf7473 to
4343b59
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test-project/a.c`:
- Line 2: The absolute include in the file (the line containing `#include`
"/src/ff.h") can break portability; change it to a relative or project-root safe
include (for example "ff.h" or a relative path like "../src/ff.h" or a
placeholder header) so the build/tests don't depend on an absolute /src path and
the preprocessor can locate the header reliably.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 66b60695-d519-43b1-92ee-4e908f0a833a
📒 Files selected for processing (13)
api/analyzers/analyzer.pyapi/analyzers/csharp/analyzer.pyapi/analyzers/java/analyzer.pyapi/analyzers/javascript/analyzer.pyapi/analyzers/kotlin/analyzer.pyapi/analyzers/python/analyzer.pyapi/analyzers/source_analyzer.pyapi/entities/file.pytest-project/a.ctest-project/c.javatests/source_files/py_imports/module_a.pytests/source_files/py_imports/module_b.pytests/test_py_imports.py
✅ Files skipped from review due to trivial changes (3)
- tests/source_files/py_imports/module_b.py
- tests/source_files/py_imports/module_a.py
- test-project/c.java
🚧 Files skipped from review as they are similar to previous changes (4)
- api/entities/file.py
- tests/test_py_imports.py
- api/analyzers/analyzer.py
- api/analyzers/java/analyzer.py
| @@ -0,0 +1,11 @@ | |||
| #include <stdio.h> | |||
| #include "/src/ff.h" | |||
There was a problem hiding this comment.
Absolute include path may cause portability issues.
The include path /src/ff.h is absolute and will fail unless the exact directory structure exists at the filesystem root. If this is test data for the analyzer, consider using a relative path or a placeholder that doesn't require the file to exist.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test-project/a.c` at line 2, The absolute include in the file (the line
containing `#include` "/src/ff.h") can break portability; change it to a relative
or project-root safe include (for example "ff.h" or a relative path like
"../src/ff.h" or a placeholder header) so the build/tests don't depend on an
absolute /src path and the preprocessor can locate the header reliably.
Migrated from falkordb/code-graph-backend#97
Summary
Add Python import tracking to code graph. Extends the analyzer to detect and track Python import statements, creating IMPORTS relationships in the graph.
Changes:
Resolves #535
Originally authored by @Copilot in falkordb/code-graph-backend#97
Summary by CodeRabbit
New Features
Tests