Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughReplaced stubbed C analyzer with a fully functional Tree-sitter-based implementation that extracts function definitions, struct definitions, include directives, and parameter details. The analyzer uses two-pass processing: first pass identifies definitions and includes, second pass resolves function call relationships. Changes
Sequence DiagramsequenceDiagram
participant User
participant CAnalyzer
participant Parser as Tree-sitter Parser
participant Graph
participant FileSystem
User->>CAnalyzer: analyze(path, graph)
CAnalyzer->>FileSystem: Read .c/.h files
FileSystem-->>CAnalyzer: File contents
rect rgba(100, 150, 200, 0.5)
Note over CAnalyzer,Graph: First Pass: Definitions & Includes
CAnalyzer->>Parser: Parse C code
Parser-->>CAnalyzer: AST
CAnalyzer->>CAnalyzer: Extract functions, structs, includes
CAnalyzer->>Graph: Add Function/Struct/File entities
CAnalyzer->>Graph: Create DEFINES edges
CAnalyzer->>Graph: Create INCLUDES edges
end
rect rgba(200, 150, 100, 0.5)
Note over CAnalyzer,Graph: Second Pass: Call Relationships
CAnalyzer->>Parser: Re-parse for function calls
Parser-->>CAnalyzer: AST
CAnalyzer->>CAnalyzer: Identify call sites
CAnalyzer->>Graph: Create CALLS edges
end
CAnalyzer-->>User: Graph populated
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Adds C #include directive handling intended to create INCLUDES edges between files in the graph, along with a unit test update to validate the relationship.
Changes:
- Replaced the (previously commented) C analyzer implementation with a concrete analyzer that attempts to parse C AST and create graph nodes/edges.
- Added
process_include_directiveand wired it into the C analyzer “first pass”. - Updated
tests/test_c_analyzer.pyto assertINCLUDESedges for an included header.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 11 comments.
| File | Description |
|---|---|
api/analyzers/c/analyzer.py |
Introduces a C analyzer implementation that attempts to create File/Function/Struct nodes and INCLUDES edges from #include. |
tests/test_c_analyzer.py |
Adds assertions for INCLUDES neighbors and updates the test to call an analyze(...) method. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
api/analyzers/c/analyzer.py
Outdated
| def process_function_definition_node(self, node: Node, path: Path, | ||
| source_code: str) -> Optional[Function]: | ||
| """ | ||
| Processes a function definition node to extract function details. | ||
|
|
||
| Args: | ||
| node (Node): The AST node representing a function definition. | ||
| path (Path): The file path where the function is defined. | ||
|
|
||
| Returns: | ||
| Optional[Function]: A Function object containing details about the function, or None if the function name cannot be determined. | ||
| """ | ||
|
|
||
| # Extract function name | ||
| res = find_child_of_type(node, 'function_declarator') | ||
| if res is None: | ||
| return None | ||
|
|
||
| function_declarator = res[0] | ||
|
|
||
| res = find_child_of_type(function_declarator, 'identifier') | ||
| if res is None: | ||
| return None | ||
|
|
||
| identifier = res[0] | ||
| function_name = identifier.text.decode('utf-8') | ||
| logger.info(f"Function declaration: {function_name}") | ||
|
|
||
| # Extract function return type | ||
| res = find_child_of_type(node, 'primitive_type') | ||
| ret_type = 'Unknown' | ||
| if res is not None: | ||
| ret_type = res[0] | ||
| ret_type = ret_type.text.decode('utf-8') | ||
|
|
||
| # Extract function parameters | ||
| args = [] | ||
| res = find_child_of_type(function_declarator, 'parameter_list') | ||
| if res is not None: | ||
| parameters = res[0] | ||
|
|
||
| # Extract arguments and their types | ||
| for child in parameters.children: | ||
| if child.type == 'parameter_declaration': | ||
| arg = self.process_parameter_declaration(child) | ||
| args.append(arg) | ||
|
|
||
| # Extract function definition line numbers | ||
| start_line = node.start_point[0] | ||
| end_line = node.end_point[0] | ||
|
|
||
| # Create Function object | ||
| docs = '' | ||
| src = source_code[node.start_byte:node.end_byte] | ||
| f = Function(str(path), function_name, docs, ret_type, src, start_line, end_line) | ||
|
|
There was a problem hiding this comment.
Fixed in f4bbb1f. The rewrite removed all references to Function/Struct types. The analyzer now uses the generic Entity class consistent with other analyzers.
api/analyzers/c/analyzer.py
Outdated
| # Create file entity | ||
| file = File(os.path.dirname(path), path.name, path.suffix) | ||
| graph.add_file(file) | ||
|
|
||
| # Parse file | ||
| source_code = f.read() | ||
| tree = self.parser.parse(source_code) |
There was a problem hiding this comment.
Fixed in f4bbb1f. CAnalyzer no longer creates File objects — file creation is handled by SourceAnalyzer.first_pass() using File(path, tree).
tests/test_c_analyzer.py
Outdated
| # Test for include_directive edge creation | ||
| included_file = g.get_file('', 'myheader.h', '.h') | ||
| self.assertIsNotNone(included_file) | ||
|
|
||
| includes = g.get_neighbors([f.id], rel='INCLUDES') | ||
| self.assertEqual(len(includes), 3) | ||
| included_files = [node['properties']['name'] for node in includes['nodes']] | ||
| self.assertIn('myheader.h', included_files) |
There was a problem hiding this comment.
Fixed in f4bbb1f. Tests were completely rewritten as standalone unit tests that don't use Graph or FalkorDB. Include extraction is tested via get_include_paths() which returns a simple list of strings.
api/analyzers/c/analyzer.py
Outdated
| if entity is not None: | ||
| # Add Function object to the graph | ||
| try: | ||
| graph.add_function(entity) |
There was a problem hiding this comment.
Fixed in f4bbb1f. The rewrite removed all add_function and add_struct calls. Entity creation is now handled by SourceAnalyzer using graph.add_entity() + connect_entities('DEFINES', ...), consistent with other analyzers.
api/analyzers/c/analyzer.py
Outdated
| # Create missing function | ||
| # Assuming this is a call to a native function e.g. 'printf' | ||
| callee_f = Function('/', callee_name, None, None, None, 0, 0) | ||
| graph.add_function(callee_f) |
There was a problem hiding this comment.
Fixed in f4bbb1f. All custom process_* methods and fallback Function creation were removed in the rewrite.
api/analyzers/c/analyzer.py
Outdated
| source_code = f.read() | ||
| tree = self.parser.parse(source_code) | ||
| try: | ||
| source_code = source_code.decode('utf-8') | ||
| except Exception as e: | ||
| logger.error(f"Failed decoding source code: {e}") | ||
| source_code = '' | ||
|
|
There was a problem hiding this comment.
Fixed in f4bbb1f. The rewrite delegates file parsing to SourceAnalyzer.first_pass() which uses file_path.read_bytes() → analyzer.parser.parse(source_code), consistent with all other analyzers.
api/analyzers/c/analyzer.py
Outdated
| if len(splitted) < 2: | ||
| logger.warning("Include path has no extension: %s", included_file_path) | ||
| return | ||
|
|
||
| # Create file entity for the included file | ||
| path = os.path.dirname(normalized_path) | ||
| name = os.path.basename(normalized_path) | ||
| ext = splitted[1] | ||
| included_file = File(path, name, ext) |
There was a problem hiding this comment.
Fixed in f4bbb1f. The old process_include_directive with os.path.splitext logic was removed. Include extraction is now handled by get_include_paths() using tree-sitter queries via self._captures().
api/analyzers/c/analyzer.py
Outdated
| # Create file entity for the included file | ||
| path = os.path.dirname(normalized_path) | ||
| name = os.path.basename(normalized_path) | ||
| ext = splitted[1] | ||
| included_file = File(path, name, ext) | ||
| graph.add_file(included_file) | ||
|
|
||
| # Connect the parent file to the included file | ||
| graph.connect_entities('INCLUDES', parent.id, included_file.id) |
There was a problem hiding this comment.
Fixed in f4bbb1f. The rewrite no longer creates File objects for included files. get_include_paths() returns raw path strings from #include directives.
tests/test_c_analyzer.py
Outdated
|
|
||
| g = Graph("c") | ||
| analyzer.analyze_local_folder(path, g) | ||
| analyzer.analyze(path, g) |
There was a problem hiding this comment.
Fixed in f4bbb1f. Tests were rewritten as standalone unit tests using CAnalyzer directly. No SourceAnalyzer or Graph dependency needed.
api/analyzers/c/analyzer.py
Outdated
| def __init__(self) -> None: | ||
| self.parser = Parser(C_LANGUAGE) | ||
|
|
There was a problem hiding this comment.
Fixed in f4bbb1f. CAnalyzer now calls super().__init__(Language(tsc.language())) and implements all required abstract methods: get_entity_label, get_entity_name, get_entity_docstring, get_entity_types, add_symbols, is_dependency, resolve_path, resolve_type, resolve_method, resolve_symbol, add_dependencies.
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (3)
api/analyzers/c/analyzer.py (3)
19-20:⚠️ Potential issue | 🟠 MajorCall the base initializer.
Skipping
AbstractAnalyzer.__init__leavesself.languageunset and bypasses the shared analyzer setup.♻️ Proposed fix
class CAnalyzer(AbstractAnalyzer): def __init__(self) -> None: - self.parser = Parser(C_LANGUAGE) + super().__init__(C_LANGUAGE)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/analyzers/c/analyzer.py` around lines 19 - 20, The C analyzer's __init__ is not calling the base class initializer, so shared setup (like self.language) is skipped; update the C analyzer __init__ to call AbstractAnalyzer.__init__ (or super().__init__()) before setting self.parser = Parser(C_LANGUAGE), or explicitly set self.language = C_LANGUAGE if the base init requires parameters—ensure you invoke the base initializer (super().__init__()) in the __init__ method where Parser(C_LANGUAGE) is currently assigned.
394-405:⚠️ Potential issue | 🔴 CriticalParse first, then construct
File(path, tree)from bytes.
Fileis created before the tree exists, andfis typed asio.TextIOWrapper, sof.read()returnsstr. Line 402 then always falls into the exception path becausestr.decode()does not exist, which wipes every extracted function body.♻️ Proposed fix
- # Create file entity - file = File(os.path.dirname(path), path.name, path.suffix) - graph.add_file(file) - # Parse file - source_code = f.read() - tree = self.parser.parse(source_code) - try: - source_code = source_code.decode('utf-8') - except Exception as e: - logger.error(f"Failed decoding source code: {e}") - source_code = '' + source_bytes = f.buffer.read() + tree = self.parser.parse(source_bytes) + source_code = source_bytes.decode('utf-8', errors='replace') + + # Create file entity + file = File(path, tree) + graph.add_file(file)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/analyzers/c/analyzer.py` around lines 394 - 405, The code creates a File object before parsing and incorrectly calls str.decode(), which always raises for io.TextIOWrapper reads and clears source_code; fix by first reading and parsing the source (use self.parser.parse on the string returned by f.read()), then construct and add the File entity (File(...)) afterwards using the original path and parsed tree as needed, and remove the decode() call — handle bytes only if f is opened in binary mode (decode then), otherwise treat f.read() as str and pass it directly to parser.parse and downstream logic (refer to File, graph.add_file, self.parser.parse, and source_code variables).
3-5:⚠️ Potential issue | 🟠 MajorReplace the wildcard imports.
This file now trips Ruff
F403/F405, and the*imports hide the actual symbols the analyzer depends on.♻️ Proposed fix
-from ..utils import * +from ..utils import find_child_of_type from pathlib import Path -from ...entities import * +from ...entities import File, Function, Struct from ...graph import Graph🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/analyzers/c/analyzer.py` around lines 3 - 5, Replace the wildcard imports in analyzer.py: remove "from ..utils import *" and "from ...entities import *" and instead import only the specific functions/classes the module uses (inspect usages of symbols in analyzer.py such as any utility helpers or entity classes referenced) and list them explicitly (e.g., from ..utils import func_a, helper_b; from ...entities import EntityX, EntityY). Ensure any renamed or aliased imports match existing references in functions like the analyzer class or helper functions, and update any __all__ or re-exports if present.
🧹 Nitpick comments (1)
tests/test_c_analyzer.py (1)
63-70: Please add the new coverage in pytest style.This extends a
unittest.TestCasesuite, but backend tests undertests/are expected to use pytest.As per coding guidelines, "Backend tests should use pytest and be organized in tests/ directory".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_c_analyzer.py` around lines 63 - 70, Convert the unittest-style assertions in tests/test_c_analyzer.py into a pytest-style test function: create a standalone function (e.g., test_include_directive_edge_creation) that uses the existing fixture/object g, replace self.assertIsNotNone(included_file) with assert included_file is not None, replace self.assertEqual(len(includes), 3) with assert len(includes) == 3, and replace self.assertIn('myheader.h', included_files) with assert 'myheader.h' in included_files; remove any unittest.TestCase class wrapper and self usage so the test runs under pytest.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/analyzers/c/analyzer.py`:
- Around line 341-357: The code creates a synthetic File node from the raw
include text (using os.path.dirname(normalized_path)) instead of resolving the
include relative to the including file; change the logic to first resolve the
include against the including file's directory (e.g. resolved =
Path(parent_of_including_file) / normalized_path), then check the graph for an
existing File entity for that resolved path and reuse it (lookup via whatever
graph method exists, e.g. graph.get_file_by_path/find_file), and only create a
new File and call graph.add_file(File(...)) if no real entity is found; update
references to use included_file = existing_or_new_file so INCLUDES point to the
actual analyzed header rather than a top-level synthetic node.
- Around line 408-419: first_pass() and second_pass() incorrectly call
query.captures() on a Query (which doesn't exist) and treat the result as a
tuple; replace those calls with the existing helper or a QueryCursor: use
self._captures(tree, query) (the inherited helper that returns a dict keyed by
capture name) when invoking C_LANGUAGE.query("(function_definition) `@function`")
and iterate captures['function'] to call process_function_definition(file, node,
path, graph, source_code); in second_pass() also remove any indexing like [0] on
capture results and instead access the dict entries returned by _captures() or
use a QueryCursor created from the Query and call
cursor.captures(tree.root_node) before processing nodes.
In `@tests/test_c_analyzer.py`:
- Line 24: The test calls analyzer.analyze(path, g) but SourceAnalyzer does not
implement analyze(); either add a public analyze(self, path, g) wrapper on
SourceAnalyzer that delegates to the appropriate existing method (for example
call analyze_local_folder(path, g) or analyze_sources(path, g) based on input)
or update the test to call an existing method such as analyze_local_folder(path,
g); additionally ensure the analyzers dispatch map enables CAnalyzer by
uncommenting the ".c" and ".h" entries in the analyzers dict so C files are
actually handled when the wrapper or chosen method delegates to CAnalyzer.
---
Duplicate comments:
In `@api/analyzers/c/analyzer.py`:
- Around line 19-20: The C analyzer's __init__ is not calling the base class
initializer, so shared setup (like self.language) is skipped; update the C
analyzer __init__ to call AbstractAnalyzer.__init__ (or super().__init__())
before setting self.parser = Parser(C_LANGUAGE), or explicitly set self.language
= C_LANGUAGE if the base init requires parameters—ensure you invoke the base
initializer (super().__init__()) in the __init__ method where Parser(C_LANGUAGE)
is currently assigned.
- Around line 394-405: The code creates a File object before parsing and
incorrectly calls str.decode(), which always raises for io.TextIOWrapper reads
and clears source_code; fix by first reading and parsing the source (use
self.parser.parse on the string returned by f.read()), then construct and add
the File entity (File(...)) afterwards using the original path and parsed tree
as needed, and remove the decode() call — handle bytes only if f is opened in
binary mode (decode then), otherwise treat f.read() as str and pass it directly
to parser.parse and downstream logic (refer to File, graph.add_file,
self.parser.parse, and source_code variables).
- Around line 3-5: Replace the wildcard imports in analyzer.py: remove "from
..utils import *" and "from ...entities import *" and instead import only the
specific functions/classes the module uses (inspect usages of symbols in
analyzer.py such as any utility helpers or entity classes referenced) and list
them explicitly (e.g., from ..utils import func_a, helper_b; from ...entities
import EntityX, EntityY). Ensure any renamed or aliased imports match existing
references in functions like the analyzer class or helper functions, and update
any __all__ or re-exports if present.
---
Nitpick comments:
In `@tests/test_c_analyzer.py`:
- Around line 63-70: Convert the unittest-style assertions in
tests/test_c_analyzer.py into a pytest-style test function: create a standalone
function (e.g., test_include_directive_edge_creation) that uses the existing
fixture/object g, replace self.assertIsNotNone(included_file) with assert
included_file is not None, replace self.assertEqual(len(includes), 3) with
assert len(includes) == 3, and replace self.assertIn('myheader.h',
included_files) with assert 'myheader.h' in included_files; remove any
unittest.TestCase class wrapper and self usage so the test runs under pytest.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: db7fbc7c-059d-4357-be05-71bacb011725
📒 Files selected for processing (2)
api/analyzers/c/analyzer.pytests/test_c_analyzer.py
996c299 to
b8cdeb2
Compare
gkorland
left a comment
There was a problem hiding this comment.
This PR has been substantially rewritten to address all 17 review comments. The original implementation bypassed the AbstractAnalyzer interface entirely (used non-existent Function/Struct types, missing Graph.add_function/add_struct methods, wrong File constructor, text/bytes confusion, etc.).
What changed:
-
Complete rewrite of
CAnalyzer: Now properly extendsAbstractAnalyzerwithsuper().__init__()and implements all required abstract methods (get_entity_label,get_entity_name,get_entity_docstring,get_entity_types,add_symbols,is_dependency,resolve_path,resolve_type,resolve_method,resolve_symbol,add_dependencies). -
Include directive support: Added
get_include_paths()helper method for extracting#includepaths from parsed C files using tree-sitter queries viaself._captures(). -
source_analyzer.pyintegration: Uncommented the C analyzer import, added.c/.hto the analyzers dict, addedNullLanguageServerfor C/H extensions in the second pass, and added.c/.hto the file globbing inanalyze_sources(). -
Test rewrite: Rewrote
test_c_analyzer.pyas standalone unit tests that don't require FalkorDB. Tests cover entity extraction (functions, structs), labels, call symbols, include extraction, andis_dependency. Added#includedirectives to the test fixturesrc.c. -
Removed dead code: Removed all the old custom
process_*methods, theutilsimport, andFunction/Structtype usage.
All 8 tests pass, lint is clean.
Migrated from FalkorDB/code-graph-backend PR #57. Original issue: FalkorDB/code-graph-backend#46 Resolves #544 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The previous implementation bypassed the AbstractAnalyzer interface entirely, using non-existent types (Function, Struct), missing methods (graph.add_function, graph.add_struct), wrong File constructor args, and text/bytes confusion. This rewrite: - Properly extends AbstractAnalyzer with super().__init__() - Implements all required abstract methods (get_entity_label, get_entity_name, get_entity_docstring, get_entity_types, add_symbols, etc.) - Uses self._captures() for correct QueryCursor-based tree-sitter queries - Adds get_include_paths() helper for #include directive extraction - Enables C/H support in source_analyzer.py (uncommented import, added to analyzers dict, LSP setup, and file globbing) - Rewrites test_c_analyzer.py as standalone unit tests (no DB required) - Adds #include directives to test fixture (src.c) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
b8cdeb2 to
f4bbb1f
Compare
Migrated from falkordb/code-graph-backend#57
Summary
Add support for processing
include_directivein C files, creating edges between files.Changes:
process_include_directivemethod to C analyzerResolves #544
Originally authored by @gkorland in falkordb/code-graph-backend#57
Summary by CodeRabbit
New Features
Tests