Skip to content

Conversation

@2bndy5
Copy link
Collaborator

@2bndy5 2bndy5 commented Dec 7, 2025

addresses #192

This coverts functions that run and capture clang tools' suggestions from blocking to async API.

Other changes

  • mark env changes in tests as unsafe (satisfies newer lint rules)
  • properly parse xml with no replacements
  • update uv.lock file

Summary by CodeRabbit

  • Refactor

    • Core clang tooling converted to asynchronous execution for improved concurrency and responsiveness.
  • Dependencies

    • Runtime features expanded to enable async process and filesystem handling.
  • User-facing Behavior

    • Logs and diagnostics now reference file names more consistently and include richer context.
  • Testing & Reliability

    • Tests and test harness updated for async flows; improved error messages, logging, and empty-result handling.

✏️ Tip: You can customize this high-level summary in your review settings.

addresses #192

This coverts functions that
run and capture clang tools' suggestions
from blocking to async API.
satisfies newer lint rules
@2bndy5 2bndy5 added the enhancement New feature or request label Dec 7, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 7, 2025

Walkthrough

The PR async-ifies clang tool execution and orchestration (run_clang_format, run_clang_tidy, capture_version, analyze_single_file, capture_clang_tools_output), switches command execution to tokio::process, changes file access to Arc<Mutex<FileObj>>, updates tests to async/await, scopes env var mutations in unsafe blocks, and adds a tokio process feature in Cargo.toml.

Changes

Cohort / File(s) Summary
Tokio dependency update
cpp-linter/Cargo.toml
Added tokio feature "process" to the existing ["macros", "rt-multi-thread"] feature set.
Async clang-format refactor
cpp-linter/src/clang_tools/clang_format.rs
Converted run_clang_format to pub async fn accepting &Arc<Mutex<FileObj>>, switched to tokio::process::Command, precompute args and file name before awaiting, use FormatAdvice::default(), add serde default/rename on replacements, update logging, and add parse_xml_no_replacements test.
Async clang-tidy refactor
cpp-linter/src/clang_tools/clang_tidy.rs
Converted run_clang_tidy to pub async fn accepting &Arc<Mutex<FileObj>>, switched to tokio::process::Command, build args prior to awaits, adjust patch caching/restoration to async pattern, and convert tests to async.
Clang tooling orchestration
cpp-linter/src/clang_tools/mod.rs
Made capture_version and analyze_single_file async, changed capture_clang_tools_output to accept &[Arc<Mutex<FileObj>>], await tool runners, capture file_name from the locked FileObj before awaits, and derive Default for ClangVersions.
Run/test harness updates
cpp-linter/src/run.rs
Updated call sites to new capture_clang_tools_output signature, adapted tests to use .await.unwrap(), and moved some env var manipulations into unsafe blocks.
Test safety scoping & review tweaks
cpp-linter/tests/*.rs
cpp-linter/tests/comments.rs, cpp-linter/tests/paginated_changed_files.rs, cpp-linter/tests/reviews.rs
Wrapped multiple env::set_var/env::remove_var calls inside unsafe { ... }; in reviews.rs split tool_ignore into format_ignore and tidy_ignore and adjusted CLI arg construction accordingly.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Verify no MutexGuard is held across .await points (clang_format.rs, clang_tidy.rs, mod.rs).
  • Confirm command/argument construction is semantically unchanged after precomputing.
  • Validate serde changes for FormatAdvice.replacements and FormatAdvice::default() behavior.
  • Check all call sites updated for async signatures and Arc<Mutex<FileObj>> usage.
  • Inspect tests for correct async conversion and the introduced unsafe blocks around env mutations.

Possibly related issues

Possibly related PRs

Suggested labels

dependencies

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main change: converting to async APIs using tokio::Command for capturing clang tool suggestions.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch tokio-command

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
cpp-linter/src/clang_tools/mod.rs (1)

245-260: Consider simplifying the spawn pattern to avoid the double-await.

The current code spawns a future that returns another future, requiring output?.await?. This works but could be simplified.

Apply this diff to call .await inside the spawned task:

     for file in files {
         let arc_params = Arc::new(clang_params.clone());
         let arc_file = Arc::clone(file);
-        executors.spawn(async move { analyze_single_file(arc_file, arc_params) });
+        executors.spawn(async move { analyze_single_file(arc_file, arc_params).await });
     }
 
     while let Some(output) = executors.join_next().await {
-        let (file_name, logs) = output?.await?;
+        let (file_name, logs) = output??;

Alternatively, since analyze_single_file is already an async fn, you could spawn it directly:

executors.spawn(analyze_single_file(arc_file, arc_params));
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6523512 and 6e8ba1c.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • cpp-linter/Cargo.toml (1 hunks)
  • cpp-linter/src/clang_tools/clang_format.rs (6 hunks)
  • cpp-linter/src/clang_tools/clang_tidy.rs (6 hunks)
  • cpp-linter/src/clang_tools/mod.rs (4 hunks)
  • cpp-linter/src/run.rs (2 hunks)
  • cpp-linter/tests/comments.rs (1 hunks)
  • cpp-linter/tests/paginated_changed_files.rs (2 hunks)
  • cpp-linter/tests/reviews.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-11-04T06:50:10.870Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 208
File: cpp-linter/src/clang_tools/mod.rs:60-115
Timestamp: 2025-11-04T06:50:10.870Z
Learning: In the cpp-linter-rs project, path validation (such as checking whether a path is a file or directory) should be performed in CLI parsing (cpp-linter/src/cli/structs.rs) rather than in the tool lookup logic (cpp-linter/src/clang_tools/mod.rs). This maintains proper separation of concerns.

Applied to files:

  • cpp-linter/tests/reviews.rs
  • cpp-linter/tests/paginated_changed_files.rs
  • cpp-linter/src/run.rs
  • cpp-linter/src/clang_tools/clang_tidy.rs
  • cpp-linter/src/clang_tools/mod.rs
  • cpp-linter/src/clang_tools/clang_format.rs
📚 Learning: 2024-10-02T07:55:08.948Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 52
File: cpp-linter/src/clang_tools/mod.rs:146-146
Timestamp: 2024-10-02T07:55:08.948Z
Learning: In the `cpp-linter` project, the function `capture_clang_tools_output` is only called in `cpp-linter/src/run.rs`, and there are no other calls to this function elsewhere in the codebase.

Applied to files:

  • cpp-linter/tests/reviews.rs
  • cpp-linter/tests/paginated_changed_files.rs
  • cpp-linter/src/run.rs
  • cpp-linter/src/clang_tools/clang_tidy.rs
  • cpp-linter/src/clang_tools/mod.rs
📚 Learning: 2024-11-23T06:20:11.698Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 54
File: cpp-linter/src/rest_api/mod.rs:253-254
Timestamp: 2024-11-23T06:20:11.698Z
Learning: In `cpp-linter/src/rest_api/mod.rs`, it is safe to use `unwrap()` on `clang_versions.tidy_version` and `clang_versions.format_version` because they are guaranteed to be `Some` at that point in the code.

Applied to files:

  • cpp-linter/tests/reviews.rs
  • cpp-linter/tests/paginated_changed_files.rs
  • cpp-linter/src/run.rs
  • cpp-linter/tests/comments.rs
  • cpp-linter/src/clang_tools/clang_tidy.rs
  • cpp-linter/src/clang_tools/mod.rs
📚 Learning: 2024-11-23T14:10:31.760Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 54
File: cpp-linter/src/clang_tools/mod.rs:149-159
Timestamp: 2024-11-23T14:10:31.760Z
Learning: Clang tools' `--version` output is consistent across all released versions, so adding extra error handling or unit tests for version extraction may be unnecessary. Tests are run after the clang tools are installed.

Applied to files:

  • cpp-linter/src/run.rs
  • cpp-linter/src/clang_tools/mod.rs
📚 Learning: 2025-01-21T09:56:32.771Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 101
File: cpp-linter/src/clang_tools/clang_format.rs:155-161
Timestamp: 2025-01-21T09:56:32.771Z
Learning: In cpp-linter-rs, the XML output being parsed is generated programmatically by clang-format tool. The only failure case for XML parsing is when clang-format produces a blank XML document, in which case falling back to empty results (using unwrap_or) is the desired behavior.

Applied to files:

  • cpp-linter/src/run.rs
  • cpp-linter/src/clang_tools/clang_tidy.rs
  • cpp-linter/src/clang_tools/mod.rs
  • cpp-linter/src/clang_tools/clang_format.rs
🧬 Code graph analysis (5)
cpp-linter/tests/reviews.rs (1)
cpp-linter/tests/common/mod.rs (1)
  • mock_server (82-84)
cpp-linter/tests/paginated_changed_files.rs (1)
cpp-linter/src/rest_api/github/specific_api.rs (1)
  • new (33-80)
cpp-linter/tests/comments.rs (2)
cpp-linter/src/main.rs (1)
  • env (11-11)
cpp-linter/tests/common/mod.rs (1)
  • mock_server (82-84)
cpp-linter/src/clang_tools/clang_tidy.rs (2)
cpp-linter/src/common_fs/file_filter.rs (1)
  • new (17-24)
cpp-linter/src/common_fs/mod.rs (1)
  • new (44-53)
cpp-linter/src/clang_tools/clang_format.rs (1)
cpp-linter/src/common_fs/mod.rs (1)
  • new (44-53)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: build-rustdoc
  • GitHub Check: x86_64-apple-darwin
  • GitHub Check: aarch64-apple-darwin
  • GitHub Check: x86_64-unknown-linux-musl
  • GitHub Check: x86_64-unknown-linux-gnu
  • GitHub Check: arm-unknown-linux-gnueabi
  • GitHub Check: test (windows-latest)
  • GitHub Check: aarch64-unknown-linux-musl
  • GitHub Check: Build aarch64-unknown-linux-musl
  • GitHub Check: Build armv7-unknown-linux-gnueabihf
  • GitHub Check: Build aarch64-apple-darwin
  • GitHub Check: Build aarch64-pc-windows-msvc
  • GitHub Check: Build x86_64-unknown-linux-gnu
  • GitHub Check: Build aarch64-unknown-linux-gnu
  • GitHub Check: Build x86_64-unknown-linux-musl
  • GitHub Check: Build x86_64-apple-darwin
  • GitHub Check: Build x86_64-pc-windows-msvc
  • GitHub Check: Build FreeBSD
  • GitHub Check: build-py-binding
  • GitHub Check: Build current binary
🔇 Additional comments (36)
cpp-linter/Cargo.toml (1)

31-31: LGTM!

The addition of process and fs features to tokio is appropriate for the async refactor. These enable tokio::process::Command and tokio::fs operations used in clang_tidy.rs and clang_format.rs.

cpp-linter/tests/paginated_changed_files.rs (3)

40-53: LGTM!

The unsafe blocks for environment variable mutations align with newer Rust lint rules that flag env::set_var as potentially unsafe due to thread safety concerns in multi-threaded test environments. This is a consistent pattern applied across the test suite.


57-66: LGTM!

Consistent with the pattern for wrapping env::set_var in unsafe blocks.


80-82: LGTM!

Consistent unsafe wrapping for environment variable setup.

cpp-linter/tests/reviews.rs (3)

75-85: LGTM!

Consistent unsafe block usage for environment variable mutations in test setup.


90-92: LGTM!

Consistent pattern for GITHUB_EVENT_PATH setup.


98-100: LGTM!

Consistent pattern for GITHUB_API_URL setup.

cpp-linter/tests/comments.rs (3)

65-75: LGTM!

Consistent unsafe block usage for environment variable setup in test code.


81-83: LGTM!

Consistent pattern for conditional GITHUB_EVENT_PATH setup.


90-92: LGTM!

Consistent pattern for GITHUB_API_URL setup.

cpp-linter/src/run.rs (7)

165-177: LGTM!

The unsafe block for env::remove_var and the switch to .unwrap() for successful test cases is appropriate for test code.


182-187: LGTM!

Consistent pattern applied to version_command test.


192-203: LGTM!

Consistent pattern applied to force_debug_output test.


208-218: LGTM!

Consistent pattern applied to no_version_input test.


222-234: LGTM!

The pre_commit_env test correctly places both env::remove_var and env::set_var inside the unsafe block, and appropriately uses assert!(result.is_err()) since this test expects failure.


240-253: LGTM!

Consistent pattern applied to no_analysis test.


257-267: LGTM!

The bad_repo_root test correctly uses assert!(result.is_err()) for expected failure scenario.

cpp-linter/src/clang_tools/clang_tidy.rs (5)

251-286: LGTM!

The async signature change and the pattern of extracting file_name and building args while holding the lock briefly, then releasing it before async I/O operations, is the correct approach. This avoids holding MutexGuard across .await points which would prevent Send requirements.


287-300: LGTM!

Using tokio::fs::read_to_string for async file reading is appropriate for the async context.


301-317: LGTM!

The command construction and async execution using tokio::process::Command with .output().await is correctly implemented.


334-352: LGTM!

The pattern of reading patched content, restoring original content using async I/O, and then acquiring the lock again to update tidy_advice is well-structured. The lock is held only for the brief mutation of file.tidy_advice.


435-483: LGTM!

Test properly updated to async with Arc<Mutex<FileObj>> pattern matching the new function signature.

cpp-linter/src/clang_tools/mod.rs (4)

138-148: LGTM!

The pattern of acquiring the lock briefly to extract file_name, then releasing it before async operations, correctly avoids holding MutexGuard across await points.


149-187: LGTM!

Using the extracted file_name for filter checks and logging avoids repeated lock acquisitions during the async operations.


191-191: LGTM!

Deriving Default for ClangVersions is a clean way to initialize the struct with None values.


204-205: LGTM!

Taking an immutable reference &Vec<Arc<Mutex<FileObj>>> is appropriate since the function only needs to clone Arcs for spawning tasks.

cpp-linter/src/clang_tools/clang_format.rs (10)

18-21: LGTM! Default handling correctly addresses XML parsing edge cases.

The Default derive and #[serde(default)] attribute work together to handle two scenarios:

  • When clang-format produces empty stdout, FormatAdvice::default() is used (line 154)
  • When XML has no <replacement> elements, the default attribute provides an empty Vec instead of failing deserialization

This aligns with the learning that valid XML with no replacements should parse successfully.


81-84: LGTM! Function signature correctly adapted for async execution.

The signature change from &mut FileObj to &Arc<Mutex<FileObj>> enables thread-safe shared access required by the async runtime, and the async fn return type supports non-blocking command execution.


87-99: LGTM! Excellent mutex usage pattern.

The code locks the mutex only to extract necessary data (file_name, ranges), then releases it before awaiting I/O operations. This minimizes lock contention and prevents holding the lock across async suspension points.


100-121: LGTM! Async command execution properly implemented.

The optional format fixes retrieval correctly uses tokio::process::Command with .await, includes clear logging, and provides helpful error context.


122-136: LGTM! Second command execution correctly structured.

The code properly constructs a separate command instance with modified arguments to retrieve XML replacements, and includes appropriate error context.


146-156: LGTM! Robust default handling for empty output.

The explicit check for empty stdout before parsing avoids errors when clang-format produces no output, falling back to FormatAdvice::default() as intended. Based on learnings, this correctly handles the case where clang-format produces blank output.


158-162: LGTM! Async file reading correctly implemented.

The tokio::fs::read call properly awaits file I/O, and the file path (derived from file.name.to_string_lossy()) is consistent with what's passed to clang-format commands.


181-186: LGTM! Second mutex lock properly scoped.

The function locks the mutex a second time only to write the final result, after all async I/O has completed. The two lock scopes are separate and non-overlapping, avoiding deadlocks while maintaining thread safety.


201-215: LGTM! Test validates the serde default behavior.

This test confirms that XML with a valid structure but no <replacement> elements parses successfully, which is essential for the #[serde(default)] attribute added on line 21. This complements the existing parse_blank_xml test and addresses the PR objective of properly parsing XML with no replacements.


4-9: Imports are correctly configured for async conversion.

The imports properly support the async refactoring: Arc<Mutex<>> for thread-safe sharing, anyhow for error context, and tokio's fs and process::Command for async I/O operations. The required tokio features (process and fs) are already enabled in cpp-linter/Cargo.toml.

@2bndy5 2bndy5 changed the title Tokio command feat: use tokio::Command to capture suggestions Dec 7, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
cpp-linter/src/clang_tools/mod.rs (3)

138-188: Async per-file analysis looks sound; minor cleanups possible

The async refactor here looks correct: the mutex is only held while copying FileObj.name, so no lock is held across await, and poisoned locks are handled explicitly.

Two small polish suggestions:

  • You can avoid the extra allocation at the end by moving file_name instead of cloning it:
-    Ok((file_name.clone(), logs))
+    Ok((file_name, logs))
  • The filter checks can be simplified to avoid the duplicated is_some_and(..) || is_none() pattern, e.g.:
-        if clang_params
-            .format_filter
-            .as_ref()
-            .is_some_and(|f| f.is_source_or_ignored(file_name.as_path()))
-            || clang_params.format_filter.is_none()
-        {
+        if clang_params
+            .format_filter
+            .as_ref()
+            .map_or(true, |f| f.is_source_or_ignored(file_name.as_path()))
+        {

and similarly for tidy_filter. Behavior is unchanged; it’s just more idiomatic.


191-198: Consider also deriving Debug for ClangVersions

Now that ClangVersions derives Default, you might also want Debug for easier logging and diagnostics, especially around tool version discovery:

-#[derive(Default)]
+#[derive(Default, Debug)]
 pub struct ClangVersions {

If Debug was intentionally dropped (e.g., to reduce public API surface), feel free to ignore this.


18-18: JoinSet-based concurrency works, but reduce cloning and consider bounding parallelism

The async fan-out via JoinSet is a good fit here, but there are a few refinements worth considering:

  1. Avoid cloning ClangParams per file

Right now a full ClangParams clone is created for every file:

for file in files {
    let arc_params = Arc::new(clang_params.clone());
    let arc_file = Arc::clone(file);
    executors.spawn(analyze_single_file(arc_file, arc_params));
}

If clang_params contains a large database_json, this can get expensive. You can clone it once and just clone the Arc per task:

-    let mut executors = JoinSet::new();
-    // iterate over the discovered files and run the clang tools
-    for file in files {
-        let arc_params = Arc::new(clang_params.clone());
-        let arc_file = Arc::clone(file);
-        executors.spawn(analyze_single_file(arc_file, arc_params));
-    }
+    let shared_params = Arc::new(clang_params.clone());
+    let mut executors = JoinSet::new();
+    // iterate over the discovered files and run the clang tools
+    for file in files {
+        let arc_file = Arc::clone(file);
+        let arc_params = Arc::clone(&shared_params);
+        executors.spawn(analyze_single_file(arc_file, arc_params));
+    }
  1. Bound concurrency to avoid spawning too many clang jobs at once

JoinSet will happily run one task per file; on very large PRs this could translate into a large number of concurrent clang-tidy/clang-format processes and potentially thrash the machine. If that’s a concern, consider enforcing a max concurrency (e.g., with a tokio::sync::Semaphore or by using a for_each_concurrent-style pattern) so you only process N files in parallel.

  1. More generic parameter type for files

Since capture_clang_tools_output doesn’t mutate files, taking a slice instead of &Vec would make the API more flexible:

-pub async fn capture_clang_tools_output(
-    files: &Vec<Arc<Mutex<FileObj>>>,
+pub async fn capture_clang_tools_output(
+    files: &[Arc<Mutex<FileObj>>],

Given the earlier learning that this function is only called from run.rs, adjusting the call site should be straightforward. Based on learnings, this keeps the API cleaner while preserving behavior.

  1. Error handling behavior with output??

Using output?? means the first failing task (either join error or inner Result::Err) will cause the function to return early and drop the remaining tasks in the JoinSet. If that’s intended (fail-fast behavior), it’s fine. If you’d rather attempt to collect diagnostics for all files even when one fails, you’d need to accumulate per-task errors instead of bailing on the first one.

Also applies to: 205-262

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6e8ba1c and 30a6bd6.

📒 Files selected for processing (2)
  • cpp-linter/src/clang_tools/mod.rs (4 hunks)
  • cpp-linter/src/run.rs (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • cpp-linter/src/run.rs
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-11-04T06:50:10.870Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 208
File: cpp-linter/src/clang_tools/mod.rs:60-115
Timestamp: 2025-11-04T06:50:10.870Z
Learning: In the cpp-linter-rs project, path validation (such as checking whether a path is a file or directory) should be performed in CLI parsing (cpp-linter/src/cli/structs.rs) rather than in the tool lookup logic (cpp-linter/src/clang_tools/mod.rs). This maintains proper separation of concerns.

Applied to files:

  • cpp-linter/src/clang_tools/mod.rs
📚 Learning: 2024-10-02T07:55:08.948Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 52
File: cpp-linter/src/clang_tools/mod.rs:146-146
Timestamp: 2024-10-02T07:55:08.948Z
Learning: In the `cpp-linter` project, the function `capture_clang_tools_output` is only called in `cpp-linter/src/run.rs`, and there are no other calls to this function elsewhere in the codebase.

Applied to files:

  • cpp-linter/src/clang_tools/mod.rs
📚 Learning: 2024-11-23T06:20:11.698Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 54
File: cpp-linter/src/rest_api/mod.rs:253-254
Timestamp: 2024-11-23T06:20:11.698Z
Learning: In `cpp-linter/src/rest_api/mod.rs`, it is safe to use `unwrap()` on `clang_versions.tidy_version` and `clang_versions.format_version` because they are guaranteed to be `Some` at that point in the code.

Applied to files:

  • cpp-linter/src/clang_tools/mod.rs
📚 Learning: 2025-01-21T09:56:32.771Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 101
File: cpp-linter/src/clang_tools/clang_format.rs:155-161
Timestamp: 2025-01-21T09:56:32.771Z
Learning: In cpp-linter-rs, the XML output being parsed is generated programmatically by clang-format tool. The only failure case for XML parsing is when clang-format produces a blank XML document, in which case falling back to empty results (using unwrap_or) is the desired behavior.

Applied to files:

  • cpp-linter/src/clang_tools/mod.rs
📚 Learning: 2024-11-23T14:10:31.760Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 54
File: cpp-linter/src/clang_tools/mod.rs:149-159
Timestamp: 2024-11-23T14:10:31.760Z
Learning: Clang tools' `--version` output is consistent across all released versions, so adding extra error handling or unit tests for version extraction may be unnecessary. Tests are run after the clang tools are installed.

Applied to files:

  • cpp-linter/src/clang_tools/mod.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: test (windows-latest)
  • GitHub Check: test (ubuntu-latest)
  • GitHub Check: arm-unknown-linux-gnueabi
  • GitHub Check: powerpc64-unknown-linux-gnu
  • GitHub Check: aarch64-pc-windows-msvc
  • GitHub Check: powerpc-unknown-linux-gnu
  • GitHub Check: armv7-unknown-linux-gnueabihf
  • GitHub Check: powerpc64le-unknown-linux-gnu
  • GitHub Check: x86_64-unknown-linux-musl
  • GitHub Check: x86_64-unknown-linux-gnu
  • GitHub Check: aarch64-unknown-linux-gnu
  • GitHub Check: aarch64-unknown-linux-musl
  • GitHub Check: Build x86_64-apple-darwin
  • GitHub Check: Build armv7-unknown-linux-gnueabihf
  • GitHub Check: Build aarch64-unknown-linux-musl
  • GitHub Check: Build aarch64-apple-darwin
  • GitHub Check: Build x86_64-unknown-linux-musl
  • GitHub Check: Build x86_64-pc-windows-msvc
  • GitHub Check: Build x86_64-unknown-linux-gnu
  • GitHub Check: Build FreeBSD

Apparently, there was a bug previously that somehow filtered out demo.hpp for some PR review tests.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
cpp-linter/src/clang_tools/mod.rs (1)

185-185: Remove redundant clone

Since file_name is already owned and not used after this return statement, the .clone() is unnecessary.

Apply this diff:

-    Ok((file_name.clone(), logs))
+    Ok((file_name, logs))
cpp-linter/tests/reviews.rs (2)

75-85: Env mutations marked unsafe are consistent with new lint expectations but still globally shared

Wrapping the env::set_var / env::remove_var calls in a single unsafe block makes the global state side‑effects explicit and keeps behavior identical to before. Given these tests rely heavily on process‑wide env, this is reasonable.

One thing to keep in mind: if the test suite is ever run with multiple test threads, these shared env mutations can still race with other tests that read/write the same vars. If you hit intermittent issues in the future, consider either:

  • serializing tests that touch env (e.g., via a serial test attribute or a single integration test driver), or
  • introducing a small helper that snapshots and restores relevant env vars around setup().

For now, the changes here look good and align with the pattern used in the other test modules.

Also applies to: 90-92, 98-100


216-221: Verify the intentional divergence between tidy_ignore and format_ignore under force_lgtm

Previously both tools shared a single ignore pattern string; now, when force_lgtm is true, the sets differ:

  • format_ignore: **/*.c|**/*.cpp|**/*.h
  • tidy_ignore: **/*.c|**/*.hpp

That means, for example, .cpp files are no longer ignored for tidy, and .hpp files are no longer ignored for format, which is a behavioral change from a single shared ignore list.

If this asymmetry is intentional to match how the test assets are laid out per tool, it would be good to add a short comment here explaining why the ignore sets differ to avoid future “cleanup” regressions.

If instead the goal was to preserve the previous semantics (same ignore patterns for both tools, just passed to two different flags), you may want to push the same extensions to both format_ignore and tidy_ignore.

Also applies to: 227-228

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 30a6bd6 and b9b4ec8.

📒 Files selected for processing (2)
  • cpp-linter/src/clang_tools/mod.rs (9 hunks)
  • cpp-linter/tests/reviews.rs (2 hunks)
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-11-04T06:50:10.870Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 208
File: cpp-linter/src/clang_tools/mod.rs:60-115
Timestamp: 2025-11-04T06:50:10.870Z
Learning: In the cpp-linter-rs project, path validation (such as checking whether a path is a file or directory) should be performed in CLI parsing (cpp-linter/src/cli/structs.rs) rather than in the tool lookup logic (cpp-linter/src/clang_tools/mod.rs). This maintains proper separation of concerns.

Applied to files:

  • cpp-linter/tests/reviews.rs
  • cpp-linter/src/clang_tools/mod.rs
📚 Learning: 2024-10-02T07:55:08.948Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 52
File: cpp-linter/src/clang_tools/mod.rs:146-146
Timestamp: 2024-10-02T07:55:08.948Z
Learning: In the `cpp-linter` project, the function `capture_clang_tools_output` is only called in `cpp-linter/src/run.rs`, and there are no other calls to this function elsewhere in the codebase.

Applied to files:

  • cpp-linter/tests/reviews.rs
  • cpp-linter/src/clang_tools/mod.rs
📚 Learning: 2025-01-21T09:56:32.771Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 101
File: cpp-linter/src/clang_tools/clang_format.rs:155-161
Timestamp: 2025-01-21T09:56:32.771Z
Learning: In cpp-linter-rs, the XML output being parsed is generated programmatically by clang-format tool. The only failure case for XML parsing is when clang-format produces a blank XML document, in which case falling back to empty results (using unwrap_or) is the desired behavior.

Applied to files:

  • cpp-linter/tests/reviews.rs
  • cpp-linter/src/clang_tools/mod.rs
📚 Learning: 2024-11-23T06:20:11.698Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 54
File: cpp-linter/src/rest_api/mod.rs:253-254
Timestamp: 2024-11-23T06:20:11.698Z
Learning: In `cpp-linter/src/rest_api/mod.rs`, it is safe to use `unwrap()` on `clang_versions.tidy_version` and `clang_versions.format_version` because they are guaranteed to be `Some` at that point in the code.

Applied to files:

  • cpp-linter/src/clang_tools/mod.rs
📚 Learning: 2024-11-23T14:10:31.760Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 54
File: cpp-linter/src/clang_tools/mod.rs:149-159
Timestamp: 2024-11-23T14:10:31.760Z
Learning: Clang tools' `--version` output is consistent across all released versions, so adding extra error handling or unit tests for version extraction may be unnecessary. Tests are run after the clang tools are installed.

Applied to files:

  • cpp-linter/src/clang_tools/mod.rs
🧬 Code graph analysis (1)
cpp-linter/tests/reviews.rs (1)
cpp-linter/tests/common/mod.rs (1)
  • mock_server (82-84)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: build-mkdocs
  • GitHub Check: Build x86_64-pc-windows-msvc
  • GitHub Check: Build FreeBSD
  • GitHub Check: x86_64-pc-windows-msvc
  • GitHub Check: x86_64-apple-darwin
  • GitHub Check: x86_64-unknown-linux-gnu
  • GitHub Check: powerpc-unknown-linux-gnu
  • GitHub Check: s390x-unknown-linux-gnu
  • GitHub Check: aarch64-pc-windows-msvc
  • GitHub Check: arm-unknown-linux-gnueabi
  • GitHub Check: arm-unknown-linux-gnueabihf
  • GitHub Check: aarch64-apple-darwin
  • GitHub Check: aarch64-unknown-linux-gnu
  • GitHub Check: aarch64-unknown-linux-musl
  • GitHub Check: powerpc64le-unknown-linux-gnu
  • GitHub Check: powerpc64-unknown-linux-gnu
  • GitHub Check: armv7-unknown-linux-gnueabihf
  • GitHub Check: x86_64-unknown-linux-musl
  • GitHub Check: test (ubuntu-latest)
  • GitHub Check: test (windows-latest)
🔇 Additional comments (11)
cpp-linter/src/clang_tools/mod.rs (11)

16-16: LGTM: Tokio imports for async operations

The added imports (fs, process::Command, task::JoinSet) are appropriate for the async conversion of clang tool execution and file I/O.


116-125: LGTM: Async version capture

The conversion to async using tokio::Command is correct. The function properly awaits the command output and error handling remains intact.


141-146: LGTM: Safe file name extraction pattern

The mutex locking pattern is correct: lock the FileObj, clone the name, and immediately drop the lock. This avoids holding the lock during expensive async operations.


154-154: LGTM: Async clang-format execution

The function properly awaits the async run_clang_format call.


173-173: LGTM: Async clang-tidy execution

The function properly awaits the async run_clang_tidy call.


213-213: LGTM: Awaiting version captures

Both capture_version calls are properly awaited and error handling is preserved.

Also applies to: 223-223


234-234: LGTM: Async file read

The conversion to tokio::fs::read with .await is correct for async I/O operations.


248-248: LGTM: Async task spawning

The task spawning pattern correctly uses JoinSet to manage concurrent file analysis, with proper Arc cloning for shared ownership.


251-258: LGTM: Task result processing

The task joining and result processing is correct. The output?? pattern on line 252 properly handles both the JoinHandle result and the task's Result type. The log grouping per file is well-structured.


202-207: Call site in run.rs correctly updated for signature change

The signature change from &mut Vec<FileObj> to &Vec<Arc<Mutex<FileObj>>> is properly reflected at the call site in cpp-linter/src/run.rs:140-145. The arc_files variable (constructed from Arc::new(Mutex::new(file)) at line 133) is passed as &arc_files, providing the correct type match for the new signature.


189-189: Debug derive removal is safe; no code relies on it.

Verification confirms that ClangVersions is not used with the Debug trait anywhere in the codebase. All usages involve field access (format_version, tidy_version) or Default trait initialization. The removal does not break existing code.

@codecov
Copy link

codecov bot commented Dec 10, 2025

Codecov Report

❌ Patch coverage is 99.35897% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 96.84%. Comparing base (6523512) to head (ff739ad).

Files with missing lines Patch % Lines
cpp-linter/src/clang_tools/clang_format.rs 97.95% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #224      +/-   ##
==========================================
+ Coverage   96.53%   96.84%   +0.30%     
==========================================
  Files          14       14              
  Lines        2999     3040      +41     
==========================================
+ Hits         2895     2944      +49     
+ Misses        104       96       -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

there was a noticeable performance increase in the benchmarks

Under the hood `tokio::fs` uses more overhead than `std::fs`.
And, the usage here are in synchronous contexts anyway.
ranges is empty if lines_changed_only is Off
@2bndy5 2bndy5 linked an issue Dec 10, 2025 that may be closed by this pull request
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
cpp-linter/src/clang_tools/clang_tidy.rs (2)

235-249: Arc<Mutex> adaptation in tally_tidy_advice is fine; consider non-panicking locks

Using &[Arc<Mutex<FileObj>>] aligns this helper with the rest of the async pipeline. The only nit is file.lock().unwrap(), which can still panic on a poisoned mutex; you might eventually want to mirror the error-mapping pattern you use in run_clang_tidy to keep this path infallible and report an error instead of panicking.


251-351: Async run_clang_tidy looks correct; a couple of optional refinements

  • Good: you only hold the Mutex<FileObj> long enough to compute file_name, ranges, and --line-filter, and the lock is released before any .await, avoiding blocking the async scheduler on a sync mutex.
  • Good: the tidy_review path cleanly caches the patched content and restores the original file contents afterward, and the advice is written back under a short lock scope at the end.
  • Optional: fs::read_to_string, fs::read, and fs::write are still blocking; if this ever becomes a throughput bottleneck on large repos, consider migrating these specific calls to tokio::fs to avoid tying up worker threads.
  • Optional: args.extend(["--extra-arg".to_string(), format!("\"{}\"", arg)]); means the quotes are part of the actual argument passed to clang-tidy, not just for display. If the intent is only to show shell-style quoting in the logs, you could keep the raw arg in the Command and apply quoting only when building the "Running ..." log string.
cpp-linter/src/clang_tools/clang_format.rs (2)

69-81: tally_format_advice Arc<Mutex> change is fine; lock unwrap is a minor nit

Switching to &[Arc<Mutex<FileObj>>] keeps tallying aligned with the new shared-ownership model. As with the tidy tally, the use of file.lock().unwrap() can panic on a poisoned mutex; if you ever want this path to be fully robust, you could propagate or log the lock error instead of panicking.


83-191: Async run_clang_format is structured well; consider a couple of small cleanups

  • Good: you gather file_name, style, and line ranges under a short-lived lock, then drop the lock before any .await, so no sync Mutex is held across async boundaries.
  • Good: the format_review path obtains a “patched” buffer in one clang-format invocation, while a second invocation with --output-replacements-xml is used for structured replacement data, both via tokio::process::Command.
  • Good: when stdout is empty you fall back to FormatAdvice::default(), and otherwise you surface XML/UTF‑8 issues via with_context, which matches the expectation that tool-generated XML should generally parse cleanly.
  • Optional: the doc comment above this function still says “Run clang-tidy...”; you might want to update it to “Run clang-format...” for clarity.
  • Optional: as in run_clang_tidy, the fs::read used when translating offsets to line numbers is blocking; if format review becomes a hot path, you could eventually switch this to tokio::fs::read.
cpp-linter/src/clang_tools/mod.rs (1)

116-126: Async capture_version is straightforward; consider adding context on failure

Using tokio::process::Command to run <tool> --version and a regex to extract the version keeps behavior consistent with the prior sync implementation. If you want slightly better diagnosability, you could wrap the .output().await? call in with_context to surface which tool path failed when the process execution itself errors.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 12a5070 and ff739ad.

📒 Files selected for processing (4)
  • cpp-linter/Cargo.toml (1 hunks)
  • cpp-linter/src/clang_tools/clang_format.rs (6 hunks)
  • cpp-linter/src/clang_tools/clang_tidy.rs (7 hunks)
  • cpp-linter/src/clang_tools/mod.rs (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • cpp-linter/Cargo.toml
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-11-04T06:50:10.870Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 208
File: cpp-linter/src/clang_tools/mod.rs:60-115
Timestamp: 2025-11-04T06:50:10.870Z
Learning: In the cpp-linter-rs project, path validation (such as checking whether a path is a file or directory) should be performed in CLI parsing (cpp-linter/src/cli/structs.rs) rather than in the tool lookup logic (cpp-linter/src/clang_tools/mod.rs). This maintains proper separation of concerns.

Applied to files:

  • cpp-linter/src/clang_tools/clang_tidy.rs
  • cpp-linter/src/clang_tools/clang_format.rs
  • cpp-linter/src/clang_tools/mod.rs
📚 Learning: 2024-10-02T07:55:08.948Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 52
File: cpp-linter/src/clang_tools/mod.rs:146-146
Timestamp: 2024-10-02T07:55:08.948Z
Learning: In the `cpp-linter` project, the function `capture_clang_tools_output` is only called in `cpp-linter/src/run.rs`, and there are no other calls to this function elsewhere in the codebase.

Applied to files:

  • cpp-linter/src/clang_tools/clang_tidy.rs
  • cpp-linter/src/clang_tools/mod.rs
📚 Learning: 2024-11-23T06:20:11.698Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 54
File: cpp-linter/src/rest_api/mod.rs:253-254
Timestamp: 2024-11-23T06:20:11.698Z
Learning: In `cpp-linter/src/rest_api/mod.rs`, it is safe to use `unwrap()` on `clang_versions.tidy_version` and `clang_versions.format_version` because they are guaranteed to be `Some` at that point in the code.

Applied to files:

  • cpp-linter/src/clang_tools/clang_tidy.rs
  • cpp-linter/src/clang_tools/mod.rs
📚 Learning: 2025-01-21T09:56:32.771Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 101
File: cpp-linter/src/clang_tools/clang_format.rs:155-161
Timestamp: 2025-01-21T09:56:32.771Z
Learning: In cpp-linter-rs, the XML output being parsed is generated programmatically by clang-format tool. The only failure case for XML parsing is when clang-format produces a blank XML document, in which case falling back to empty results (using unwrap_or) is the desired behavior.

Applied to files:

  • cpp-linter/src/clang_tools/clang_tidy.rs
  • cpp-linter/src/clang_tools/clang_format.rs
  • cpp-linter/src/clang_tools/mod.rs
📚 Learning: 2024-11-23T14:10:31.760Z
Learnt from: 2bndy5
Repo: cpp-linter/cpp-linter-rs PR: 54
File: cpp-linter/src/clang_tools/mod.rs:149-159
Timestamp: 2024-11-23T14:10:31.760Z
Learning: Clang tools' `--version` output is consistent across all released versions, so adding extra error handling or unit tests for version extraction may be unnecessary. Tests are run after the clang tools are installed.

Applied to files:

  • cpp-linter/src/clang_tools/mod.rs
🧬 Code graph analysis (2)
cpp-linter/src/clang_tools/clang_tidy.rs (2)
cpp-linter/src/common_fs/file_filter.rs (1)
  • new (17-24)
cpp-linter/src/common_fs/mod.rs (1)
  • new (44-53)
cpp-linter/src/clang_tools/clang_format.rs (1)
cpp-linter/src/common_fs/mod.rs (1)
  • new (44-53)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: build-mkdocs
  • GitHub Check: aarch64-pc-windows-msvc
  • GitHub Check: x86_64-pc-windows-msvc
  • GitHub Check: aarch64-apple-darwin
  • GitHub Check: Build x86_64-pc-windows-msvc
  • GitHub Check: Build armv7-unknown-linux-gnueabihf
  • GitHub Check: Build i686-pc-windows-msvc
  • GitHub Check: s390x-unknown-linux-gnu
  • GitHub Check: Build aarch64-apple-darwin
  • GitHub Check: Build aarch64-pc-windows-msvc
  • GitHub Check: Build aarch64-unknown-linux-musl
  • GitHub Check: Build aarch64-unknown-linux-gnu
  • GitHub Check: Build x86_64-unknown-linux-musl
  • GitHub Check: Build x86_64-unknown-linux-gnu
  • GitHub Check: x86_64-unknown-linux-musl
  • GitHub Check: powerpc64-unknown-linux-gnu
  • GitHub Check: Build x86_64-apple-darwin
  • GitHub Check: aarch64-unknown-linux-gnu
  • GitHub Check: Build FreeBSD
🔇 Additional comments (5)
cpp-linter/src/clang_tools/clang_tidy.rs (1)

432-480: Tokio-based use_extra_args test matches the new async API

The conversion to #[tokio::test] and Arc<Mutex<FileObj>> matches the updated run_clang_tidy signature, and the test still verifies that extra_args show up in the logged command string, which is the behavior this path is responsible for.

cpp-linter/src/clang_tools/clang_format.rs (1)

21-28: FormatAdvice defaults + XML no-replacement test look solid

Deriving Default for FormatAdvice and adding #[serde(rename(deserialize = "replacement"), default)] to replacements pairs nicely with the new parse_xml_no_replacements test: <replacements> elements with no children now deserialize to FormatAdvice { replacements: vec![], patched: None }, which matches the intended “no fixes” semantics without treating it as a parse error. This is consistent with the prior design of only treating truly blank XML as the exceptional case. Based on learnings.

Also applies to: 204-218

cpp-linter/src/clang_tools/mod.rs (3)

137-185: analyze_single_file async flow and locking strategy look good

  • You grab file_name under a short lock, then drop the lock before any calls to run_clang_format / run_clang_tidy, so no synchronous mutex is held across awaits.
  • The filter checks use is_none_or(|f| f.is_source_or_ignored(file_name.as_path())), which keeps the logic clear: only run a tool when it’s enabled and not explicitly ignored for this file.
  • Logs from each tool invocation are simply appended to the local logs vec and returned alongside the file_name, which meshes cleanly with the grouping done in capture_clang_tools_output.

188-195: ClangVersions now deriving Default simplifies callers

Deriving Default on ClangVersions lets capture_clang_tools_output and summarization code construct an “all tools unused” state without boilerplate, while still allowing each field to be filled only when its corresponding tool actually runs.


201-262: capture_clang_tools_output’s async orchestration and fast-fail behavior look solid

  • Version discovery now awaits the async capture_version for each enabled tool, with versions stored in ClangVersions and the resolved executables wired back into clang_params before any file analysis.
  • Cloning clang_params into an Arc and spawning one analyze_single_file task per file via JoinSet gives you per-file concurrency without sharing mutable state across tasks.
  • The while let Some(output) = executors.join_next().await { let (file_name, logs) = output??; ... } pattern correctly fast-fails on both spawn (JoinError) and per-file analysis errors (inner anyhow::Error), aborting remaining tasks when an error is encountered, as described in your comment.
  • Grouping each file’s logs between start_log_group("Analyzing <path>") and end_log_group() keeps the REST API output structured even when files complete out of order under concurrency.

@2bndy5
Copy link
Collaborator Author

2bndy5 commented Dec 11, 2025

There is roughly about a 10 second increase in the benchmark with this patch. I'm not sure if this is worth merging.

Command Mean [s] Min [s] Max [s] Relative
python-pure 171.075 ± 0.493 170.517 171.450 1.00 ± 0.01
python-rust 188.281 ± 0.466 187.760 188.659 1.10 ± 0.01
rust 186.069 ± 3.193 184.018 189.749 1.09 ± 0.02
rust-previous 170.442 ± 1.830 169.202 172.544 1.00

Command named "rust-previous" is the base branch for this PR. And "rust" is the head commit for this PR.

I'm going to abstract certain changes into a separate branch and try to isolate what causes the performance regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

switch to tokio::process::Command

2 participants