Skip to content

Expose raft::memory_tracking_resources to C/Python/Java/Rust#2073

Open
achirkin wants to merge 5 commits into
rapidsai:release/26.06from
achirkin:fea-memory-tracking-resources-api
Open

Expose raft::memory_tracking_resources to C/Python/Java/Rust#2073
achirkin wants to merge 5 commits into
rapidsai:release/26.06from
achirkin:fea-memory-tracking-resources-api

Conversation

@achirkin
Copy link
Copy Markdown
Contributor

@achirkin achirkin commented May 12, 2026

Expose the new raft::memory_tracking_resources in the public API of all supported languages and add documentation.

See Raft #2973 for more information about memory tracking resources.

Resolves rapidsai/raft#2982

@achirkin achirkin self-assigned this May 12, 2026
@achirkin achirkin requested review from a team as code owners May 12, 2026 10:47
@achirkin achirkin added the feature request New feature or request label May 12, 2026
@achirkin achirkin added the non-breaking Introduces a non-breaking change label May 12, 2026
@achirkin achirkin moved this to In Progress in Unstructured Data Processing May 12, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: fa8b5de9-ea37-4f31-a886-33c0f05a4ace

📥 Commits

Reviewing files that changed from the base of the PR and between ffb3921 and 3c88ea3.

📒 Files selected for processing (1)
  • rust/cuvs/Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (1)
  • rust/cuvs/Cargo.toml

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Resources now support optional memory allocation tracking with background sampling to CSV files at configurable intervals, automatically restoring global memory resources when the tracked handle is destroyed. Support added across C, Java, Python, and Rust APIs.
  • Tests

    • Added integration and unit tests to verify memory tracking CSV output generation.

Walkthrough

This PR adds background CSV memory-allocation tracking to CuVS resources across C, Python, Java, and Rust. The core cuvsResourcesCreateWithMemoryTracking C API validates inputs, constructs a raft memory-tracking resource, and returns an opaque handle. Language bindings—Python Cython wrapper, Java SPI with JDK implementation, and Rust safe wrapper—all call the C foundation with appropriate path encoding and interval conversion. Each language includes tests verifying non-empty CSV output.

Changes

Memory Tracking Resources Feature

Layer / File(s) Summary
C API Foundation
c/include/cuvs/core/c_api.h, c/src/core/c_api.cpp
Declare and implement cuvsResourcesCreateWithMemoryTracking with input validation for CSV path and sample interval; wrap raft::memory_tracking_resources initialization and error translation.
Python Wrapper and Tests
python/cuvs/cuvs/common/c_api.pxd, python/cuvs/cuvs/common/resources.pyx, python/cuvs/cuvs/tests/test_memory_tracking.py
Cython declarations and Resources.__cinit__ accept optional memory-tracking CSV path and interval; call C API when enabled; docstring and example added; test verifies non-empty CSV output.
Java API Contract and Public Factories
java/cuvs-java/src/main/java/com/nvidia/cuvs/spi/CuVSProvider.java, java/cuvs-java/src/main/java/com/nvidia/cuvs/CuVSResources.java, java/cuvs-java/src/main/java/com/nvidia/cuvs/spi/UnsupportedProvider.java
Extend CuVSProvider SPI with newCuVSResources(tempDirectory, csvPath, interval) default method; add public factory on CuVSResources; update UnsupportedProvider stub.
Java Implementation and Tests
java/cuvs-java/src/main/java22/com/nvidia/cuvs/internal/CuVSResourcesImpl.java, java/cuvs-java/src/main/java22/com/nvidia/cuvs/spi/JDKProvider.java, java/cuvs-java/src/test/java/com/nvidia/cuvs/MemoryTrackingResourcesIT.java
CuVSResourcesImpl new constructor encodes CSV path to UTF-8 native memory and invokes the C API; JDKProvider overload validates and forwards arguments; integration test allocates matrices, waits for CSV reporter, verifies non-empty output on Linux AMD64.
Rust FFI and Wrapper
rust/cuvs-sys/src/bindings.rs, rust/cuvs/src/resources.rs, rust/cuvs/Cargo.toml
Add FFI binding for cuvsResourcesCreateWithMemoryTracking; safe Resources::with_memory_tracking constructor validates UTF-8 path and NUL, converts interval to milliseconds with 10ms default, calls C API; unit test uses tempfile dependency to verify CSV generation.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 79.31% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately describes the main change: exposing raft::memory_tracking_resources to multiple language APIs.
Description check ✅ Passed The PR description clearly relates to the changeset by explaining that it exposes raft::memory_tracking_resources to C/Python/Java/Rust APIs.
Linked Issues check ✅ Passed The PR successfully exposes memory tracking resources in Python, Java, Rust, and C APIs with consistent interfaces matching normal resource handles, fulfilling the requirement from issue #2982.
Out of Scope Changes check ✅ Passed All changes are directly related to exposing memory tracking resources; minor refactoring (method signature reformatting) is incidental to the main feature.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@c/src/core/c_api.cpp`:
- Around line 42-54: The function cuvsResourcesCreateWithMemoryTracking must
validate the output handle and interval: before dereferencing res check that res
is not null and throw std::invalid_argument if it is, and validate
sample_interval_ms > 0 (throw std::invalid_argument for non-positive values) to
avoid unstable sampling; then only allocate the raft::memory_tracking_resources
(res_ptr) and assign *res = reinterpret_cast<uintptr_t>(res_ptr) after these
checks so the translate_exceptions wrapper still returns errors consistently.

In `@java/cuvs-java/src/main/java/com/nvidia/cuvs/spi/CuVSProvider.java`:
- Around line 50-53: Change the new abstract method in the CuVSProvider
interface into a default method to preserve SPI compatibility: declare default
CuVSResources newCuVSResources(Path tempDirectory, Path memoryTrackingCsvPath,
Duration memoryTrackingSampleInterval) throws Throwable, and implement it to
delegate to the existing older overload (e.g., call the existing
newCuVSResources(Path tempDirectory) overload) so existing providers continue to
work; if no older overload exists, have the default implementation throw an
UnsupportedOperationException with a clear message so providers can opt-in to
override the new behavior.

In `@python/cuvs/cuvs/common/resources.pyx`:
- Around line 75-85: The __cinit__ currently treats an empty string
memory_tracking_csv_path as falsy and silently skips memory tracking; change the
validation to explicitly check for None versus empty string: if
memory_tracking_csv_path is None call cuvsResourcesCreate, otherwise verify the
provided path is a non-empty string/bytes (e.g., check length > 0) and then
encode it and call cuvsResourcesCreateWithMemoryTracking with csv_bytes and
memory_tracking_sample_interval_ms, and if the path is empty raise a Python
ValueError; keep using check_cuvs around the C calls (referencing __cinit__,
memory_tracking_csv_path, cuvsResourcesCreateWithMemoryTracking,
cuvsResourcesCreate, check_cuvs).

In `@rust/cuvs/src/resources.rs`:
- Around line 49-50: The current cast from Duration::as_millis() (u128) to i64
for sample_interval_ms is lossy; replace the direct as_millis() as i64 cast with
a fallible conversion using .try_into() on the u128 result (e.g., let
sample_interval_ms: i64 = ...as_millis().try_into()?), and propagate or return a
clear error if conversion fails (do not silently wrap/truncate before passing to
the C FFI). Update the surrounding function (the code that defines
sample_interval_ms and uses sample_interval) to return or propagate an error (or
otherwise handle the failure) so invalid large durations aren’t passed into the
FFI.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f8434388-318b-4816-a423-a13a3977db91

📥 Commits

Reviewing files that changed from the base of the PR and between 90c18a8 and 48cd2de.

📒 Files selected for processing (15)
  • c/include/cuvs/core/c_api.h
  • c/src/core/c_api.cpp
  • docs/source/api_basics.rst
  • java/cuvs-java/src/main/java/com/nvidia/cuvs/CuVSResources.java
  • java/cuvs-java/src/main/java/com/nvidia/cuvs/spi/CuVSProvider.java
  • java/cuvs-java/src/main/java/com/nvidia/cuvs/spi/UnsupportedProvider.java
  • java/cuvs-java/src/main/java22/com/nvidia/cuvs/internal/CuVSResourcesImpl.java
  • java/cuvs-java/src/main/java22/com/nvidia/cuvs/spi/JDKProvider.java
  • java/cuvs-java/src/test/java/com/nvidia/cuvs/MemoryTrackingResourcesIT.java
  • python/cuvs/cuvs/common/c_api.pxd
  • python/cuvs/cuvs/common/resources.pyx
  • python/cuvs/cuvs/tests/test_memory_tracking.py
  • rust/cuvs-sys/src/bindings.rs
  • rust/cuvs/Cargo.toml
  • rust/cuvs/src/resources.rs

Comment thread c/src/core/c_api.cpp
Comment thread java/cuvs-java/src/main/java/com/nvidia/cuvs/spi/CuVSProvider.java Outdated
Comment thread python/cuvs/cuvs/common/resources.pyx
Comment thread rust/cuvs/src/resources.rs
Comment thread docs/source/api_basics.rst Outdated
///
/// `sample_interval` controls the minimum time between successive CSV
/// samples; when `None`, the C++ default of 10 ms is used.
pub fn with_memory_tracking<P: AsRef<Path>>(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yan-zaretskiy can we get your review on these parts?

@achirkin achirkin requested a review from cjnolet May 13, 2026 05:05
@achirkin achirkin requested a review from yan-zaretskiy May 13, 2026 14:06
Copy link
Copy Markdown
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C and Python changes LGTM

@achirkin achirkin changed the base branch from main to release/26.06 May 15, 2026 14:47
@achirkin achirkin moved this from In Progress to Done in Unstructured Data Processing May 15, 2026
@achirkin achirkin moved this from Done to In Progress in Unstructured Data Processing May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature request New feature or request non-breaking Introduces a non-breaking change

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

[FEA] Python API for memory tracking resources

3 participants