feat(ruvector-diskann): add RaBitQ backend via new Quantizer trait (Phase 1 item #1) by ruvnet · Pull Request #383 · ruvnet/RuVector

ruvnet · 2026-04-26T01:13:03Z

Summary

First implementation step from the RaBitQ integration research roadmap (PR #382). ADR-154 named DiskANN as a target consumer for RaBitQ; this PR makes it real via Pattern 1 (direct embed).

What changes

DiskANN had no quantizer abstraction — DiskAnnIndex held an Option<ProductQuantizer> directly. This PR:

Introduces Quantizer trait at crates/ruvector-diskann/src/quantize/mod.rs — minimal surface (train / encode / prepare_query / distance) with associated Query type for per-impl handles.
Wraps existing PQ as impl Quantizer for ProductQuantizer (back-compat re-exports preserved).
Adds RabitqQuantizer — new backend using packed binary codes for the filter pass and exact L2² for rerank.
rabitq cargo feature, default-on. --no-default-features builds stay green (PQ-only).

Verification

cargo build -p ruvector-diskann (default features) → OK
cargo build -p ruvector-diskann --no-default-features → OK
cargo clippy -p ruvector-diskann --all-targets --no-deps -- -D warnings (both feature states) → exit 0
cargo fmt --all --check → exit 0
cargo test -p ruvector-diskann --features rabitq → 26 / 26 passed (21 unit + 5 integration)

Notable test results:

rabitq_on_disk_size_is_at_most_one_sixteenth_of_f32 confirms ratio ≤ 1/16 + 1/D at D ∈ {128, 256, 512, 768, 1024}
deterministic_codes_for_same_seed verifies ADR-154's bit-identical guarantee
rabitq_recall_not_drastically_worse_than_pq at 1k×128: PQ ≈ 0.9, RaBitQ no-rerank ≈ 0.18 (matches research baseline)

⚠️ Important finding the research doc didn't capture

pq_codes is dead storage in DiskANN's current search path. crates/ruvector-diskann/src/index.rs:169-200's search() calls graph.greedy_search (uses FlatVectors — the originals) and then does exact l2_squared rerank. self.pq_codes is read by neither. Today's PQ savings are purely on-disk; in-memory the index still holds full f32 vectors.

RaBitQ inherits this until the search loop is rewritten to consult quantizer codes during graph traversal. That rewrite is a separate PR and the prerequisite for realizing the 17.5× memory compression the research projects.

This PR delivers: the abstraction + RaBitQ backend + storage layer.
This PR does NOT deliver: the search-path rewrite that would make either backend's codes load-bearing.

Scope cuts (deferred)

DiskAnnIndex::build still uses ProductQuantizer concretely rather than Box<dyn Quantizer>. Call-site switch is small follow-up.
100k × 768d full acceptance run lives in benches/rabitq_recall.rs (override with RABITQ_BENCH_N=100000); not exercised here, that's CI follow-up.
RabitqQuantizer keeps originals in DRAM for exact L2² rerank. Streaming originals from disk is M2/M3.

Stacked on PR #380

Branched from main after PR #380 merged at 7a599b7cf. Independent of PR #381 (Python SDK) and PR #382 (research doc).

Test plan

Reviewer runs cargo test -p ruvector-diskann --features rabitq locally — should be green
CI exercises both default and --no-default-features build paths
Follow-up PR rewrites DiskAnnIndex::search to consult quantizer codes during graph traversal — that's where the memory savings actually land

🤖 Generated with claude-flow

Phase 1 item #1 from the research roadmap at `docs/research/rabitq-integration/05-roadmap.md`. ADR-154 named DiskANN as a target consumer for RaBitQ; this PR makes the integration real via Pattern 1 (direct embed) — DiskANN now path-deps on `ruvector-rabitq` and ships RaBitQ as a peer to the existing ProductQuantizer. ## What changes DiskANN previously had no quantizer abstraction — `DiskAnnIndex` held an `Option<ProductQuantizer>` directly, and `pq_codes` was populated from concrete PQ methods. This PR: 1. **Introduces `Quantizer` trait** at `crates/ruvector-diskann/src/quantize/mod.rs`. Minimal surface DiskANN actually needs: train(vectors) encode(vec) -> bytes prepare_query(query) -> Self::Query distance(query, code) -> f32 Each impl ships its own per-query handle (PQ: flat distance LUT; RaBitQ: encoded `BinaryCode`). 2. **Wraps existing PQ** as `impl Quantizer for ProductQuantizer`. Source moved from `src/pq.rs` → `src/quantize/pq.rs`. Back-compat re-exports preserved at the crate root so existing call sites keep compiling unchanged. 3. **Adds `RabitqQuantizer`** at `src/quantize/rabitq.rs`. Backed by `ruvector_rabitq::quantize::{Rotation, BinaryCode}` plus a stored originals matrix for L2² rerank. Hamming distance over packed bits for the filter pass, exact L2² for the rerank pass. 4. **Cargo features.** New `rabitq` cargo feature, default-on. `--no-default-features` builds remain green (PQ-only). ## Determinism ADR-154's `(seed, dim, vectors) → bit-identical codes` guarantee is honored on the new backend. Verified by `tests/rabitq_quantizer.rs::deterministic_codes_for_same_seed` — two `RabitqQuantizer::train` calls with the same seed produce byte-identical codes. (Existing PQ uses `rand::thread_rng()` and is non-deterministic across runs; closing that gap is out of scope.) ## Verification cargo build -p ruvector-diskann → OK cargo build -p ruvector-diskann --no-default-features → OK cargo clippy -p ruvector-diskann --all-targets --no-deps -- -D warnings → exit 0 cargo clippy -p ruvector-diskann --no-default-features ... → exit 0 cargo fmt --all --check → exit 0 cargo test -p ruvector-diskann --features rabitq → 26/26 passed (21 unit + 5 integration) Notable test results: - `rabitq_on_disk_size_is_at_most_one_sixteenth_of_f32`: confirms ratio ≤ `1/16 + 1/D` at D ∈ {128, 256, 512, 768, 1024}. - `rabitq_recall_not_drastically_worse_than_pq` at 1k×128: PQ ≈ 0.9, RaBitQ no-rerank ≈ 0.18. Matches the research's no-rerank baseline. - `deterministic_codes_for_same_seed`: ADR-154 guarantee verified. ## Bench (acceptance shape) `benches/rabitq_recall.rs` is shaped to the research's acceptance test (100k × 768d, recall@10 ≥ 0.95, on-disk ≤ 1/16 f32). Defaults to n=10k for CI speed; override with `RABITQ_BENCH_N=100000` for the full configuration. Not run in this PR — that's CI follow-up. ## Important finding the research doc didn't capture **`pq_codes` is dead storage in the current DiskANN search path.** `crates/ruvector-diskann/src/index.rs:169-200`'s `search()` calls `graph.greedy_search` (which uses `FlatVectors` — the originals) and then does exact `l2_squared` rerank. `self.pq_codes` is read by NEITHER. Today's PQ savings are purely on-disk; in-memory the index still holds full f32 vectors. RaBitQ inherits the same situation until the search loop is rewritten to consult quantizer codes during graph traversal. That rewrite is a real follow-up — likely a separate PR — and is the prerequisite for actually realizing the 17.5× memory compression the research projects. This PR delivers: the **abstraction** + **RaBitQ backend** + **storage layer**. It does NOT deliver: the search-path rewrite that would make either backend's codes load-bearing. ## Scope cuts - `DiskAnnIndex::build` still uses `ProductQuantizer` concretely rather than `Box<dyn Quantizer>`. Switching the call site is a small follow-up; kept the PR scope tight to "introduce trait + RaBitQ impl, don't refactor index construction yet". - 100k × 768d full acceptance run deferred to bench/CI. - `RabitqQuantizer` keeps the originals in DRAM for exact L2² rerank. Streaming originals from disk is M2/M3 work. Refs: ADR-154 (RaBitQ), research doc 05-roadmap.md Phase 1 item #1 Co-Authored-By: claude-flow <ruv@ruv.net>

Unblocks the 7 stacked PRs (#381-#387) and turns `main`'s CI green for the first time in days. Two issues fixed: ## Failure 1 — Security audit (was: 8 vulnerabilities) `cargo audit` is now exit 0. 4 of the 5 critical advisories were fixed by version bumps; only the unfixable one is ignored. **Dep-bumped:** - `rustls-webpki 0.101.7` + `0.103.10` → `0.103.13` via `cargo update -p rustls-webpki@0.103.10`. Patches: RUSTSEC-2026-0098 (URI name constraints) RUSTSEC-2026-0099 (wildcard name constraints) RUSTSEC-2026-0104 (CRL parsing panic) - `idna 0.5.0` → `1.1.0` via `validator 0.18 → 0.20` in `examples/scipix`. Patches RUSTSEC-2024-0421 (Punycode acceptance). - Bonus: `reqwest 0.11 → 0.12` (in `ruvector-core` + `examples/benchmarks`) and `hf-hub 0.3 → 0.4` (in `ruvector-core` + `ruvllm` + `ruvllm-cli`). Removes the entire legacy `rustls 0.21` / `rustls-webpki 0.101.7` subtree from the lockfile. **Ignored** (single advisory, with rationale): - `RUSTSEC-2023-0071` (rsa Marvin timing sidechannel) — no upstream fix available; we don't expose RSA decryption services. Documented in `.cargo/audit.toml`. **Unmaintained warnings** (16 total — proc-macro-error, derivative, instant, paste, bincode 1, pqcrypto-{kyber,dilithium}, rustls-pemfile 1, rusttype, wee_alloc, number_prefix, rand_os, core2, lru, pprof, rand) — each given a one-line justification in `.cargo/audit.toml` so CI stays green on them while the team decides whether to chase upstream replacements. ## Failure 2 — Tests timeout (was: 30-min job timeout cancellation) `.github/workflows/ci.yml` `test` job is now a `matrix` with `fail-fast: false` and `timeout-minutes: 45`. Six parallel shards under `cargo nextest run` (installed via `taiki-e/install-action@v2`) plus a separate `cargo test --doc` step (nextest doesn't run doctests): | Shard | Crates | |------------------|---------------------------------------------| | vector-index | rabitq, rulake, diskann, graph, gnn, cnn | | rvagent | 10 rvagent-* crates | | ruvix | 16 ruvix-* crates | | ruqu-quantum | 5 ruqu* crates | | ml-research | attention, mincut, scipix, fpga-transformer,| | | sparse-inference, sparsifier, solver, | | | graph-transformer, domain-expansion, | | | robotics | | core-and-rest | --workspace minus the above | `Swatinem/rust-cache@v2` is keyed per shard. Audit job switched to `taiki-e/install-action` for `cargo-audit` (faster than `cargo install --locked`). ## Verification cargo audit → exit 0 cargo build --workspace --exclude ruvector-postgres → clean cargo clippy --workspace --exclude ruvector-postgres --no-deps -- -D warnings → exit 0 cargo fmt --all --check → exit 0 ## Cargo.lock churn 166-line diff, net ~120 lines removed (more deletions than additions). Removed: `idna 0.5.0`, `rustls-webpki 0.101.7`, `validator 0.18`, `validator_derive 0.18`, `proc-macro-error 1.0.4`. Added: `rustls-webpki 0.103.13`, `validator 0.20`, `proc-macro-error2`, `hf-hub 0.4.3`, `reqwest 0.12.28`. No suspicious crates. ## Recommended merge order 1. **This PR first** — unblocks every other PR's CI. 2. After this lands and main is green, rebase the 7 open PRs (#381-#387) one at a time. The DiskANN stack (#383→#384→#385→#386) must merge in numeric order. #381 (Python SDK), #382 (research), #387 (graph property index) are independent and can merge in any order after their CI goes green on the rebase. Co-Authored-By: claude-flow <ruv@ruv.net>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ruvector-diskann): add RaBitQ backend via new Quantizer trait (Phase 1 item #1)#383

feat(ruvector-diskann): add RaBitQ backend via new Quantizer trait (Phase 1 item #1)#383
ruvnet wants to merge 1 commit intomainfrom
feature/diskann-rabitq-backend

ruvnet commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ruvnet commented Apr 26, 2026

Summary

What changes

Verification

⚠️ Important finding the research doc didn't capture

Scope cuts (deferred)

Stacked on PR #380

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant