feat(ruvector-diskann): add RaBitQ backend via new Quantizer trait (Phase 1 item #1)#383
Open
feat(ruvector-diskann): add RaBitQ backend via new Quantizer trait (Phase 1 item #1)#383
Conversation
Phase 1 item #1 from the research roadmap at `docs/research/rabitq-integration/05-roadmap.md`. ADR-154 named DiskANN as a target consumer for RaBitQ; this PR makes the integration real via Pattern 1 (direct embed) — DiskANN now path-deps on `ruvector-rabitq` and ships RaBitQ as a peer to the existing ProductQuantizer. ## What changes DiskANN previously had no quantizer abstraction — `DiskAnnIndex` held an `Option<ProductQuantizer>` directly, and `pq_codes` was populated from concrete PQ methods. This PR: 1. **Introduces `Quantizer` trait** at `crates/ruvector-diskann/src/quantize/mod.rs`. Minimal surface DiskANN actually needs: train(vectors) encode(vec) -> bytes prepare_query(query) -> Self::Query distance(query, code) -> f32 Each impl ships its own per-query handle (PQ: flat distance LUT; RaBitQ: encoded `BinaryCode`). 2. **Wraps existing PQ** as `impl Quantizer for ProductQuantizer`. Source moved from `src/pq.rs` → `src/quantize/pq.rs`. Back-compat re-exports preserved at the crate root so existing call sites keep compiling unchanged. 3. **Adds `RabitqQuantizer`** at `src/quantize/rabitq.rs`. Backed by `ruvector_rabitq::quantize::{Rotation, BinaryCode}` plus a stored originals matrix for L2² rerank. Hamming distance over packed bits for the filter pass, exact L2² for the rerank pass. 4. **Cargo features.** New `rabitq` cargo feature, default-on. `--no-default-features` builds remain green (PQ-only). ## Determinism ADR-154's `(seed, dim, vectors) → bit-identical codes` guarantee is honored on the new backend. Verified by `tests/rabitq_quantizer.rs::deterministic_codes_for_same_seed` — two `RabitqQuantizer::train` calls with the same seed produce byte-identical codes. (Existing PQ uses `rand::thread_rng()` and is non-deterministic across runs; closing that gap is out of scope.) ## Verification cargo build -p ruvector-diskann → OK cargo build -p ruvector-diskann --no-default-features → OK cargo clippy -p ruvector-diskann --all-targets --no-deps -- -D warnings → exit 0 cargo clippy -p ruvector-diskann --no-default-features ... → exit 0 cargo fmt --all --check → exit 0 cargo test -p ruvector-diskann --features rabitq → 26/26 passed (21 unit + 5 integration) Notable test results: - `rabitq_on_disk_size_is_at_most_one_sixteenth_of_f32`: confirms ratio ≤ `1/16 + 1/D` at D ∈ {128, 256, 512, 768, 1024}. - `rabitq_recall_not_drastically_worse_than_pq` at 1k×128: PQ ≈ 0.9, RaBitQ no-rerank ≈ 0.18. Matches the research's no-rerank baseline. - `deterministic_codes_for_same_seed`: ADR-154 guarantee verified. ## Bench (acceptance shape) `benches/rabitq_recall.rs` is shaped to the research's acceptance test (100k × 768d, recall@10 ≥ 0.95, on-disk ≤ 1/16 f32). Defaults to n=10k for CI speed; override with `RABITQ_BENCH_N=100000` for the full configuration. Not run in this PR — that's CI follow-up. ## Important finding the research doc didn't capture **`pq_codes` is dead storage in the current DiskANN search path.** `crates/ruvector-diskann/src/index.rs:169-200`'s `search()` calls `graph.greedy_search` (which uses `FlatVectors` — the originals) and then does exact `l2_squared` rerank. `self.pq_codes` is read by NEITHER. Today's PQ savings are purely on-disk; in-memory the index still holds full f32 vectors. RaBitQ inherits the same situation until the search loop is rewritten to consult quantizer codes during graph traversal. That rewrite is a real follow-up — likely a separate PR — and is the prerequisite for actually realizing the 17.5× memory compression the research projects. This PR delivers: the **abstraction** + **RaBitQ backend** + **storage layer**. It does NOT deliver: the search-path rewrite that would make either backend's codes load-bearing. ## Scope cuts - `DiskAnnIndex::build` still uses `ProductQuantizer` concretely rather than `Box<dyn Quantizer>`. Switching the call site is a small follow-up; kept the PR scope tight to "introduce trait + RaBitQ impl, don't refactor index construction yet". - 100k × 768d full acceptance run deferred to bench/CI. - `RabitqQuantizer` keeps the originals in DRAM for exact L2² rerank. Streaming originals from disk is M2/M3 work. Refs: ADR-154 (RaBitQ), research doc 05-roadmap.md Phase 1 item #1 Co-Authored-By: claude-flow <ruv@ruv.net>
This was referenced Apr 26, 2026
ruvnet
added a commit
that referenced
this pull request
Apr 26, 2026
Unblocks the 7 stacked PRs (#381-#387) and turns `main`'s CI green for the first time in days. Two issues fixed: ## Failure 1 — Security audit (was: 8 vulnerabilities) `cargo audit` is now exit 0. 4 of the 5 critical advisories were fixed by version bumps; only the unfixable one is ignored. **Dep-bumped:** - `rustls-webpki 0.101.7` + `0.103.10` → `0.103.13` via `cargo update -p rustls-webpki@0.103.10`. Patches: RUSTSEC-2026-0098 (URI name constraints) RUSTSEC-2026-0099 (wildcard name constraints) RUSTSEC-2026-0104 (CRL parsing panic) - `idna 0.5.0` → `1.1.0` via `validator 0.18 → 0.20` in `examples/scipix`. Patches RUSTSEC-2024-0421 (Punycode acceptance). - Bonus: `reqwest 0.11 → 0.12` (in `ruvector-core` + `examples/benchmarks`) and `hf-hub 0.3 → 0.4` (in `ruvector-core` + `ruvllm` + `ruvllm-cli`). Removes the entire legacy `rustls 0.21` / `rustls-webpki 0.101.7` subtree from the lockfile. **Ignored** (single advisory, with rationale): - `RUSTSEC-2023-0071` (rsa Marvin timing sidechannel) — no upstream fix available; we don't expose RSA decryption services. Documented in `.cargo/audit.toml`. **Unmaintained warnings** (16 total — proc-macro-error, derivative, instant, paste, bincode 1, pqcrypto-{kyber,dilithium}, rustls-pemfile 1, rusttype, wee_alloc, number_prefix, rand_os, core2, lru, pprof, rand) — each given a one-line justification in `.cargo/audit.toml` so CI stays green on them while the team decides whether to chase upstream replacements. ## Failure 2 — Tests timeout (was: 30-min job timeout cancellation) `.github/workflows/ci.yml` `test` job is now a `matrix` with `fail-fast: false` and `timeout-minutes: 45`. Six parallel shards under `cargo nextest run` (installed via `taiki-e/install-action@v2`) plus a separate `cargo test --doc` step (nextest doesn't run doctests): | Shard | Crates | |------------------|---------------------------------------------| | vector-index | rabitq, rulake, diskann, graph, gnn, cnn | | rvagent | 10 rvagent-* crates | | ruvix | 16 ruvix-* crates | | ruqu-quantum | 5 ruqu* crates | | ml-research | attention, mincut, scipix, fpga-transformer,| | | sparse-inference, sparsifier, solver, | | | graph-transformer, domain-expansion, | | | robotics | | core-and-rest | --workspace minus the above | `Swatinem/rust-cache@v2` is keyed per shard. Audit job switched to `taiki-e/install-action` for `cargo-audit` (faster than `cargo install --locked`). ## Verification cargo audit → exit 0 cargo build --workspace --exclude ruvector-postgres → clean cargo clippy --workspace --exclude ruvector-postgres --no-deps -- -D warnings → exit 0 cargo fmt --all --check → exit 0 ## Cargo.lock churn 166-line diff, net ~120 lines removed (more deletions than additions). Removed: `idna 0.5.0`, `rustls-webpki 0.101.7`, `validator 0.18`, `validator_derive 0.18`, `proc-macro-error 1.0.4`. Added: `rustls-webpki 0.103.13`, `validator 0.20`, `proc-macro-error2`, `hf-hub 0.4.3`, `reqwest 0.12.28`. No suspicious crates. ## Recommended merge order 1. **This PR first** — unblocks every other PR's CI. 2. After this lands and main is green, rebase the 7 open PRs (#381-#387) one at a time. The DiskANN stack (#383→#384→#385→#386) must merge in numeric order. #381 (Python SDK), #382 (research), #387 (graph property index) are independent and can merge in any order after their CI goes green on the rebase. Co-Authored-By: claude-flow <ruv@ruv.net>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First implementation step from the RaBitQ integration research roadmap (PR #382). ADR-154 named DiskANN as a target consumer for RaBitQ; this PR makes it real via Pattern 1 (direct embed).
What changes
DiskANN had no quantizer abstraction —
DiskAnnIndexheld anOption<ProductQuantizer>directly. This PR:Quantizertrait atcrates/ruvector-diskann/src/quantize/mod.rs— minimal surface (train/encode/prepare_query/distance) with associatedQuerytype for per-impl handles.impl Quantizer for ProductQuantizer(back-compat re-exports preserved).RabitqQuantizer— new backend using packed binary codes for the filter pass and exact L2² for rerank.rabitqcargo feature, default-on.--no-default-featuresbuilds stay green (PQ-only).Verification
cargo build -p ruvector-diskann(default features) → OKcargo build -p ruvector-diskann --no-default-features→ OKcargo clippy -p ruvector-diskann --all-targets --no-deps -- -D warnings(both feature states) → exit 0cargo fmt --all --check→ exit 0cargo test -p ruvector-diskann --features rabitq→ 26 / 26 passed (21 unit + 5 integration)Notable test results:
rabitq_on_disk_size_is_at_most_one_sixteenth_of_f32confirms ratio ≤1/16 + 1/Dat D ∈ {128, 256, 512, 768, 1024}deterministic_codes_for_same_seedverifies ADR-154's bit-identical guaranteerabitq_recall_not_drastically_worse_than_pqat 1k×128: PQ ≈ 0.9, RaBitQ no-rerank ≈ 0.18 (matches research baseline)pq_codesis dead storage in DiskANN's current search path.crates/ruvector-diskann/src/index.rs:169-200'ssearch()callsgraph.greedy_search(usesFlatVectors— the originals) and then does exactl2_squaredrerank.self.pq_codesis read by neither. Today's PQ savings are purely on-disk; in-memory the index still holds full f32 vectors.RaBitQ inherits this until the search loop is rewritten to consult quantizer codes during graph traversal. That rewrite is a separate PR and the prerequisite for realizing the 17.5× memory compression the research projects.
This PR delivers: the abstraction + RaBitQ backend + storage layer.
This PR does NOT deliver: the search-path rewrite that would make either backend's codes load-bearing.
Scope cuts (deferred)
DiskAnnIndex::buildstill usesProductQuantizerconcretely rather thanBox<dyn Quantizer>. Call-site switch is small follow-up.benches/rabitq_recall.rs(override withRABITQ_BENCH_N=100000); not exercised here, that's CI follow-up.RabitqQuantizerkeeps originals in DRAM for exact L2² rerank. Streaming originals from disk is M2/M3.Stacked on PR #380
Branched from
mainafter PR #380 merged at7a599b7cf. Independent of PR #381 (Python SDK) and PR #382 (research doc).Test plan
cargo test -p ruvector-diskann --features rabitqlocally — should be green--no-default-featuresbuild pathsDiskAnnIndex::searchto consult quantizer codes during graph traversal — that's where the memory savings actually land🤖 Generated with claude-flow