Skip to content

feat(ruvector-diskann): add RaBitQ backend via new Quantizer trait (Phase 1 item #1)#383

Open
ruvnet wants to merge 1 commit intomainfrom
feature/diskann-rabitq-backend
Open

feat(ruvector-diskann): add RaBitQ backend via new Quantizer trait (Phase 1 item #1)#383
ruvnet wants to merge 1 commit intomainfrom
feature/diskann-rabitq-backend

Conversation

@ruvnet
Copy link
Copy Markdown
Owner

@ruvnet ruvnet commented Apr 26, 2026

Summary

First implementation step from the RaBitQ integration research roadmap (PR #382). ADR-154 named DiskANN as a target consumer for RaBitQ; this PR makes it real via Pattern 1 (direct embed).

What changes

DiskANN had no quantizer abstraction — DiskAnnIndex held an Option<ProductQuantizer> directly. This PR:

  1. Introduces Quantizer trait at crates/ruvector-diskann/src/quantize/mod.rs — minimal surface (train / encode / prepare_query / distance) with associated Query type for per-impl handles.
  2. Wraps existing PQ as impl Quantizer for ProductQuantizer (back-compat re-exports preserved).
  3. Adds RabitqQuantizer — new backend using packed binary codes for the filter pass and exact L2² for rerank.
  4. rabitq cargo feature, default-on. --no-default-features builds stay green (PQ-only).

Verification

  • cargo build -p ruvector-diskann (default features) → OK
  • cargo build -p ruvector-diskann --no-default-features → OK
  • cargo clippy -p ruvector-diskann --all-targets --no-deps -- -D warnings (both feature states) → exit 0
  • cargo fmt --all --check → exit 0
  • cargo test -p ruvector-diskann --features rabitq26 / 26 passed (21 unit + 5 integration)

Notable test results:

  • rabitq_on_disk_size_is_at_most_one_sixteenth_of_f32 confirms ratio ≤ 1/16 + 1/D at D ∈ {128, 256, 512, 768, 1024}
  • deterministic_codes_for_same_seed verifies ADR-154's bit-identical guarantee
  • rabitq_recall_not_drastically_worse_than_pq at 1k×128: PQ ≈ 0.9, RaBitQ no-rerank ≈ 0.18 (matches research baseline)

⚠️ Important finding the research doc didn't capture

pq_codes is dead storage in DiskANN's current search path. crates/ruvector-diskann/src/index.rs:169-200's search() calls graph.greedy_search (uses FlatVectors — the originals) and then does exact l2_squared rerank. self.pq_codes is read by neither. Today's PQ savings are purely on-disk; in-memory the index still holds full f32 vectors.

RaBitQ inherits this until the search loop is rewritten to consult quantizer codes during graph traversal. That rewrite is a separate PR and the prerequisite for realizing the 17.5× memory compression the research projects.

This PR delivers: the abstraction + RaBitQ backend + storage layer.
This PR does NOT deliver: the search-path rewrite that would make either backend's codes load-bearing.

Scope cuts (deferred)

  • DiskAnnIndex::build still uses ProductQuantizer concretely rather than Box<dyn Quantizer>. Call-site switch is small follow-up.
  • 100k × 768d full acceptance run lives in benches/rabitq_recall.rs (override with RABITQ_BENCH_N=100000); not exercised here, that's CI follow-up.
  • RabitqQuantizer keeps originals in DRAM for exact L2² rerank. Streaming originals from disk is M2/M3.

Stacked on PR #380

Branched from main after PR #380 merged at 7a599b7cf. Independent of PR #381 (Python SDK) and PR #382 (research doc).

Test plan

  • Reviewer runs cargo test -p ruvector-diskann --features rabitq locally — should be green
  • CI exercises both default and --no-default-features build paths
  • Follow-up PR rewrites DiskAnnIndex::search to consult quantizer codes during graph traversal — that's where the memory savings actually land

🤖 Generated with claude-flow

Phase 1 item #1 from the research roadmap at
`docs/research/rabitq-integration/05-roadmap.md`. ADR-154 named
DiskANN as a target consumer for RaBitQ; this PR makes the integration
real via Pattern 1 (direct embed) — DiskANN now path-deps on
`ruvector-rabitq` and ships RaBitQ as a peer to the existing
ProductQuantizer.

## What changes

DiskANN previously had no quantizer abstraction — `DiskAnnIndex` held
an `Option<ProductQuantizer>` directly, and `pq_codes` was populated
from concrete PQ methods. This PR:

1. **Introduces `Quantizer` trait** at `crates/ruvector-diskann/src/quantize/mod.rs`.
   Minimal surface DiskANN actually needs:
       train(vectors)
       encode(vec) -> bytes
       prepare_query(query) -> Self::Query
       distance(query, code) -> f32
   Each impl ships its own per-query handle (PQ: flat distance LUT;
   RaBitQ: encoded `BinaryCode`).

2. **Wraps existing PQ** as `impl Quantizer for ProductQuantizer`.
   Source moved from `src/pq.rs` → `src/quantize/pq.rs`. Back-compat
   re-exports preserved at the crate root so existing call sites keep
   compiling unchanged.

3. **Adds `RabitqQuantizer`** at `src/quantize/rabitq.rs`. Backed by
   `ruvector_rabitq::quantize::{Rotation, BinaryCode}` plus a stored
   originals matrix for L2² rerank. Hamming distance over packed bits
   for the filter pass, exact L2² for the rerank pass.

4. **Cargo features.** New `rabitq` cargo feature, default-on.
   `--no-default-features` builds remain green (PQ-only).

## Determinism

ADR-154's `(seed, dim, vectors) → bit-identical codes` guarantee is
honored on the new backend. Verified by
`tests/rabitq_quantizer.rs::deterministic_codes_for_same_seed` —
two `RabitqQuantizer::train` calls with the same seed produce
byte-identical codes. (Existing PQ uses `rand::thread_rng()` and is
non-deterministic across runs; closing that gap is out of scope.)

## Verification

  cargo build -p ruvector-diskann                                       → OK
  cargo build -p ruvector-diskann --no-default-features                 → OK
  cargo clippy -p ruvector-diskann --all-targets --no-deps -- -D warnings → exit 0
  cargo clippy -p ruvector-diskann --no-default-features ...            → exit 0
  cargo fmt --all --check                                               → exit 0
  cargo test -p ruvector-diskann --features rabitq                      → 26/26 passed
                                                                          (21 unit + 5 integration)

Notable test results:
- `rabitq_on_disk_size_is_at_most_one_sixteenth_of_f32`: confirms
  ratio ≤ `1/16 + 1/D` at D ∈ {128, 256, 512, 768, 1024}.
- `rabitq_recall_not_drastically_worse_than_pq` at 1k×128: PQ ≈ 0.9,
  RaBitQ no-rerank ≈ 0.18. Matches the research's no-rerank baseline.
- `deterministic_codes_for_same_seed`: ADR-154 guarantee verified.

## Bench (acceptance shape)

`benches/rabitq_recall.rs` is shaped to the research's acceptance
test (100k × 768d, recall@10 ≥ 0.95, on-disk ≤ 1/16 f32). Defaults to
n=10k for CI speed; override with `RABITQ_BENCH_N=100000` for the
full configuration. Not run in this PR — that's CI follow-up.

## Important finding the research doc didn't capture

**`pq_codes` is dead storage in the current DiskANN search path.**
`crates/ruvector-diskann/src/index.rs:169-200`'s `search()` calls
`graph.greedy_search` (which uses `FlatVectors` — the originals)
and then does exact `l2_squared` rerank. `self.pq_codes` is read
by NEITHER. Today's PQ savings are purely on-disk; in-memory the
index still holds full f32 vectors.

RaBitQ inherits the same situation until the search loop is
rewritten to consult quantizer codes during graph traversal.
That rewrite is a real follow-up — likely a separate PR — and is
the prerequisite for actually realizing the 17.5× memory
compression the research projects.

This PR delivers: the **abstraction** + **RaBitQ backend** + **storage
layer**. It does NOT deliver: the search-path rewrite that would make
either backend's codes load-bearing.

## Scope cuts

- `DiskAnnIndex::build` still uses `ProductQuantizer` concretely
  rather than `Box<dyn Quantizer>`. Switching the call site is a
  small follow-up; kept the PR scope tight to "introduce trait +
  RaBitQ impl, don't refactor index construction yet".
- 100k × 768d full acceptance run deferred to bench/CI.
- `RabitqQuantizer` keeps the originals in DRAM for exact L2² rerank.
  Streaming originals from disk is M2/M3 work.

Refs: ADR-154 (RaBitQ), research doc 05-roadmap.md Phase 1 item #1

Co-Authored-By: claude-flow <ruv@ruv.net>
ruvnet added a commit that referenced this pull request Apr 26, 2026
Unblocks the 7 stacked PRs (#381-#387) and turns `main`'s CI green
for the first time in days. Two issues fixed:

## Failure 1 — Security audit (was: 8 vulnerabilities)

`cargo audit` is now exit 0. 4 of the 5 critical advisories were
fixed by version bumps; only the unfixable one is ignored.

**Dep-bumped:**
- `rustls-webpki 0.101.7` + `0.103.10` → `0.103.13` via
  `cargo update -p rustls-webpki@0.103.10`. Patches:
    RUSTSEC-2026-0098 (URI name constraints)
    RUSTSEC-2026-0099 (wildcard name constraints)
    RUSTSEC-2026-0104 (CRL parsing panic)
- `idna 0.5.0` → `1.1.0` via `validator 0.18 → 0.20` in
  `examples/scipix`. Patches RUSTSEC-2024-0421 (Punycode acceptance).
- Bonus: `reqwest 0.11 → 0.12` (in `ruvector-core` + `examples/benchmarks`)
  and `hf-hub 0.3 → 0.4` (in `ruvector-core` + `ruvllm` +
  `ruvllm-cli`). Removes the entire legacy `rustls 0.21` /
  `rustls-webpki 0.101.7` subtree from the lockfile.

**Ignored** (single advisory, with rationale):
- `RUSTSEC-2023-0071` (rsa Marvin timing sidechannel) — no upstream
  fix available; we don't expose RSA decryption services. Documented
  in `.cargo/audit.toml`.

**Unmaintained warnings** (16 total — proc-macro-error, derivative,
instant, paste, bincode 1, pqcrypto-{kyber,dilithium}, rustls-pemfile 1,
rusttype, wee_alloc, number_prefix, rand_os, core2, lru, pprof, rand) —
each given a one-line justification in `.cargo/audit.toml` so CI stays
green on them while the team decides whether to chase upstream
replacements.

## Failure 2 — Tests timeout (was: 30-min job timeout cancellation)

`.github/workflows/ci.yml` `test` job is now a `matrix` with
`fail-fast: false` and `timeout-minutes: 45`. Six parallel shards
under `cargo nextest run` (installed via `taiki-e/install-action@v2`)
plus a separate `cargo test --doc` step (nextest doesn't run
doctests):

  | Shard            | Crates                                      |
  |------------------|---------------------------------------------|
  | vector-index     | rabitq, rulake, diskann, graph, gnn, cnn    |
  | rvagent          | 10 rvagent-* crates                         |
  | ruvix            | 16 ruvix-* crates                           |
  | ruqu-quantum     | 5 ruqu* crates                              |
  | ml-research      | attention, mincut, scipix, fpga-transformer,|
  |                  | sparse-inference, sparsifier, solver,       |
  |                  | graph-transformer, domain-expansion,        |
  |                  | robotics                                    |
  | core-and-rest    | --workspace minus the above                 |

`Swatinem/rust-cache@v2` is keyed per shard. Audit job switched to
`taiki-e/install-action` for `cargo-audit` (faster than
`cargo install --locked`).

## Verification

  cargo audit                                                   → exit 0
  cargo build --workspace --exclude ruvector-postgres           → clean
  cargo clippy --workspace --exclude ruvector-postgres --no-deps -- -D warnings → exit 0
  cargo fmt --all --check                                       → exit 0

## Cargo.lock churn

166-line diff, net ~120 lines removed (more deletions than
additions). Removed: `idna 0.5.0`, `rustls-webpki 0.101.7`,
`validator 0.18`, `validator_derive 0.18`, `proc-macro-error 1.0.4`.
Added: `rustls-webpki 0.103.13`, `validator 0.20`,
`proc-macro-error2`, `hf-hub 0.4.3`, `reqwest 0.12.28`. No
suspicious crates.

## Recommended merge order

1. **This PR first** — unblocks every other PR's CI.
2. After this lands and main is green, rebase the 7 open PRs
   (#381-#387) one at a time. The DiskANN stack (#383#384#385#386)
   must merge in numeric order. #381 (Python SDK), #382 (research),
   #387 (graph property index) are independent and can merge in
   any order after their CI goes green on the rebase.

Co-Authored-By: claude-flow <ruv@ruv.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant