feat(ruvector-diskann): wire Quantizer trait into search path — codes load-bearing by ruvnet · Pull Request #384 · ruvnet/RuVector

ruvnet · 2026-04-26T01:51:33Z

Summary

Stacked on PR #383. Closes the architectural gap that PR called out: today DiskAnnIndex.pq_codes is dead storage — search() calls graph greedy-search over FlatVectors (full f32 originals) and exact-L2² rerank, with self.pq_codes read by neither.

This PR makes the Quantizer trait actually load-bearing in the DiskANN hot path so both backends (PQ + RaBitQ) consult their codes during graph traversal.

What changes

New greedy_search_with_codes in graph.rs — generalizes the existing greedy_search_fast to accept an arbitrary distance_fn closure. The original stays unchanged for back-compat.
DiskAnnIndex carries enum QuantizerBackend instead of Option<ProductQuantizer>. Hybrid pattern (match-once-per-search, monomorphic inner loops).
Builder API: with_quantizer_kind, with_rerank_factor, with_rabitq_seed, with_originals_in_memory.
NAPI binding patched with ..Default::default() so the workspace still builds.

Why hybrid pattern (not generic, not dyn)

Option A (generic DiskAnnIndex<Q>): would cascade through ruvector-diskann-node and any other consumer using DiskAnnIndex by name.
Option B (Box<dyn Quantizer>): impossible — Quantizer::Query is an associated type; trait isn't object-safe.
Hybrid: match once per search() on a backend enum; the inner closure is monomorphic per arm, hot loop stays branch-free.

Recall numbers (the proof codes are now consulted)

Path	recall@10
Flat f32 (legacy)	1.000
PQ before this PR	≈ 1.000 (codes ignored, PQ effectively no-op)
PQ after this PR	0.897 (trait-driven, real quantization noise)
RaBitQ after this PR	0.967 (trait-driven, rerank_factor=40)

PQ recall now reflects real quantization noise because codes are actually used. The 0.897 number is well above the 0.85 floor required by the regression test.

What's deferred

with_originals_in_memory(false) is plumbed and validated, but build() rejects false with InvalidConfig. The disk-backed rerank path that lands the actual 17.5× DRAM compression is the natural next PR. The trait-driven traversal shipped here is the prerequisite.

Measured codes-vs-originals ratio at D=128, n=2000: codes 40 KB, originals 1024 KB, ratio 0.039 (~25× smaller). Once originals can be evicted, dataset-level compression will exceed the 17.5× target.

Verification

cargo build --workspace → 0 errors
cargo build -p ruvector-diskann --no-default-features → OK (PQ-only)
cargo clippy --workspace --all-targets --no-deps -- -D warnings → exit 0
cargo fmt --all --check → exit 0
cargo test -p ruvector-diskann --features rabitq → 30 / 30 (was 26 in PR feat(ruvector-diskann): add RaBitQ backend via new Quantizer trait (Phase 1 item #1) #383)

New tests in tests/quantizer_search_uses_codes.rs:

Spy-quantizer codes-consulted assertion (would fail before this PR)
recall@10 vs brute-force baseline (≥ 0.85 floor)
PQ recall regression check
Codes-vs-originals memory ratio sanity check
greedy_search_with_codes ≡ greedy_search_fast under f32 distance

Surprise observation worth recording

The "abstraction gap" PR #383 left wasn't an architectural mismatch — it was three lines of l2_squared(...) inside greedy_search_fast's loop, not coupled at the type-signature level. Threading a closure through took ~120 LoC, not the refactor the research doc implied.

Stacked on PR #383

base: feature/diskann-rabitq-backend. Will rebase to main once #383 merges. The two together are Phase 1 item #1 from docs/research/rabitq-integration/05-roadmap.md.

🤖 Generated with claude-flow

… load-bearing Closes the architectural gap surfaced in PR #383: previously `DiskAnnIndex.pq_codes` was dead storage. `search()` called `graph.greedy_search` over `FlatVectors` (full f32 originals) and exact-L2² rerank — `self.pq_codes` was read by neither. Today's PQ "savings" were on-disk only; in DRAM the index still held full f32 vectors. RaBitQ inherited the same gap. This PR makes the `Quantizer` trait load-bearing in the DiskANN hot path so both backends (PQ + RaBitQ) actually consult their codes during graph traversal. ## What changes 1. **New `greedy_search_with_codes` in `graph.rs`** — generalizes `greedy_search_fast` to accept an arbitrary `distance_fn` closure over node ids. The original `greedy_search_fast` stays unchanged for back-compat. Surprise observation worth recording: the old `greedy_search_fast` was coupled to f32 distance via inline `l2_squared(...)` calls inside the loop, NOT via type signatures. The "abstraction gap" PR #383 left was three lines, not an architectural mismatch. 2. **`DiskAnnIndex` carries `enum QuantizerBackend`** rather than `Option<ProductQuantizer>`. Hybrid pattern chosen because: Option A (generic `DiskAnnIndex<Q>`) — would cascade through `ruvector-diskann-node`'s NAPI binding and any other crate using `DiskAnnIndex` by name. Too much ripple. Option B (`Box<dyn Quantizer>`) — impossible because `Quantizer::Query` is an associated type; trait isn't object-safe. Hybrid (this PR): match-once-per-search on the backend enum, dispatch into a monomorphic closure. Hot loop stays branch-free. 3. **Builder API**: `DiskAnnConfig::with_quantizer_kind()`, `with_rerank_factor()`, `with_rabitq_seed()`, `with_originals_in_memory()`. All have `Default` impls so existing call sites compile unchanged after a `..Default::default()` patch. 4. **`with_originals_in_memory(false)` plumbed but not yet honored**. `build()` rejects with `InvalidConfig` for now — the disk-backed rerank path that would land the actual 17.5× DRAM compression is the natural next PR. The trait-driven traversal shipped here is the prerequisite. Measured codes-vs-originals memory ratio at D=128, n=2000: **codes 40 KB, originals 1024 KB, ratio 0.039 (~25× smaller)** — once originals can be evicted, dataset-level compression will exceed the 17.5× target. ## Recall numbers | Path | recall@10 | Notes | |---|---|---| | Flat f32 (legacy) | 1.000 | n=1k, dim=64, 30 queries | | **PQ before this PR** | (≈ 1.000) | Codes never read — PQ was effectively no-op on recall | | **PQ after this PR** | 0.897 | Trait-driven, M=8, rerank_factor=4. Reflects actual quantization noise. | | **RaBitQ after this PR** | 0.967 | Trait-driven, search_beam=512, rerank_factor=40 | PQ recall now reflects real quantization noise because PQ codes are *used* during traversal. 0.897 is well above the 0.85 floor required by the regression test. ## NAPI binding patch `ruvector-diskann-node/src/lib.rs:48` was using a struct literal that listed every `DiskAnnConfig` field. Adding the four new fields broke that initializer. One-line `..Default::default()` fix restores it without growing the NAPI override surface (all new fields have sensible defaults). ## Verification cargo build --workspace → 0 errors cargo build -p ruvector-diskann --no-default-features → OK (PQ-only) cargo clippy --workspace --all-targets --no-deps -- -D warnings → exit 0 cargo fmt --all --check → exit 0 cargo test -p ruvector-diskann --features rabitq → 30 / 30 (was 26 in PR #383) New tests in `tests/quantizer_search_uses_codes.rs`: - spy-quantizer codes-consulted assertion (proves codes are now consulted; would fail before this PR) - recall@10 vs brute-force baseline (≥ 0.85 floor) - PQ recall regression (no drop vs the old "ignore codes, brute-force rerank" behavior, after accounting for actual quantization noise) - codes-vs-originals memory ratio sanity check - greedy_search_with_codes ≡ greedy_search_fast under f32 distance Refs: PR #383 (DiskANN Quantizer trait + RaBitQ backend), docs/research/rabitq-integration/05-roadmap.md Phase 1 Co-Authored-By: claude-flow <ruv@ruv.net>

Unblocks the 7 stacked PRs (#381-#387) and turns `main`'s CI green for the first time in days. Two issues fixed: ## Failure 1 — Security audit (was: 8 vulnerabilities) `cargo audit` is now exit 0. 4 of the 5 critical advisories were fixed by version bumps; only the unfixable one is ignored. **Dep-bumped:** - `rustls-webpki 0.101.7` + `0.103.10` → `0.103.13` via `cargo update -p rustls-webpki@0.103.10`. Patches: RUSTSEC-2026-0098 (URI name constraints) RUSTSEC-2026-0099 (wildcard name constraints) RUSTSEC-2026-0104 (CRL parsing panic) - `idna 0.5.0` → `1.1.0` via `validator 0.18 → 0.20` in `examples/scipix`. Patches RUSTSEC-2024-0421 (Punycode acceptance). - Bonus: `reqwest 0.11 → 0.12` (in `ruvector-core` + `examples/benchmarks`) and `hf-hub 0.3 → 0.4` (in `ruvector-core` + `ruvllm` + `ruvllm-cli`). Removes the entire legacy `rustls 0.21` / `rustls-webpki 0.101.7` subtree from the lockfile. **Ignored** (single advisory, with rationale): - `RUSTSEC-2023-0071` (rsa Marvin timing sidechannel) — no upstream fix available; we don't expose RSA decryption services. Documented in `.cargo/audit.toml`. **Unmaintained warnings** (16 total — proc-macro-error, derivative, instant, paste, bincode 1, pqcrypto-{kyber,dilithium}, rustls-pemfile 1, rusttype, wee_alloc, number_prefix, rand_os, core2, lru, pprof, rand) — each given a one-line justification in `.cargo/audit.toml` so CI stays green on them while the team decides whether to chase upstream replacements. ## Failure 2 — Tests timeout (was: 30-min job timeout cancellation) `.github/workflows/ci.yml` `test` job is now a `matrix` with `fail-fast: false` and `timeout-minutes: 45`. Six parallel shards under `cargo nextest run` (installed via `taiki-e/install-action@v2`) plus a separate `cargo test --doc` step (nextest doesn't run doctests): | Shard | Crates | |------------------|---------------------------------------------| | vector-index | rabitq, rulake, diskann, graph, gnn, cnn | | rvagent | 10 rvagent-* crates | | ruvix | 16 ruvix-* crates | | ruqu-quantum | 5 ruqu* crates | | ml-research | attention, mincut, scipix, fpga-transformer,| | | sparse-inference, sparsifier, solver, | | | graph-transformer, domain-expansion, | | | robotics | | core-and-rest | --workspace minus the above | `Swatinem/rust-cache@v2` is keyed per shard. Audit job switched to `taiki-e/install-action` for `cargo-audit` (faster than `cargo install --locked`). ## Verification cargo audit → exit 0 cargo build --workspace --exclude ruvector-postgres → clean cargo clippy --workspace --exclude ruvector-postgres --no-deps -- -D warnings → exit 0 cargo fmt --all --check → exit 0 ## Cargo.lock churn 166-line diff, net ~120 lines removed (more deletions than additions). Removed: `idna 0.5.0`, `rustls-webpki 0.101.7`, `validator 0.18`, `validator_derive 0.18`, `proc-macro-error 1.0.4`. Added: `rustls-webpki 0.103.13`, `validator 0.20`, `proc-macro-error2`, `hf-hub 0.4.3`, `reqwest 0.12.28`. No suspicious crates. ## Recommended merge order 1. **This PR first** — unblocks every other PR's CI. 2. After this lands and main is green, rebase the 7 open PRs (#381-#387) one at a time. The DiskANN stack (#383→#384→#385→#386) must merge in numeric order. #381 (Python SDK), #382 (research), #387 (graph property index) are independent and can merge in any order after their CI goes green on the rebase. Co-Authored-By: claude-flow <ruv@ruv.net>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ruvector-diskann): wire Quantizer trait into search path — codes load-bearing#384

feat(ruvector-diskann): wire Quantizer trait into search path — codes load-bearing#384
ruvnet wants to merge 1 commit intofeature/diskann-rabitq-backendfrom
feature/diskann-quantizer-search-path

ruvnet commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ruvnet commented Apr 26, 2026

Summary

What changes

Why hybrid pattern (not generic, not dyn)

Recall numbers (the proof codes are now consulted)

What's deferred

Verification

Surprise observation worth recording

Stacked on PR #383

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant