feat(ruvector-diskann): wire Quantizer trait into search path — codes load-bearing#384
Open
ruvnet wants to merge 1 commit intofeature/diskann-rabitq-backendfrom
Open
Conversation
… load-bearing Closes the architectural gap surfaced in PR #383: previously `DiskAnnIndex.pq_codes` was dead storage. `search()` called `graph.greedy_search` over `FlatVectors` (full f32 originals) and exact-L2² rerank — `self.pq_codes` was read by neither. Today's PQ "savings" were on-disk only; in DRAM the index still held full f32 vectors. RaBitQ inherited the same gap. This PR makes the `Quantizer` trait load-bearing in the DiskANN hot path so both backends (PQ + RaBitQ) actually consult their codes during graph traversal. ## What changes 1. **New `greedy_search_with_codes` in `graph.rs`** — generalizes `greedy_search_fast` to accept an arbitrary `distance_fn` closure over node ids. The original `greedy_search_fast` stays unchanged for back-compat. Surprise observation worth recording: the old `greedy_search_fast` was coupled to f32 distance via inline `l2_squared(...)` calls inside the loop, NOT via type signatures. The "abstraction gap" PR #383 left was three lines, not an architectural mismatch. 2. **`DiskAnnIndex` carries `enum QuantizerBackend`** rather than `Option<ProductQuantizer>`. Hybrid pattern chosen because: Option A (generic `DiskAnnIndex<Q>`) — would cascade through `ruvector-diskann-node`'s NAPI binding and any other crate using `DiskAnnIndex` by name. Too much ripple. Option B (`Box<dyn Quantizer>`) — impossible because `Quantizer::Query` is an associated type; trait isn't object-safe. Hybrid (this PR): match-once-per-search on the backend enum, dispatch into a monomorphic closure. Hot loop stays branch-free. 3. **Builder API**: `DiskAnnConfig::with_quantizer_kind()`, `with_rerank_factor()`, `with_rabitq_seed()`, `with_originals_in_memory()`. All have `Default` impls so existing call sites compile unchanged after a `..Default::default()` patch. 4. **`with_originals_in_memory(false)` plumbed but not yet honored**. `build()` rejects with `InvalidConfig` for now — the disk-backed rerank path that would land the actual 17.5× DRAM compression is the natural next PR. The trait-driven traversal shipped here is the prerequisite. Measured codes-vs-originals memory ratio at D=128, n=2000: **codes 40 KB, originals 1024 KB, ratio 0.039 (~25× smaller)** — once originals can be evicted, dataset-level compression will exceed the 17.5× target. ## Recall numbers | Path | recall@10 | Notes | |---|---|---| | Flat f32 (legacy) | 1.000 | n=1k, dim=64, 30 queries | | **PQ before this PR** | (≈ 1.000) | Codes never read — PQ was effectively no-op on recall | | **PQ after this PR** | 0.897 | Trait-driven, M=8, rerank_factor=4. Reflects actual quantization noise. | | **RaBitQ after this PR** | 0.967 | Trait-driven, search_beam=512, rerank_factor=40 | PQ recall now reflects real quantization noise because PQ codes are *used* during traversal. 0.897 is well above the 0.85 floor required by the regression test. ## NAPI binding patch `ruvector-diskann-node/src/lib.rs:48` was using a struct literal that listed every `DiskAnnConfig` field. Adding the four new fields broke that initializer. One-line `..Default::default()` fix restores it without growing the NAPI override surface (all new fields have sensible defaults). ## Verification cargo build --workspace → 0 errors cargo build -p ruvector-diskann --no-default-features → OK (PQ-only) cargo clippy --workspace --all-targets --no-deps -- -D warnings → exit 0 cargo fmt --all --check → exit 0 cargo test -p ruvector-diskann --features rabitq → 30 / 30 (was 26 in PR #383) New tests in `tests/quantizer_search_uses_codes.rs`: - spy-quantizer codes-consulted assertion (proves codes are now consulted; would fail before this PR) - recall@10 vs brute-force baseline (≥ 0.85 floor) - PQ recall regression (no drop vs the old "ignore codes, brute-force rerank" behavior, after accounting for actual quantization noise) - codes-vs-originals memory ratio sanity check - greedy_search_with_codes ≡ greedy_search_fast under f32 distance Refs: PR #383 (DiskANN Quantizer trait + RaBitQ backend), docs/research/rabitq-integration/05-roadmap.md Phase 1 Co-Authored-By: claude-flow <ruv@ruv.net>
ruvnet
added a commit
that referenced
this pull request
Apr 26, 2026
Unblocks the 7 stacked PRs (#381-#387) and turns `main`'s CI green for the first time in days. Two issues fixed: ## Failure 1 — Security audit (was: 8 vulnerabilities) `cargo audit` is now exit 0. 4 of the 5 critical advisories were fixed by version bumps; only the unfixable one is ignored. **Dep-bumped:** - `rustls-webpki 0.101.7` + `0.103.10` → `0.103.13` via `cargo update -p rustls-webpki@0.103.10`. Patches: RUSTSEC-2026-0098 (URI name constraints) RUSTSEC-2026-0099 (wildcard name constraints) RUSTSEC-2026-0104 (CRL parsing panic) - `idna 0.5.0` → `1.1.0` via `validator 0.18 → 0.20` in `examples/scipix`. Patches RUSTSEC-2024-0421 (Punycode acceptance). - Bonus: `reqwest 0.11 → 0.12` (in `ruvector-core` + `examples/benchmarks`) and `hf-hub 0.3 → 0.4` (in `ruvector-core` + `ruvllm` + `ruvllm-cli`). Removes the entire legacy `rustls 0.21` / `rustls-webpki 0.101.7` subtree from the lockfile. **Ignored** (single advisory, with rationale): - `RUSTSEC-2023-0071` (rsa Marvin timing sidechannel) — no upstream fix available; we don't expose RSA decryption services. Documented in `.cargo/audit.toml`. **Unmaintained warnings** (16 total — proc-macro-error, derivative, instant, paste, bincode 1, pqcrypto-{kyber,dilithium}, rustls-pemfile 1, rusttype, wee_alloc, number_prefix, rand_os, core2, lru, pprof, rand) — each given a one-line justification in `.cargo/audit.toml` so CI stays green on them while the team decides whether to chase upstream replacements. ## Failure 2 — Tests timeout (was: 30-min job timeout cancellation) `.github/workflows/ci.yml` `test` job is now a `matrix` with `fail-fast: false` and `timeout-minutes: 45`. Six parallel shards under `cargo nextest run` (installed via `taiki-e/install-action@v2`) plus a separate `cargo test --doc` step (nextest doesn't run doctests): | Shard | Crates | |------------------|---------------------------------------------| | vector-index | rabitq, rulake, diskann, graph, gnn, cnn | | rvagent | 10 rvagent-* crates | | ruvix | 16 ruvix-* crates | | ruqu-quantum | 5 ruqu* crates | | ml-research | attention, mincut, scipix, fpga-transformer,| | | sparse-inference, sparsifier, solver, | | | graph-transformer, domain-expansion, | | | robotics | | core-and-rest | --workspace minus the above | `Swatinem/rust-cache@v2` is keyed per shard. Audit job switched to `taiki-e/install-action` for `cargo-audit` (faster than `cargo install --locked`). ## Verification cargo audit → exit 0 cargo build --workspace --exclude ruvector-postgres → clean cargo clippy --workspace --exclude ruvector-postgres --no-deps -- -D warnings → exit 0 cargo fmt --all --check → exit 0 ## Cargo.lock churn 166-line diff, net ~120 lines removed (more deletions than additions). Removed: `idna 0.5.0`, `rustls-webpki 0.101.7`, `validator 0.18`, `validator_derive 0.18`, `proc-macro-error 1.0.4`. Added: `rustls-webpki 0.103.13`, `validator 0.20`, `proc-macro-error2`, `hf-hub 0.4.3`, `reqwest 0.12.28`. No suspicious crates. ## Recommended merge order 1. **This PR first** — unblocks every other PR's CI. 2. After this lands and main is green, rebase the 7 open PRs (#381-#387) one at a time. The DiskANN stack (#383→#384→#385→#386) must merge in numeric order. #381 (Python SDK), #382 (research), #387 (graph property index) are independent and can merge in any order after their CI goes green on the rebase. Co-Authored-By: claude-flow <ruv@ruv.net>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Stacked on PR #383. Closes the architectural gap that PR called out: today
DiskAnnIndex.pq_codesis dead storage —search()calls graph greedy-search overFlatVectors(full f32 originals) and exact-L2² rerank, withself.pq_codesread by neither.This PR makes the
Quantizertrait actually load-bearing in the DiskANN hot path so both backends (PQ + RaBitQ) consult their codes during graph traversal.What changes
greedy_search_with_codesingraph.rs— generalizes the existinggreedy_search_fastto accept an arbitrarydistance_fnclosure. The original stays unchanged for back-compat.DiskAnnIndexcarriesenum QuantizerBackendinstead ofOption<ProductQuantizer>. Hybrid pattern (match-once-per-search, monomorphic inner loops).with_quantizer_kind,with_rerank_factor,with_rabitq_seed,with_originals_in_memory...Default::default()so the workspace still builds.Why hybrid pattern (not generic, not dyn)
DiskAnnIndex<Q>): would cascade throughruvector-diskann-nodeand any other consumer usingDiskAnnIndexby name.Box<dyn Quantizer>): impossible —Quantizer::Queryis an associated type; trait isn't object-safe.search()on a backend enum; the inner closure is monomorphic per arm, hot loop stays branch-free.Recall numbers (the proof codes are now consulted)
PQ recall now reflects real quantization noise because codes are actually used. The 0.897 number is well above the 0.85 floor required by the regression test.
What's deferred
with_originals_in_memory(false)is plumbed and validated, butbuild()rejectsfalsewithInvalidConfig. The disk-backed rerank path that lands the actual 17.5× DRAM compression is the natural next PR. The trait-driven traversal shipped here is the prerequisite.Measured codes-vs-originals ratio at D=128, n=2000: codes 40 KB, originals 1024 KB, ratio 0.039 (~25× smaller). Once originals can be evicted, dataset-level compression will exceed the 17.5× target.
Verification
cargo build --workspace→ 0 errorscargo build -p ruvector-diskann --no-default-features→ OK (PQ-only)cargo clippy --workspace --all-targets --no-deps -- -D warnings→ exit 0cargo fmt --all --check→ exit 0cargo test -p ruvector-diskann --features rabitq→ 30 / 30 (was 26 in PR feat(ruvector-diskann): add RaBitQ backend via new Quantizer trait (Phase 1 item #1) #383)New tests in
tests/quantizer_search_uses_codes.rs:greedy_search_with_codes ≡ greedy_search_fastunder f32 distanceSurprise observation worth recording
The "abstraction gap" PR #383 left wasn't an architectural mismatch — it was three lines of
l2_squared(...)insidegreedy_search_fast's loop, not coupled at the type-signature level. Threading a closure through took ~120 LoC, not the refactor the research doc implied.Stacked on PR #383
base: feature/diskann-rabitq-backend. Will rebase tomainonce #383 merges. The two together are Phase 1 item #1 fromdocs/research/rabitq-integration/05-roadmap.md.🤖 Generated with claude-flow