Skip to content

feat(ruvector-diskann): persist RaBitQ codes — reloads keep codes-driven traversal#386

Open
ruvnet wants to merge 1 commit intofeature/diskann-disk-backed-rerankfrom
feature/diskann-rabitq-persistence
Open

feat(ruvector-diskann): persist RaBitQ codes — reloads keep codes-driven traversal#386
ruvnet wants to merge 1 commit intofeature/diskann-disk-backed-rerankfrom
feature/diskann-rabitq-persistence

Conversation

@ruvnet
Copy link
Copy Markdown
Owner

@ruvnet ruvnet commented Apr 26, 2026

Summary

Stacked on PR #385. Closes the RaBitQ persistence limitation that the round-trip test in #385 had to work around with PQ.

After this PR: a saved → dropped → reloaded RaBitQ-built DiskANN index keeps the codes-driven traversal path. Search results are bit-identical (verified via f32::to_bits() u32 comparison across 8 queries on a 300-vector D=64 RaBitQ index).

Sidecar layout

<storage_path>/rabitq.bin — sibling to PR #385's originals.bin and the existing pq.bin. 32-byte header [magic="DARQ0001"][version=1][dim][seed][code_bytes_total][n_codes] + raw code bytes.

Dispatch via a new quantizer_kind JSON tag in config.json ("none"|"pq"|"rabitq"), with a v1 fallback that probes pq.bin presence.

Why not reuse ruvector_rabitq::persist::save_index/load_index?

Those helpers re-persist the originals to support a rebuild path. DiskANN already owns the originals via vectors.bin / originals.bin (PR #385). Reusing the helpers would double-store the f32 payload.

The determinism contract those helpers rely on (ADR-154's (dim, seed) → bit-identical rotation) is reused — that's why the sidecar persists only (dim, seed) rather than the dense rotation matrix.

Surprise finding worth recording

The pre-PR load() silently swallowed RaBitQ failures. With quantizer_kind = Rabitq set in code but no rabitq.bin sidecar on disk, load() happily returned a QuantizerBackend::None index. Search still worked — just via the f32 fallback path with no warning. The new explicit InvalidConfig error converts what was a silent recall regression into a loud configuration error.

Verification

  • cargo build --workspace → clean
  • cargo build -p ruvector-diskann --no-default-features → clean
  • cargo clippy --workspace --all-targets --no-deps -- -D warnings → clean
  • cargo fmt --all --check → clean
  • cargo test -p ruvector-diskann --features rabitq38 / 38 (was 35 in PR feat(ruvector-diskann): land disk-backed rerank — DRAM compression now real #385)
  • cargo test -p ruvector-diskann --no-default-features19 / 19

Three new tests in tests/disk_backed_rerank.rs:

  • disk_backed_save_load_round_trip_preserves_results_rabitq — bit-identical bytes post-reload
  • rabitq_load_rejects_missing_sidecar — explicit error path
  • v1_pq_index_without_quantizer_kind_tag_still_loads — back-compat fallback

DiskANN stack

base: feature/diskann-disk-backed-rerank (PR #385)
base of base: feature/diskann-quantizer-search-path (PR #384)
base of base of base: feature/diskann-rabitq-backend (PR #383)
base of all: main

The four PRs (#383#384#385#386) together complete Phase 1 item #1 from the research roadmap and close every limitation surfaced along the way:

🤖 Generated with claude-flow

…ven traversal

Closes the limitation flagged in PR #385 (and carried from PR #383/#384):
a saved → dropped → reloaded RaBitQ-built DiskANN index used to fall
back to the f32 traversal path because the rotation matrix + binary
codes weren't persisted. The save/load round-trip test in PR #385 had
to use PQ to avoid hitting this gap.

After this PR:
  - RaBitQ codes survive save/load
  - Reloaded index keeps codes-driven traversal
  - Search results are bit-identical (verified via `f32::to_bits()`
    u32 comparison across 8 queries)

## Sidecar layout

`<storage_path>/rabitq.bin` — sibling to PR #385's `originals.bin`
and the existing `pq.bin`.

  Header (32 bytes):
    [8] magic    = b"DARQ0001"   (matches existing DARO0001 convention)
    [4] version  = u32 LE (1)
    [4] dim      = u32 LE
    [8] seed     = u64 LE         ← rotation reconstructed from seed
    [4] code_bytes_total = u32 LE
    [4] n_codes  = u32 LE

  Body: `n_codes * code_bytes_total` raw code bytes.

Dispatch is done via a new `quantizer_kind` JSON tag in `config.json`
(`"none" | "pq" | "rabitq"`), with a v1 fallback that probes
`pq.bin` presence for indexes saved before this tag existed.

## Why not reuse `ruvector_rabitq::persist::save_index/load_index`?

Those helpers re-persist the originals so the rebuild path can
re-encode them. DiskANN already owns the originals via
`vectors.bin` / `originals.bin` (PR #385's sidecar). Reusing the
helpers would double-store the f32 payload.

We DO reuse the determinism contract those helpers rely on:
ADR-154's `(dim, seed) → bit-identical rotation`. That's why the
sidecar persists only `(dim, seed)` rather than the dense
rotation matrix — the rotation is reconstructed at load time.

## v1 back-compat

`v1_pq_index_without_quantizer_kind_tag_still_loads` strips the
`quantizer_kind` tag from `config.json` post-save and asserts
byte-identical search results after reload via the file-presence
fallback. Existing PR #385 indexes load and search correctly.

## Surprise observation worth recording

The pre-PR `load()` path silently swallowed RaBitQ failures. With
`quantizer_kind = Rabitq` set in code but no `rabitq.bin` sidecar
on disk, `load()` happily returned a `QuantizerBackend::None`
index. Search still worked — just via the f32 fallback with no
warning. There was no `quantizer_kind` tag in `config.json` at
all, so the runtime config also lost track of what the index was
*supposed to be*.

Adding the tag and turning the missing-sidecar case into an
explicit `InvalidConfig` (tested by
`rabitq_load_rejects_missing_sidecar`) converts what was a silent
recall regression into a loud configuration error.

## Verification

  cargo build --workspace                                              → clean
  cargo build -p ruvector-diskann --no-default-features                → clean
  cargo clippy --workspace --all-targets --no-deps -- -D warnings      → clean
  cargo fmt --all --check                                              → clean
  cargo test -p ruvector-diskann --features rabitq                     → 38 / 38
                                                                          (was 35 in PR #385)
  cargo test -p ruvector-diskann --no-default-features                 → 19 / 19

Three new tests in `tests/disk_backed_rerank.rs`:
- `disk_backed_save_load_round_trip_preserves_results_rabitq` —
  bit-identical results post-reload
- `rabitq_load_rejects_missing_sidecar` — explicit error path
- `v1_pq_index_without_quantizer_kind_tag_still_loads` —
  back-compat fallback

## Files

- `src/index.rs` (+201/-25): magic constants, save section for
  RaBitQ, load dispatch, mirror loaded seed into runtime config.
- `src/quantize/rabitq.rs` (+26): `seed: u64` field, `seed()`
  accessor, `pub(crate) fn new_trained` for the load path.
- `tests/disk_backed_rerank.rs` (+216): 3 new tests.

Refs: PR #383 (Quantizer trait + RaBitQ backend), PR #384
(search-path rewrite), PR #385 (disk-backed rerank +
`with_originals_in_memory(false)`),
docs/research/rabitq-integration/05-roadmap.md Phase 1.

Co-Authored-By: claude-flow <ruv@ruv.net>
ruvnet added a commit that referenced this pull request Apr 26, 2026
Unblocks the 7 stacked PRs (#381-#387) and turns `main`'s CI green
for the first time in days. Two issues fixed:

## Failure 1 — Security audit (was: 8 vulnerabilities)

`cargo audit` is now exit 0. 4 of the 5 critical advisories were
fixed by version bumps; only the unfixable one is ignored.

**Dep-bumped:**
- `rustls-webpki 0.101.7` + `0.103.10` → `0.103.13` via
  `cargo update -p rustls-webpki@0.103.10`. Patches:
    RUSTSEC-2026-0098 (URI name constraints)
    RUSTSEC-2026-0099 (wildcard name constraints)
    RUSTSEC-2026-0104 (CRL parsing panic)
- `idna 0.5.0` → `1.1.0` via `validator 0.18 → 0.20` in
  `examples/scipix`. Patches RUSTSEC-2024-0421 (Punycode acceptance).
- Bonus: `reqwest 0.11 → 0.12` (in `ruvector-core` + `examples/benchmarks`)
  and `hf-hub 0.3 → 0.4` (in `ruvector-core` + `ruvllm` +
  `ruvllm-cli`). Removes the entire legacy `rustls 0.21` /
  `rustls-webpki 0.101.7` subtree from the lockfile.

**Ignored** (single advisory, with rationale):
- `RUSTSEC-2023-0071` (rsa Marvin timing sidechannel) — no upstream
  fix available; we don't expose RSA decryption services. Documented
  in `.cargo/audit.toml`.

**Unmaintained warnings** (16 total — proc-macro-error, derivative,
instant, paste, bincode 1, pqcrypto-{kyber,dilithium}, rustls-pemfile 1,
rusttype, wee_alloc, number_prefix, rand_os, core2, lru, pprof, rand) —
each given a one-line justification in `.cargo/audit.toml` so CI stays
green on them while the team decides whether to chase upstream
replacements.

## Failure 2 — Tests timeout (was: 30-min job timeout cancellation)

`.github/workflows/ci.yml` `test` job is now a `matrix` with
`fail-fast: false` and `timeout-minutes: 45`. Six parallel shards
under `cargo nextest run` (installed via `taiki-e/install-action@v2`)
plus a separate `cargo test --doc` step (nextest doesn't run
doctests):

  | Shard            | Crates                                      |
  |------------------|---------------------------------------------|
  | vector-index     | rabitq, rulake, diskann, graph, gnn, cnn    |
  | rvagent          | 10 rvagent-* crates                         |
  | ruvix            | 16 ruvix-* crates                           |
  | ruqu-quantum     | 5 ruqu* crates                              |
  | ml-research      | attention, mincut, scipix, fpga-transformer,|
  |                  | sparse-inference, sparsifier, solver,       |
  |                  | graph-transformer, domain-expansion,        |
  |                  | robotics                                    |
  | core-and-rest    | --workspace minus the above                 |

`Swatinem/rust-cache@v2` is keyed per shard. Audit job switched to
`taiki-e/install-action` for `cargo-audit` (faster than
`cargo install --locked`).

## Verification

  cargo audit                                                   → exit 0
  cargo build --workspace --exclude ruvector-postgres           → clean
  cargo clippy --workspace --exclude ruvector-postgres --no-deps -- -D warnings → exit 0
  cargo fmt --all --check                                       → exit 0

## Cargo.lock churn

166-line diff, net ~120 lines removed (more deletions than
additions). Removed: `idna 0.5.0`, `rustls-webpki 0.101.7`,
`validator 0.18`, `validator_derive 0.18`, `proc-macro-error 1.0.4`.
Added: `rustls-webpki 0.103.13`, `validator 0.20`,
`proc-macro-error2`, `hf-hub 0.4.3`, `reqwest 0.12.28`. No
suspicious crates.

## Recommended merge order

1. **This PR first** — unblocks every other PR's CI.
2. After this lands and main is green, rebase the 7 open PRs
   (#381-#387) one at a time. The DiskANN stack (#383#384#385#386)
   must merge in numeric order. #381 (Python SDK), #382 (research),
   #387 (graph property index) are independent and can merge in
   any order after their CI goes green on the rebase.

Co-Authored-By: claude-flow <ruv@ruv.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant