Skip to content

research: deep review of RaBitQ integration paths into ruvector#382

Open
ruvnet wants to merge 1 commit intomainfrom
research/rabitq-integration
Open

research: deep review of RaBitQ integration paths into ruvector#382
ruvnet wants to merge 1 commit intomainfrom
research/rabitq-integration

Conversation

@ruvnet
Copy link
Copy Markdown
Owner

@ruvnet ruvnet commented Apr 26, 2026

Summary

Research-only PR. Seven markdown files at docs/research/rabitq-integration/ surveying how to extend RaBitQ usage beyond its current call sites (ruLake + Python SDK M1).

No code changes. This is the research artifact that precedes any new ADR or implementation work.

Top 3 recommended integrations

  1. ruvector-diskann RaBitQ backend (≤500 LoC) — ADR-154 already named DiskANN as a target consumer; the spot is open.
  2. ruvector-graph VectorPropertyIndex (≤600 LoC) — vector-keyed property lookup for graph nodes.
  3. ruvector-gnn differentiable_search rewrite (≤300 LoC) — replace cosine fan-out with RabitqPlusIndex::search_with_rerank, keep gradient path, collapse memory by 32×.

Key nuance the research surfaced

The VectorKernel trait + CpuKernel are shipped at crates/ruvector-rabitq/src/kernel.rs:78 and ADR-157's dispatch policy is fully specified — but no caller wires it up. Only reference is a doc comment in ruLake. Any new consumer picking the "trait dispatch" pattern would be the first non-test caller and would have to implement dispatch from scratch.

Forced ordering decision: ruLake must implement register_kernel before any other consumer adopts the trait. Phase 1 below stays Pattern 1 (direct embed) only.

Phased roadmap (~10–13 engineer-weeks)

  • Phase 1 (4–5 wk): the 3 high-value Pattern-1 integrations above. All direct-embed.
  • Phase 2 (4–6 wk): ruLake wires register_kernel; CpuKernel + at least one new kernel (CPU-SIMD or WASM) become real; ≥2 consumers route through the trait.
  • Phase 3 (~2 wk): propose new ADR — "RaBitQ as ruvector's canonical vector compression substrate" — and catalog what ruvector-graph / -gnn / -attention need to share one compression layer.

File breakdown

File LoC What it covers
INDEX.md 51 Top-level pointers
01-current-integration.md 134 Where RaBitQ is consumed today (call site map)
02-integration-opportunities.md 300 15 candidate consumers surveyed with effort/value scoring
03-architectural-patterns.md 289 Direct embed / VectorKernel trait / through ruLake (+ anti-patterns)
04-cross-cutting-concerns.md 230 Determinism, witness format, memory ownership, perf, cross-language
05-roadmap.md 238 3 phases with milestones + acceptance gates
06-decision-record.md 107 1-page call to action with open questions

Test plan

  • Each file ≤300 lines (per scope guidance)
  • Cites real crate:file:line references throughout
  • No code changes — cargo build --workspace unaffected

Stacked on PR #380

Branched from main after PR #380 merged at 7a599b7cf. No conflicts with PR #381.

🤖 Generated with claude-flow

Seven-file research at docs/research/rabitq-integration/ surveying
where RaBitQ (ADR-154, crates.io v2.2.0) is consumed today, where
else it could go, and what architectural pattern each candidate
should use.

## Top 3 integration recommendations

1. **ruvector-diskann RaBitQ backend** — replace/augment the PQ
   quantizer with `RabitqPlusIndex` (≤500 LoC). ADR-154 already
   named DiskANN as a target consumer; the spot is open.

2. **ruvector-graph `VectorPropertyIndex`** — vector-keyed property
   lookup for graph nodes via RaBitQ codes alongside the property
   table (≤600 LoC). Unlocks "find nodes whose embedding is closest
   to query" without a separate index crate.

3. **ruvector-gnn `differentiable_search` rewrite** — replace the
   cosine fan-out at `differentiable_search.rs` with
   `RabitqPlusIndex::search_with_rerank` (≤300 LoC). Keeps the
   gradient path; collapses memory by 32×.

## Key nuance discovered

The `VectorKernel` trait + `CpuKernel` shipped at
`crates/ruvector-rabitq/src/kernel.rs:78` and ADR-157's dispatch
policy is fully specified — but **no caller wires it up**. The
only reference is a doc comment at `crates/ruvector-rulake/src/lake.rs:595`.

Any new consumer choosing Pattern 2 (the trait dispatch route)
would be the first non-test caller and would have to implement
dispatch from scratch — almost certainly diverging from
ADR-157's determinism gate. This forced an ordering decision:
**ruLake must implement `register_kernel` first**; Phase 1 below
stays Pattern 1 (direct embed) only.

## Phased roadmap

- **Phase 1 (4–5 wk):** the 3 high-value Pattern-1 integrations
  above. All direct-embed; no trait dispatch yet.
- **Phase 2 (4–6 wk):** ruLake wires `register_kernel`; CpuKernel
  + at least one new kernel (CPU-SIMD or WASM) become real;
  ≥2 consumers route through the trait.
- **Phase 3 (~2 wk):** propose new ADR ("RaBitQ as ruvector's
  canonical vector compression substrate") and catalog what
  ruvector-graph / -gnn / -attention need to share one
  compression layer.

Total: ~10–13 engineer-weeks.

## What this is NOT

- Not implementation. No Rust code in this PR — just markdown.
- Not an ADR. Phase 3 may produce one; this is the research that
  precedes it.
- Not a binding decision. Each integration in §02 is annotated
  with effort + value so the team can re-prioritize.

## File breakdown

  INDEX.md                         51 LoC
  01-current-integration.md       134 LoC  (call sites today)
  02-integration-opportunities.md 300 LoC  (15 candidates surveyed)
  03-architectural-patterns.md    289 LoC  (3 patterns + anti-patterns)
  04-cross-cutting-concerns.md    230 LoC  (determinism, witness, perf)
  05-roadmap.md                   238 LoC  (3 phases, milestones)
  06-decision-record.md           107 LoC  (1-page call to action)

Refs: ADR-154 (RaBitQ), ADR-155 (ruLake), ADR-157 (accelerator
plane), PR #380 (ADR-159 + workspace cleanup), PR #381 (Python
SDK M1).

Co-Authored-By: claude-flow <ruv@ruv.net>
ruvnet added a commit that referenced this pull request Apr 26, 2026
Unblocks the 7 stacked PRs (#381-#387) and turns `main`'s CI green
for the first time in days. Two issues fixed:

## Failure 1 — Security audit (was: 8 vulnerabilities)

`cargo audit` is now exit 0. 4 of the 5 critical advisories were
fixed by version bumps; only the unfixable one is ignored.

**Dep-bumped:**
- `rustls-webpki 0.101.7` + `0.103.10` → `0.103.13` via
  `cargo update -p rustls-webpki@0.103.10`. Patches:
    RUSTSEC-2026-0098 (URI name constraints)
    RUSTSEC-2026-0099 (wildcard name constraints)
    RUSTSEC-2026-0104 (CRL parsing panic)
- `idna 0.5.0` → `1.1.0` via `validator 0.18 → 0.20` in
  `examples/scipix`. Patches RUSTSEC-2024-0421 (Punycode acceptance).
- Bonus: `reqwest 0.11 → 0.12` (in `ruvector-core` + `examples/benchmarks`)
  and `hf-hub 0.3 → 0.4` (in `ruvector-core` + `ruvllm` +
  `ruvllm-cli`). Removes the entire legacy `rustls 0.21` /
  `rustls-webpki 0.101.7` subtree from the lockfile.

**Ignored** (single advisory, with rationale):
- `RUSTSEC-2023-0071` (rsa Marvin timing sidechannel) — no upstream
  fix available; we don't expose RSA decryption services. Documented
  in `.cargo/audit.toml`.

**Unmaintained warnings** (16 total — proc-macro-error, derivative,
instant, paste, bincode 1, pqcrypto-{kyber,dilithium}, rustls-pemfile 1,
rusttype, wee_alloc, number_prefix, rand_os, core2, lru, pprof, rand) —
each given a one-line justification in `.cargo/audit.toml` so CI stays
green on them while the team decides whether to chase upstream
replacements.

## Failure 2 — Tests timeout (was: 30-min job timeout cancellation)

`.github/workflows/ci.yml` `test` job is now a `matrix` with
`fail-fast: false` and `timeout-minutes: 45`. Six parallel shards
under `cargo nextest run` (installed via `taiki-e/install-action@v2`)
plus a separate `cargo test --doc` step (nextest doesn't run
doctests):

  | Shard            | Crates                                      |
  |------------------|---------------------------------------------|
  | vector-index     | rabitq, rulake, diskann, graph, gnn, cnn    |
  | rvagent          | 10 rvagent-* crates                         |
  | ruvix            | 16 ruvix-* crates                           |
  | ruqu-quantum     | 5 ruqu* crates                              |
  | ml-research      | attention, mincut, scipix, fpga-transformer,|
  |                  | sparse-inference, sparsifier, solver,       |
  |                  | graph-transformer, domain-expansion,        |
  |                  | robotics                                    |
  | core-and-rest    | --workspace minus the above                 |

`Swatinem/rust-cache@v2` is keyed per shard. Audit job switched to
`taiki-e/install-action` for `cargo-audit` (faster than
`cargo install --locked`).

## Verification

  cargo audit                                                   → exit 0
  cargo build --workspace --exclude ruvector-postgres           → clean
  cargo clippy --workspace --exclude ruvector-postgres --no-deps -- -D warnings → exit 0
  cargo fmt --all --check                                       → exit 0

## Cargo.lock churn

166-line diff, net ~120 lines removed (more deletions than
additions). Removed: `idna 0.5.0`, `rustls-webpki 0.101.7`,
`validator 0.18`, `validator_derive 0.18`, `proc-macro-error 1.0.4`.
Added: `rustls-webpki 0.103.13`, `validator 0.20`,
`proc-macro-error2`, `hf-hub 0.4.3`, `reqwest 0.12.28`. No
suspicious crates.

## Recommended merge order

1. **This PR first** — unblocks every other PR's CI.
2. After this lands and main is green, rebase the 7 open PRs
   (#381-#387) one at a time. The DiskANN stack (#383#384#385#386)
   must merge in numeric order. #381 (Python SDK), #382 (research),
   #387 (graph property index) are independent and can merge in
   any order after their CI goes green on the rebase.

Co-Authored-By: claude-flow <ruv@ruv.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant