Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 110 additions & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,8 @@ members = [
"examples/OSpipe",
"crates/ruvector-coherence",
"crates/ruvector-profiler",
# Python SDK — M1 (RaBitQ-only). See docs/sdk/04-milestones.md.
"crates/ruvector-py",
"crates/ruvector-attn-mincut",
"crates/ruvector-cognitive-container",
"crates/ruvector-verified",
Expand Down
26 changes: 26 additions & 0 deletions crates/ruvector-py/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
[package]
name = "ruvector-py"
version = "0.1.0"
edition = "2021"
rust-version.workspace = true
license = "MIT OR Apache-2.0"
authors.workspace = true
repository.workspace = true
description = "Python bindings for ruvector — vector similarity search via RaBitQ 1-bit quantization"

[lib]
# Must match the maturin `module-name` Python module so the produced
# cdylib lands as `ruvector/_native.<abi>.so`. See M1 in
# `docs/sdk/04-milestones.md`.
name = "ruvector_py"
crate-type = ["cdylib"]

[dependencies]
# Pinned to 0.22 across both pyo3 and rust-numpy: the two crates are
# version-locked and a mismatch produces cryptic linker errors. abi3-py39
# means one wheel covers Python 3.9..3.13 — see `docs/sdk/02-strategy.md`
# § "Wheel distribution matrix".
pyo3 = { version = "0.22", features = ["extension-module", "abi3-py39"] }
numpy = "0.22"
ruvector-rabitq = { path = "../ruvector-rabitq" }
thiserror = { workspace = true }
91 changes: 91 additions & 0 deletions crates/ruvector-py/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# ruvector — Python SDK (M1)

Vector similarity search via RaBitQ 1-bit quantization, implemented in Rust
with native NumPy interop. M1 ships exactly one index class —
`RabitqIndex` — backed by `ruvector_rabitq::RabitqPlusIndex` (symmetric
1-bit scan + exact f32 rerank).

This crate is the Python wheel half of the ruvector workspace; the
underlying algorithms live in `crates/ruvector-rabitq/` and are unchanged
by this binding. The full SDK plan (M1 → M4) is in
[`docs/sdk/`](../../docs/sdk/).

## Install

Once published to PyPI:

```sh
pip install ruvector
```

For local development from a checkout:

```sh
cd crates/ruvector-py
maturin develop --release
pytest tests/
```

`maturin develop` builds the Rust cdylib in-place and links it as
`ruvector._native` so `import ruvector` works from any Python interpreter
in the active virtualenv. The `--release` flag matters: a debug build is
~30× slower on the search loop and will fail the latency acceptance test.

## 30-second example

```python
import numpy as np
import ruvector

# Build an index over 100k random D=128 vectors.
rng = np.random.default_rng(42)
vectors = rng.standard_normal((100_000, 128), dtype=np.float32)
idx = ruvector.RabitqIndex.build(vectors, rerank_factor=20)

# Search the 10 nearest neighbours of a query.
query = vectors[0]
hits = idx.search(query, k=10)
for vid, score in hits:
print(vid, score)
# 0 0.0
# 12345 0.0023
# ...

# Persist and reload.
idx.save("idx.rbpx")
idx2 = ruvector.RabitqIndex.load("idx.rbpx")
assert idx2.search(query, k=10) == hits
```

## API summary

| Call | Returns | Notes |
|---|---|---|
| `RabitqIndex.build(vectors, *, rerank_factor=20, seed=42)` | `RabitqIndex` | `vectors`: `(n, dim)` C-contig `float32` |
| `idx.search(query, k, *, rerank_factor=None)` | `list[(int, float)]` | `(id, score²)` ascending; `rerank_factor=None` uses the build value |
| `idx.save(path)` / `RabitqIndex.load(path)` | `None` / `RabitqIndex` | `.rbpx` v1 format |
| `len(idx)` / `idx.dim` / `idx.memory_bytes` / `idx.rerank_factor` | `int` | diagnostics |
| `ruvector.RuVectorError` | exception | base of the (future) error tree |
| `ruvector.__version__` | `str` | mirrors `Cargo.toml` |

Non-contiguous or wrong-dtype inputs raise `TypeError` at the boundary
rather than silently copying — predictable beats fast.

## Acceptance gates (M1)

Per `docs/sdk/04-milestones.md`:

1. `pip install ruvector` (or `maturin develop`) succeeds in <10 s
2. 100k-vector D=128 search returns in <10 ms (p99 over 100 queries)
3. Type stubs validate with `mypy --strict`

## Links

- [SDK plan and milestones](../../docs/sdk/) — M1 through M4 roadmap
- [Binding strategy](../../docs/sdk/02-strategy.md) — why PyO3 + maturin
- [API surface sketch](../../docs/sdk/03-api-surface.md) — full Python surface
- [`ruvector-rabitq`](../ruvector-rabitq/) — the Rust crate this wraps

## License

Dual MIT / Apache-2.0, matching the rest of the ruvector workspace.
56 changes: 56 additions & 0 deletions crates/ruvector-py/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
[build-system]
requires = ["maturin>=1.7,<2.0"]
build-backend = "maturin"

[project]
name = "ruvector"
version = "0.1.0"
description = "Vector similarity search via RaBitQ 1-bit quantization"
readme = "README.md"
license = { text = "MIT OR Apache-2.0" }
requires-python = ">=3.9"
authors = [{ name = "Ruvector Team" }]
keywords = ["vector-search", "ann", "rabitq", "rust", "embeddings"]
classifiers = [
"Development Status :: 3 - Alpha",
"Programming Language :: Rust",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Programming Language :: Python :: Implementation :: CPython",
"Topic :: Scientific/Engineering :: Artificial Intelligence",
"Topic :: Database :: Database Engines/Servers",
"License :: OSI Approved :: MIT License",
"License :: OSI Approved :: Apache Software License",
"Operating System :: POSIX :: Linux",
"Operating System :: MacOS",
"Operating System :: Microsoft :: Windows",
]
dependencies = ["numpy>=1.21"]

[project.optional-dependencies]
test = ["pytest>=7", "numpy>=1.21"]

[project.urls]
Repository = "https://github.com/ruvnet/ruvector"
Issues = "https://github.com/ruvnet/ruvector/issues"
"SDK Plan" = "https://github.com/ruvnet/ruvector/tree/main/docs/sdk"

[tool.maturin]
features = ["pyo3/extension-module"]
python-source = "python"
module-name = "ruvector._native"
# Hand-written stubs live alongside the Python source so they ship in the
# wheel. `python/ruvector/__init__.pyi` is the canonical surface; the
# `stubs/` tree carries the same file for tooling that reads PEP 561 stub
# packages directly. See `docs/sdk/02-strategy.md` § "Type stubs".
include = [
{ path = "python/ruvector/py.typed", format = "wheel" },
{ path = "python/ruvector/__init__.pyi", format = "wheel" },
]

[tool.pytest.ini_options]
testpaths = ["tests"]
10 changes: 10 additions & 0 deletions crates/ruvector-py/python/ruvector/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
"""ruvector — vector similarity search via RaBitQ 1-bit quantization.

M1 surface only: a single ``RabitqIndex`` class plus the ``RuVectorError``
base exception. See ``docs/sdk/04-milestones.md`` for what M2/M3/M4 add
(RuLake, Embedder, A2aClient).
"""

from ruvector._native import RabitqIndex, RuVectorError, __version__

__all__ = ["RabitqIndex", "RuVectorError", "__version__"]
Loading
Loading