Native INT8 (byte vector) HNSW build + search API

Hi! Filing this from the [ArcadeDB](https://github.com/ArcadeData/arcadedb) project, which uses JVector as the HNSW backend for its `LSM_VECTOR` index.

## Background

We're adding pre-quantized int8 ingest to ArcadeDB ([ArcadeData/arcadedb#4132](https://github.com/ArcadeData/arcadedb/issues/4132)) so callers using providers that emit int8 directly (Cohere `embed-english-v3.0`, OpenAI `text-embedding-3-large` reduced precision, Sentence Transformers with int8 quantization) can skip a precision-losing client-side `int8 → float32` round-trip.

We dug into JVector 4.0.0-rc.8 to wire the path through and found the HNSW graph API operates on `VectorFloat<?>` end-to-end:

- `RandomAccessVectorValues.getVector(int ordinal)` returns `VectorFloat<?>`.
- `GraphIndexBuilder` constructors take `RandomAccessVectorValues` (float-only).
- `VectorSimilarityFunction.compare(VectorFloat<?>, VectorFloat<?>)` - the abstract method signature.

So a caller with int8 input must dequantize to float32 for graph build *and* for every query, even when the application semantics are int8-throughout. `ByteSequence<?>` exists in the type system but is used only for PQ/BQ codes (sidecar against the float-vector graph), not as a primary HNSW vector type.

## Ask

A native byte (int8) vector path:

- A `RandomAccessByteVectorValues` (or a generalised `RandomAccessVectorValues<T>`).
- `VectorSimilarityFunction` overloads / variants for `byte[]` (cosine + dot product on byte vectors with per-block min/max calibration; euclidean on bytes is also straightforward).
- `GraphIndexBuilder` constructor(s) that accept the byte-vector RAVV + byte similarity function.
- Search-side equivalent in `GraphSearcher`.

## Why it matters

Modern embedding providers default to int8/binary outputs at scale - Cohere binary embeddings are 1/32× the size of float32, Cohere int8 is 1/4×. Forcing dequantize-on-build/search means:

- **Build cost**: O(N * dim * 4) bytes of transient float32 even though the source is bytes.
- **Search cost**: every query vector dequantises before comparison - JVector's SIMD intrinsics for the byte-similarity case never get exercised.
- **Storage cost**: applications keep bytes in their primary store but JVector wants floats, so RAM and on-disk size grow 4× beyond what the application needs.

Lucene 9.x added `VectorEncoding.BYTE` for similar reasons; we'd love the same in JVector to close the precision/size loop.

## ArcadeDB context

For reference, our matrix vs Qdrant / Milvus 2.5 ([docs/arcadedb-vs-leading-vector-dbms.md](https://github.com/ArcadeData/arcadedb/blob/main/docs/arcadedb-vs-leading-vector-dbms.md)) flags pre-quantized ingest as a P2 gap. We can ship an MVP that dequantizes int8 → float32 server-side (covered in [ArcadeData/arcadedb#4132](https://github.com/ArcadeData/arcadedb/issues/4132)) and that closes the API ergonomics gap, but the full "end-to-end int8 with no float32 transient" win requires the JVector-side support.

Happy to contribute if there's a design direction the maintainers are considering. Otherwise, this is a tracking request.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Native INT8 (byte vector) HNSW build + search API #665

Background

Ask

Why it matters

ArcadeDB context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Native INT8 (byte vector) HNSW build + search API #665

Description

Background

Ask

Why it matters

ArcadeDB context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions