Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
140 changes: 107 additions & 33 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,16 +14,20 @@ pathrex/
├── Cargo.toml # Crate manifest (edition 2024)
├── build.rs # Links LAGraph + LAGraphX; optionally regenerates FFI bindings
├── src/
│ ├── lib.rs # Public modules: formats, graph, sparql, lagraph_sys, utils
│ ├── lib.rs # Modules: formats, graph, rpq, sparql, utils, lagraph_sys
│ ├── main.rs # Binary entry point (placeholder)
│ ├── lagraph_sys.rs # FFI module — includes generated bindings
│ ├── lagraph_sys_generated.rs# Bindgen output (checked in, regenerated in CI)
│ ├── utils.rs # Internal helpers: CountingBuilder, CountOutput, VecSource,
│ │ # grb_ok! and la_ok! macros
│ ├── utils.rs # Public helpers: CountingBuilder, CountOutput, VecSource,
│ │ # grb_ok! and la_ok! macros, build_graph
│ ├── graph/
│ │ ├── mod.rs # Core traits (GraphBuilder, GraphDecomposition, GraphSource,
│ │ │ # Backend, Graph<B>), error types, RAII wrappers, GrB init
│ │ └── inmemory.rs # InMemory marker, InMemoryBuilder, InMemoryGraph
│ ├── rpq/
│ │ ├── mod.rs # RpqEvaluator (assoc. Result), RpqQuery, Endpoint, PathExpr, RpqError
│ │ ├── nfarpq.rs # NfaRpqEvaluator (LAGraph_RegularPathQuery)
│ │ └── rpqmatrix.rs # Matrix-plan RPQ evaluator
│ ├── sparql/
│ │ └── mod.rs # parse_rpq / extract_rpq → RpqQuery (spargebra)
│ └── formats/
Expand All @@ -32,7 +36,9 @@ pathrex/
│ └── mm.rs # MatrixMarket directory loader (vertices.txt, edges.txt, *.txt)
├── tests/
│ ├── inmemory_tests.rs # Integration tests for InMemoryBuilder / InMemoryGraph
│ └── mm_tests.rs # Integration tests for MatrixMarket format
│ ├── mm_tests.rs # Integration tests for MatrixMarket format
│ ├── nfarpq_tests.rs # Integration tests for NfaRpqEvaluator
│ └── rpqmatrix_tests.rs # Integration tests for matrix-plan RPQ evaluator
├── deps/
│ └── LAGraph/ # Git submodule (SparseLinearAlgebra/LAGraph)
└── .github/workflows/ci.yml # CI: build GraphBLAS + LAGraph, cargo build & test
Expand Down Expand Up @@ -121,40 +127,40 @@ regenerates it with `--features regenerate-bindings`. **Do not hand-edit this fi

### Edge

[`Edge`](src/graph/mod.rs:154) is the universal currency between format parsers and graph
[`Edge`](src/graph/mod.rs:158) is the universal currency between format parsers and graph
builders: `{ source: String, target: String, label: String }`.

### GraphSource trait

[`GraphSource<B>`](src/graph/mod.rs:164) is implemented by any data source that knows how to
[`GraphSource<B>`](src/graph/mod.rs:168) is implemented by any data source that knows how to
feed itself into a specific [`GraphBuilder`]:

- [`apply_to(self, builder: B) -> Result<B, B::Error>`](src/graph/mod.rs:165) — consumes the
- [`apply_to(self, builder: B) -> Result<B, B::Error>`](src/graph/mod.rs:169) — consumes the
source and returns the populated builder.

[`Csv<R>`](src/formats/csv.rs:52) implements `GraphSource<InMemoryBuilder>` directly, so it
can be passed to [`GraphBuilder::load`].

### GraphBuilder trait

[`GraphBuilder`](src/graph/mod.rs:169) accumulates edges and produces a
[`GraphDecomposition`](src/graph/mod.rs:188):
[`GraphBuilder`](src/graph/mod.rs:173) accumulates edges and produces a
[`GraphDecomposition`](src/graph/mod.rs:193):

- [`load<S: GraphSource<Self>>(self, source: S)`](src/graph/mod.rs:179) — primary entry point;
- [`load<S: GraphSource<Self>>(self, source: S)`](src/graph/mod.rs:183) — primary entry point;
delegates to `GraphSource::apply_to`.
- [`build(self)`](src/graph/mod.rs:184) — finalise into an immutable graph.
- [`build(self)`](src/graph/mod.rs:188) — finalise into an immutable graph.

`InMemoryBuilder` also exposes lower-level helpers outside the trait:

- [`push_edge(&mut self, edge: Edge)`](src/graph/inmemory.rs:62) — ingest one edge.
- [`with_stream<I, E>(self, stream: I)`](src/graph/inmemory.rs:72) — consume an
- [`push_edge(&mut self, edge: Edge)`](src/graph/inmemory.rs:83) — ingest one edge.
- [`with_stream<I, E>(self, stream: I)`](src/graph/inmemory.rs:93) — consume an
`IntoIterator<Item = Result<Edge, E>>`.
- [`push_grb_matrix(&mut self, label, matrix: GrB_Matrix)`](src/graph/inmemory.rs:85) — accept
- [`push_grb_matrix(&mut self, label, matrix: GrB_Matrix)`](src/graph/inmemory.rs:106) — accept
a pre-built `GrB_Matrix` for a label, wrapping it in an `LAGraph_Graph` immediately.

### Backend trait & Graph\<B\> handle

[`Backend`](src/graph/mod.rs:217) associates a marker type with a concrete builder/graph pair:
[`Backend`](src/graph/mod.rs:221) associates a marker type with a concrete builder/graph pair:

```rust
pub trait Backend {
Expand All @@ -163,28 +169,28 @@ pub trait Backend {
}
```

[`Graph<B>`](src/graph/mod.rs:229) is a zero-sized handle parameterised by a `Backend`:
[`Graph<B>`](src/graph/mod.rs:233) is a zero-sized handle parameterised by a `Backend`:

- [`Graph::<InMemory>::builder()`](src/graph/mod.rs:234) — returns a fresh `InMemoryBuilder`.
- [`Graph::<InMemory>::try_from(source)`](src/graph/mod.rs:238) — builds a graph from a single
- [`Graph::<InMemory>::builder()`](src/graph/mod.rs:238) — returns a fresh `InMemoryBuilder`.
- [`Graph::<InMemory>::try_from(source)`](src/graph/mod.rs:242) — builds a graph from a single
source in one call.

[`InMemory`](src/graph/inmemory.rs:26) is the concrete backend marker type.
[`InMemory`](src/graph/inmemory.rs:27) is the concrete backend marker type.

### GraphDecomposition trait

[`GraphDecomposition`](src/graph/mod.rs:188) is the read-only query interface:
[`GraphDecomposition`](src/graph/mod.rs:193) is the read-only query interface:

- [`get_graph(label)`](src/graph/mod.rs:192) — returns `Arc<LagraphGraph>` for a given edge label.
- [`get_node_id(string_id)`](src/graph/mod.rs:195) / [`get_node_name(mapped_id)`](src/graph/mod.rs:198) — bidirectional string ↔ integer dictionary.
- [`num_nodes()`](src/graph/mod.rs:199) — total unique nodes.
- [`get_graph(label)`](src/graph/mod.rs:197) — returns `Arc<LagraphGraph>` for a given edge label.
- [`get_node_id(string_id)`](src/graph/mod.rs:200) / [`get_node_name(mapped_id)`](src/graph/mod.rs:203) — bidirectional string ↔ integer dictionary.
- [`num_nodes()`](src/graph/mod.rs:204) — total unique nodes.

### InMemoryBuilder / InMemoryGraph

[`InMemoryBuilder`](src/graph/inmemory.rs:35) is the primary `GraphBuilder` implementation.
[`InMemoryBuilder`](src/graph/inmemory.rs:36) is the primary `GraphBuilder` implementation.
It collects edges in RAM, then [`build()`](src/graph/inmemory.rs:131) calls
GraphBLAS to create one `GrB_Matrix` per label via COO format, wraps each in an
`LAGraph_Graph`, and returns an [`InMemoryGraph`](src/graph/inmemory.rs:173).
`LAGraph_Graph`, and returns an [`InMemoryGraph`](src/graph/inmemory.rs:174).

Multiple CSV sources can be chained with repeated `.load()` calls; all edges are merged
into a single graph.
Expand All @@ -196,7 +202,7 @@ which is used by the MatrixMarket loader.

### Format parsers

Two built-in parsers are available:
CSV and MatrixMarket edge loaders are available:

#### CSV format

Expand All @@ -220,7 +226,7 @@ Name-based lookup requires `has_header: true`.

[`MatrixMarket`](src/formats/mm.rs) loads an edge-labeled graph from a directory with:

- `vertices.txt` — one line per node: `<node_name> <1-based-index>` on disk; [`get_node_id`](src/graph/mod.rs:199) returns the matching **0-based** matrix index
- `vertices.txt` — one line per node: `<node_name> <1-based-index>` on disk; [`get_node_id`](src/graph/mod.rs:200) returns the matching **0-based** matrix index
- `edges.txt` — one line per label: `<label_name> <1-based-index>` (selects `n.txt`)
- `<n>.txt` — MatrixMarket adjacency matrix for label with index `n`

Expand Down Expand Up @@ -274,6 +280,64 @@ and the parsed query contains full IRIs sharing a common prefix.
The module handles spargebra's desugaring of sequence paths (`?x <a>/<b>/<c> ?y`)
from a chain of BGP triples back into a single path expression.

### RPQ evaluation (`src/rpq/`)

The [`rpq`](src/rpq/mod.rs) module provides an abstraction for evaluating
Regular Path Queries (RPQs) over edge-labeled graphs using GraphBLAS/LAGraph.

Key public items:

- [`Endpoint`](src/rpq/mod.rs) — `Variable(String)` or `Named(String)` (IRI string).
- [`PathExpr`](src/rpq/mod.rs) — `Label`, `Sequence`, `Alternative`, `ZeroOrMore`,
`OneOrMore`, `ZeroOrOne`.
- [`RpqQuery`](src/rpq/mod.rs) — `{ subject, path, object }` using the types above;
[`strip_base(&mut self, base)`](src/rpq/mod.rs) removes a shared IRI prefix from
named endpoints and labels.
- [`RpqEvaluator`](src/rpq/mod.rs) — trait with associated type `Result` and
[`evaluate(query, graph)`](src/rpq/mod.rs) taking `&RpqQuery` and
[`GraphDecomposition`], returning `Result<Self::Result, RpqError>`.
Each concrete evaluator exposes its own output type (see below).
- [`RpqError`](src/rpq/mod.rs) — unified error type for RPQ parsing and evaluation:
`Parse` (SPARQL syntax), `Extract` (query extraction), `UnsupportedPath`,
`VertexNotFound`, and `Graph` (wraps [`GraphError`](src/graph/mod.rs) for
label-not-found and GraphBLAS/LAGraph failures).

#### `NfaRpqEvaluator` (`src/rpq/nfarpq.rs`)

[`NfaRpqEvaluator`](src/rpq/nfarpq.rs) implements [`RpqEvaluator`] by:

1. Converting a [`PathExpr`] into an [`Nfa`](src/rpq/nfarpq.rs) via Thompson's
construction ([`Nfa::from_path_expr()`](src/rpq/nfarpq.rs)).
2. Eliminating ε-transitions via epsilon closure ([`NfaBuilder::epsilon_closure()`](src/rpq/nfarpq.rs)).
3. Building one `LAGraph_Graph` per NFA label transition
([`Nfa::build_lagraph_matrices()`](src/rpq/nfarpq.rs)).
4. Calling [`LAGraph_RegularPathQuery`] with the NFA matrices, data-graph
matrices, start/final states, and source vertices.

`type Result = NfaRpqResult` ([`GraphblasVector`] of reachable targets).

Supported path operators match [`PathExpr`] variants above. `Reverse` and
`NegatedPropertySet` from SPARQL map to [`RpqError::UnsupportedPath`] when they
appear in extracted paths.

Subject/object resolution: [`Endpoint::Variable`] means "all vertices";
[`Endpoint::Named`] resolves to a single vertex via
[`GraphDecomposition::get_node_id()`](src/graph/mod.rs:200).

[`NfaRpqResult`](src/rpq/nfarpq.rs) wraps a [`GraphblasVector`] of reachable **target**
vertices. When the subject is a variable, every vertex is used as a source and
`LAGraph_RegularPathQuery` returns the union of targets — individual `(source, target)`
pairs are not preserved.

#### `RpqMatrixEvaluator` (`src/rpq/rpqmatrix.rs`)

[`RpqMatrixEvaluator`](src/rpq/rpqmatrix.rs) compiles [`PathExpr`] into a Boolean matrix plan
over label adjacency matrices and runs [`LAGraph_RPQMatrix`]. It returns
[`RpqMatrixResult`](src/rpq/rpqmatrix.rs): the path-relation `nnz` plus a
[`GraphblasMatrix`] duplicate of the result matrix (full reachability relation for the path).
Subject/object do not filter the matrix; a named subject is only validated to exist.
Bound objects are not supported yet ([`RpqError::UnsupportedPath`]).

### FFI layer

[`lagraph_sys`](src/lagraph_sys.rs) exposes raw C bindings for GraphBLAS and
Expand All @@ -287,7 +351,7 @@ LAGraph. Safe Rust wrappers live in [`graph::mod`](src/graph/mod.rs):
- [`GraphblasMatrix`](src/graph/mod.rs) — RAII wrapper around `GrB_Matrix` (`dup` + `free` on drop).
- [`ensure_grb_init()`](src/graph/mod.rs:39) — one-time `LAGraph_Init` via `std::sync::Once`.

### Macros (`src/utils.rs`)
### Macros & helpers (`src/utils.rs`)

Two `#[macro_export]` macros handle FFI error mapping:

Expand All @@ -297,6 +361,12 @@ Two `#[macro_export]` macros handle FFI error mapping:
appending the required `*mut i8` message buffer, and maps failure to
`GraphError::LAGraph(info, msg)`.

A convenience function is also provided:

- [`build_graph(edges)`](src/utils.rs:184) — builds an `InMemoryGraph` from a
slice of `(&str, &str, &str)` triples (source, target, label). Used by
integration tests.

## Coding Conventions

- **Rust edition 2024**.
Expand All @@ -308,14 +378,13 @@ Two `#[macro_export]` macros handle FFI error mapping:
- `GraphError` converts into `RpqError` via `#[from] GraphError` on the
`RpqError::Graph` variant, enabling `?` propagation in evaluators.
- Unsafe FFI calls are confined to `lagraph_sys`, `graph/mod.rs`,
`graph/inmemory.rs`, `rpq/nfarpq.rs`, and `rpq/rpqmatrix.rs`. All raw pointers are wrapped in
`graph/inmemory.rs`, `rpq/nfarpq.rs`. All raw pointers are wrapped in
RAII types that free resources on drop.
- `unsafe impl Send + Sync` is provided for `LagraphGraph`,
`GraphblasVector`, and `GraphblasMatrix` because GraphBLAS handles are thread-safe after init.
- Unit tests live in `#[cfg(test)] mod tests` blocks inside each module.
Integration tests that need GraphBLAS live in [`tests/inmemory_tests.rs`](tests/inmemory_tests.rs),
[`tests/mm_tests.rs`](tests/mm_tests.rs), [`tests/nfarpq_tests.rs`](tests/nfarpq_tests.rs),
and [`tests/rpqmatrix_tests.rs`](tests/rpqmatrix_tests.rs).
[`tests/mm_tests.rs`](tests/mm_tests.rs), [`tests/nfarpq_tests.rs`](tests/nfarpq_tests.rs).

## Testing

Expand All @@ -335,8 +404,13 @@ Tests in `src/formats/csv.rs` are pure Rust and need no native dependencies.

Tests in `src/sparql/mod.rs` are pure Rust and need no native dependencies.

Tests in `src/graph/inmemory.rs` and [`tests/inmemory_tests.rs`](tests/inmemory_tests.rs)
call real GraphBLAS/LAGraph and require the native libraries to be present.
Tests in `src/rpq/nfarpq.rs` (NFA construction unit tests) are pure Rust and need no
native dependencies.

Tests in `src/graph/inmemory.rs`, [`tests/inmemory_tests.rs`](tests/inmemory_tests.rs),
[`tests/mm_tests.rs`](tests/mm_tests.rs), [`tests/nfarpq_tests.rs`](tests/nfarpq_tests.rs),
and [`tests/rpqmatrix_tests.rs`](tests/rpqmatrix_tests.rs) call real GraphBLAS/LAGraph and
require the native libraries to be present.

## CI

Expand Down
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ csv = "1.4.0"
libc = "0.2"
oxrdf = "0.3.3"
oxttl = "0.2.3"
rustfst = "1.2"
spargebra = "0.4.6"
thiserror = "1.0"

Expand Down
2 changes: 2 additions & 0 deletions build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ fn regenerate_bindings() {
.allowlist_function("GrB_Matrix_new")
.allowlist_function("GrB_Matrix_nvals")
.allowlist_function("GrB_Matrix_free")
.allowlist_function("GrB_Matrix_extractElement_BOOL")
.allowlist_function("GrB_Matrix_build_BOOL")
.allowlist_function("GrB_Vector_new")
.allowlist_function("GrB_Vector_free")
Expand All @@ -83,6 +84,7 @@ fn regenerate_bindings() {
.allowlist_function("LAGraph_Delete")
.allowlist_function("LAGraph_Cached_AT")
.allowlist_function("LAGraph_MMRead")
.allowlist_function("LAGraph_RegularPathQuery")
.default_enum_style(bindgen::EnumVariation::Rust {
non_exhaustive: false,
})
Expand Down
1 change: 1 addition & 0 deletions src/graph/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,7 @@ impl Drop for LagraphGraph {
unsafe impl Send for LagraphGraph {}
unsafe impl Sync for LagraphGraph {}

#[derive(Debug)]
pub struct GraphblasVector {
pub inner: GrB_Vector,
}
Expand Down
45 changes: 45 additions & 0 deletions src/lagraph_sys_generated.rs
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,9 @@ unsafe extern "C" {
ncols: GrB_Index,
) -> GrB_Info;
}
unsafe extern "C" {
pub fn GrB_Matrix_dup(C: *mut GrB_Matrix, A: GrB_Matrix) -> GrB_Info;
}
unsafe extern "C" {
pub fn GrB_Matrix_nvals(nvals: *mut GrB_Index, A: GrB_Matrix) -> GrB_Info;
}
Expand All @@ -168,6 +171,14 @@ unsafe extern "C" {
dup: GrB_BinaryOp,
) -> GrB_Info;
}
unsafe extern "C" {
pub fn GrB_Matrix_extractElement_BOOL(
x: *mut bool,
A: GrB_Matrix,
i: GrB_Index,
j: GrB_Index,
) -> GrB_Info;
}
unsafe extern "C" {
pub fn GrB_vxm(
w: GrB_Vector,
Expand Down Expand Up @@ -261,3 +272,37 @@ unsafe extern "C" {
msg: *mut ::std::os::raw::c_char,
) -> ::std::os::raw::c_int;
}
unsafe extern "C" {
pub fn LAGraph_RegularPathQuery(
reachable: *mut GrB_Vector,
R: *mut LAGraph_Graph,
nl: usize,
QS: *const GrB_Index,
nqs: usize,
QF: *const GrB_Index,
nqf: usize,
G: *mut LAGraph_Graph,
S: *const GrB_Index,
ns: usize,
msg: *mut ::std::os::raw::c_char,
) -> ::std::os::raw::c_int;
}
#[repr(u32)]
#[derive(Debug, Copy, Clone, Hash, PartialEq, Eq)]
pub enum RPQMatrixOp {
RPQ_MATRIX_OP_LABEL = 0,
RPQ_MATRIX_OP_LOR = 1,
RPQ_MATRIX_OP_CONCAT = 2,
RPQ_MATRIX_OP_KLEENE = 3,
RPQ_MATRIX_OP_KLEENE_L = 4,
RPQ_MATRIX_OP_KLEENE_R = 5,
}
#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct RPQMatrixPlan {
pub op: RPQMatrixOp,
pub lhs: *mut RPQMatrixPlan,
pub rhs: *mut RPQMatrixPlan,
pub mat: GrB_Matrix,
pub res_mat: GrB_Matrix,
}
2 changes: 2 additions & 0 deletions src/rpq/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
//! let result: NfaRpqResult = NfaRpqEvaluator.evaluate(&query, &graph)?;
//! ```

pub mod nfarpq;

use crate::graph::{GraphDecomposition, GraphError};
use crate::sparql::ExtractError;
use spargebra::SparqlSyntaxError;
Expand Down
Loading
Loading