A hierarchical key-value database with MVCC concurrency, WAL durability, and schema layer access patterns. Inspired by MUMPS multi-dimensional subscripts and ForestDB's copy-on-write HB+trie architecture.
- Hierarchical key paths — Store and retrieve values at paths like
users/alice/name - MVCC — Lock-free concurrent reads with snapshot isolation
- WAL durability — Thread-local write-ahead log with configurable sync modes
- Schema Layers — GraphQL and Graph access today, SQL roadmap
- Graph Layer — Triple-store with Gremlin-style traversal, schema-driven indexing, and cost-based query optimization
- Language bindings — C, Node.js, Dart
- High throughput — Sharded LRU cache with memory pooling, inline key comparison, MVCC fast-path
#include "Database/database.h"
#include "Database/database_config.h"
int main() {
int error = 0;
// Create with default config (in-memory, ASYNC WAL)
database_config_t* config = database_config_default();
database_t* db = database_create_with_config("/tmp/mydb", config, &error);
database_config_destroy(config);
if (!db) { fprintf(stderr, "Error: %d\n", error); return 1; }
// Write
path_t* key = path_create_from_string("/", "users/alice/name", 0);
identifier_t* val = identifier_create_from_string("Alice", 0);
database_put_sync(db, key, val);
path_destroy(key);
identifier_destroy(val);
// Read
path_t* key2 = path_create_from_string("/", "users/alice/name", 0);
identifier_t* result = NULL;
int rc = database_get_sync(db, key2, &result);
if (rc == 0 && result) {
// use result...
identifier_destroy(result);
}
path_destroy(key2);
// Batch
batch_t* batch = batch_create(0);
path_t* p1 = path_create_from_string("/", "users/bob/age", 0);
identifier_t* v1 = identifier_create_from_string("30", 0);
batch_add_put(batch, p1, v1); // ownership transfers on success
database_write_batch_sync(db, batch);
batch_destroy(batch);
// Iterate
path_t* start = path_create_from_string("/", "users/alice", 0);
path_t* end = path_create_from_string("/", "users/alice/~", 0);
database_iterator_t* it = database_scan_start(db, start, end);
path_t* out_path = NULL;
identifier_t* out_val = NULL;
while (database_scan_next(it, &out_path, &out_val) == 0) {
// process out_path, out_val
path_destroy(out_path);
identifier_destroy(out_val);
}
database_scan_end(it);
path_destroy(start);
path_destroy(end);
database_destroy(db);
return 0;
}WaveDB stores values at hierarchical key paths using an HBTrie — a B+tree where each level is itself a B+tree. A path like ["users", "alice", "name"] traverses three levels of B+trees, one per path component.
root bnode
/ | \
"users" "posts" ...
/ | \ |
bnode ... bnode ...
/ \ |
"alice" "bob" "title"
| | |
bnode bnode value
/ | \ |
"name" "age" ...
| |
value value
Concurrency: MVCC (Multi-Version Concurrency Control) provides snapshot isolation — readers never block writers and never see partial updates. Each write creates a versioned entry; reads see the latest committed version at their transaction ID.
Durability: Thread-local WAL files eliminate write contention. Three sync modes trade durability for throughput: IMMEDIATE (fsync every write), DEBOUNCED (batched fsync every 250ms, recommended), ASYNC (buffer flushed to kernel on 250ms idle timer, survives process crash but not power failure).
Schema Layers: Access the same data through different query paradigms. The GraphQL layer maps type definitions to hierarchical paths and resolves queries with scan plans. The Graph layer provides a triple-store with Gremlin-style traversal and cost-based optimization. Multiple layers can coexist on the same database using subtrees for namespace isolation.
database_config_t* config = database_config_default();
// Mutable — can change when reopening
config->lru_memory_mb = 100; // LRU cache size (default: 50 MB)
config->lru_shards = 64; // LRU shards (default: 64, 0 = auto-scale)
config->wal_config.sync_mode = WAL_SYNC_DEBOUNCED; // WAL mode
config->wal_config.debounce_ms = 250; // fsync window (default: 250ms)
config->worker_threads = 4; // Worker pool size (default: 4)
// Immutable — set at creation, persisted
config->chunk_size = 4; // HBTrie chunk size (default: 4)
config->btree_node_size = 4096; // B+tree node size (default: 4096)
config->enable_persist = 1; // 0 = in-memory, 1 = persistent (default: 1)
database_t* db = database_create_with_config("/path/to/db", config, &error);
database_config_destroy(config);Config is persisted as CBOR at <db_path>/.config and automatically loaded on reopen. database_config_merge reconciles saved and passed configs — immutable settings always use the on-disk values.
| Mode | Behavior | Durability | Performance |
|---|---|---|---|
WAL_SYNC_IMMEDIATE |
fsync after every write | Highest | ~1K ops/sec |
WAL_SYNC_DEBOUNCED |
batched fsync (default 250ms) | High | ~300K ops/sec |
WAL_SYNC_ASYNC |
buffered write, idle drain every 250ms | Process crash only | ~400K ops/sec |
| Operation | Async (promise) | Sync |
|---|---|---|
| Put | database_put(db, path, val, promise) |
database_put_sync(db, path, val) |
| Get | database_get(db, path, promise) |
database_get_sync(db, path, &result) |
| Delete | database_delete(db, path, promise) |
database_delete_sync(db, path) |
| Batch | database_write_batch(db, batch, promise) |
database_write_batch_sync(db, batch) |
| Increment | — | database_increment_sync(db, path, delta) |
| Snapshot | — | database_snapshot(db) |
WaveDB supports AES-256-GCM encryption at rest for both WAL files and persisted pages. Two key modes are available:
| Mode | Key | Use case |
|---|---|---|
ENCRYPTION_SYMMETRIC |
32-byte user-supplied key | Simple setups, embedded devices |
ENCRYPTION_ASYMMETRIC |
RSA key pair (DER-encoded) | Key rotation, write-only nodes |
Encryption is set at database creation and cannot be changed on an existing database. Key material is never persisted — only a salt and verification check are stored, allowing key verification on re-open.
#include "Database/database.h"
#include "Database/database_config.h"
#include "Storage/encryption.h"
uint8_t key[32] = { /* your 32-byte AES-256 key */ };
encrypted_database_config_t* config = encrypted_database_config_default();
encrypted_database_config_set_type(config, ENCRYPTION_SYMMETRIC);
encrypted_database_config_set_symmetric_key(config, key, sizeof(key));
database_config_set_enable_persist(&config->config, 1);
int error = 0;
database_t* db = database_create_encrypted("/path/to/db", config, &error);
if (db == NULL) {
// error == DATABASE_ERR_ENCRYPTION_KEY_INVALID if key is wrong
// error == DATABASE_ERR_ENCRYPTION_REQUIRED if DB exists but is unencrypted
fprintf(stderr, "Error: %d\n", error);
return 1;
}
encrypted_database_config_destroy(config);
// Use normally — encryption is transparent
database_put_sync_raw(db, "users/alice/name", 16, '\037',
(const uint8_t*)"Alice", 5);
// Reopen with the same key
encrypted_database_config_t* config2 = encrypted_database_config_default();
encrypted_database_config_set_type(config2, ENCRYPTION_SYMMETRIC);
encrypted_database_config_set_symmetric_key(config2, key, sizeof(key));
database_config_set_enable_persist(&config2->config, 1);
database_t* db2 = database_create_encrypted("/path/to/db", config2, &error);
encrypted_database_config_destroy(config2);
database_destroy(db);// Load DER-encoded keys (generated externally, e.g. OpenSSL)
uint8_t private_key_der[1217] = { /* DER RSA private key */ };
uint8_t public_key_der[294] = { /* DER RSA public key */ };
encrypted_database_config_t* config = encrypted_database_config_default();
encrypted_database_config_set_type(config, ENCRYPTION_ASYMMETRIC);
encrypted_database_config_set_asymmetric_private_key(
config, private_key_der, sizeof(private_key_der));
encrypted_database_config_set_asymmetric_public_key(
config, public_key_der, sizeof(public_key_der));
database_config_set_enable_persist(&config->config, 1);
int error = 0;
database_t* db = database_create_encrypted("/path/to/db", config, &error);
encrypted_database_config_destroy(config);In asymmetric mode, a random data encryption key (DEK) is generated per session and wrapped with the RSA public key. The private key is required for decryption; a write-only node can omit it.
| Code | Meaning |
|---|---|
DATABASE_ERR_ENCRYPTION_REQUIRED |
Database is encrypted but no key was provided |
DATABASE_ERR_ENCRYPTION_KEY_INVALID |
Wrong key or key verification failed |
DATABASE_ERR_ENCRYPTION_UNSUPPORTED |
Cannot add encryption to an existing unencrypted DB |
The GraphQL layer maps SDL type definitions to hierarchical paths and resolves queries with scan plans.
#include "Layers/graphql/graphql.h"
graphql_layer_t* layer = graphql_layer_create("/path/to/db", NULL);
// Register schema
const char* sdl =
"type User { name: String age: Int }"
"type Query { user(id: ID!): User }";
graphql_schema_parse(layer, sdl, NULL);
// Query
graphql_result_t* result = graphql_query_sync(layer, "{ user(id: \"alice\") { name } }");
// result contains { "data": { "user": { "name": "Alice" } } }
graphql_result_destroy(result);
graphql_layer_destroy(layer);Query plans are compiled from SDL + query and executed as scan, get, or batch-get operations against the database. Custom resolvers can be registered for computed fields.
The Graph layer provides a triple-store (subject-predicate-object) with Gremlin-inspired traversal, schema-driven automatic indexing, and cost-based query optimization.
#include "Layers/graph/graph.h"
graph_layer_t* layer = graph_layer_create("/path/to/db", NULL);
// Define schema with index hints
graph_schema_parse(layer,
"type Clip @index(spo, pos) {"
" tagged_with: [Tag];"
" name: String @index(pos);"
"}", NULL);
// Insert triples
graph_insert_sync(layer, "clip_abc", "tagged_with", "gaming");
graph_insert_sync(layer, "clip_abc", "name", "My Clip");
graph_insert_sync(layer, "clip_xyz", "tagged_with", "gaming");
// Query with Gremlin-style DSL
graph_result_t* r1 = graph_parse_execute(
"g.V(\"clip_abc\").Out(\"tagged_with\")", layer, NULL);
// r1 → ["gaming"]
// Intersection
graph_result_t* r2 = graph_parse_execute(
"g.V(\"gaming\").In(\"tagged_with\")"
".And(g.V(\"clip_abc\").Out(\"tagged_with\"))", layer, NULL);
// Range predicates
graph_result_t* r3 = graph_parse_execute(
"g.Has(\"age\", >=, \"25\").Out(\"follows\")", layer, NULL);
// Morphisms (reusable query fragments)
graph_morphism_define(layer, "friends_content",
"g.Morphism(\"friends_content\").Out(\"follows\").Out(\"likes\")", NULL);
graph_result_t* r4 = graph_parse_execute(
"g.V(\"alice\").Follow(\"friends_content\")", layer, NULL);
graph_result_destroy(r1);
graph_result_destroy(r2);
graph_result_destroy(r3);
graph_result_destroy(r4);
graph_layer_destroy(layer);The optimizer automatically reorders Has filters by selectivity and sorts Intersect children by estimated cardinality, using statistics computed from PSO index scans. See docs/graph-db-schema-layer.md for the full design.
Subtrees provide scoped views into a database with automatic prefix isolation. This enables multiple schema layers to coexist in different namespaces on the same database.
#include "Database/database_subtree.h"
// Open subtrees with different prefixes
database_subtree_t* gql_st = database_subtree_open(db, "graphql", '/');
database_subtree_t* graph_st = database_subtree_open(db, "graph", '/');
// Operations are scoped — keys are automatically prefixed
database_subtree_put_sync_raw(gql_st, "Users/1/name", 12, '/',
(const uint8_t*)"Alice", 5);
database_subtree_put_sync_raw(graph_st, "triples/s/p/o", 14, '/',
(const uint8_t*)"data", 4);
// Each subtree only sees its own data
size_t gql_count = database_subtree_count(gql_st); // 1
size_t graph_count = database_subtree_count(graph_st); // 1
// Delete all data under a prefix
database_subtree_delete_prefix(db, "graphql", '/');
// Close subtrees (does not destroy the database)
database_subtree_close(gql_st);
database_subtree_close(graph_st);Create multiple layers on the same database by passing a subtree:
database_subtree_t* gql_sub = database_subtree_open(db, "layer/graphql", '/');
database_subtree_t* graph_sub = database_subtree_open(db, "layer/graph", '/');
int error_code = 0;
graphql_layer_t* gql = graphql_layer_create(db_path, gql_config, gql_sub, &error_code);
graph_layer_t* graph = graph_layer_create(db_path, graph_config, graph_sub, &error_code);
// Both layers operate independently on the same databaseWithout a subtree, layer creation fails with error_code = -3 if a different layer type already owns the database.
npm install wavedbconst { WaveDB } = require('wavedb');
const db = new WaveDB('/path/to/db', {
delimiter: '/',
lruMemoryMb: 50,
wal: { syncMode: 'debounced' }
});
// Async (Promise)
await db.put('users/alice/name', 'Alice');
const name = await db.get('users/alice/name');
// Sync (blocking)
db.putSync('users/bob/name', 'Bob');
const name2 = db.getSync('users/bob/name');
// Callback style
db.putCb('users/charlie/name', 'Charlie', (err) => { /* ... */ });
db.getCb('users/charlie/name', (err, val) => { /* ... */ });
// Object operations (nested data → flattened paths)
await db.putObject('users/alice', { name: 'Alice', age: 30 });
const user = await db.getObject('users/alice'); // { name: 'Alice', age: 30 }
// Batch
await db.batch([
{ type: 'put', path: 'users/alice/name', value: 'Alice' },
{ type: 'del', path: 'users/bob/name' }
]);
// Iterator
const iter = new Iterator(db, { start: 'users/alice', end: 'users/alice~' });
let entry;
while ((entry = iter.read())) {
console.log(entry.key, entry.value);
}
iter.end();
// Graph Layer
const { GraphLayer, g } = require('wavedb');
const graph = new GraphLayer();
graph.parseSchema('type Clip @index(spo, pos) { tagged_with: [Tag]; }');
graph.insertSync('clip_abc', 'tagged_with', 'gaming');
const tags = g.V('clip_abc').Out('tagged_with').All(); // ['gaming']
graph.close();
// Subtrees — namespace isolation for multiple layers on one database
const gqlSub = db.openSubtree('graphql');
const graphSub = db.openSubtree('graph');
gqlSub.putSync('Users/1/name', 'Alice');
graphSub.putSync('triples/s/p/o', 'data');
console.log(gqlSub.count()); // 1
console.log(graphSub.count()); // 1
gqlSub.close();
graphSub.close();
// Create layers with subtrees on the same database
const { GraphQLLayer } = require('wavedb');
const gqlLayer = new GraphQLLayer(dbPath, { subtree: gqlSub });
const graphLayer = new GraphLayer(dbPath, { subtree: graphSub });See bindings/nodejs/README.md for full API reference.
# pubspec.yaml
dependencies:
wavedb: ^0.1.0import 'package:wavedb/wavedb.dart';
final db = WaveDB('/path/to/db', delimiter: '/');
// Sync
db.putSync('users/alice/name', 'Alice');
final name = db.getSync('users/alice/name');
// Async
await db.put('users/bob/name', 'Bob');
final name2 = await db.get('users/bob/name');
// Object operations
await db.putObject('users/alice', { 'name': 'Alice', 'age': 30 });
final user = await db.getObject('users/alice');
// Streaming
db.createReadStream(start: 'users/alice', end: 'users/alice~').listen((kv) {
print('${kv.key}: ${kv.value}');
});
// GraphQL
final layer = GraphQLLayer();
layer.create(GraphQLLayerConfig(path: '/path/to/db'));
layer.parseSchema('type User { name: String } type Query { user(id: ID!): User }');
final result = layer.querySync('{ user(id: "alice") { name } }');
// Graph Layer
final graph = GraphLayer();
graph.parseSchema('type Clip @index(spo, pos) { tagged_with: [Tag]; }');
graph.insertSync('clip_abc', 'tagged_with', 'gaming');
final clips = g().V('clip_abc').Out('tagged_with').All(); // ['gaming']
// Subtrees — namespace isolation for multiple layers on one database
final gqlSub = db.openSubtree('graphql');
final graphSub = db.openSubtree('graph');
gqlSub.putSync('Users/1/name', 'Alice');
graphSub.putSync('triples/s/p/o', 'data');
print(gqlSub.count()); // 1
print(graphSub.count()); // 1
gqlSub.close();
graphSub.close();
// Create layers with subtrees on the same database
final gqlLayer = GraphQLLayer()..create(GraphQLLayerConfig(path: dbPath, subtree: gqlSub));
final graphLayer = GraphLayer(subtree: graphSub);
db.close();See bindings/dart/README.md for full API reference.
Benchmarks on AMD Ryzen 9700X (8-core Zen 5), Windows 11, 50MB LRU cache, ASYNC WAL mode, GCC 15.1 -O3. 5-run averages, CV <6% unless noted.
| Operation | Throughput | P50 Latency | P99 Latency |
|---|---|---|---|
| Get | 2.61M ops/sec | 474 ns | 490 ns |
| Put | 498K ops/sec | 2.02 µs | 4.80 µs |
| Delete | 183K ops/sec | 3.52 µs | 7.33 µs |
| Mixed (70% read) | 2.94M ops/sec | 459 ns | 490 ns |
| Threads | Write | Read | Mixed (70R/20W/10D) |
|---|---|---|---|
| 1 | 275K ops/sec | 1.48M ops/sec | 312K ops/sec |
| 2 | 484K ops/sec | 2.48M ops/sec | 535K ops/sec |
| 4 | 647K ops/sec | 4.59M ops/sec | 709K ops/sec |
| 8 | 811K ops/sec | 4.47M ops/sec | 873K ops/sec |
| 16 | 949K ops/sec | 2.85M ops/sec | 982K ops/sec |
| 32 | 1.02M ops/sec | 2.88M ops/sec | 1.06M ops/sec |
git clone --recursive https://github.com/vijayee/WaveDB.git
cd WaveDB
mkdir build && cd build
cmake ..
make -j$(nproc)
# Run tests
ctest --output-on-failure
# Build shared library for Dart/FFI bindings
cmake -DBUILD_DART_BINDINGS=ON ..
make -j$(nproc)cmake -DCMAKE_INSTALL_PREFIX=/usr/local ..
make
sudo make installInstalls headers, static library (libwavedb.a), and CMake config.
# After install
find_package(WaveDB REQUIRED)
target_link_libraries(myapp WaveDB::wavedb)
# Or embedded
add_subdirectory(path/to/WaveDB)
target_link_libraries(myapp WaveDB::wavedb)Bundled — no external dependencies required:
- xxhash — fast hashing
- hashmap — hash map implementation
- libcbor — CBOR serialization
- OpenSSL — AES-256-GCM encryption (required for encrypted databases)
- P2P Database Replication — Distributed multi-master replication with conflict resolution
- SQL Schema Layer — SQL query interface over hierarchical data
MIT