Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Added

- `BStack::process_gen` (Rust) / `bstack_process_gen` (C) (`set` + `atomic`): generator/callback-driven primitive that acquires the write lock once and holds it across a sequence of dependent reads ending in at most one mutating operation (`Write`, `Swap`, `Push`, or `Pop`), which always ends the sequence. Closes the ABA window that a `get_batched_gen` (read, release lock) + `cas` (re-acquire, compare, write) pairing would otherwise leave open for allocator-mutex-free pop-style algorithms — see `examples/atomic_linked_list.rs` / `examples/atomic_linked_list.c` for a worked free-list push/pop demonstration.
- `BStackGenOp<'a>` (Rust) / `bstack_gen_op_t` (C) (`set` + `atomic`): non-exhaustive enum (Rust) / tagged union (C) of operations yielded by `process_gen`'s closure/callback — `Read { offset, buf }`, `Len { out }`, `Write { offset, data }`, `Swap { a_offset, b_offset, len }`, `Push { data }`, and `Pop { buf }`. `Write`, `Swap`, `Push`, and `Pop` are the only mutating variants — exactly one is permitted per call, and any one of them ends the sequence immediately; `Read`/`Len` do not end the sequence. The Rust enum derives `Debug` (intentionally not `PartialEq`/`Eq`/`Hash` — see the type's doc comment).
- `BStackGenOp<'a>` (Rust) / `bstack_gen_op_t` (C) (`set` + `atomic`): non-exhaustive enum (Rust) / tagged union (C) of operations yielded by `process_gen`'s closure/callback — `Read { offset, buf }`, `Len { out }`, `Write { offset, data }`, `Swap { a_offset, b_offset, len }`, `Push { data }`, `Pop { buf }`, and `Discard { len }` (Rust; in C, a `Pop` with a `NULL` destination buffer). `Write`, `Swap`, `Push`, `Pop`, and `Discard` are the only mutating variants — exactly one is permitted per call, and any one of them ends the sequence immediately; `Read`/`Len` do not end the sequence. The Rust enum derives `Debug` (intentionally not `PartialEq`/`Eq`/`Hash` — see the type's doc comment).
- `BStackGenOp::Push { data }` / `BSTACK_GEN_PUSH` and `BStackGenOp::Pop { buf }` / `BSTACK_GEN_POP` (Rust + C, `set` + `atomic`): in-sequence equivalents of `push`/`pop` for `process_gen` — `Push` appends `data` and `Pop` removes the last `buf.len()` bytes into `buf`, growing/shrinking the payload. Like `Write` and `Swap`, exactly one of `Write`/`Swap`/`Push`/`Pop` is permitted per call and any one of them ends the sequence immediately. `Pop` errors if it would remove more than the current payload or shrink it below the locked length.
- `BStackGenOp::Len { out }` / `BSTACK_GEN_LEN` (Rust + C, `set` + `atomic`): writes the current logical payload size into `out` and, unlike the mutating variants, does not end the sequence — the in-sequence equivalent of `len`, useful when a later step's offset or length depends on the current payload size.
- `BStackGenOp::Discard { len }` (Rust) / `BSTACK_GEN_POP` with a `NULL` `u.pop.buf` (C) (`set` + `atomic`): removes the last `len` bytes from the end of the file without reading them back, shrinking the payload and ending the sequence — the in-sequence, buffer-free equivalent of `discard` and the counterpart of `Pop`. Useful for truncating a tail whose size is only known once earlier `Read`s/`Len` have resolved, without allocating a throwaway buffer. In Rust this is a dedicated variant (slices cannot be null); in C it is expressed idiomatically as a `Pop` whose destination pointer is `NULL`. Errors on the same conditions as `Pop`.

### Changed

- **`SlabBStackAllocator` and `CheckedSlabBStackAllocator` — `alloc` / `dealloc` / `realloc` are now lock-free under the `atomic` feature** (`alloc` + `set` features): The allocator-level `Mutex` that previously serialised free-list push/pop is gone from these paths. Free-list pop now drives a single `BStack::process_gen` sequence (read `free_head`, read the popped block's `next`, advance `free_head` — all under one held `BStack` write lock, closing the ABA window a `get`/`cas` pair would leave open); free-list push splices a single block or a whole freed run onto the head with one `BStack::cross_exchange`; tail grow/shrink use `BStack::try_extend_zeros` / `BStack::try_discard` (atomic check-and-act under `BStack`'s own write lock). `SlabBStackAllocator` drops its allocator-level `Mutex` entirely and is `Sync` purely through `BStack`'s interior mutability. `CheckedSlabBStackAllocator` retains a `Mutex` solely for `recover` (see below); none of `alloc` / `dealloc` / `realloc` take it. The on-disk format is unchanged — no magic-number bump.
- **`CheckedSlabBStackAllocator::recover` runs under its own mutex** (`alloc` + `set` features, `atomic`): the `Mutex` is held for the full call solely to keep recovery single-flight, preventing two concurrent runs from reclaiming the same leaked block twice. The scan itself (free-list walk, arena classification, and its one optional tail discard) runs as a single `BStack::process_gen` sequence, so the `BStack` write lock — not the `Mutex` — serialises it against the lock-free `alloc` / `dealloc` / `realloc`. Ordinary `alloc` / `dealloc` / `realloc` never take the `Mutex`.

## [0.2.4] - 2026-06-07

Expand Down
60 changes: 0 additions & 60 deletions PLANNED.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,66 +180,6 @@ The same change applies to the corresponding methods on `BStackGuardedSlice`, an

---

## Lock-free free list in `SlabBStackAllocator` and `CheckedSlabBStackAllocator`

**Feature flag:** `atomic`
**Breaking change:** No (internal implementation change only)

### Motivation

Under the `atomic` feature, `SlabBStackAllocator` guards every free-list mutation with an internal `Mutex<()>`. This mutex serialises `push_free_block`, `push_free_blocks`, and `pop_free_block` across threads, preventing concurrent alloc/dealloc from racing on `free_head`. While correct, the mutex is a point of contention: all threads allocating or deallocating single-block regions must queue behind it, even though the underlying `BStack` already provides atomic compound operations — `cross_exchange` for a lock-free push, and `get_batched_gen` + `cas` for a compare-and-swap pop — that could serve the same role without an allocator-level lock.

The goal is to remove the `Mutex<()>` entirely and replace every free-list path with sequences of BStack primitives that are safe under concurrent `&self` access.

`CheckedSlabBStackAllocator` carries the same `Mutex<()>` and the same free-list pop/push structure as `SlabBStackAllocator`, so whatever solution is adopted here applies to it by extension with no additional design work.

### Design

#### Push: lock-free prepend via `cross_exchange`

To push block `b` (at payload offset `b_addr`) onto the free list:

1. **Plant a self-pointer placeholder.** Write `b_addr` as a little-endian `u64` into the first eight bytes of `b` — i.e., call `stack.set(b_addr, b_addr.to_le_bytes())`. This seeds the slot that will become `b->next` with a safe, in-bounds value.

2. **Atomically splice `b` in as the new head.** Call `stack.cross_exchange(b_addr, FREE_HEAD_OFFSET, 8)`. `cross_exchange` swaps the eight bytes at `b_addr` with the eight bytes at `FREE_HEAD_OFFSET` under a single write lock. Before the call the slot at `b_addr` holds `b_addr` and the slot at `FREE_HEAD_OFFSET` holds the current head `H`; after the call `FREE_HEAD_OFFSET` holds `b_addr` (b is now head) and `b_addr` holds `H` (b's next pointer is the old head). The self-pointer written in step 1 is never observed by any reader: `cross_exchange` atomically replaces it with `H` at the same moment it publishes `b` as the new head.

The call to `set` in step 1 and the call to `cross_exchange` in step 2 are not jointly atomic — a crash between them leaves the self-pointer sitting in `b->next` with `free_head` still pointing to the old head. This leaks `b` rather than corrupting the list, matching the crash-safety class already documented for `push_free_block`.

Push is inherently race-free without any allocator-level mutex: even if two threads push concurrently, each `cross_exchange` is atomic with respect to the other, and each thread's block ends up correctly linked into the list (though their relative order at the head is not deterministic).

#### Pop: single-lock dependent sequence via `process_gen`

To pop the head block from the free list:

1. **Run the whole pop as one `process_gen` sequence.** Drive `stack.process_gen` through a small state machine:
- *Step 0* — issue `Read { offset: FREE_HEAD_OFFSET, buf: head_buf }` to read the current head pointer.
- *Step 1* — once `head_buf` is populated, parse `head_val`. If `head_val == SENTINEL`, the list is empty: return `None`, ending the sequence with nothing popped — fall through to the tail-extension branch. Otherwise remember `head_val` and issue `Read { offset: head_val, buf: next_buf }` to read that block's `next` pointer.
- *Step 2* — issue `Write { offset: FREE_HEAD_OFFSET, data: next_buf }`, replacing the head with the next block and ending the sequence. The caller now owns `head_val`.

The crucial property is that `process_gen` acquires the BStack write lock *before* the first read and holds it, unreleased, across every subsequent read and the terminating write. The read of `free_head`, the read of its `next` pointer, and the write that advances `free_head` all happen as one indivisible critical section — not as separate lock acquisitions that another thread's operations could interleave between.

#### Why a CAS-based design would be unsafe, and how `process_gen` avoids it

A more "obvious" design would pair `get_batched_gen` (read `head` and `head->next` under a read lock, then release it) with `cas` (re-acquire the write lock, compare, and conditionally write `free_head`). That pairing leaves a race window between releasing the read lock and acquiring the write lock — and the ABA problem exploits exactly that window: `free_head` can return to the same byte value it held at read time even though the list structure underneath has completely changed.

**Concrete example.** Suppose the free list is `head → H0 → H1 → H2 → …`:

1. **Thread A** reads `head = H0` and `H0->next = H1`, then releases its read lock.
2. **Thread B** pops H0 (`free_head`: H0 → H1). H0 is now live.
3. **Thread B** pops H1 (`free_head`: H1 → H2). H1 is now live.
4. **Thread B** deallocates H0 — push: writes `H0->next = H2` (the current head), then sets `head = H0`. The free list is now `H0 → H2 → …`.
5. **Thread A** fires its CAS: re-reads `FREE_HEAD_OFFSET`, sees `H0` — matching what it read in step 1 — and writes `H1`. The CAS "succeeds".

`free_head` is now `H1`, but H1 is still live (allocated to Thread B in step 3): the next `alloc` hands it out a second time — a silent double allocation that no error reports. A no-retry-on-failure policy would not help here; the corruption comes from a *successful* CAS, one that cannot distinguish "head is still H0 because nothing changed" from "head is H0 again because it cycled back".

`process_gen` makes this scenario structurally impossible rather than merely improbable. Because Thread A would hold the *same* write lock continuously from its first read of `free_head` through its final write, none of Thread B's steps — pop H0, pop H1, push H0 back — could execute in between; every one of them needs that same write lock and so simply blocks until Thread A's whole sequence, including the terminating write, has completed. There is no window in which `free_head` can change value and cycle back, so the byte-value re-comparison that CAS relies on — and that ABA defeats — is unnecessary. No generation counter, no on-disk format change, and no retry policy are needed; the allocator only has to drive the read-read-write sequence through `process_gen`, which already owns the single lock acquisition that makes it atomic.

### Open questions

- **Batch push under concurrency.** `push_free_blocks` currently reads `free_head`, builds the entire linked chain in a buffer, then writes the chain and updates `free_head` in two separate calls — not atomic with respect to concurrent pushes. A lock-free batch push requires either (a) holding the allocator mutex for batch operations only, (b) building the chain tail-to-head and using the same `cross_exchange` trick, or (c) a new BStack primitive. The single-block push via `cross_exchange` is already safe; the batch case needs its own solution before the mutex can be removed entirely.

---

## Typed region and I/O parameter types

**Feature flag:** None (additive API surface)
Expand Down
Loading
Loading