Feat/w8 zero alloc consume by lxsaah · Pull Request #149 · aimdb-dev/aimdb

lxsaah · 2026-06-21T21:05:17Z

Summary

After 036 W1 removed the per-message Box<dyn Any> from the connector spine, exactly one AimDB-added per-message heap allocation remained on the in-process consume path: the Pin<Box<dyn Future>> that the object-erased BufferReader::recv() constructed on every call. An async fn on an erased trait isn't object-safe without boxing the future — so the box was pure deadweight imposed by the trait signature, not part of the dyn trade-off.

This PR removes it by converting the reader SPI from boxed-future async to an object-safe poll interface, restoring async fn recv().await ergonomics through a thin, allocation-free handle.

Result: 0 AimDB-added heap allocations per message on the in-process consume path (was 1), verified by the new aimdb-bench B0 suite across all three buffer profiles.

The core change

aimdb-core SPI (buffer/traits.rs):

// before — heap-allocates a future box on every call:
fn recv(&mut self) -> Pin<Box<dyn Future<Output = Result<T, DbError>> + Send + '_>>;
// after — object-safe, allocation-free:
fn poll_recv(&mut self, cx: &mut Context<'_>) -> Poll<Result<T, DbError>>;

try_recv is unchanged. The same swap applies to the remote-access JsonBufferReader (recv_json → poll_recv_json).

New consumer-facing handles (buffer/reader.rs) restore the ergonomic surface:

buffer::Reader<T> (and buffer::JsonReader under remote-access) wrap the erased reader and expose async fn recv() implemented once via core::future::poll_fn — core-only, no_std-clean, zero-allocation, no unsafe.
Consumer::subscribe, TypedRecord::subscribe, and AimDb::subscribe now return Reader<T> instead of Box<dyn BufferReader<T> + Send>.

Adapter implementations

Adapter	Mechanism	Allocation
Tokio — broadcast / watch	`broadcast`/`watch` expose no public poll API, so the reader round-trips its receiver through a single reused `ReusableBoxFuture`, re-armed after each `Ready`	1 box per subscriber lifetime, 0 per message
Tokio — Mailbox	`Notify` replaced by an explicit, deduplicated waker list beside the slot; `push` drains and wakes	0 per message
Embassy	Drives embassy-sync's public poll methods directly: `Subscriber::poll_next_message`, `watch::Receiver::poll_changed`, `Channel::poll_receive`	0 per message, no new `unsafe`
WASM	The old `WasmRecvFuture::poll` body moves verbatim into `poll_recv` (the box existed solely to satisfy the old trait signature)	0 per message

try_recv on the tokio broadcast/watch readers polls the reused future with Waker::noop() — Ready means a value/error is available now, Pending means empty — preserving the prior semantics. The profiling reader memoizes pending_since so a Pending wait on the producer is not counted as consumer processing time.

The embassy poll methods are small, additive public wrappers added to the vendored embassy-sync (Channel::poll_receive is already public; this gives pubsub/watch the matching method). This replaces the initial W8 cut's hand-rolled no_std ReusableBoxFuture, deleting ~80 lines of raw-pointer unsafe.

Benchmarking infrastructure (design 038)

New host-only aimdb-bench crate (excluded from default-members):

B0 — allocation count (headline + gate): a counting #[global_allocator] in dedicated bench binaries measures allocs/message. Committed baselines show 0.0 allocs/msg across all three tokio profiles. CI wiring is advisory/report-only for now; a hard gate is a documented follow-up (038 §6).
B1/B2 — latency & throughput for the tokio and embassy-host adapters (trend-only on shared CI runners).
B3 — on-target cycles via the new examples/embassy-bench-stm32h5 (STM32H5, embassy runtime).

Breaking changes

SPI break (adapter authors only): BufferReader::recv → poll_recv; JsonBufferReader::recv_json → poll_recv_json. Object safety is preserved.
Subscribe return type: Box<dyn BufferReader<T> + Send> → Reader<T>.
Source-compatible for consumers: subscribe().recv().await is unchanged at every call site — examples and aimdb-pro compile without edits. Holders of a concrete adapter reader wrap it once: Reader::new(Box::new(reader)).
Connector SPI unchanged (BYOC-stable): SerializedReader::recv keeps its boxed RecvSerializedFuture; only the inner per-message box is eliminated.

Full inventory in aimdb-core/CHANGELOG.md.

Verification

Check	Result
`make all` — full matrix: build + tests + clippy `-D warnings` + fmt, incl. wasm32 & thumbv7em cross-compiles	✅ pass
aimdb-core (std+metrics, std+profiling, no_std+alloc+remote-access, `remote::`)	✅ pass
tokio adapter buffer unit tests (poll_recv/try_recv × broadcast/watch/mailbox, lag, interleaved drain, multi-reader)	✅ 51/51
embassy adapter (host unit + doctests)	✅ 13/13 + doctests
aimdb-bench `--benches` build · fmt-check (5 changed crates)	✅ pass · clean
B0 allocations/message (committed baseline)	✅ 0.0 ×3 profiles

⚠️ Merge gate — embassy submodule

The _external/embassy submodule is pinned to a fork commit that adds the public poll_next_message / poll_changed wrappers, which the embassy adapter depends on (the workspace compiles embassy-sync from the submodule path). The corresponding upstream embassy PR is pending.

Until it merges, the submodule points at a fork branch rather than upstream, so CI on this PR will not pass the submodule checkout. Merge sequence:

Land the upstream embassy-sync poll-methods PR.
Repoint _external/embassy to the merged upstream SHA (and .gitmodules back to embassy-rs/embassy).
Merge this PR.

This is the single hard gate on merge; all code, tests, docs, and downstream compatibility are otherwise ready.

- Introduced `aimdb-bench` crate for benchmarking AimDB with various profiles. - Implemented allocation counting benchmarks (B0) using `CountingAllocator`. - Added latency benchmarks (B1) to measure push-to-receive latency. - Developed throughput benchmarks (B2) for steady-state performance. - Created pipeline benchmarks for both allocation (B0-Pipeline) and runner-driven throughput (B-Runner-Pipeline). - Established workload profiles for telemetry, state, and command messages. - Results are serialized to JSON for easy analysis and comparison.

…seline data for b0_alloc_tokio

…tructions

…ure, update output paths for results

… SPI - Replace async recv method with poll_recv for object safety and zero allocations. - Remove WasmRecvFuture and its associated heap allocation, simplifying the reader implementation. - Update documentation to reflect the new design and performance metrics. - Ensure compatibility with existing consumer-facing API by maintaining async recv method. - Introduce measurement program to validate allocation and performance improvements across different buffer profiles.

…llocations

…in ProfilingBufferReader

- Implemented `b2_throughput_embassy.rs` to measure steady-state throughput using the Embassy buffer backend. - Added baseline data for allocation metrics in `b0_alloc_embassy.json`. - Created cycle profiling baseline for STM32H563ZI in `b3_cycles_stm32h5.json`. - Updated `lib.rs` to include new Embassy benchmarks and profiles. - Introduced `profiles_embassy.rs` for buffer constructors tailored for the Embassy adapter. - Set up STM32H563ZI example with necessary configurations and dependencies in `Cargo.toml`. - Added README documentation for the STM32H563ZI example, detailing usage and results. - Implemented a build script for linking and memory configuration in `build.rs`. - Created a flash script for easy deployment to the STM32H563ZI board. - Established a Rust toolchain file for consistent development environment. - Developed main benchmarking logic in `main.rs` to measure cycles and allocations for various buffer profiles.

…and Embassy adapters

- Updated comments and documentation in `b_alloc_pipeline.rs` to clarify the measurement scope and execution details. - Improved clarity in `b_runner_pipeline.rs` regarding the benchmarking process and its scope. - Simplified and clarified the `alloc.rs` documentation, emphasizing the isolation of the counting allocator from production code. - Revised `lib.rs` documentation to highlight the non-production nature of the benchmarking infrastructure. - Enhanced comments in `profiles_embassy.rs` to better explain the behavior of Embassy buffers and the importance of lazy subscriber registration. - Added a new design document `038-aimdb-bench-crate-design.md` outlining the structured benchmarking infrastructure for AimDB. - Introduced `039-proof-artifact-and-story-roadmap.md` to detail the sequencing of proof artifacts and story publication.

…ifacts

lxsaah added 14 commits June 16, 2026 20:13

feat(bench): update output directory for benchmark results and add ba…

aab1cd6

…seline data for b0_alloc_tokio

feat(bench): add README for benchmarking infrastructure and usage ins…

5b73b01

…tructions

style: format async calls for better readability in benchmark functions

631b7e7

fix: add missing license field in Cargo.toml

de94dd6

feat(bench): enhance Makefile and README for benchmarking infrastruct…

319ef3f

…ure, update output paths for results

feat(buffer): refactor EmbassyBufferReader to eliminate per-message a…

ce10d91

…llocations

feat(profiling): add pending_since to track consumer processing time …

f53e979

…in ProfilingBufferReader

feat(bench): update baseline references in benchmark scripts to 'main'

34b8b5c

feat(bench): consolidate latency and throughput benchmarks for Tokio …

5d9630c

…and Embassy adapters

feat(buffer): add lazy subscriber creation explanation for SpmcRing

f82e81c

lxsaah self-assigned this Jun 21, 2026

lxsaah added 3 commits June 21, 2026 21:13

chore: remove outdated design documents for aimdb-bench and proof art…

871e0e3

…ifacts

docs: update comments for clarity on buffer types and Tokio primitives

2b54974

Merge branch 'main' into feat/w8-zero-alloc-consume

f22344f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/w8 zero alloc consume#149

Feat/w8 zero alloc consume#149
lxsaah wants to merge 17 commits into
mainfrom
feat/w8-zero-alloc-consume

lxsaah commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

lxsaah commented Jun 21, 2026

Summary

The core change

Adapter implementations

Benchmarking infrastructure (design 038)

Breaking changes

Verification

⚠️ Merge gate — embassy submodule

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant