Feat/w8 zero alloc consume#149
Draft
lxsaah wants to merge 17 commits into
Draft
Conversation
- Introduced `aimdb-bench` crate for benchmarking AimDB with various profiles. - Implemented allocation counting benchmarks (B0) using `CountingAllocator`. - Added latency benchmarks (B1) to measure push-to-receive latency. - Developed throughput benchmarks (B2) for steady-state performance. - Created pipeline benchmarks for both allocation (B0-Pipeline) and runner-driven throughput (B-Runner-Pipeline). - Established workload profiles for telemetry, state, and command messages. - Results are serialized to JSON for easy analysis and comparison.
…seline data for b0_alloc_tokio
…ure, update output paths for results
… SPI - Replace async recv method with poll_recv for object safety and zero allocations. - Remove WasmRecvFuture and its associated heap allocation, simplifying the reader implementation. - Update documentation to reflect the new design and performance metrics. - Ensure compatibility with existing consumer-facing API by maintaining async recv method. - Introduce measurement program to validate allocation and performance improvements across different buffer profiles.
…in ProfilingBufferReader
- Implemented `b2_throughput_embassy.rs` to measure steady-state throughput using the Embassy buffer backend. - Added baseline data for allocation metrics in `b0_alloc_embassy.json`. - Created cycle profiling baseline for STM32H563ZI in `b3_cycles_stm32h5.json`. - Updated `lib.rs` to include new Embassy benchmarks and profiles. - Introduced `profiles_embassy.rs` for buffer constructors tailored for the Embassy adapter. - Set up STM32H563ZI example with necessary configurations and dependencies in `Cargo.toml`. - Added README documentation for the STM32H563ZI example, detailing usage and results. - Implemented a build script for linking and memory configuration in `build.rs`. - Created a flash script for easy deployment to the STM32H563ZI board. - Established a Rust toolchain file for consistent development environment. - Developed main benchmarking logic in `main.rs` to measure cycles and allocations for various buffer profiles.
…and Embassy adapters
- Updated comments and documentation in `b_alloc_pipeline.rs` to clarify the measurement scope and execution details. - Improved clarity in `b_runner_pipeline.rs` regarding the benchmarking process and its scope. - Simplified and clarified the `alloc.rs` documentation, emphasizing the isolation of the counting allocator from production code. - Revised `lib.rs` documentation to highlight the non-production nature of the benchmarking infrastructure. - Enhanced comments in `profiles_embassy.rs` to better explain the behavior of Embassy buffers and the importance of lazy subscriber registration. - Added a new design document `038-aimdb-bench-crate-design.md` outlining the structured benchmarking infrastructure for AimDB. - Introduced `039-proof-artifact-and-story-roadmap.md` to detail the sequencing of proof artifacts and story publication.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
After 036 W1 removed the per-message
Box<dyn Any>from the connector spine, exactly one AimDB-added per-message heap allocation remained on the in-process consume path: thePin<Box<dyn Future>>that the object-erasedBufferReader::recv()constructed on every call. Anasync fnon an erased trait isn't object-safe without boxing the future — so the box was pure deadweight imposed by the trait signature, not part of thedyntrade-off.This PR removes it by converting the reader SPI from boxed-future
asyncto an object-safe poll interface, restoringasync fn recv().awaitergonomics through a thin, allocation-free handle.Result: 0 AimDB-added heap allocations per message on the in-process consume path (was 1), verified by the new
aimdb-benchB0 suite across all three buffer profiles.The core change
aimdb-coreSPI (buffer/traits.rs):try_recvis unchanged. The same swap applies to theremote-accessJsonBufferReader(recv_json→poll_recv_json).New consumer-facing handles (
buffer/reader.rs) restore the ergonomic surface:buffer::Reader<T>(andbuffer::JsonReaderunderremote-access) wrap the erased reader and exposeasync fn recv()implemented once viacore::future::poll_fn—core-only,no_std-clean, zero-allocation, nounsafe.Consumer::subscribe,TypedRecord::subscribe, andAimDb::subscribenow returnReader<T>instead ofBox<dyn BufferReader<T> + Send>.Adapter implementations
broadcast/watchexpose no public poll API, so the reader round-trips its receiver through a single reusedReusableBoxFuture, re-armed after eachReadyNotifyreplaced by an explicit, deduplicated waker list beside the slot;pushdrains and wakesSubscriber::poll_next_message,watch::Receiver::poll_changed,Channel::poll_receiveunsafeWasmRecvFuture::pollbody moves verbatim intopoll_recv(the box existed solely to satisfy the old trait signature)try_recvon the tokio broadcast/watch readers polls the reused future withWaker::noop()—Readymeans a value/error is available now,Pendingmeans empty — preserving the prior semantics. The profiling reader memoizespending_sinceso aPendingwait on the producer is not counted as consumer processing time.The embassy poll methods are small, additive public wrappers added to the vendored
embassy-sync(Channel::poll_receiveis already public; this gives pubsub/watch the matching method). This replaces the initial W8 cut's hand-rolledno_stdReusableBoxFuture, deleting ~80 lines of raw-pointerunsafe.Benchmarking infrastructure (design 038)
New host-only
aimdb-benchcrate (excluded fromdefault-members):#[global_allocator]in dedicated bench binaries measures allocs/message. Committed baselines show 0.0 allocs/msg across all three tokio profiles. CI wiring is advisory/report-only for now; a hard gate is a documented follow-up (038 §6).examples/embassy-bench-stm32h5(STM32H5, embassy runtime).Breaking changes
BufferReader::recv→poll_recv;JsonBufferReader::recv_json→poll_recv_json. Object safety is preserved.Box<dyn BufferReader<T> + Send>→Reader<T>.subscribe().recv().awaitis unchanged at every call site — examples andaimdb-procompile without edits. Holders of a concrete adapter reader wrap it once:Reader::new(Box::new(reader)).SerializedReader::recvkeeps its boxedRecvSerializedFuture; only the inner per-message box is eliminated.Full inventory in
aimdb-core/CHANGELOG.md.Verification
make all— full matrix: build + tests + clippy-D warnings+ fmt, incl. wasm32 & thumbv7em cross-compilesremote::)--benchesbuild · fmt-check (5 changed crates)The
_external/embassysubmodule is pinned to a fork commit that adds the publicpoll_next_message/poll_changedwrappers, which the embassy adapter depends on (the workspace compilesembassy-syncfrom the submodule path). The corresponding upstream embassy PR is pending.Until it merges, the submodule points at a fork branch rather than upstream, so CI on this PR will not pass the submodule checkout. Merge sequence:
embassy-syncpoll-methods PR._external/embassyto the merged upstream SHA (and.gitmodulesback toembassy-rs/embassy).This is the single hard gate on merge; all code, tests, docs, and downstream compatibility are otherwise ready.