feat(eth-sparse-mpt): add ~3x faster v3 state-root implementation and harness tools by bertmiller · Pull Request #887 · flashbots/rbuilder

bertmiller · 2026-02-28T20:50:16Z

📝 Summary

I will state upfront that this code is almost entirely written with Codex. However, the results seem promising and it shares much overlap with the existing v2 state root anyway.

I asked Codex to setup a harness that measured performance across different depths of cache warmth and compared state root implementations to each other and Reth's for correctness. Then I asked it to optimize v2 and implement a v3 state root. This was effective because the agent could simply iterate if it noticed it had the wrong root or if it was slower. Finally, I ran the harness for 1000+ blocks and the new v3 always produced the same root as v2 and reth, so I have some confidence in the implementation. You can find the testing scripts in this PR, specifically correct_and_bench.rs. Note that these require unwinding a reth node block by block (thank you Vitaly for this trick).

The results are a significant speedup relative to v2. Over 1,000 blocks:

Scope	v2 p50	v3 p50	v2 p99	v3 p99	v2 mean	v3 mean
Overall (excluding warm=100, n=3213)	3.01	0.88	4.71	1.87	3.19	0.97

Warm %	v2 p50	v3 p50	p50 speedup	v2 p99	v3 p99	p99 speedup	v2 mean	v3 mean	mean speedup
70	4.16	1.09	3.82x	5.96	2.22	2.68x	4.32	1.21	3.58x
80	3.05	0.94	3.26x	4.75	1.93	2.46x	3.23	1.03	3.14x
90	1.83	0.61	3.00x	3.41	1.45	2.35x	2.00	0.68	2.95x

Here's a moving average:

The one notable regression is that in ~2.5% of blocks there were v3 p99 cases that were greater than v2. On average these regressions added 1.89ms, and the worst observed was 23.9ms for a single block.

💡 Motivation and Context

Calculating the state root is in the hot path for a new block so making it faster lowers latency for all our users.

✅ I have completed the following steps:

Run make lint
Run make test
Added tests (if applicable)

claude · 2026-02-28T20:50:34Z

Claude finished @bertmiller's task in 3m 26s —— View job

PR Review: v3 state-root implementation

Read repository guidelines
Gather context and understand the changes
Review v3 core implementation (v3/mod.rs, v3/trie/mod.rs)
Review v3 supporting files (v3/fetch.rs, v3/trie/proof_store.rs, v3/trie/tests.rs)
Review harness/bin files
Review lib.rs and Cargo changes
Post inline comments for issues found
Post final review summary

Summary

The v3 implementation is structurally a copy of v2 with the same overall architecture (arena-based trie, proof store, incremental revert/apply pattern). The 3x speedup likely comes from the optimized data layout (SoA with Range<usize> indices into flat Vec<u8> buffers instead of heap-allocated Vec<u8> per node). The harness tools for correctness and benchmarking are thorough.

Key Issues

Production stability concerns:

Panicking assertions on the hot path — The retry loops in calculate_root_hash_with_sparse_trie (lines 996, 1018, 1036, 1061 of v3/mod.rs) use assert!(loop_break, ...) which will crash the builder if convergence doesn't happen within 10 iterations. These should return errors instead. See inline comment
Potential index-out-of-bounds panic — insert_nibble_slice accesses ins_key[path_walked] without checking that path_walked < ins_key.len() when encountering a branch node. See inline comment

Correctness/concurrency:
3. TOCTOU race in ProofStore::add_proof — The contains_key/insert pair on the DashMap is not atomic. Concurrent callers with the same key can both pass the check and both insert, appending duplicate entries. See inline comment

Unsafe aliasing in parallel hashing — calculate_rlp_pointer_node creates &mut references from &self via raw pointer casts. While the tree structure ensures disjoint indices, this technically violates Rust's aliasing rules and is UB under Miri. See inline comment

Design/maintenance:
5. Confusing type aliases — SharedCacheV3 = SharedCacheV2 and RootHashCalculatorV3 = RootHashCalculator where SharedCacheV2 is a local struct (not the v2 module's type) is misleading. See inline comment

Unbounded arena growth — The Trie append-only arena (values, keys, branch_node_children) never reclaims memory from deleted/updated nodes. Over long-running sessions this is a slow memory leak. See inline comment

Minor:
7. Typos: "hahsing", "fethed", "addres" — see inline comments

Other Observations

The v2 changes (silencing KeyNotFound errors during delete/revert) are a behavior change that could mask bugs — previously these were hard errors. The PR description doesn't call this out. Worth confirming these changes are intentional and tested.
The v3 module is ~4700 new lines, most of which is structurally identical to v2. This creates a significant maintenance burden — any bug fix in v2's trie logic will need to be applied to v3 as well. Consider whether the v3 Trie could share more code with v2 (e.g., via generics or a shared trie implementation with different node layouts).
The three new binaries (correctness-harness, cache-warm-compare, correct_and_bench) add heavy dependencies to the crate (reth-db, reth-evm, reth-node-ethereum, etc.). These are useful development tools but significantly increase compile times. Consider gating them behind a feature flag or moving them to a separate crate.

claude · 2026-02-28T20:53:27Z

crates/eth-sparse-mpt/src/v3/trie/proof_store.rs

+        if self.proofs.contains_key(&key) {
+            return Ok(());
+        }


TOCTOU race: contains_key + insert are not atomic on the DashMap. Two threads with the same key can both pass the contains_key check, both decode the proof, and both insert — doing redundant work and appending duplicate entries to keys/values/rlp_ptrs. Consider using self.proofs.entry(key).or_try_insert_with(|| { ... }) instead, or at minimum use insert with or_insert to make this safe. This same pattern exists in v2, but worth fixing here since this is new code.

claude · 2026-02-28T20:53:29Z

crates/eth-sparse-mpt/src/v3/trie/mod.rs

+    fn calculate_rlp_pointer_node(
+        &self,
+        node_idx: usize,
+        rlp: &mut Vec<u8>,
+        proof_store: &ProofStore,
+    ) {
+        self.rlp_encode_node(node_idx, rlp, proof_store);
+        let result =
+            unsafe { &mut *(self.rlp_ptrs_local.as_ptr().add(node_idx) as *mut ArrayVec<u8, 33>) };
+        result.clear();
+        if rlp.len() < 32 {
+            result.try_extend_from_slice(rlp).unwrap();
+        } else {
+            let hash = keccak256(rlp);
+            result.push(EMPTY_STRING_CODE + 32);
+            result.try_extend_from_slice(hash.as_slice()).unwrap();
+        }
+
+        let hashed_node = unsafe { &mut *(self.hashed_nodes.as_ptr().add(node_idx) as *mut bool) };
+        *hashed_node = true;
+    }


unsafe block creates mutable references from shared &self to update rlp_ptrs_local and hashed_nodes. The comment says this is safe because root_hash takes &mut self, but root_hash actually takes &self (line 811), and root_hash_node spawns rayon tasks that can mutate concurrently. This is undefined behavior when two rayon tasks attempt to hash different children of the same branch — they both call calculate_rlp_pointer_node through &self, creating aliasing &mut references to different indices.

The tree structure guarantees that different spawned tasks hash disjoint subtrees (so different indices), which makes this safe in practice, but the Rust aliasing model is not index-aware — two &mut from the same Vec are UB even if they target different indices. This should use UnsafeCell or Cell wrappers to be sound according to Rust's aliasing rules, or at minimum add // SAFETY: comments documenting why concurrent access to distinct indices is safe.

(I note this pattern is inherited from v2, but worth flagging.)

claude · 2026-02-28T20:53:30Z

crates/eth-sparse-mpt/src/v3/trie/mod.rs

+    fn insert_nibble_slice(
+        &mut self,
+        ins_key: &[u8],
+        insert_value: InsertValue<'_>,
+    ) -> Result<Option<Range<usize>>, NodeNotFound> {
+        let mut current_node = 0;
+        let mut path_walked = 0;
+
+        let mut old_value = None;
+
+        loop {
+            let node = self
+                .nodes
+                .get(current_node)
+                .ok_or_else(|| NodeNotFound::new(ins_key))?;
+            self.hashed_nodes[current_node] = false;
+            match node {
+                DiffTrieNode::Branch { children } => {
+                    let children = *children;
+
+                    let n = ins_key[path_walked] as usize;


Potential panic: ins_key[path_walked] on line 246 will panic with index-out-of-bounds if path_walked == ins_key.len() when we reach a Branch node. This can happen when the key has been fully consumed but we hit a branch (i.e., the key is a prefix of existing keys). Consider adding a bounds check or returning an error instead of panicking.

claude · 2026-02-28T20:53:31Z

crates/eth-sparse-mpt/src/v3/mod.rs

+        for _ in 0..10 {
+            stats.start();
+            let ok = self.process_storage_tries_updates()?;
+            stats.measure_insert(true);
+            if !ok {
+                stats.start();
+                self.fetch_missing_storage_nodes(&consistent_db_view, &mut stats)?;
+                stats.measure_proof_fetch(true);
+                continue;
+            }
+            stats.start();
+            self.hash_storage_tries();
+            stats.measure_hash(true);
+            loop_break = true;
+            break;
+        }
+        assert!(loop_break, "storage trie are not processed after 10 iters");


The magic number 10 for the retry loop upper bound is used in multiple places (storage tries, account tries, account trie proofs) with assert!(loop_break, ...) that will panic in production if the loop doesn't converge. Consider:

Making this a named constant (e.g., const MAX_FETCH_ITERATIONS: usize = 10;)

Returning an error instead of panicking on exceeding max iterations — a panic in a hot path will bring down the builder

claude · 2026-02-28T20:53:32Z

crates/eth-sparse-mpt/src/v3/mod.rs

+pub type SharedCacheV3 = SharedCacheV2;
+pub type RootHashCalculatorV3 = RootHashCalculator;


These type aliases (SharedCacheV3 = SharedCacheV2, RootHashCalculatorV3 = RootHashCalculator) are confusing. The V3 module defines its own SharedCacheV2 and RootHashCalculator structs locally (not re-exported from v2), so these aliases just rename the local types. This makes it look like V3 shares V2's cache at the integration level (in lib.rs), when in fact each module has independent copies of identically-structured types. Consider renaming the local structs to SharedCacheV3 / RootHashCalculatorV3 directly to avoid the confusing indirection.

claude · 2026-02-28T20:53:33Z

crates/eth-sparse-mpt/src/v3/mod.rs

+                let hash = storage_calc
+                    .trie
+                    .root_hash(PARALLEL_HASHING_STORAGE_NODES, &storage_calc.proof_store)
+                    .expect("missing node while hahsing storage trie");


Typo: "missing node while hahsing storage trie" → "missing node while hashing storage trie"

claude · 2026-02-28T20:53:34Z

crates/eth-sparse-mpt/src/v3/mod.rs

+                    "Proof was not fethed correctly"
+                )));


Typo: "Proof was not fethed correctly" → "Proof was not fetched correctly"

claude · 2026-02-28T20:53:36Z

crates/eth-sparse-mpt/src/v3/mod.rs

+}
+
+impl StorageCalculator {
+    fn new(addres: Address, shared_cache: &SharedCacheV2) -> Self {


Typo: addres → address

claude · 2026-02-28T20:53:37Z

crates/eth-sparse-mpt/src/v3/trie/mod.rs

+#[derive(Debug, Clone, Default)]
+pub struct Trie {
+    // 3 arrays below are of the same length
+    hashed_nodes: Vec<bool>,
+    rlp_ptrs_local: Vec<ArrayVec<u8, 33>>,
+    nodes: Vec<DiffTrieNode>,
+
+    values: Vec<u8>,
+    keys: Vec<u8>,
+    branch_node_children: Vec<[Option<NodePtr>; 16]>,
+
+    // scratchpad
+    walk_path: Vec<(usize, u8)>, // node index, nibble
+}


The Trie struct uses an append-only arena pattern for values, keys, and branch_node_children. As nodes are inserted, updated, and deleted, old data in these vectors becomes unreachable but is never freed (only clear() resets everything). Over many blocks, this means unbounded memory growth within a single Trie instance.

Given this is on a hot path that persists across blocks (the StorageCalculator.trie and AccountTrieCalculator.trie survive between calculate_root_hash_with_sparse_trie calls), this could be a slow memory leak for long-running builders. Consider periodically compacting or tracking waste ratio. (This is also inherited from v2.)

claude · 2026-02-28T20:53:38Z

crates/eth-sparse-mpt/src/v3/mod.rs

+    fn process_storage_tries_updates(&mut self) -> eyre::Result<bool> {
+        let all_changed_processed = Arc::new(Mutex::new(true));
+        self.changed_account
+            .write()
+            .par_iter_mut()
+            .map(|(address, status)| -> eyre::Result<()> {
+                // if account is done, just return
+                if !status.needs_processing() {
+                    return Ok(());
+                }
+                if self.process_storage_tries_update(*address)? {
+                    *status = StorageTrieStatus::InsertsProcessed;
+                } else {
+                    *all_changed_processed.lock() = false;
+                }
+                Ok(())
+            })
+            .collect::<Result<(), _>>()?;
+        let res = *all_changed_processed.lock();
+        Ok(res)
+    }


process_storage_tries_updates takes a write lock on changed_account and then calls par_iter_mut(). This holds a write lock for the duration of the parallel iteration — no other method can read changed_account until all parallel tasks complete. While this works functionally (no other thread accesses it concurrently at this point), it means the locking granularity is coarser than necessary. If the design changes to allow concurrent phases, this could become a bottleneck. Consider whether a lock-free approach (e.g., collecting the results and writing back) would be cleaner.

bertmiller · 2026-02-28T20:56:23Z

@dvush if you would like me to address Claude's input I'll pass it to Codex, let me know either way.

claude · 2026-02-28T21:41:02Z

Claude finished @bertmiller's task in 2m 41s —— View job

PR Review: v3 state-root implementation (updated)

Read repository guidelines
Gather context on latest changes (2 new commits since last review)
Review v3 core implementation (v3/mod.rs)
Review v3 trie implementation (v3/trie/mod.rs)
Review v3 supporting files (v3/fetch.rs, v3/trie/proof_store.rs, v3/trie/tests.rs)
Review harness/bin files and Cargo changes
Review lib.rs changes
Post inline comments for issues found
Post final review summary

What changed since last review

Two commits address feedback from the previous review:

bd9e3cf — Retry-loop exhaustion now returns SparseTrieError instead of panicking via assert!. Uses a named MAX_PROCESS_ITERS constant. This addresses issue chore(deps): bump built from 0.7.1 to 0.7.3 #1 from the prior review.
6d9baf3 — Dev binaries gated behind dev-tools feature flag. This addresses the compile-time concern about heavy reth dependencies.

Remaining issues

Production stability — remaining panics on hot path:
Several panic!/assert!/.expect() calls remain inside fetch_missing_storage_nodes and fetch_missing_account_trie_nodes (lines 643-672, 925-947 of v3/mod.rs), as well as .expect() during hashing (lines 698, 959). These will crash the builder on unexpected state. See inline comment

Potential index-out-of-bounds panic (unchanged):
insert_nibble_slice at v3/trie/mod.rs:246 — ins_key[path_walked] has no bounds check when encountering a Branch node. See inline comment

Unsafe aliasing in parallel hashing (unchanged):
calculate_rlp_pointer_node creates &mut references from &self via raw pointer casts (v3/trie/mod.rs:794,804). While safe in practice due to disjoint indices, this is technically UB. Needs UnsafeCell or at minimum // SAFETY: documentation. See inline comment

TOCTOU race in ProofStore::add_proof (unchanged):
contains_key/insert pair on DashMap at v3/trie/proof_store.rs:59-97 is not atomic. See inline comment

Confusing type aliases (unchanged):
SharedCacheV3 = SharedCacheV2 at v3/mod.rs:31 where SharedCacheV2 is a local struct. See inline comment

V2 behavior change not documented:
The v2 changes silencing KeyNotFound errors during delete/revert (4 locations in v2/mod.rs) change previously-hard-error behavior to silent success. The PR description doesn't mention this. Worth confirming these are intentional — silently succeeding could mask trie state desync bugs.

Minor typos: "hahsing" (line 698) and "fethed" (line 1086) in v3/mod.rs.

Summary

The two new commits address the most critical prior feedback (panic on retry exhaustion, compile-time bloat from dev binaries). The remaining issues are lower severity but still worth addressing before merging to production, particularly the remaining panics/expects on the hot path and the unsafe aliasing pattern.

claude · 2026-02-28T21:43:21Z

crates/eth-sparse-mpt/src/v3/mod.rs

+                    let ok = storage_calc.trie.try_add_proof_from_proof_store(&missing_node, &storage_calc.proof_store).expect("should be able to insert proofs from proof store when they are found (storage trie)");
+                    assert!(ok, "proof is not added (storage trie)");
+                } else {
+                    panic!("Missing node that was just fetched is not there (storage trie)");


Several panic! / assert! / .expect() calls remain on the hot path after the retry-loop fix. Specifically:

Lines 643–644, 669–670: expect()+assert! in fetch_missing_storage_nodes

Line 672: panic!("Missing node that was just fetched is not there (storage trie)")

Lines 925–926, 944–945: same pattern in fetch_missing_account_trie_nodes

Line 947: panic!("Missing node that was just fetched is not there (account trie)")

Line 698: .expect("missing node while hashing storage trie")

Line 959: .expect("failed to hash account trie")

These will crash the builder on any unexpected state. Consider converting them to return SparseTrieError instead.

Same behavior as v2

claude · 2026-02-28T21:43:22Z

crates/eth-sparse-mpt/src/v3/trie/mod.rs

+                DiffTrieNode::Branch { children } => {
+                    let children = *children;
+
+                    let n = ins_key[path_walked] as usize;


Potential panic: ins_key[path_walked] will panic with index-out-of-bounds if path_walked == ins_key.len() when encountering a Branch node (i.e., the key has been fully consumed but we hit a branch). This was flagged in the previous review and still applies. Consider adding a bounds check or returning an error.

Same behavior as v2

claude · 2026-02-28T21:43:24Z

crates/eth-sparse-mpt/src/v3/trie/mod.rs

+    ) {
+        self.rlp_encode_node(node_idx, rlp, proof_store);
+        let result =
+            unsafe { &mut *(self.rlp_ptrs_local.as_ptr().add(node_idx) as *mut ArrayVec<u8, 33>) };


The unsafe aliasing concern from the previous review still applies. calculate_rlp_pointer_node creates &mut references from &self via raw pointer casts (lines 794, 804). While the tree structure ensures disjoint index access in practice, this violates Rust's aliasing rules. Consider using UnsafeCell wrappers, or at minimum add // SAFETY: comments documenting the disjointness invariant.

Same behavior as v2

claude · 2026-02-28T21:43:25Z

crates/eth-sparse-mpt/src/v3/trie/proof_store.rs

+        key: Nibbles,
+        proof: Vec<(Nibbles, P)>,
+    ) -> Result<(), alloy_rlp::Error> {
+        if self.proofs.contains_key(&key) {


TOCTOU race still present: contains_key (line 59) and insert (line 97) are not atomic on the DashMap. Two threads with the same key can both pass the check and both decode/insert, appending duplicate entries to keys/values/rlp_ptrs. Consider using self.proofs.entry(key).or_try_insert_with(|| { ... }) to make this atomic.

Same behavior as v2

claude · 2026-02-28T21:43:26Z

crates/eth-sparse-mpt/src/v3/mod.rs

+const PARALLEL_HASHING_STORAGE_NODES: bool = true;
+const MAX_PROCESS_ITERS: usize = 10;
+
+pub type SharedCacheV3 = SharedCacheV2;


These aliases are still confusing. SharedCacheV3 = SharedCacheV2 where SharedCacheV2 is a local struct in the v3 module (not the v2 module's type). Consider renaming the local struct directly to SharedCacheV3 / RootHashCalculatorV3 to avoid the indirection.

claude · 2026-02-28T21:43:28Z

crates/eth-sparse-mpt/src/v3/mod.rs

+                let hash = storage_calc
+                    .trie
+                    .root_hash(PARALLEL_HASHING_STORAGE_NODES, &storage_calc.proof_store)
+                    .expect("missing node while hahsing storage trie");


Typo: "hahsing" → "hashing"

Same as v2, but happy to fix if we want

claude · 2026-02-28T21:43:29Z

crates/eth-sparse-mpt/src/v3/mod.rs

+        for proof_target in proof_targets {
+            if !proofs.contains_key(proof_target) {
+                return Err(SparseTrieError::Other(eyre::eyre!(
+                    "Proof was not fethed correctly"


Typo: "fethed" → "fetched"

Same as v2, but happy to fix if we want

bertmiller · 2026-03-01T02:55:26Z

I think the remaining Claude comments are things that were copied over from v2 (except the small changes in v2, which I can revert if you want. not sure why it did that)

bertmiller added 3 commits February 28, 2026 11:17

feat(eth-sparse-mpt): add v3 state-root implementation and harness tools

cf045de

fix lint

dcdaa4a

fix lint

e2da067

bertmiller requested review from ZanCorDX and dvush as code owners February 28, 2026 20:50

claude bot reviewed Feb 28, 2026

View reviewed changes

bertmiller added 2 commits February 28, 2026 16:04

Gate dev binaries behind dev-tools feature

6d9baf3

Return errors instead of panicking on v3 retry exhaustion

bd9e3cf

claude bot reviewed Feb 28, 2026

View reviewed changes

		pub type SharedCacheV3 = SharedCacheV2;
		pub type RootHashCalculatorV3 = RootHashCalculator;

Conversation

bertmiller commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📝 Summary

💡 Motivation and Context

✅ I have completed the following steps:

Uh oh!

claude bot commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: v3 state-root implementation

Summary

Key Issues

Other Observations

Uh oh!

claude bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

bertmiller commented Feb 28, 2026

Uh oh!

claude bot commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: v3 state-root implementation (updated)

What changed since last review

Remaining issues

Summary

Uh oh!

claude bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

bertmiller Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

bertmiller Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

bertmiller Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

bertmiller Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

bertmiller commented Feb 28, 2026 •

edited

Loading

claude bot commented Feb 28, 2026 •

edited

Loading

claude bot commented Feb 28, 2026 •

edited

Loading

bertmiller commented Mar 1, 2026 •

edited

Loading