Skip to content

feat(flashblocks): State root computation via PayloadProcessor sparse trie task#2247

Draft
mw2000 wants to merge 3 commits intomainfrom
mw2000/state-root-fast
Draft

feat(flashblocks): State root computation via PayloadProcessor sparse trie task#2247
mw2000 wants to merge 3 commits intomainfrom
mw2000/state-root-fast

Conversation

@mw2000
Copy link
Copy Markdown
Contributor

@mw2000 mw2000 commented Apr 16, 2026

  • Replace inline synchronous state root computation with reth's PayloadProcessor sparse trie task, computing state roots in parallel during block building
  • PayloadProcessor, ChangesetCache, and TreeConfig are stored on OpPayloadBuilder and reused across blocks so the sparse trie stays warm
  • Falls back to synchronous computation on task failure
  • Rewrite state_root benchmark to measure residual wait time (finish → root) with busy-wait simulation of 200ms inter-flashblock delays

mw2000 added 3 commits April 16, 2026 16:42
…trics

Expand ClientBounds with DatabaseProviderFactory, StateReader, and
'static to support OverlayStateProviderFactory and StateProviderBuilder
needed by the upcoming PayloadProcessor integration.

Add state_root_task_* metrics (started, completed, error counts and
duration histogram) for observability.
…rocessor

Spawn reth's PayloadProcessor to compute state roots in parallel with
transaction execution via the sparse trie task. Per-tx state diffs are
fed through OnStateHook and the final root is awaited at finalization,
with a synchronous fallback on task failure.

PayloadProcessor, ChangesetCache, and TreeConfig are stored on
OpPayloadBuilder and reused across blocks so the sparse trie stays warm
when consecutive blocks share the same parent state root.
…wait

Rewrite the state root benchmark to measure residual wait time — the
duration from dropping the state hook to receiving the computed root.
Uses busy-wait spin loops (not thread::sleep) to simulate 200ms
inter-flashblock execution delays without OS scheduling artifacts.

Includes warm sparse trie (cold-start warmup block), 50k base-state
accounts, and per-flashblock intermediate root measurements.
@cb-heimdall
Copy link
Copy Markdown
Collaborator

🟡 Heimdall Review Status

Requirement Status More Info
Reviews 🟡 0/1
Denominator calculation
Show calculation
1 if user is bot 0
1 if user is external 0
2 if repo is sensitive 0
From .codeflow.yml 1
Additional review requirements
Show calculation
Max 0
0
From CODEOWNERS 0
Global minimum 0
Max 1
1
1 if commit is unverified 1
Sum 2

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
base Ignored Ignored Apr 16, 2026 11:53pm

Request Review

Comment on lines +137 to +143
let start_time = std::time::Instant::now();

// Signal that all state updates are done.
state_hook.finish();

let state_provider = state.database.as_ref();
let hashed_state = state_provider.hashed_post_state(&state.bundle_state);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential correctness issue: hashed_state computed before merge_transitions

hashed_post_state(&state.bundle_state) is called here, before build_block calls state.merge_transitions(BundleRetention::Reverts). In reth's State model, individual commit() calls push changes into transition_state, and merge_transitions() merges them into bundle_state. If there are un-merged transitions at this point, the hashed_state won't reflect all state changes.

This hashed_state is then passed to build_block and stored in BuiltPayloadExecutedBlock, which the engine uses for state persistence. If it's stale (pre-merge), the engine could persist an inconsistent view.

Consider calling state.merge_transitions(BundleRetention::Reverts) before computing the hashed state, or moving this computation to after build_block merges transitions. Alternatively, verify that all transitions are already merged into bundle_state by this point (i.e., no pending transitions exist).

Comment on lines +145 to +146
if let Some(rx) = state_root_handle {
match rx.recv() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking recv() on a tokio worker thread

rx.recv() is std::sync::mpsc::Receiver::recv(), which blocks the current thread. This is called from finalize_payload, which is invoked from async fn build_payload. Blocking a tokio worker thread while waiting for the sparse trie task can stall other tasks on the same runtime, especially under load.

Consider using tokio::sync::oneshot instead of std::sync::mpsc for StateRootHandle, or wrapping this in tokio::task::spawn_blocking / tokio::task::block_in_place.

Comment on lines +98 to +105
let env = reth_engine_tree::tree::ExecutionEnv {
evm_env: Default::default(),
hash: B256::ZERO,
parent_hash,
parent_state_root,
transaction_count: 0,
withdrawals: None,
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

evm_env: Default::default() and hash: B256::ZERO — these are placeholder values for the ExecutionEnv since the builder manages its own execution loop. This works because the empty tx list means the processor won't actually execute anything with these fields. However, this coupling to internal PayloadProcessor behavior is fragile — if PayloadProcessor::spawn starts validating or using these fields (e.g., for logging, caching, or consistency checks), this will silently produce wrong results. A brief // SAFETY: or // INVARIANT: comment documenting why the defaults are correct would help future readers.

pub config: BuilderConfig,
/// Reusable payload processor — warm sparse trie persisted across blocks.
#[debug(skip)]
payload_processor: Arc<std::sync::Mutex<PayloadProcessor<BaseEvmConfig>>>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arc<std::sync::Mutex<PayloadProcessor>> — the lock is correctly scoped (dropped before any .await), but since OpPayloadBuilder is Clone (and payload_processor is Arc), concurrent build_payload calls would contend on this lock. If the service guarantees single-active-builder at a time, that's fine, but worth a comment noting the exclusivity assumption.


// PayloadProcessor is reused across blocks for warm sparse trie.
let evm_config = BaseEvmConfig::optimism(ctx.chain_spec());
let tree_config = TreeConfig::default();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TreeConfig::default() — the benchmark uses TreeConfig::default().with_disable_sparse_trie_cache_pruning(true), but production here uses the plain default (which has pruning enabled). If the benchmark is meant to reflect production, one of these should change. If cache pruning is intentionally enabled in production but disabled in benchmarks to keep the trie warm, that's fine but worth documenting the discrepancy.

@github-actions
Copy link
Copy Markdown
Contributor

Review Summary

This PR integrates reth's PayloadProcessor sparse trie task to compute state roots in parallel during block building. The overall approach is sound — pipelining trie multiproof computation alongside transaction execution is a meaningful latency optimization. The fallback to synchronous computation on task failure is a good safety net. Here are the findings:

Correctness concern (high)

hashed_state computed before merge_transitions (state_root_task.rs:137-143): finalize_state_root reads state.bundle_state via hashed_post_state() before build_block calls merge_transitions(). In reth's State, individual commit() calls push into transition_state, and merge_transitions merges them into bundle_state. If transitions haven't been merged yet, the hashed_state stored in BuiltPayloadExecutedBlock will be stale. This could cause the engine to persist an inconsistent state view. Please verify that all transitions are already in bundle_state by this point, or move the hashed state computation to after the merge.

Concurrency concerns (medium)

  1. Blocking recv() on tokio worker (state_root_task.rs:146): std::sync::mpsc::Receiver::recv() blocks the current thread. Since this is called from an async fn, it can stall other tokio tasks. Consider tokio::sync::oneshot or wrapping in block_in_place.

  2. Arc<std::sync::Mutex<PayloadProcessor>> (payload.rs:112): Lock is correctly scoped before any .await, but since OpPayloadBuilder is Clone, concurrent build_payload calls would contend. Worth documenting the single-active-builder assumption if that holds.

Robustness / maintainability (low)

  • Placeholder ExecutionEnv fields (state_root_task.rs:98-105): evm_env: Default::default() and hash: B256::ZERO rely on the empty tx list preventing the processor from using these values. This coupling to PayloadProcessor internals is fragile; an invariant comment would help.

  • TreeConfig discrepancy (service.rs:55): Production uses TreeConfig::default() (pruning enabled) while benchmarks use with_disable_sparse_trie_cache_pruning(true). If the intent is to match production behavior in benchmarks, one should change.

Trait bound expansion (note)

ClientBounds now requires DatabaseProviderFactory<Provider: ...>, StateReader, and 'static. This is a significant widening of the trait surface for callers. The bounds appear necessary for OverlayStateProviderFactory and StateProviderBuilder, but this tightens the coupling between the builder crate and reth internals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants