Skip to content

Revamp Message Pool Nonce calculation#6788

Merged
sudo-shashank merged 60 commits intomainfrom
shashank/revamp-mpool-nonce-calc
Apr 9, 2026
Merged

Revamp Message Pool Nonce calculation#6788
sudo-shashank merged 60 commits intomainfrom
shashank/revamp-mpool-nonce-calc

Conversation

@sudo-shashank
Copy link
Copy Markdown
Contributor

@sudo-shashank sudo-shashank commented Mar 25, 2026

Summary of changes

Changes introduced in this pull request:

  • Nonce calculation:

    • get_state_sequence scans messages included in the current tipset to find the true highest nonce, rather than relying solely on actor.sequence which can lag.
    • Add a (TipsetKey, Address)-keyed state nonce cache to avoid redundant tipset scans.
  • Nonce serialization:

    • Add per-sender MpoolLocker preventing concurrent MpoolPushMessage requests for the same sender from reading stale gas/balance/nonce state. Different senders proceed in parallel.
    • Add NonceTracker with a global mutex that serializes the get_sequence + sign + push triplet, ensuring sequential nonce assignment across all senders.
  • Nonce gap enforcement:

    • MsgSet::add gains a strict flag that rejects messages exceeding MAX_NONCE_GAP (4 for trusted, 0 for untrusted) and blocks replace-by-fee when a gap exists. Reorg re-additions use strict=false.
    • Early balance check in MpoolPushMessage to reject before consuming a nonce.
  • CLI:

    • forest-wallet list now shows a Nonce column with table formatting.

Reference issue to close (if applicable)

Closes #4899
Closes #3628
Closes #2927

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Outside contributions

  • I have read and agree to the CONTRIBUTING document.
  • I have read and agree to the AI Policy document. I understand that failure to comply with the guidelines will lead to rejection of the pull request.

Summary by CodeRabbit

  • Bug Fixes

    • Fixed message-pool nonce calculation to align with Lotus.
  • New Features

    • Persistent global nonce tracking to serialize nonce assignment.
    • Per-sender locking to serialize concurrent sends.
    • Delegated signing and Filecoin→Ethereum address send support.
    • Wallet balance validation before sending transactions.
  • Improvements

    • Stricter nonce-gap and replace-by-fee behavior; pending messages keyed by canonical addresses.
    • Wallet list shows a formatted table with balances and nonces.
  • Tests

    • Expanded unit/integration tests; added timeouts, remote-wallet flows, and concurrency validations.
  • Chores

    • CI workflow: re-enabled and sequenced delegated wallet checks; aggregated status updated.

@sudo-shashank sudo-shashank added the RPC requires calibnet RPC checks to run on CI label Mar 25, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 25, 2026

Walkthrough

Adds per-sender async locking and a global nonce serializer, threads ID→key resolution and nonce/state caches through the message pool, changes nonce-gap/RBF semantics, delegates signing+push (with balance checks) to a NonceTracker, and wires new components into RPC, daemon, tools, tests, and CI.

Changes

Cohort / File(s) Summary
CI & Changelog
​.github/workflows/forest.yml, CHANGELOG.md
Re-enable calibnet-delegated-wallet-check, add shared concurrency group and dependencies; add changelog entry.
Wallet Test Scripts
scripts/tests/calibnet_wallet_check.sh, scripts/tests/calibnet_delegated_wallet_check.sh
Fix wallet-list parsing, re-enable ETH-address flows, adjust funding amounts, add remote self-send test, and add balance-poll retry loops with timeouts and explicit failures.
RPC State & Init (daemon/tools/tests)
src/rpc/mod.rs, src/daemon/mod.rs, src/rpc/methods/sync.rs, src/tool/.../generate_test_snapshot.rs, src/tool/.../test_snapshot.rs, src/tool/offline_server/server.rs
Add mpool_locker and nonce_tracker to RPCState and initialize them in daemon, offline server, tools, and test contexts.
Key Management & Wallet CLI
src/key_management/wallet_helpers.rs, src/wallet/subcommands/wallet_cmd.rs, src/rpc/methods/wallet.rs
Introduce sign_message handling delegated (EIP‑1559) signing and tests; switch RPC/CLI to use helper signing; update wallet list rendering and error handling.
Message Pool Core & Selection
src/message_pool/msgpool/msg_pool.rs, src/message_pool/msgpool/mod.rs, src/message_pool/msgpool/selection.rs, src/message_pool/msgpool/*
Thread key-resolution and state-nonce caches, index pending by resolved key-address, add gap-filling and StrictnessPolicy-based nonce-gap/RBF rules, refactor head-change/republish and tests.
Provider Trait & Test Provider
src/message_pool/msgpool/provider.rs, src/message_pool/msgpool/test_provider.rs
Extend Provider with resolve_to_key and messages_for_tipset; test provider implements key mapping and returns signed messages for tipsets.
Mpool Locker & Nonce Tracker
src/message_pool/mpool_locker.rs, src/message_pool/nonce_tracker.rs, src/message_pool/mod.rs, src/message_pool/errors.rs
Add MpoolLocker (per-address async mutex map) and NonceTracker (global nonce serialization/sign_and_push), re-export them, and add Error::NonceGap.
RPC Mpool Handler
src/rpc/methods/mpool.rs
MpoolPushMessage now acquires per-sender lock, checks wallet balance, and delegates nonce selection, signing and push to NonceTracker::sign_and_push (uses eth_chain_id).
Tests & Harnesses
src/message_pool/msgpool/*, src/rpc/methods/sync.rs
Refactor test harnesses to use MessagePool wrappers (apply_head_change), add ID→key resolution tests, nonce-cache behavior, nonce-gap/RBF semantics, and concurrency tests for sign_and_push and mpool locking.

Sequence Diagram

sequenceDiagram
    participant Client as Client
    participant RPC as RPC_Handler
    participant Locker as MpoolLocker
    participant Tracker as NonceTracker
    participant Mpool as MessagePool
    participant Provider as Provider

    Client->>RPC: MpoolPushMessage(message)
    RPC->>Locker: take_lock(sender_address)
    Locker-->>RPC: OwnedMutexGuard (locked)
    RPC->>Tracker: sign_and_push(message, key, eth_chain_id)
    Note right of Tracker: acquire global mutex (serialize nonces)
    Tracker->>Mpool: get_sequence(resolved_key, cur_ts)
    Mpool->>Provider: resolve_to_key(sender, ts)
    Provider-->>Mpool: canonical key address
    Tracker->>Tracker: sign message with assigned nonce
    Tracker->>Mpool: push(signed_message)
    Mpool-->>Tracker: push OK
    Tracker-->>RPC: SignedMessage
    RPC-->>Client: result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • akaladarshi
  • LesnyRumcajs
  • hanabi1224
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Revamp Message Pool Nonce calculation' clearly and specifically summarizes the main change in the changeset.
Linked Issues check ✅ Passed The PR comprehensively addresses all requirements from linked issues: nonce calculation alignment via tipset scanning, per-sender/global nonce serialization, nonce-gap enforcement, early balance checks, and improved CLI visibility.
Out of Scope Changes check ✅ Passed All changes are within scope: GitHub Actions workflow updates, test script improvements, message pool refactoring, nonce tracking/locking, RPC handler updates, and CLI enhancements align directly with the stated PR objectives.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch shashank/revamp-mpool-nonce-calc
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch shashank/revamp-mpool-nonce-calc

Comment @coderabbitai help to get the list of available commands and usage tips.

@sudo-shashank sudo-shashank marked this pull request as ready for review March 25, 2026 06:56
@sudo-shashank sudo-shashank requested a review from a team as a code owner March 25, 2026 06:56
@sudo-shashank sudo-shashank requested review from LesnyRumcajs and hanabi1224 and removed request for a team March 25, 2026 06:56
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/tool/subcommands/api_cmd/generate_test_snapshot.rs (1)

146-160: ⚠️ Potential issue | 🟠 Major

Don't bind NonceTracker to the shared snapshot-generation DB.

load_db() gives every snapshot run the same Arc<ReadOpsTrackingStore<ManyCar<ParityDb>>>, and this wrapper writes settings through to inner, not tracker. With nonce_tracker pointed at state_manager.blockstore_owned(), persisted nonce-cache updates will bleed into later snapshot generations and may still be absent from the exported minimal snapshot unless those keys get read again. This will make stateful fixtures order-dependent once MpoolPushMessage / MpoolGetNonce snapshots are added.

Based on learnings, "when using ReadOpsTrackingStore to generate minimal snapshots, HEAD_KEY should be written to db.tracker (not db itself) before calling export_forest_car(), because the export reads from the tracker MemoryDB which accumulates only the accessed data during computation."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tool/subcommands/api_cmd/generate_test_snapshot.rs` around lines 146 -
160, NonceTracker is being constructed against state_manager.blockstore_owned()
which points at the full DB; instead bind the nonce cache to the
ReadOpsTrackingStore tracker used for snapshot generation so persisted nonce
writes go into the ephemeral tracker not the underlying DB. Change the
NonceTracker creation to use the tracking store (the db.tracker returned by
load_db() / the ReadOpsTrackingStore wrapper) rather than
state_manager.blockstore_owned(), and ensure HEAD_KEY is written to db.tracker
(tracker) before calling export_forest_car() so the export reads the tracked
(accessed) keys only.
src/message_pool/msgpool/mod.rs (1)

324-342: ⚠️ Potential issue | 🟠 Major

Canonicalize the rmsgs key as well.

After this refactor pending is keyed by resolved key address, but Lines 333-342 still probe rmsgs by raw from. On a revert/apply where the same actor shows up as f0… on one side and key-address form on the other, the apply path misses the reverted entry, removes the pending entry instead, and then re-adds the reverted message at the end.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/message_pool/msgpool/mod.rs` around lines 324 - 342, The function
remove_from_selected_msgs is checking rmsgs by the raw from address while
pending is keyed by resolved key addresses; call resolve_to_key(api, key_cache,
from, cur_ts) once at the start of the lookup and then use the resolved Address
for both rmsgs and pending accesses (i.e., replace rmsgs.get_mut(from) with
rmsgs.get_mut(&resolved) and use resolved in the remove(...) calls), ensuring
you still propagate the Result error from resolve_to_key and keep the existing
logic for removing the sequence from the temp map when present.
🧹 Nitpick comments (3)
scripts/tests/calibnet_wallet_check.sh (1)

76-76: Avoid scraping the human list table for ADDR_ONE.

This script already had to change once because the list layout moved. Pulling the address out of tail | cut keeps the test coupled to presentation-only output and will break again on the next formatting tweak. Prefer a command or output mode that returns just the address.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/tests/calibnet_wallet_check.sh` at line 76, The current extraction of
ADDR_ONE by piping "$FOREST_WALLET_PATH list | tail -1 | cut -d ' ' -f2" scrapes
the human-formatted table and is fragile; change the ADDR_ONE assignment to use
a machine-readable/listing option or output mode from $FOREST_WALLET_PATH (e.g.,
a flag that emits just addresses or JSON) and parse that output (or use a
--quiet/--format option) to reliably select the desired address, so the script
uses the wallet CLI's non-presentational output instead of scraping the printed
table.
src/message_pool/mpool_locker.rs (1)

17-22: Document MpoolLocker::new().

The public constructor is the only exposed MpoolLocker API here without rustdoc.

As per coding guidelines, "Document public functions and structs with doc comments".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/message_pool/mpool_locker.rs` around lines 17 - 22, Add a doc comment for
the public constructor MpoolLocker::new() describing what a MpoolLocker
represents and what the new() instance contains/initializes (e.g., an empty
inner Mutex<HashMap> used to track per-message-pool locks), include any
thread-safety or ownership notes relevant to callers, and place the triple-slash
/// doc immediately above the impl block or the new() function so rustdoc will
document it alongside the MpoolLocker API.
src/message_pool/msgpool/test_provider.rs (1)

229-244: Consider adding BLS message support for completeness.

The messages_for_tipset implementation only collects signed messages. While this is consistent with TestApi's internal storage (which only stores SignedMessage), the production ChainStore::messages_for_tipset collects both unsigned BLS and signed SECP messages via BlockMessages::for_tipset.

This discrepancy is acceptable for current tests since they primarily use SECP messages, but if future tests require BLS message handling during head changes, this may need enhancement.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/message_pool/msgpool/test_provider.rs` around lines 229 - 244,
messages_for_tipset currently only returns signed messages (using inner.bmsgs
and ChainMessage::Signed), so add support for BLS/unsigned messages by extending
TestApi's storage and collecting them in the same loop: add a container for
BLS/unsigned messages (e.g., inner.bls_msgs or similar) and in
messages_for_tipset iterate blocks' CIDs to append both signed messages
(ChainMessage::Signed(Arc::new(...))) and unsigned/BLS messages
(ChainMessage::Unsigned or the appropriate ChainMessage variant for BLS) into
msgs before returning; update any TestApi methods that populate messages so
tests can insert BLS messages into the new storage.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CHANGELOG.md`:
- Line 40: The changelog entry currently references the PR number; update the
entry in CHANGELOG.md so it links to the issue `#4899` instead of PR `#6788`
(replace the PR link "[`#6788`](...)" with the issue link "[`#4899`](...)" and keep
the description "Fixed message pool nonce calculation to align with Lotus.") to
follow the repo convention of preferring issue references when both issue and PR
exist.

In `@scripts/tests/calibnet_delegated_wallet_check.sh`:
- Around line 87-100: The loop that polls DELEGATE_ADDR_REMOTE_THREE_BALANCE can
false-pass because it only compares to the prior observed balance and doesn’t
ensure MSG_DELEGATE_FOUR actually landed; change the logic in the block using
MSG_DELEGATE_FOUR, DELEGATE_ADDR_REMOTE_THREE_BALANCE and
DELEGATE_ADDR_THREE_BALANCE to either (A) wait for the message CID in
MSG_DELEGATE_FOUR to be included/confirmed before returning (preferred), or (B)
if keeping balance-polling, fail the script when the loop reaches the 20 retry
timeout instead of continuing silently (exit non-zero and log an error), and
keep updating DELEGATE_ADDR_REMOTE_THREE_BALANCE via $FOREST_WALLET_PATH
--remote-wallet balance each iteration; ensure the chosen approach references
MSG_DELEGATE_FOUR for waiting or uses an explicit exit on timeout.

In `@scripts/tests/calibnet_wallet_check.sh`:
- Around line 157-181: The loops currently only compare balances
(ETH_ADDR_TWO_BALANCE / ETH_ADDR_THREE_BALANCE) and will silently succeed after
retries even if the send commands (MSG_ETH, MSG_ETH_REMOTE) never produced a
message CID; change the logic to wait on the actual returned message CIDs from
MSG_ETH and MSG_ETH_REMOTE (extract the CID from MSG_ETH / MSG_ETH_REMOTE) and
poll the node/wallet for that CID confirmation, or at minimum make the 20-retry
timeout fatal by exiting non-zero if the CID was not observed; update the two
loops that reference ETH_ADDR_TWO_BALANCE and ETH_ADDR_THREE_BALANCE to use the
extracted CID variables and fail with exit 1 on timeout so the script cannot
pass silently.

In `@src/message_pool/mpool_locker.rs`:
- Around line 52-74: Tests use timing sleeps which makes them flaky; replace the
sleep-based handshakes in the two test blocks (the tasks spawned as t1/t2 that
call locker.take_lock and use first_entered/first_released/second_saw_first)
with a deterministic sync primitive such as tokio::sync::Notify or
tokio::sync::Barrier: have the first task signal (notify.notify_one() or
barrier.wait().await) immediately after it acquires the lock (where it currently
sets entered.store) and have the second task await that signal before attempting
its assertion (instead of sleeping), and likewise use a notify or barrier to
signal release instead of the 20ms/100ms sleeps; apply the same change to the
other test block referenced (the one around lines 94-116) so both tests no
longer rely on timing.

In `@src/message_pool/msgpool/msg_pool.rs`:
- Around line 307-315: The code masks resolve_to_key failures by using
unwrap_or(msg.from()), which can undercount nonces; change the scan to propagate
resolution errors instead: call resolve_to_key(api, key_cache, &msg.from(),
cur_ts)? (or otherwise handle the Result and return Err) instead of unwrap_or,
and adjust the surrounding function's signature/return path to propagate the
error; refer to resolve_to_key, messages_for_tipset, msg.from(), and next_nonce
when making this change so the failure during the current tipset scan is not
silently converted to the original address.

In `@src/message_pool/nonce_tracker.rs`:
- Around line 81-87: The code currently pushes the signed message with
mpool.push(smsg.clone()).await? and then calls self.save_nonce(&message.from,
nonce)? which can return an error and cause MpoolPushMessage to report failure
despite the message already being in the mempool; change this to make
persistence best-effort: call self.save_nonce(&message.from, nonce) but do not
propagate its error — instead catch/log the error (e.g. with error!/warn!) and
still return Ok(smsg). Keep the same call order (sign_message, mpool.push,
save_nonce) but ensure save_nonce failures do not convert the whole operation
into an error.

In `@src/rpc/methods/mpool.rs`:
- Around line 301-306: MpoolGetNonce is reading the nonce directly from mpool
via mpool.get_sequence while WalletSignMessage/MpoolPush use
ctx.nonce_tracker.sign_and_push, causing mismatch after restarts; change
MpoolGetNonce to query the same NonceTracker (e.g., call the nonce-tracking
getter on ctx.nonce_tracker instead of mpool.get_sequence) so both flows use the
same source, and ensure any other code paths that return sequence numbers
(mpool.get_sequence, MpoolPushMessage) are routed through NonceTracker APIs to
keep nonce state consistent.

---

Outside diff comments:
In `@src/message_pool/msgpool/mod.rs`:
- Around line 324-342: The function remove_from_selected_msgs is checking rmsgs
by the raw from address while pending is keyed by resolved key addresses; call
resolve_to_key(api, key_cache, from, cur_ts) once at the start of the lookup and
then use the resolved Address for both rmsgs and pending accesses (i.e., replace
rmsgs.get_mut(from) with rmsgs.get_mut(&resolved) and use resolved in the
remove(...) calls), ensuring you still propagate the Result error from
resolve_to_key and keep the existing logic for removing the sequence from the
temp map when present.

In `@src/tool/subcommands/api_cmd/generate_test_snapshot.rs`:
- Around line 146-160: NonceTracker is being constructed against
state_manager.blockstore_owned() which points at the full DB; instead bind the
nonce cache to the ReadOpsTrackingStore tracker used for snapshot generation so
persisted nonce writes go into the ephemeral tracker not the underlying DB.
Change the NonceTracker creation to use the tracking store (the db.tracker
returned by load_db() / the ReadOpsTrackingStore wrapper) rather than
state_manager.blockstore_owned(), and ensure HEAD_KEY is written to db.tracker
(tracker) before calling export_forest_car() so the export reads the tracked
(accessed) keys only.

---

Nitpick comments:
In `@scripts/tests/calibnet_wallet_check.sh`:
- Line 76: The current extraction of ADDR_ONE by piping "$FOREST_WALLET_PATH
list | tail -1 | cut -d ' ' -f2" scrapes the human-formatted table and is
fragile; change the ADDR_ONE assignment to use a machine-readable/listing option
or output mode from $FOREST_WALLET_PATH (e.g., a flag that emits just addresses
or JSON) and parse that output (or use a --quiet/--format option) to reliably
select the desired address, so the script uses the wallet CLI's
non-presentational output instead of scraping the printed table.

In `@src/message_pool/mpool_locker.rs`:
- Around line 17-22: Add a doc comment for the public constructor
MpoolLocker::new() describing what a MpoolLocker represents and what the new()
instance contains/initializes (e.g., an empty inner Mutex<HashMap> used to track
per-message-pool locks), include any thread-safety or ownership notes relevant
to callers, and place the triple-slash /// doc immediately above the impl block
or the new() function so rustdoc will document it alongside the MpoolLocker API.

In `@src/message_pool/msgpool/test_provider.rs`:
- Around line 229-244: messages_for_tipset currently only returns signed
messages (using inner.bmsgs and ChainMessage::Signed), so add support for
BLS/unsigned messages by extending TestApi's storage and collecting them in the
same loop: add a container for BLS/unsigned messages (e.g., inner.bls_msgs or
similar) and in messages_for_tipset iterate blocks' CIDs to append both signed
messages (ChainMessage::Signed(Arc::new(...))) and unsigned/BLS messages
(ChainMessage::Unsigned or the appropriate ChainMessage variant for BLS) into
msgs before returning; update any TestApi methods that populate messages so
tests can insert BLS messages into the new storage.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9c53ddd6-e120-4024-a80f-227d8c503f0b

📥 Commits

Reviewing files that changed from the base of the PR and between 7868455 and e25fabd.

📒 Files selected for processing (24)
  • .github/workflows/forest.yml
  • CHANGELOG.md
  • docs/docs/developers/guides/nonce_handling.md
  • scripts/tests/calibnet_delegated_wallet_check.sh
  • scripts/tests/calibnet_wallet_check.sh
  • src/daemon/mod.rs
  • src/key_management/wallet_helpers.rs
  • src/message_pool/errors.rs
  • src/message_pool/mod.rs
  • src/message_pool/mpool_locker.rs
  • src/message_pool/msgpool/mod.rs
  • src/message_pool/msgpool/msg_pool.rs
  • src/message_pool/msgpool/provider.rs
  • src/message_pool/msgpool/selection.rs
  • src/message_pool/msgpool/test_provider.rs
  • src/message_pool/nonce_tracker.rs
  • src/rpc/methods/mpool.rs
  • src/rpc/methods/sync.rs
  • src/rpc/methods/wallet.rs
  • src/rpc/mod.rs
  • src/tool/offline_server/server.rs
  • src/tool/subcommands/api_cmd/generate_test_snapshot.rs
  • src/tool/subcommands/api_cmd/test_snapshot.rs
  • src/wallet/subcommands/wallet_cmd.rs
💤 Files with no reviewable changes (1)
  • .github/workflows/forest.yml

@sudo-shashank sudo-shashank marked this pull request as draft March 25, 2026 07:15
Copy link
Copy Markdown
Member

@LesnyRumcajs LesnyRumcajs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from existing comments, no other big concerns. I'll let @hanabi1224 give a final sign-off.

hanabi1224
hanabi1224 previously approved these changes Apr 8, 2026
Copy link
Copy Markdown
Contributor

@hanabi1224 hanabi1224 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a few NITs

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/message_pool/msgpool/msg_pool.rs (1)

341-358: ⚠️ Potential issue | 🟠 Major

Propagate tipset-scan failures instead of falling back to actor.sequence.

If messages_for_tipset or either resolve_to_key call fails here, the function still caches and returns the actor-state nonce. That undercounts messages already included in the current tipset and can hand out a stale nonce again—the exact failure this refactor is meant to eliminate. Please return the error instead of warn-and-continue.

Suggested direction
-    if let (Ok(resolved), Ok(messages)) = (
-        resolve_to_key(api, key_cache, addr, cur_ts)
-            .inspect_err(|e| tracing::warn!(%addr, "failed to resolve address to key: {e:#}")),
-        api.messages_for_tipset(cur_ts)
-            .inspect_err(|e| tracing::warn!("failed to get messages for tipset: {e:#}")),
-    ) {
-        for msg in messages.iter() {
-            if let Ok(from) = resolve_to_key(api, key_cache, &msg.from(), cur_ts).inspect_err(
-                |e| tracing::warn!(from = %msg.from(), "failed to resolve message sender: {e:#}"),
-            ) && from == resolved
-            {
-                let n = msg.sequence() + 1;
-                if n > next_nonce {
-                    next_nonce = n;
-                }
-            }
-        }
-    }
+    let resolved = resolve_to_key(api, key_cache, addr, cur_ts)?;
+    let messages = api.messages_for_tipset(cur_ts)?;
+    for msg in messages.iter() {
+        let from = resolve_to_key(api, key_cache, &msg.from(), cur_ts)?;
+        if from == resolved {
+            let n = msg.sequence() + 1;
+            if n > next_nonce {
+                next_nonce = n;
+            }
+        }
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/message_pool/msgpool/msg_pool.rs` around lines 341 - 358, The tipset-scan
currently swallows failures by using inspect_err and falling back to
actor.sequence; change it to propagate errors instead: replace the
inspect_err-warn usages on resolve_to_key(...) and
api.messages_for_tipset(cur_ts) with the `?` operator so that failures in
resolve_to_key or messages_for_tipset (and the inner resolve_to_key when
checking msg.from()) return Err from the surrounding function rather than
logging and continuing; ensure the surrounding function's signature returns
Result so the `?` compiles and update any call sites if needed.
🧹 Nitpick comments (3)
.github/workflows/forest.yml (1)

221-223: Shared concurrency group causes pending job replacement across runs.

Both calibnet-wallet-check and calibnet-delegated-wallet-check use group: calibnet-wallet-tests. GitHub Actions concurrency allows only one running and one pending job per group; when a new job queues while one is pending, the older pending job is canceled and replaced. This can disrupt wallet test scheduling across PR activity.

Consider consolidating the wallet and delegated wallet checks into a single job, or narrow the concurrency group scope if cross-run serialization is not necessary.

Also applies to: 250-252

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/forest.yml around lines 221 - 223, The shared concurrency
group "calibnet-wallet-tests" causes pending job replacement for the two jobs
calibnet-wallet-check and calibnet-delegated-wallet-check; fix by either merging
those two jobs into a single composite job (combine their steps under one job
name) or give each job a narrower/dedicated concurrency.group (e.g., unique
group names per job or include a scope like branch/PR id) so GitHub Actions will
not cancel pending runs across runs; update the concurrency.group value(s)
accordingly for the jobs referenced (calibnet-wallet-check,
calibnet-delegated-wallet-check).
src/rpc/methods/mpool.rs (1)

285-287: Avoid routing the balance preflight through another RPC handler.

Calling WalletBalance::handle(...) here pulls request-scoped RPC plumbing into core mpool logic and duplicates the balance/required-funds path inside MpoolPushMessage. A small shared helper for the balance lookup would keep this path transport-agnostic and reduce drift with the regular mpool validation flow.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rpc/methods/mpool.rs` around lines 285 - 287, The mpool code is calling
WalletBalance::handle(...) inside MpoolPushMessage which pulls request-scoped
RPC plumbing into core mpool logic; extract the balance lookup into a small
transport-agnostic helper (e.g., wallet::lookup_balance or
mpool::get_wallet_balance) that accepts the minimal inputs (context/state and
Address) and returns the balance, then replace the WalletBalance::handle call in
MpoolPushMessage with that helper and compute required_funds = message.value +
message.gas_fee_cap * message.gas_limit so the mpool logic no longer depends on
RPC handler plumbing.
src/wallet/subcommands/wallet_cmd.rs (1)

558-564: Consider adding error context for chain ID parsing.

If the hex parsing fails, the error won't clearly indicate it was during eth_chain_id parsing. Per coding guidelines, adding .context() improves error diagnostics.

♻️ Suggested improvement
                     let eth_chain_id = u64::from_str_radix(
                         EthChainId::call(&backend.remote, ())
                             .await?
                             .trim_start_matches("0x"),
                         16,
-                    )?;
+                    )
+                    .context("failed to parse eth chain id")?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/wallet/subcommands/wallet_cmd.rs` around lines 558 - 564, The
eth_chain_id hex parsing currently uses u64::from_str_radix(...) which will
produce an opaque parse error if it fails; update the parsing call to attach
contextual information (e.g., using anyhow::Context) so failures indicate they
occurred while parsing eth_chain_id from EthChainId::call(...) -- locate the
EthChainId::call(&backend.remote, ()) sequence and the subsequent
u64::from_str_radix(...) invocation and add .context("parsing eth_chain_id from
hex string returned by EthChainId::call") (or equivalent) to the Result before
the ? so that downstream callers (and the crate::key_management::sign_message
call) get a clear error message.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/message_pool/msgpool/msg_pool.rs`:
- Around line 341-358: The tipset-scan currently swallows failures by using
inspect_err and falling back to actor.sequence; change it to propagate errors
instead: replace the inspect_err-warn usages on resolve_to_key(...) and
api.messages_for_tipset(cur_ts) with the `?` operator so that failures in
resolve_to_key or messages_for_tipset (and the inner resolve_to_key when
checking msg.from()) return Err from the surrounding function rather than
logging and continuing; ensure the surrounding function's signature returns
Result so the `?` compiles and update any call sites if needed.

---

Nitpick comments:
In @.github/workflows/forest.yml:
- Around line 221-223: The shared concurrency group "calibnet-wallet-tests"
causes pending job replacement for the two jobs calibnet-wallet-check and
calibnet-delegated-wallet-check; fix by either merging those two jobs into a
single composite job (combine their steps under one job name) or give each job a
narrower/dedicated concurrency.group (e.g., unique group names per job or
include a scope like branch/PR id) so GitHub Actions will not cancel pending
runs across runs; update the concurrency.group value(s) accordingly for the jobs
referenced (calibnet-wallet-check, calibnet-delegated-wallet-check).

In `@src/rpc/methods/mpool.rs`:
- Around line 285-287: The mpool code is calling WalletBalance::handle(...)
inside MpoolPushMessage which pulls request-scoped RPC plumbing into core mpool
logic; extract the balance lookup into a small transport-agnostic helper (e.g.,
wallet::lookup_balance or mpool::get_wallet_balance) that accepts the minimal
inputs (context/state and Address) and returns the balance, then replace the
WalletBalance::handle call in MpoolPushMessage with that helper and compute
required_funds = message.value + message.gas_fee_cap * message.gas_limit so the
mpool logic no longer depends on RPC handler plumbing.

In `@src/wallet/subcommands/wallet_cmd.rs`:
- Around line 558-564: The eth_chain_id hex parsing currently uses
u64::from_str_radix(...) which will produce an opaque parse error if it fails;
update the parsing call to attach contextual information (e.g., using
anyhow::Context) so failures indicate they occurred while parsing eth_chain_id
from EthChainId::call(...) -- locate the EthChainId::call(&backend.remote, ())
sequence and the subsequent u64::from_str_radix(...) invocation and add
.context("parsing eth_chain_id from hex string returned by EthChainId::call")
(or equivalent) to the Result before the ? so that downstream callers (and the
crate::key_management::sign_message call) get a clear error message.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 85cfbc13-17f4-4508-af1c-3cec93556849

📥 Commits

Reviewing files that changed from the base of the PR and between 28a2068 and 0e9e629.

📒 Files selected for processing (10)
  • .github/workflows/forest.yml
  • CHANGELOG.md
  • scripts/tests/calibnet_delegated_wallet_check.sh
  • src/message_pool/msgpool/mod.rs
  • src/message_pool/msgpool/msg_pool.rs
  • src/message_pool/msgpool/selection.rs
  • src/rpc/methods/mpool.rs
  • src/rpc/methods/wallet.rs
  • src/rpc/mod.rs
  • src/wallet/subcommands/wallet_cmd.rs
✅ Files skipped from review due to trivial changes (1)
  • CHANGELOG.md
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/rpc/methods/wallet.rs
  • src/rpc/mod.rs
  • scripts/tests/calibnet_delegated_wallet_check.sh

@sudo-shashank sudo-shashank enabled auto-merge April 9, 2026 09:35
@sudo-shashank sudo-shashank added this pull request to the merge queue Apr 9, 2026

if let (Ok(resolved), Ok(messages)) = (
resolve_to_key(api, key_cache, addr, cur_ts)
.inspect_err(|e| tracing::warn!(%addr, "failed to resolve address to key: {e:#}")),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we propagate these errors, since message here directly help in nonce calculation?

message.sequence = nonce;

let balance =
super::wallet::WalletBalance::handle(ctx.clone(), (message.from,), extensions).await?;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we direct use the

StateTree::new_from_root(ctx.store_owned(), cid)?
            .get_actor(&address)?
            .map(|it| it.balance.clone().into())
            .unwrap_or_default()

} else {
StrictnessPolicy::Strict
};
self.add_helper(msg, trust_policy, strictness)?;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pass already calculated sequence here so we don't have to recalculate in the add_helper.

pub start_time: chrono::DateTime<chrono::Utc>,
pub snapshot_progress_tracker: SnapshotProgressTracker,
pub shutdown: mpsc::Sender<()>,
pub mpool_locker: crate::message_pool::MpoolLocker,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we store the both of the mpool locks and tracker in mpool itself, because the RPC shouldn't contains the locks and tracker related to an specific API?

What do you guys think @sudo-shashank @LesnyRumcajs ?

Merged via the queue into main with commit 8877124 Apr 9, 2026
57 checks passed
@sudo-shashank sudo-shashank deleted the shashank/revamp-mpool-nonce-calc branch April 9, 2026 10:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

RPC requires calibnet RPC checks to run on CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Revamp Message Pool Nonce calculation Forest mpool should handle chain reorganisation Support a better "insufficient funds" error in send command

5 participants