Revamp Message Pool Nonce calculation by sudo-shashank · Pull Request #6788 · ChainSafe/forest

sudo-shashank · 2026-03-25T05:43:51Z

Summary of changes

Changes introduced in this pull request:

Nonce calculation:
- get_state_sequence now scans messages included in the current tipset to find the true highest nonce, rather than relying solely on actor.sequence which can lag.
- Add a (TipsetKey, Address)-keyed state nonce cache to avoid redundant tipset scans.
Nonce serialization:
- Add per-sender MpoolLocker preventing concurrent requests from reading stale state.
- Add NonceTracker that assigns nonces sequentially and persists them to the datastore, this bridges the gap between "messages broadcast to the network but not yet on-chain" and "node restart wiping in-memory state". without it, any restart while messages are in-flight would cause nonce collisions.
Nonce gap enforcement:
- MsgSet::add gains a strict flag that rejects messages exceeding MAX_NONCE_GAP (4 for trusted, 0 for untrusted) and blocks replace-by-fee when a gap exists. Reorg re-additions use strict=false.
Early balance check in MpoolPushMessage to reject before consuming a nonce.
forest-wallet list cmd now shows a nonce column with table formatting.

Reference issue to close (if applicable)

Closes #4899
Closes #3628

Change checklist

I have performed a self-review of my own code,
I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
I have added tests that prove my fix is effective or that my feature works (if possible),
I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Outside contributions

I have read and agree to the CONTRIBUTING document.
I have read and agree to the AI Policy document. I understand that failure to comply with the guidelines will lead to rejection of the pull request.

Summary by CodeRabbit

Bug Fixes
- Fixed message pool nonce calculation to align with Lotus
New Features
- Added nonce tracking and persistence for message pool operations
- Implemented nonce gap detection in message pool submissions
- Wallet balance validation now occurs before message push operations
- Enhanced wallet list display with formatted table showing address balances and nonces
- Improved delegated address message signing support

coderabbitai · 2026-03-25T05:43:59Z

Walkthrough

This PR implements message pool nonce calculation alignment with Lotus by introducing a NonceTracker component that persists nonces across restarts, adds address resolution with nonce gap detection, and implements per-address async locking via MpoolLocker for RPC serialization.

Changes

Cohort / File(s)	Summary
CI & Changelog `.github/workflows/forest.yml`, `CHANGELOG.md`	Added concurrency policies to wallet check jobs and documented nonce calculation fix for message pool alignment with Lotus.
Wallet Test Scripts `scripts/tests/calibnet_wallet_check.sh`, `scripts/tests/calibnet_delegated_wallet_check.sh`	Updated wallet address field extraction, added Filecoin-to-Ethereum address conversion with balance verification and remote wallet testing, and introduced timeout handling for balance polling.
Nonce Tracking & Concurrency `src/message_pool/nonce_tracker.rs`, `src/message_pool/mpool_locker.rs`	Introduced `NonceTracker` for persisting per-address nonce state in `SettingsStore` with `sign_and_push` workflow, and `MpoolLocker` for per-address async locking to serialize RPC operations.
Message Pool Re-exports `src/message_pool/mod.rs`, `src/message_pool/errors.rs`	Added public exports for `NonceTracker` and `MpoolLocker`; added `NonceGap` error variant for nonce gap detection.
Message Pool Core Logic `src/message_pool/msgpool/mod.rs`, `src/message_pool/msgpool/msg_pool.rs`	Implemented address resolution to canonical key form, added `key_cache` and `state_nonce_cache` LRU caches, introduced nonce gap detection with strict/untrusted modes, threaded caches through `head_change` and message selection, and updated pending map keying to use resolved addresses.
Message Pool Provider `src/message_pool/msgpool/provider.rs`, `src/message_pool/msgpool/test_provider.rs`	Extended `Provider` trait with `resolve_to_key` (ID-to-key resolution) and `messages_for_tipset` methods; updated test implementation with address mapping and message retrieval.
Message Pool Selection `src/message_pool/msgpool/selection.rs`	Updated head-change simulation to accept `key_cache` parameter and refactored test setup to use `apply_head_change` wrapper instead of direct `head_change` calls.
Key Management Signing `src/key_management/wallet_helpers.rs`	Added public `sign_message` helper for delegated and standard key types with EIP-1559 transaction support and comprehensive unit tests.
RPC State & Daemon `src/rpc/mod.rs`, `src/daemon/mod.rs`, `src/rpc/methods/sync.rs`	Added `mpool_locker` and `nonce_tracker` fields to `RPCState` struct; initialized both components in daemon, offline server, and test contexts.
RPC Methods `src/rpc/methods/mpool.rs`, `src/rpc/methods/wallet.rs`	Updated `MpoolPushMessage` to acquire per-address locks, delegate nonce handling to `NonceTracker::sign_and_push`, and validate wallet balance; refactored `WalletSignMessage` to use the new `sign_message` helper.
Tool Commands `src/tool/offline_server/server.rs`, `src/tool/subcommands/api_cmd/generate_test_snapshot.rs`, `src/tool/subcommands/api_cmd/test_snapshot.rs`	Extended RPC state initialization in offline server and test snapshot tools with `NonceTracker` and `MpoolLocker` components.
Wallet CLI `src/wallet/subcommands/wallet_cmd.rs`	Simplified message signing logic to use the new `sign_message` helper; refactored `WalletCommands::List` to fetch actor state concurrently, derive nonce from state, and render output as a formatted table.

Sequence Diagram

sequenceDiagram
    actor Client
    participant RPC Handler
    participant MpoolLocker
    participant NonceTracker
    participant MessagePool
    participant Provider
    participant SettingsStore
    
    Client->>RPC Handler: MpoolPushMessage(message)
    RPC Handler->>MpoolLocker: take_lock(sender_address)
    MpoolLocker->>MpoolLocker: insert or clone per-address Mutex
    MpoolLocker-->>RPC Handler: OwnedMutexGuard (locked)
    
    RPC Handler->>NonceTracker: sign_and_push(message, key)
    NonceTracker->>NonceTracker: acquire global Mutex
    NonceTracker->>MessagePool: get current sequence
    MessagePool->>Provider: resolve_to_key(address)
    Provider-->>MessagePool: canonical key address
    NonceTracker->>SettingsStore: retrieve persisted nonce
    SettingsStore-->>NonceTracker: nonce or default
    NonceTracker->>NonceTracker: compute next_nonce = max(mpool, persisted)
    NonceTracker->>NonceTracker: sign message with nonce
    NonceTracker->>MessagePool: push SignedMessage
    MessagePool-->>NonceTracker: success
    NonceTracker->>SettingsStore: save_nonce(nonce + 1)
    NonceTracker-->>RPC Handler: SignedMessage
    RPC Handler-->>Client: result

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Disable flaky checks in wallet test #6734: Disables flaky calibnet wallet checks in CI; this PR re-enables and improves those same jobs with concurrency policies and script fixes.
refactor: refine msgs_in_tipset cache usage #6782: Refactors ChainStore::messages_for_tipset return type and caching; this PR adds new Provider::messages_for_tipset method implementation and usage.
fix: avoid deep-cloning Tipset and FullTipset #6274: Modifies head_change and republish_pending_messages wiring in message pool; this PR makes related signature and parameter changes to the same functions for cache threading.

Suggested reviewers

LesnyRumcajs
hanabi1224

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Revamp Message Pool Nonce calculation' directly corresponds to the main objective of the PR and accurately summarizes the primary changes across nonce handling, serialization, gap enforcement, and persistence.
Linked Issues check	✅ Passed	The PR comprehensively addresses both linked issues: `#4899` (nonce calculation refactoring with state sequence scanning, caching, strict gap enforcement) and `#3628` (reorg handling via nonce persistence and mpool resend logic) with supporting code changes, tests, and CLI updates.
Out of Scope Changes check	✅ Passed	All changes are directly aligned with stated objectives: nonce calculation improvements, mpool/nonce synchronization primitives, delegated wallet signing support, and CI/test infrastructure updates to support the new functionality.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch shashank/revamp-mpool-nonce-calc

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch shashank/revamp-mpool-nonce-calc

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/tool/subcommands/api_cmd/generate_test_snapshot.rs (1)
146-160: ⚠️ Potential issue | 🟠 Major

Don't bind NonceTracker to the shared snapshot-generation DB.

load_db() gives every snapshot run the same Arc<ReadOpsTrackingStore<ManyCar<ParityDb>>>, and this wrapper writes settings through to inner, not tracker. With nonce_tracker pointed at state_manager.blockstore_owned(), persisted nonce-cache updates will bleed into later snapshot generations and may still be absent from the exported minimal snapshot unless those keys get read again. This will make stateful fixtures order-dependent once MpoolPushMessage / MpoolGetNonce snapshots are added.

Based on learnings, "when using ReadOpsTrackingStore to generate minimal snapshots, HEAD_KEY should be written to db.tracker (not db itself) before calling export_forest_car(), because the export reads from the tracker MemoryDB which accumulates only the accessed data during computation."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tool/subcommands/api_cmd/generate_test_snapshot.rs` around lines 146 -
160, NonceTracker is being constructed against state_manager.blockstore_owned()
which points at the full DB; instead bind the nonce cache to the
ReadOpsTrackingStore tracker used for snapshot generation so persisted nonce
writes go into the ephemeral tracker not the underlying DB. Change the
NonceTracker creation to use the tracking store (the db.tracker returned by
load_db() / the ReadOpsTrackingStore wrapper) rather than
state_manager.blockstore_owned(), and ensure HEAD_KEY is written to db.tracker
(tracker) before calling export_forest_car() so the export reads the tracked
(accessed) keys only.
src/message_pool/msgpool/mod.rs (1)
324-342: ⚠️ Potential issue | 🟠 Major

Canonicalize the rmsgs key as well.

After this refactor pending is keyed by resolved key address, but Lines 333-342 still probe rmsgs by raw from. On a revert/apply where the same actor shows up as f0… on one side and key-address form on the other, the apply path misses the reverted entry, removes the pending entry instead, and then re-adds the reverted message at the end.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/message_pool/msgpool/mod.rs` around lines 324 - 342, The function
remove_from_selected_msgs is checking rmsgs by the raw from address while
pending is keyed by resolved key addresses; call resolve_to_key(api, key_cache,
from, cur_ts) once at the start of the lookup and then use the resolved Address
for both rmsgs and pending accesses (i.e., replace rmsgs.get_mut(from) with
rmsgs.get_mut(&resolved) and use resolved in the remove(...) calls), ensuring
you still propagate the Result error from resolve_to_key and keep the existing
logic for removing the sequence from the temp map when present.

🧹 Nitpick comments (3)

scripts/tests/calibnet_wallet_check.sh (1)
76-76: Avoid scraping the human list table for ADDR_ONE.

This script already had to change once because the list layout moved. Pulling the address out of tail | cut keeps the test coupled to presentation-only output and will break again on the next formatting tweak. Prefer a command or output mode that returns just the address.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/tests/calibnet_wallet_check.sh` at line 76, The current extraction of
ADDR_ONE by piping "$FOREST_WALLET_PATH list | tail -1 | cut -d ' ' -f2" scrapes
the human-formatted table and is fragile; change the ADDR_ONE assignment to use
a machine-readable/listing option or output mode from $FOREST_WALLET_PATH (e.g.,
a flag that emits just addresses or JSON) and parse that output (or use a
--quiet/--format option) to reliably select the desired address, so the script
uses the wallet CLI's non-presentational output instead of scraping the printed
table.
src/message_pool/mpool_locker.rs (1)
17-22: Document MpoolLocker::new().

The public constructor is the only exposed MpoolLocker API here without rustdoc.

As per coding guidelines, "Document public functions and structs with doc comments".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/message_pool/mpool_locker.rs` around lines 17 - 22, Add a doc comment for
the public constructor MpoolLocker::new() describing what a MpoolLocker
represents and what the new() instance contains/initializes (e.g., an empty
inner Mutex<HashMap> used to track per-message-pool locks), include any
thread-safety or ownership notes relevant to callers, and place the triple-slash
/// doc immediately above the impl block or the new() function so rustdoc will
document it alongside the MpoolLocker API.
src/message_pool/msgpool/test_provider.rs (1)
229-244: Consider adding BLS message support for completeness.

The messages_for_tipset implementation only collects signed messages. While this is consistent with TestApi's internal storage (which only stores SignedMessage), the production ChainStore::messages_for_tipset collects both unsigned BLS and signed SECP messages via BlockMessages::for_tipset.

This discrepancy is acceptable for current tests since they primarily use SECP messages, but if future tests require BLS message handling during head changes, this may need enhancement.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/message_pool/msgpool/test_provider.rs` around lines 229 - 244,
messages_for_tipset currently only returns signed messages (using inner.bmsgs
and ChainMessage::Signed), so add support for BLS/unsigned messages by extending
TestApi's storage and collecting them in the same loop: add a container for
BLS/unsigned messages (e.g., inner.bls_msgs or similar) and in
messages_for_tipset iterate blocks' CIDs to append both signed messages
(ChainMessage::Signed(Arc::new(...))) and unsigned/BLS messages
(ChainMessage::Unsigned or the appropriate ChainMessage variant for BLS) into
msgs before returning; update any TestApi methods that populate messages so
tests can insert BLS messages into the new storage.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CHANGELOG.md`:
- Line 40: The changelog entry currently references the PR number; update the
entry in CHANGELOG.md so it links to the issue `#4899` instead of PR `#6788`
(replace the PR link "[`#6788`](...)" with the issue link "[`#4899`](...)" and keep
the description "Fixed message pool nonce calculation to align with Lotus.") to
follow the repo convention of preferring issue references when both issue and PR
exist.

In `@scripts/tests/calibnet_delegated_wallet_check.sh`:
- Around line 87-100: The loop that polls DELEGATE_ADDR_REMOTE_THREE_BALANCE can
false-pass because it only compares to the prior observed balance and doesn’t
ensure MSG_DELEGATE_FOUR actually landed; change the logic in the block using
MSG_DELEGATE_FOUR, DELEGATE_ADDR_REMOTE_THREE_BALANCE and
DELEGATE_ADDR_THREE_BALANCE to either (A) wait for the message CID in
MSG_DELEGATE_FOUR to be included/confirmed before returning (preferred), or (B)
if keeping balance-polling, fail the script when the loop reaches the 20 retry
timeout instead of continuing silently (exit non-zero and log an error), and
keep updating DELEGATE_ADDR_REMOTE_THREE_BALANCE via $FOREST_WALLET_PATH
--remote-wallet balance each iteration; ensure the chosen approach references
MSG_DELEGATE_FOUR for waiting or uses an explicit exit on timeout.

In `@scripts/tests/calibnet_wallet_check.sh`:
- Around line 157-181: The loops currently only compare balances
(ETH_ADDR_TWO_BALANCE / ETH_ADDR_THREE_BALANCE) and will silently succeed after
retries even if the send commands (MSG_ETH, MSG_ETH_REMOTE) never produced a
message CID; change the logic to wait on the actual returned message CIDs from
MSG_ETH and MSG_ETH_REMOTE (extract the CID from MSG_ETH / MSG_ETH_REMOTE) and
poll the node/wallet for that CID confirmation, or at minimum make the 20-retry
timeout fatal by exiting non-zero if the CID was not observed; update the two
loops that reference ETH_ADDR_TWO_BALANCE and ETH_ADDR_THREE_BALANCE to use the
extracted CID variables and fail with exit 1 on timeout so the script cannot
pass silently.

In `@src/message_pool/mpool_locker.rs`:
- Around line 52-74: Tests use timing sleeps which makes them flaky; replace the
sleep-based handshakes in the two test blocks (the tasks spawned as t1/t2 that
call locker.take_lock and use first_entered/first_released/second_saw_first)
with a deterministic sync primitive such as tokio::sync::Notify or
tokio::sync::Barrier: have the first task signal (notify.notify_one() or
barrier.wait().await) immediately after it acquires the lock (where it currently
sets entered.store) and have the second task await that signal before attempting
its assertion (instead of sleeping), and likewise use a notify or barrier to
signal release instead of the 20ms/100ms sleeps; apply the same change to the
other test block referenced (the one around lines 94-116) so both tests no
longer rely on timing.

In `@src/message_pool/msgpool/msg_pool.rs`:
- Around line 307-315: The code masks resolve_to_key failures by using
unwrap_or(msg.from()), which can undercount nonces; change the scan to propagate
resolution errors instead: call resolve_to_key(api, key_cache, &msg.from(),
cur_ts)? (or otherwise handle the Result and return Err) instead of unwrap_or,
and adjust the surrounding function's signature/return path to propagate the
error; refer to resolve_to_key, messages_for_tipset, msg.from(), and next_nonce
when making this change so the failure during the current tipset scan is not
silently converted to the original address.

In `@src/message_pool/nonce_tracker.rs`:
- Around line 81-87: The code currently pushes the signed message with
mpool.push(smsg.clone()).await? and then calls self.save_nonce(&message.from,
nonce)? which can return an error and cause MpoolPushMessage to report failure
despite the message already being in the mempool; change this to make
persistence best-effort: call self.save_nonce(&message.from, nonce) but do not
propagate its error — instead catch/log the error (e.g. with error!/warn!) and
still return Ok(smsg). Keep the same call order (sign_message, mpool.push,
save_nonce) but ensure save_nonce failures do not convert the whole operation
into an error.

In `@src/rpc/methods/mpool.rs`:
- Around line 301-306: MpoolGetNonce is reading the nonce directly from mpool
via mpool.get_sequence while WalletSignMessage/MpoolPush use
ctx.nonce_tracker.sign_and_push, causing mismatch after restarts; change
MpoolGetNonce to query the same NonceTracker (e.g., call the nonce-tracking
getter on ctx.nonce_tracker instead of mpool.get_sequence) so both flows use the
same source, and ensure any other code paths that return sequence numbers
(mpool.get_sequence, MpoolPushMessage) are routed through NonceTracker APIs to
keep nonce state consistent.

---

Outside diff comments:
In `@src/message_pool/msgpool/mod.rs`:
- Around line 324-342: The function remove_from_selected_msgs is checking rmsgs
by the raw from address while pending is keyed by resolved key addresses; call
resolve_to_key(api, key_cache, from, cur_ts) once at the start of the lookup and
then use the resolved Address for both rmsgs and pending accesses (i.e., replace
rmsgs.get_mut(from) with rmsgs.get_mut(&resolved) and use resolved in the
remove(...) calls), ensuring you still propagate the Result error from
resolve_to_key and keep the existing logic for removing the sequence from the
temp map when present.

In `@src/tool/subcommands/api_cmd/generate_test_snapshot.rs`:
- Around line 146-160: NonceTracker is being constructed against
state_manager.blockstore_owned() which points at the full DB; instead bind the
nonce cache to the ReadOpsTrackingStore tracker used for snapshot generation so
persisted nonce writes go into the ephemeral tracker not the underlying DB.
Change the NonceTracker creation to use the tracking store (the db.tracker
returned by load_db() / the ReadOpsTrackingStore wrapper) rather than
state_manager.blockstore_owned(), and ensure HEAD_KEY is written to db.tracker
(tracker) before calling export_forest_car() so the export reads the tracked
(accessed) keys only.

---

Nitpick comments:
In `@scripts/tests/calibnet_wallet_check.sh`:
- Line 76: The current extraction of ADDR_ONE by piping "$FOREST_WALLET_PATH
list | tail -1 | cut -d ' ' -f2" scrapes the human-formatted table and is
fragile; change the ADDR_ONE assignment to use a machine-readable/listing option
or output mode from $FOREST_WALLET_PATH (e.g., a flag that emits just addresses
or JSON) and parse that output (or use a --quiet/--format option) to reliably
select the desired address, so the script uses the wallet CLI's
non-presentational output instead of scraping the printed table.

In `@src/message_pool/mpool_locker.rs`:
- Around line 17-22: Add a doc comment for the public constructor
MpoolLocker::new() describing what a MpoolLocker represents and what the new()
instance contains/initializes (e.g., an empty inner Mutex<HashMap> used to track
per-message-pool locks), include any thread-safety or ownership notes relevant
to callers, and place the triple-slash /// doc immediately above the impl block
or the new() function so rustdoc will document it alongside the MpoolLocker API.

In `@src/message_pool/msgpool/test_provider.rs`:
- Around line 229-244: messages_for_tipset currently only returns signed
messages (using inner.bmsgs and ChainMessage::Signed), so add support for
BLS/unsigned messages by extending TestApi's storage and collecting them in the
same loop: add a container for BLS/unsigned messages (e.g., inner.bls_msgs or
similar) and in messages_for_tipset iterate blocks' CIDs to append both signed
messages (ChainMessage::Signed(Arc::new(...))) and unsigned/BLS messages
(ChainMessage::Unsigned or the appropriate ChainMessage variant for BLS) into
msgs before returning; update any TestApi methods that populate messages so
tests can insert BLS messages into the new storage.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9c53ddd6-e120-4024-a80f-227d8c503f0b

📥 Commits

Reviewing files that changed from the base of the PR and between 7868455 and e25fabd.

📒 Files selected for processing (24)

.github/workflows/forest.yml
CHANGELOG.md
docs/docs/developers/guides/nonce_handling.md
scripts/tests/calibnet_delegated_wallet_check.sh
scripts/tests/calibnet_wallet_check.sh
src/daemon/mod.rs
src/key_management/wallet_helpers.rs
src/message_pool/errors.rs
src/message_pool/mod.rs
src/message_pool/mpool_locker.rs
src/message_pool/msgpool/mod.rs
src/message_pool/msgpool/msg_pool.rs
src/message_pool/msgpool/provider.rs
src/message_pool/msgpool/selection.rs
src/message_pool/msgpool/test_provider.rs
src/message_pool/nonce_tracker.rs
src/rpc/methods/mpool.rs
src/rpc/methods/sync.rs
src/rpc/methods/wallet.rs
src/rpc/mod.rs
src/tool/offline_server/server.rs
src/tool/subcommands/api_cmd/generate_test_snapshot.rs
src/tool/subcommands/api_cmd/test_snapshot.rs
src/wallet/subcommands/wallet_cmd.rs

💤 Files with no reviewable changes (1)

.github/workflows/forest.yml

CHANGELOG.md

scripts/tests/calibnet_delegated_wallet_check.sh

scripts/tests/calibnet_wallet_check.sh

src/message_pool/mpool_locker.rs

src/message_pool/msgpool/msg_pool.rs

src/message_pool/nonce_tracker.rs

src/rpc/methods/mpool.rs

coderabbitai

🧹 Nitpick comments (1)

src/message_pool/nonce_tracker.rs (1)
193-215: Good test coverage for sequential nonce assignment.

The test verifies that consecutive sign_and_push calls assign sequential nonces (0, 1). Consider adding a concurrent test that spawns multiple tasks calling sign_and_push simultaneously to verify the global mutex correctly prevents nonce collisions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/message_pool/nonce_tracker.rs` around lines 193 - 215, Add a new async
test alongside test_sign_and_push_assigns_sequential_nonces that spawns multiple
concurrent tasks which call NonceTracker::sign_and_push using the same tracker,
mpool, wallet key (from make_test_pool_and_wallet) and sender to ensure no two
returned SignedMessage instances share the same sequence; use
make_test_nonce_store() to create the tracker and await all tasks (e.g., via
join_all) then assert that the set of message().sequence values is contiguous
and unique (0..n-1) to verify the global mutex prevents nonce collisions under
concurrency.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/message_pool/nonce_tracker.rs`:
- Around line 193-215: Add a new async test alongside
test_sign_and_push_assigns_sequential_nonces that spawns multiple concurrent
tasks which call NonceTracker::sign_and_push using the same tracker, mpool,
wallet key (from make_test_pool_and_wallet) and sender to ensure no two returned
SignedMessage instances share the same sequence; use make_test_nonce_store() to
create the tracker and await all tasks (e.g., via join_all) then assert that the
set of message().sequence values is contiguous and unique (0..n-1) to verify the
global mutex prevents nonce collisions under concurrency.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ce841c06-e607-4d24-9a46-75e1d2eef067

📥 Commits

Reviewing files that changed from the base of the PR and between e25fabd and de77141.

📒 Files selected for processing (6)

scripts/tests/calibnet_delegated_wallet_check.sh
scripts/tests/calibnet_wallet_check.sh
src/daemon/mod.rs
src/message_pool/mpool_locker.rs
src/message_pool/msgpool/msg_pool.rs
src/message_pool/nonce_tracker.rs

✅ Files skipped from review due to trivial changes (1)

src/message_pool/msgpool/msg_pool.rs

🚧 Files skipped from review as they are similar to previous changes (3)

scripts/tests/calibnet_delegated_wallet_check.sh
src/message_pool/mpool_locker.rs
scripts/tests/calibnet_wallet_check.sh

codecov · 2026-03-25T10:38:03Z

Codecov Report

❌ Patch coverage is 89.12249% with 119 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.24%. Comparing base (5146687) to head (a5f884a).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/wallet/subcommands/wallet_cmd.rs	0.00%	43 Missing ⚠️
src/rpc/methods/mpool.rs	0.00%	18 Missing ⚠️
src/message_pool/msgpool/mod.rs	91.20%	10 Missing and 6 partials ⚠️
src/message_pool/msgpool/msg_pool.rs	97.86%	5 Missing and 4 partials ⚠️
src/message_pool/msgpool/selection.rs	81.25%	9 Missing ⚠️
src/message_pool/nonce_tracker.rs	94.66%	3 Missing and 5 partials ⚠️
src/key_management/wallet_helpers.rs	93.57%	2 Missing and 5 partials ⚠️
src/daemon/mod.rs	0.00%	3 Missing ⚠️
...tool/subcommands/api_cmd/generate_test_snapshot.rs	0.00%	3 Missing ⚠️
src/rpc/methods/wallet.rs	0.00%	2 Missing ⚠️
... and 1 more

Additional details and impacted files

Files with missing lines	Coverage Δ
src/message_pool/errors.rs	`66.66% <ø> (ø)`
src/message_pool/mpool_locker.rs	`100.00% <100.00%> (ø)`
src/message_pool/msgpool/test_provider.rs	`99.02% <100.00%> (+0.71%)`	⬆️
src/rpc/methods/sync.rs	`72.39% <100.00%> (+0.69%)`	⬆️
src/rpc/mod.rs	`89.20% <ø> (+0.71%)`	⬆️
src/tool/offline_server/server.rs	`31.27% <100.00%> (+0.92%)`	⬆️
src/tool/subcommands/api_cmd/test_snapshot.rs	`85.51% <100.00%> (+0.30%)`	⬆️
src/message_pool/msgpool/provider.rs	`57.14% <90.00%> (+16.11%)`	⬆️
src/rpc/methods/wallet.rs	`33.15% <0.00%> (+0.87%)`	⬆️
src/daemon/mod.rs	`27.82% <0.00%> (-0.15%)`	⬇️
... and 8 more

... and 10 files with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5146687...a5f884a. Read the comment docs.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

docs/docs/developers/guides/nonce_handling.md

LesnyRumcajs · 2026-03-26T13:26:25Z

@sudo-shashank Can you please summarise what are the effective logic changes?

LesnyRumcajs · 2026-03-26T13:30:01Z

docs/docs/developers/guides/nonce_handling.md

+proceed in parallel for different senders while the nonce-critical section
+remains serialized.
+
+## Nonce persistence


Why do we need to persiste them?

Not a regular case but If the node restarts while messages are broadcast but not yet on-chain, the mpool would reassign them, causing nonce collisions. Persisting the last-assigned nonce to disk ensures we never reuse an in-flight nonce after a node restart

scripts/tests/calibnet_delegated_wallet_check.sh

src/message_pool/msgpool/mod.rs

sudo-shashank · 2026-03-31T06:48:55Z

@sudo-shashank Can you please summarise what are the effective logic changes?

updated PR description

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

src/message_pool/mpool_locker.rs (1)

26-32: Consider deriving Default alongside new().

clippy::new_without_default suggests implementing Default when a type has a new() constructor taking no arguments.

♻️ Proposed fix

+impl Default for MpoolLocker {
+    fn default() -> Self {
+        Self::new()
+    }
+}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/message_pool/mpool_locker.rs` around lines 26 - 32, Implement Default for
MpoolLocker to satisfy clippy::new_without_default: add an impl Default for
MpoolLocker { fn default() -> Self { Self::new() } } (or derive Default on the
struct if you prefer and the inner types support it) so that
MpoolLocker::default() is equivalent to MpoolLocker::new(); reference symbols:
MpoolLocker and its new() constructor and the Default trait.

src/message_pool/msgpool/mod.rs (2)

267-292: Consider hoisting the tipset read outside the inner loop.

cur_tipset.read().clone() is called for every message in the loop. While not incorrect, this acquires and releases the read lock repeatedly. Hoisting the read outside the block iteration would be slightly more efficient.

Proposed improvement

         for b in ts.block_headers() {
             let Ok((msgs, smsgs)) = api.messages_for_block(b) else {
                 tracing::error!("error retrieving messages for block");
                 continue;
             };
+            let cur_ts = cur_tipset.read().clone();

             for msg in smsgs {
-                let cur_ts = cur_tipset.read().clone();
                 remove_from_selected_msgs(
                     api,
                     key_cache,
                     &cur_ts,
                     // ... rest unchanged

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/message_pool/msgpool/mod.rs` around lines 267 - 292, The loops call
cur_tipset.read().clone() for every message, causing repeated read-lock
acquires; hoist the tipset snapshot outside the inner loops by reading and
cloning cur_tipset once (e.g., let cur_ts = cur_tipset.read().clone();) before
iterating over smsgs and msgs, then pass that cur_ts into
remove_from_selected_msgs; update both loops that call remove_from_selected_msgs
(the one iterating smsgs and the one iterating msgs) to use the single
precomputed cur_ts and keep existing logic around repub/republished and rmsgs
unchanged.

361-368: Consider using entry API to avoid the unwrap().

While the unwrap() on line 366 cannot actually fail (since the key was just inserted on line 365), it would be cleaner to use the entry API to avoid the pattern entirely. As per coding guidelines, unwrap() should be avoided in production code.

Cleaner entry-based approach

 pub(in crate::message_pool) fn add_to_selected_msgs(
     m: SignedMessage,
     rmsgs: &mut HashMap<Address, HashMap<u64, SignedMessage>>,
 ) {
-    let s = rmsgs.get_mut(&m.from());
-    if let Some(s) = s {
-        s.insert(m.sequence(), m);
-    } else {
-        rmsgs.insert(m.from(), HashMap::new());
-        rmsgs.get_mut(&m.from()).unwrap().insert(m.sequence(), m);
-    }
+    rmsgs
+        .entry(m.from())
+        .or_insert_with(HashMap::new)
+        .insert(m.sequence(), m);
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/message_pool/msgpool/mod.rs` around lines 361 - 368, Replace the manual
get_mut/insert/unwrap pattern on rmsgs with the HashMap entry API: use
rmsgs.entry(m.from()).or_insert_with(HashMap::new) (or similar) to obtain a
mutable inner map and then insert m using m.sequence() as the key; update the
block around rmsgs, m.from(), and m.sequence() to avoid any unwrap() calls and
make the insertion atomic via entry().

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/forest.yml:
- Around line 280-285: The umbrella workflow job integration-tests-status is
missing the newly re-enabled calibnet-delegated-wallet-check in its needs list;
update the integration-tests-status job definition to include
"calibnet-delegated-wallet-check" in its needs array so the umbrella job waits
for and reflects the delegated wallet check result (ensure the job name matches
exactly "calibnet-delegated-wallet-check" when adding it to
integration-tests-status's needs).

In `@src/message_pool/msgpool/msg_pool.rs`:
- Around line 323-337: The current combined if-let on resolve_to_key and
api.messages_for_tipset silently skips the tipset scan on any error; change this
to explicitly handle errors from resolve_to_key(api, key_cache, addr, cur_ts)
and api.messages_for_tipset(cur_ts) (e.g. use match or the ? operator) so
failures are returned/propagated instead of ignored; then perform the loop over
messages (examining msg.from(), resolve_to_key for each sender, comparing to the
resolved sender, and updating next_nonce from msg.sequence()+1) only when both
calls succeed, ensuring caller sees the error rather than silently undercounting
nonces.

---

Nitpick comments:
In `@src/message_pool/mpool_locker.rs`:
- Around line 26-32: Implement Default for MpoolLocker to satisfy
clippy::new_without_default: add an impl Default for MpoolLocker { fn default()
-> Self { Self::new() } } (or derive Default on the struct if you prefer and the
inner types support it) so that MpoolLocker::default() is equivalent to
MpoolLocker::new(); reference symbols: MpoolLocker and its new() constructor and
the Default trait.

In `@src/message_pool/msgpool/mod.rs`:
- Around line 267-292: The loops call cur_tipset.read().clone() for every
message, causing repeated read-lock acquires; hoist the tipset snapshot outside
the inner loops by reading and cloning cur_tipset once (e.g., let cur_ts =
cur_tipset.read().clone();) before iterating over smsgs and msgs, then pass that
cur_ts into remove_from_selected_msgs; update both loops that call
remove_from_selected_msgs (the one iterating smsgs and the one iterating msgs)
to use the single precomputed cur_ts and keep existing logic around
repub/republished and rmsgs unchanged.
- Around line 361-368: Replace the manual get_mut/insert/unwrap pattern on rmsgs
with the HashMap entry API: use
rmsgs.entry(m.from()).or_insert_with(HashMap::new) (or similar) to obtain a
mutable inner map and then insert m using m.sequence() as the key; update the
block around rmsgs, m.from(), and m.sequence() to avoid any unwrap() calls and
make the insertion atomic via entry().

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 99615cba-8153-4242-a701-3f306c25e967

📥 Commits

Reviewing files that changed from the base of the PR and between 856be21 and 8294ad5.

📒 Files selected for processing (9)

.github/workflows/forest.yml
CHANGELOG.md
scripts/tests/calibnet_delegated_wallet_check.sh
src/message_pool/mpool_locker.rs
src/message_pool/msgpool/mod.rs
src/message_pool/msgpool/msg_pool.rs
src/message_pool/msgpool/selection.rs
src/message_pool/nonce_tracker.rs
src/tool/subcommands/api_cmd/generate_test_snapshot.rs

✅ Files skipped from review due to trivial changes (1)

CHANGELOG.md

🚧 Files skipped from review as they are similar to previous changes (4)

src/tool/subcommands/api_cmd/generate_test_snapshot.rs
scripts/tests/calibnet_delegated_wallet_check.sh
src/message_pool/msgpool/selection.rs
src/message_pool/nonce_tracker.rs

coderabbitai · 2026-03-31T08:05:41Z

.github/workflows/forest.yml

+    concurrency:
+      group: calibnet-wallet-tests
+      cancel-in-progress: false
    needs:
      - build-ubuntu
+      - calibnet-wallet-check


⚠️ Potential issue | 🟠 Major

Wire this re-enabled job into integration-tests-status.

calibnet-delegated-wallet-check now runs, but the umbrella job at Lines 672-693 still does not list it in needs. integration-tests-status can therefore pass while delegated wallet coverage is failing or skipped.

Patch for the umbrella job

integration-tests-status: needs: - build-macos - build-ubuntu - cargo-publish-dry-run - forest-cli-check - calibnet-check - calibnet-stateless-mode-check - calibnet-stateless-rpc-check - state-migrations-check - calibnet-wallet-check + - calibnet-delegated-wallet-check - calibnet-export-check-v2 - calibnet-no-discovery-checks - calibnet-kademlia-checks - calibnet-eth-mapping-check

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/forest.yml around lines 280 - 285, The umbrella workflow job integration-tests-status is missing the newly re-enabled calibnet-delegated-wallet-check in its needs list; update the integration-tests-status job definition to include "calibnet-delegated-wallet-check" in its needs array so the umbrella job waits for and reflects the delegated wallet check result (ensure the job name matches exactly "calibnet-delegated-wallet-check" when adding it to integration-tests-status's needs).

coderabbitai · 2026-03-31T08:05:41Z

src/message_pool/msgpool/msg_pool.rs

+    if let (Ok(resolved), Ok(messages)) = (
+        resolve_to_key(api, key_cache, addr, cur_ts),
+        api.messages_for_tipset(cur_ts),
+    ) {
+        for msg in &messages {
+            if let Ok(from) = resolve_to_key(api, key_cache, &msg.from(), cur_ts)
+                && from == resolved
+            {
+                let n = msg.sequence() + 1;
+                if n > next_nonce {
+                    next_nonce = n;
+                }
+            }
+        }
+    }


⚠️ Potential issue | 🟠 Major

Don't silently skip the tipset message scan when operations fail.

The if let (Ok(...), Ok(...)) pattern silently ignores errors from resolve_to_key and messages_for_tipset. If either fails, the scan is skipped entirely, potentially undercounting nonces and leading to nonce reuse — the exact failure this refactor aims to prevent.

This is similar to the concern raised in a previous review about masking sender-resolution failures.

Propagate errors instead of silently skipping

- if let (Ok(resolved), Ok(messages)) = ( - resolve_to_key(api, key_cache, addr, cur_ts), - api.messages_for_tipset(cur_ts), - ) { - for msg in &messages { - if let Ok(from) = resolve_to_key(api, key_cache, &msg.from(), cur_ts) - && from == resolved - { - let n = msg.sequence() + 1; - if n > next_nonce { - next_nonce = n; - } - } - } - } + let resolved = resolve_to_key(api, key_cache, addr, cur_ts)?; + let messages = api.messages_for_tipset(cur_ts)?; + for msg in &messages { + let from = resolve_to_key(api, key_cache, &msg.from(), cur_ts)?; + if from == resolved { + let n = msg.sequence() + 1; + if n > next_nonce { + next_nonce = n; + } + } + }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/message_pool/msgpool/msg_pool.rs` around lines 323 - 337, The current combined if-let on resolve_to_key and api.messages_for_tipset silently skips the tipset scan on any error; change this to explicitly handle errors from resolve_to_key(api, key_cache, addr, cur_ts) and api.messages_for_tipset(cur_ts) (e.g. use match or the ? operator) so failures are returned/propagated instead of ignored; then perform the loop over messages (examining msg.from(), resolve_to_key for each sender, comparing to the resolved sender, and updating next_nonce from msg.sequence()+1) only when both calls succeed, ensuring caller sees the error rather than silently undercounting nonces.

sudo-shashank added 12 commits March 25, 2026 00:51

Fix address resolution for pending map lookups

7792fda

Add test

a6a4b89

MaxNonceGap check

9101620

Fix get_state_sequence to scan tipset messages

701d468

Fix delegated address signing in MpoolPushMessage

c6f94eb

Impl per sender lock and nonce tracker

66e0fa8

Balance early check

f048bb5

Impl state nonce cache

716c39b

Include nonce column in wallet list cmd

eca4e4b

enable wallet test

53a5e8b

Add nonce handling doc

2d08e5b

lint fix

50af62b

sudo-shashank added the RPC requires calibnet RPC checks to run on CI label Mar 25, 2026

github-actions bot deployed to preview March 25, 2026 05:45 View deployment

Update changelog

2325d1f

github-actions bot deployed to preview March 25, 2026 05:48 View deployment

fix spellcheck error

55d55b9

github-actions bot deployed to preview March 25, 2026 05:57 View deployment

more lint fix

e25fabd

github-actions bot deployed to preview March 25, 2026 06:28 View deployment

sudo-shashank marked this pull request as ready for review March 25, 2026 06:56

sudo-shashank requested a review from a team as a code owner March 25, 2026 06:56

sudo-shashank requested review from LesnyRumcajs and hanabi1224 and removed request for a team March 25, 2026 06:56

coderabbitai bot reviewed Mar 25, 2026

View reviewed changes

sudo-shashank marked this pull request as draft March 25, 2026 07:15

minor corrections

0358f1f

github-actions bot deployed to preview March 25, 2026 09:40 View deployment

sudo-shashank marked this pull request as ready for review March 25, 2026 10:18

github-actions bot deployed to preview March 25, 2026 10:19 View deployment

coderabbitai bot reviewed Mar 25, 2026

View reviewed changes

LesnyRumcajs reviewed Mar 26, 2026

View reviewed changes

docs/docs/developers/guides/nonce_handling.md Outdated Show resolved Hide resolved

LesnyRumcajs reviewed Mar 26, 2026

View reviewed changes

docs/docs/developers/guides/nonce_handling.md Outdated Show resolved Hide resolved

LesnyRumcajs reviewed Mar 26, 2026

View reviewed changes

scripts/tests/calibnet_delegated_wallet_check.sh Outdated Show resolved Hide resolved

LesnyRumcajs reviewed Mar 26, 2026

View reviewed changes

src/message_pool/msgpool/mod.rs Show resolved Hide resolved

Merge branch 'main' into shashank/revamp-mpool-nonce-calc

5b3bf8b

github-actions bot deployed to preview March 30, 2026 03:33 View deployment

Merge branch 'main' into shashank/revamp-mpool-nonce-calc

ea8c94c

github-actions bot deployed to preview March 30, 2026 06:28 View deployment

Merge branch 'main' into shashank/revamp-mpool-nonce-calc

856be21

github-actions bot deployed to preview March 30, 2026 10:29 View deployment

github-actions bot deployed to preview March 31, 2026 00:04 View deployment

sudo-shashank force-pushed the shashank/revamp-mpool-nonce-calc branch from e7c499c to 856be21 Compare March 31, 2026 00:06

Merge branch 'main' into shashank/revamp-mpool-nonce-calc

6369aed

github-actions bot deployed to preview March 31, 2026 00:07 View deployment

github-actions bot deployed to preview March 31, 2026 00:08 View deployment

refactor tests

a5f884a

github-actions bot deployed to preview March 31, 2026 05:37 View deployment

sudo-shashank added 2 commits March 31, 2026 12:01

inline docs

635a6bd

Merge branch 'main' into shashank/revamp-mpool-nonce-calc

75eaf71

sudo-shashank added 3 commits March 31, 2026 12:32

fix spellcheck

c27efaf

avoid nonce collision in CI

94ad892

set required fund for delegated wallet test

8294ad5

coderabbitai bot reviewed Mar 31, 2026

View reviewed changes

Conversation

sudo-shashank commented Mar 25, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary of changes

Reference issue to close (if applicable)

Other information and links

Change checklist

Outside contributions

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

LesnyRumcajs commented Mar 26, 2026

Uh oh!

LesnyRumcajs Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

sudo-shashank Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sudo-shashank commented Mar 31, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sudo-shashank commented Mar 25, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 25, 2026 •

edited

Loading

codecov bot commented Mar 25, 2026 •

edited

Loading

sudo-shashank Mar 31, 2026 •

edited

Loading