Skip to content

fix(consensus): Respect EL Sync on Startup#1657

Draft
refcell wants to merge 1 commit intomainfrom
fix/el-sync-not-respected
Draft

fix(consensus): Respect EL Sync on Startup#1657
refcell wants to merge 1 commit intomainfrom
fix/el-sync-not-respected

Conversation

@refcell
Copy link
Contributor

@refcell refcell commented Mar 24, 2026

Summary

Fixes the base-consensus sequencer to respect el sync on startup

@cb-heimdall
Copy link
Collaborator

🟡 Heimdall Review Status

Requirement Status More Info
Reviews 🟡 0/1
Denominator calculation
Show calculation
1 if user is bot 0
1 if user is external 0
2 if repo is sensitive 0
From .codeflow.yml 1
Additional review requirements
Show calculation
Max 0
0
From CODEOWNERS 0
Global minimum 0
Max 1
1
1 if commit is unverified 0
Sum 1

@vercel
Copy link

vercel bot commented Mar 24, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
base Ignored Ignored Mar 24, 2026 11:35pm

Request Review

Comment on lines +283 to 293
} else if let (Ok(Some(head)), Some(unsafe_head_tx)) =
(reth_head, self.unsafe_head_tx.as_ref())
{
unsafe_head_tx
.send_if_modified(|val| (*val != head).then(|| *val = head).is_some());
info!(
target: "engine",
head = %head.block_info.number,
"Bootstrap: skipped reset (beyond genesis), seeded unsafe head from reth"
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The else if arm silently does nothing when unsafe_head_tx is None (validator mode) and reth is beyond genesis. In that case:

  • engine.reset() is skipped (correct — avoids reorging below the state pivot)
  • el_sync_finished stays false (its Default)
  • No watch channel is seeded

The validator path doesn't call schedule_initial_reset, so this isn't a sequencer-style deadlock. But consider logging at info! level when we're beyond genesis in validator mode so operators can distinguish "skipped because validator" from "skipped because watch channel missing":

Suggested change
} else if let (Ok(Some(head)), Some(unsafe_head_tx)) =
(reth_head, self.unsafe_head_tx.as_ref())
{
unsafe_head_tx
.send_if_modified(|val| (*val != head).then(|| *val = head).is_some());
info!(
target: "engine",
head = %head.block_info.number,
"Bootstrap: skipped reset (beyond genesis), seeded unsafe head from reth"
);
}
} else if let Ok(Some(head)) = reth_head {
if let Some(unsafe_head_tx) = self.unsafe_head_tx.as_ref() {
unsafe_head_tx
.send_if_modified(|val| (*val != head).then(|| *val = head).is_some());
}
info!(
target: "engine",
head = %head.block_info.number,
"Bootstrap: skipped reset (beyond genesis), seeded unsafe head from reth"
);
}

This way the log line fires for validators too, confirming the bootstrap took the "beyond genesis" branch.

Comment on lines +153 to +162
let confirmed =
self.check_forkchoice_updated_status(state, &valid_response.payload_status.status)?;

// Apply the new sync state to the engine state.
state.sync_state = new_sync_state;
// Only apply the sync-state update when the EL confirmed the forkchoice
// (`Valid`). When the EL returns `Syncing` the block is merely stored —
// advancing `sync_state` here would move `unsafe_head` beyond what the EL
// can serve, creating a gap that breaks derivation consolidation.
if confirmed {
state.sync_state = new_sync_state;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the interaction with the early-return at line 112: when the EL returns Syncing, sync_state is not updated (confirmed = false). If the same state_update is retried, new_sync_state equals state.sync_state (both still at the old value), the early return at line 112 fires, no FCU is sent, and el_sync_finished never flips to true.

In practice this may not trigger because callers typically advance to a new block. But it may be worth guarding the early return on el_sync_finished being true, so that a retry of the same update while the EL is still syncing re-sends the FCU rather than no-oping:

if state.el_sync_finished && state.sync_state != Default::default() && state.sync_state == new_sync_state {

@github-actions
Copy link
Contributor

Review Summary

The PR correctly addresses the core problem: when the EL returns Syncing from engine_forkchoiceUpdated, the node should not advance sync_state (and thus unsafe_head) beyond what the EL can actually serve. The approach -- returning a bool from check_forkchoice_updated_status and gating the state update -- is clean and well-tested.

Findings

1. Early-return short-circuit vs Syncing retry (task.rs line 112) -- When the EL returns Syncing, sync_state is intentionally not updated. If the same state_update is retried, new_sync_state == state.sync_state triggers the early return, skipping the FCU entirely. This means the EL is never re-queried and el_sync_finished stays false. Guarding the early return on state.el_sync_finished would allow the FCU to be re-sent on retry while the EL is still syncing.
2. Beyond-genesis bootstrap in validator mode (engine_request_processor.rs lines 283-293) -- The else-if arm requires both Ok(Some(head)) and Some(unsafe_head_tx). In validator mode (unsafe_head_tx = None), the entire branch is skipped silently -- no log line, no state seeding. The info log should fire regardless of whether the watch channel exists, so operators can confirm the node took the beyond-genesis bootstrap path.

Looks Good

  • The SynchronizeTask change from Result unit to Result bool is well-designed and preserves the Output = () trait contract.
  • The ELSyncing error variant and the sequencer retry loop with cancellation support are correct.
  • The reset deferral in the engine processor (el_sync_finished gate) prevents aborting in-progress snap sync.
  • Test coverage for Valid, Syncing, and Syncing-then-Valid sequences is thorough.

@github-actions github-actions bot added the M-prevent-stale Meta: Not Stale label Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

M-prevent-stale Meta: Not Stale

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants