feat: validator authority override for replication and replay#1043
feat: validator authority override for replication and replay#1043
Conversation
Co-authored-by: Amp <amp@ampcode.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019cd734-802b-77f0-8350-6e7e485d0920 Co-authored-by: Amp <amp@ampcode.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019cd734-802b-77f0-8350-6e7e485d0920 Co-authored-by: Amp <amp@ampcode.com>
…in test-kit Amp-Thread-ID: https://ampcode.com/threads/T-019cd734-802b-77f0-8350-6e7e485d0920 Co-authored-by: Amp <amp@ampcode.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a validator authority override flow: LedgerConfig gains an optional Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
magicblock-api/src/magic_validator.rs (1)
556-582:⚠️ Potential issue | 🔴 CriticalOverride state lost after ledger replay in replication-mode validators.
When a validator is configured in
StandByorReplicatOnlymode AND hasreplay_authority_overrideconfigured, the replication-mode authority override is permanently lost after ledger replay:
- Initialization (line 266): Replication mode sets override to the primary validator's pubkey
- Ledger replay (line 559): Override is temporarily replaced with
replay_authority_override- Post-replay (line 581): Override is unconditionally cleared
- Mode transition (line 166): Validator transitions to Replica mode without the proper authority override
This breaks authority signature verification for all subsequent transactions processed in Replica mode, since they lack the required replication-mode authority override that should have been preserved.
Restore the replication-mode override after replay completes, either by:
- Saving and restoring the previous override value
- Re-applying the replication mode override before transitioning to Replica mode
- Only clearing the override if it matches the replay override, preserving any replication-mode override
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-api/src/magic_validator.rs` around lines 556 - 582, When setting the temporary replay override (self.config.ledger.replay_authority_override) before process_ledger, first read and save the current validator override via validator::set_validator_authority_override/validator::unset_validator_authority_override pair (e.g., capture the existing override value), then set the replay override, run process_ledger, and after replay restore the saved override instead of unconditionally calling validator::unset_validator_authority_override(); alternatively only unset if the current override equals the replay override. Update the block around process_ledger (where validator::set_validator_authority_override, process_ledger, and validator::unset_validator_authority_override are used) to preserve and restore the prior override so replication-mode authority remains active after replay.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@magicblock-api/src/magic_validator.rs`:
- Around line 556-582: When setting the temporary replay override
(self.config.ledger.replay_authority_override) before process_ledger, first read
and save the current validator override via
validator::set_validator_authority_override/validator::unset_validator_authority_override
pair (e.g., capture the existing override value), then set the replay override,
run process_ledger, and after replay restore the saved override instead of
unconditionally calling validator::unset_validator_authority_override();
alternatively only unset if the current override equals the replay override.
Update the block around process_ledger (where
validator::set_validator_authority_override, process_ledger, and
validator::unset_validator_authority_override are used) to preserve and restore
the prior override so replication-mode authority remains active after replay.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: ef3d5968-8efc-48c0-8ca0-c9e27ad2af17
📒 Files selected for processing (11)
magicblock-api/src/magic_validator.rsmagicblock-config/src/config/ledger.rsmagicblock-config/src/config/validator.rsprograms/magicblock/src/clone_account/common.rsprograms/magicblock/src/magic_scheduled_base_intent.rsprograms/magicblock/src/mutate_accounts/process_mutate_accounts.rsprograms/magicblock/src/schedule_task/process_schedule_task.rsprograms/magicblock/src/schedule_transactions/process_accept_scheduled_commits.rsprograms/magicblock/src/schedule_transactions/process_scheduled_commit_sent.rsprograms/magicblock/src/toggle_executable_check.rsprograms/magicblock/src/validator.rs
Amp-Thread-ID: https://ampcode.com/threads/T-019ce0e2-0bcc-77ec-8f73-7febb61c8392 Co-authored-by: Amp <amp@ampcode.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
magicblock-api/src/magic_validator.rs (1)
560-588:⚠️ Potential issue | 🟠 MajorRestore the replay override on the error path too.
Line 578 can return through
?before the cleanup at Line 582 runs. If replay fails, the temporary replay authority stays installed in global state and later permission checks use the wrong effective validator id.Suggested fix
let prior_override = validator::validator_authority_override(); - if let Some(ref pk) = self.config.ledger.replay_authority_override { - validator::set_validator_authority_override(pk.0); - } + let replay_override = self + .config + .ledger + .replay_authority_override + .as_ref() + .map(|pk| pk.0); + if let Some(pk) = replay_override { + validator::set_validator_authority_override(pk); + } // SOLANA only allows blockhash to be valid for 150 slot back in time, // considering that the average slot time on solana is 400ms, then: const SOLANA_VALID_BLOCKHASH_AGE: u64 = 150 * 400; // we have this number for our max blockhash age in slots, which correspond to 60 seconds let max_block_age = SOLANA_VALID_BLOCKHASH_AGE / self.config.ledger.block_time_ms(); let step_start = Instant::now(); - let slot_to_continue_at = process_ledger( + let replay_result = process_ledger( &self.ledger, accountsdb_slot, self.transaction_scheduler.clone(), max_block_age, ) - .await?; + .await; + + if replay_override.is_some() { + match prior_override { + Some(pk) => validator::set_validator_authority_override(pk), + None => validator::unset_validator_authority_override(), + } + } + + let slot_to_continue_at = replay_result?; log_timing("startup", "ledger_replay", step_start); self.accountsdb.set_slot(slot_to_continue_at); - - // Restore the prior authority override now that replay is done. - if self.config.ledger.replay_authority_override.is_some() { - match prior_override { - Some(pk) => validator::set_validator_authority_override(pk), - None => validator::unset_validator_authority_override(), - } - }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-api/src/magic_validator.rs` around lines 560 - 588, The temporary replay authority override is only restored on the success path; if process_ledger(...) returns an error (the `?`), the prior override remains set. Change the flow so you capture the result of process_ledger into a variable (e.g., let result = process_ledger(...).await;) or match on its Result, then always restore the prior override using validator::set_validator_authority_override or validator::unset_validator_authority_override based on prior_override before propagating the result (i.e., return result? after restoration). Ensure the restoration block references prior_override and self.config.ledger.replay_authority_override exactly as in the existing code so the cleanup runs on both Ok and Err paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@magicblock-api/src/magic_validator.rs`:
- Around line 263-269: Before applying the replication-mode override, ensure any
previous process-global override is cleared: call
validator::clear_validator_authority_override() (or the equivalent clear
function) before the match on config.validator.replication_mode and also in the
ReplicationMode::Standalone arm; then only call
validator::set_validator_authority_override(pk.0) for the StandBy/ReplicatOnly
arms so a second MagicValidator in the same process cannot inherit a stale
override.
---
Outside diff comments:
In `@magicblock-api/src/magic_validator.rs`:
- Around line 560-588: The temporary replay authority override is only restored
on the success path; if process_ledger(...) returns an error (the `?`), the
prior override remains set. Change the flow so you capture the result of
process_ledger into a variable (e.g., let result = process_ledger(...).await;)
or match on its Result, then always restore the prior override using
validator::set_validator_authority_override or
validator::unset_validator_authority_override based on prior_override before
propagating the result (i.e., return result? after restoration). Ensure the
restoration block references prior_override and
self.config.ledger.replay_authority_override exactly as in the existing code so
the cleanup runs on both Ok and Err paths.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6907c714-a2b3-4aee-9f95-0c5a4fbaafdc
📒 Files selected for processing (1)
magicblock-api/src/magic_validator.rs
…hlorenz/schedulter_dual-mode+validator-keypair+override
…hlorenz/schedulter_dual-mode+validator-keypair+override
… mode transitions Amp-Thread-ID: https://ampcode.com/threads/T-019ce626-d458-711b-a4f8-81350e6b1a45 Co-authored-by: Amp <amp@ampcode.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019ce62a-8f42-7138-8a0a-4c5f43c1e2df Co-authored-by: Amp <amp@ampcode.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019ce62f-907c-719f-885d-43cf944b98c3 Co-authored-by: Amp <amp@ampcode.com>
…hlorenz/schedulter_dual-mode+validator-keypair+override
bmuddha
left a comment
There was a problem hiding this comment.
Please address the initial comments and rebase (a lot of merged changes still showing up in this PR). I'll give it another pass afterwards.
…ypair+override * master: (34 commits) fix: allow not existing accounts in actions (#1120) fix: published npm wrapper (#1115) release: 0.8.5 (#1113) fix: actions dropping on eata cold cloning (#1112) fix (CloneAccount): enforce incoming_slot > current_slot when cloning an account (#1110) fix: flaky restore_ledger integration test (#1107) Fix: remove conditions that allow to overrides undelegating ata (trough mapped eata) (#1105) feat: add logs for startup time on chainlink and committer service (#1106) feat: action callbacks (#1061) fix: error serialization (#1098) chore: upgrade SVM crate (#1097) feat: Add ScheduleCommitFinalize along with CommitFinalizeTask (#847) fix: Dedup actions execution (#1095) Update `magicblock-api` README to match current async API (#1090) fix: return tokio-console support back (#1089) hotfix: toml integration tests (#1087) chore: fix libsodium lib for arm (#1085) hotfix: lock file and deps integration tests (#1084) hotfix: lockfile (#1083) release: 0.8.4 (#1082) ...
Manual Deploy AvailableYou can trigger a manual deploy of this PR branch to testnet: Alternative: Comment
Comment updated automatically when the PR is synchronized. |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
magicblock-api/src/magic_validator.rs (1)
582-610:⚠️ Potential issue | 🟠 MajorAlways restore the prior override when replay fails.
Line 600 returns early on
process_ledger(...)errors, so the cleanup at Lines 604-610 never runs. This is process-global state, so a failed replay can leave subsequent authority checks pinned toreplay_authority_override.🛠️ Suggested fix
- let slot_to_continue_at = process_ledger( - &self.ledger, - accountsdb_slot, - self.transaction_scheduler.clone(), - max_block_age, - ) - .await?; + let slot_to_continue_at = match process_ledger( + &self.ledger, + accountsdb_slot, + self.transaction_scheduler.clone(), + max_block_age, + ) + .await + { + Ok(slot) => slot, + Err(err) => { + if self.config.ledger.replay_authority_override.is_some() { + match prior_override { + Some(pk) => validator::set_validator_authority_override(pk), + None => validator::unset_validator_authority_override(), + } + } + return Err(err.into()); + } + };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-api/src/magic_validator.rs` around lines 582 - 610, The playback override (self.config.ledger.replay_authority_override) is only restored after a successful process_ledger call, so on error the prior_override is never reinstated; change the logic around process_ledger (the call to process_ledger(...) and the prior_override handling) so the prior override is always restored regardless of process_ledger's outcome — e.g., call process_ledger and store its Result, then always run the restore logic that uses prior_override and validator::set_validator_authority_override / validator::unset_validator_authority_override, and finally return or propagate the stored Result (or use a scope guard that restores prior_override on drop) so that process-global state is never left pinned to replay_authority_override after a failure.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test-integration/test-ledger-restore/src/lib.rs`:
- Around line 290-303: The local-remote override helper
start_magicblock_validator_with_config_struct currently only airdrops lamports
and does not perform fee-vault bootstrap like setup_validator_with_local_remote;
update start_magicblock_validator_with_config_struct (or the calling sequence
around start_magicblock_validator_with_config_struct in
test-ledger-restore/src/lib.rs) to call init_validator_fees_vault(...) with the
same arguments used by setup_validator_with_local_remote after the airdrop and
before validator startup so the validator fees vault is initialized the same way
as the original helper; ensure you reference the same
loaded_accounts.validator_authority() and any config/port/validator handles
required by init_validator_fees_vault to match behavior.
---
Outside diff comments:
In `@magicblock-api/src/magic_validator.rs`:
- Around line 582-610: The playback override
(self.config.ledger.replay_authority_override) is only restored after a
successful process_ledger call, so on error the prior_override is never
reinstated; change the logic around process_ledger (the call to
process_ledger(...) and the prior_override handling) so the prior override is
always restored regardless of process_ledger's outcome — e.g., call
process_ledger and store its Result, then always run the restore logic that uses
prior_override and validator::set_validator_authority_override /
validator::unset_validator_authority_override, and finally return or propagate
the stored Result (or use a scope guard that restores prior_override on drop) so
that process-global state is never left pinned to replay_authority_override
after a failure.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 87c07f7d-d8fd-4fbb-8aa8-f11141871938
📒 Files selected for processing (5)
magicblock-api/src/magic_validator.rsprograms/magicblock/src/clone_account/common.rsprograms/magicblock/src/magic_scheduled_base_intent.rsprograms/magicblock/src/schedule_transactions/process_scheduled_commit_sent.rstest-integration/test-ledger-restore/src/lib.rs
- technically this was not an issue since we abort the startup process if ledger replay fails - therefore this is a defensive measure and merely for correctness
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
magicblock-api/src/magic_validator.rs (1)
277-283:⚠️ Potential issue | 🟠 MajorGuard the replication-mode override until construction succeeds.
This mutates process-global authority state while
try_from_config()still has later?exits. If one of those init steps fails, the override remains active even though noMagicValidatorinstance exists to clean it up.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-api/src/magic_validator.rs` around lines 277 - 283, The code calls validator::set_validator_authority_override based on config.validator.replication_mode before try_from_config() completes, leaving global authority state mutated if later initialization fails; change this by deferring the override until after MagicValidator construction succeeds or by installing a temporary RAII guard that sets the override only while try_from_config() runs and automatically reverts on error—specifically move the call out of the early match on ReplicationMode (or wrap it with a guard) so that set_validator_authority_override and its reversal are tied to the successful completion of try_from_config() and the lifetime of the created MagicValidator.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@magicblock-api/src/magic_validator.rs`:
- Around line 578-585: The replay-authority override is not restored if
process_ledger() returns early, leaving the process-global override stuck;
capture the result of process_ledger (or use an RAII guard) so you can call
validator::set_validator_authority_override(prior_override) before propagating
errors, e.g. call let res = self.process_ledger(...); restore via
validator::set_validator_authority_override(prior_override); then return res?;
apply the same pattern to the other override block around process_ledger at the
604-610 region to ensure the prior_override (from
validator::validator_authority_override()) is always restored.
---
Duplicate comments:
In `@magicblock-api/src/magic_validator.rs`:
- Around line 277-283: The code calls
validator::set_validator_authority_override based on
config.validator.replication_mode before try_from_config() completes, leaving
global authority state mutated if later initialization fails; change this by
deferring the override until after MagicValidator construction succeeds or by
installing a temporary RAII guard that sets the override only while
try_from_config() runs and automatically reverts on error—specifically move the
call out of the early match on ReplicationMode (or wrap it with a guard) so that
set_validator_authority_override and its reversal are tied to the successful
completion of try_from_config() and the lifetime of the created MagicValidator.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 54e3103a-9638-4bc7-8008-7bbaab14354b
📒 Files selected for processing (2)
magicblock-api/src/magic_validator.rsmagicblock-config/src/config/validator.rs
Amp-Thread-ID: https://ampcode.com/threads/T-019d423b-0554-7209-9766-6a0de8926cb1 Co-authored-by: Amp <amp@ampcode.com>
Summary
Add validator keypair override functionality to support dual-mode validation and ledger replay
scenarios. Enables replication modes (StandBy and ReplicatOnly) to specify a primary validator
pubkey for signature verification, and allows temporary override of validator authority during
ledger replay for transaction verification against a different authority.
Partially addresses: #1026
Details
This PR introduces validator authority override capabilities across the system to support two key scenarios:
Config Updates
Extended
ReplicationModeenum variants (StandBy,ReplicatOnly) to carry aSerdePubkeyrepresenting the primary validator's public key. Addedreplay_authority_overridefield toLedgerConfigto allow specifying an alternative validator authority during ledger replay operations.Validator Authority Override Mechanism
Added static override state (
VALIDATOR_AUTHORITY_OVERRIDE) to the validator module with three new functions:set_validator_authority_override(): Set the override pubkeyunset_validator_authority_override(): Clear the overrideeffective_validator_authority_id(): Return override if set, otherwise the actual validator authorityMagicValidator Initialization
Updated validator initialization to set the authority override for replication modes (StandBy
and ReplicatOnly) before program initialization, ensuring authority checks use the primary
validator's pubkey.
Ledger Replay Override
During ledger replay in StartingUp mode, if
replay_authority_overrideis configured, it'stemporarily set before replaying transactions and unset after replay completes. This allows
replayed transactions to be verified against a different authority than the current validator.
Program Updates
Updated all program instruction handlers to use
effective_validator_authority_id()instead ofvalidator_authority_id(), affecting:This ensures all authority verification respects any active override.
Summary by CodeRabbit
New Features
Configuration Changes
Behavioral Changes
Tests