Skip to content

feat: batch cold account loads in light client#2341

Open
sergeytimoshin wants to merge 4 commits intomainfrom
sergey/load-accounts-batching
Open

feat: batch cold account loads in light client#2341
sergeytimoshin wants to merge 4 commits intomainfrom
sergey/load-accounts-batching

Conversation

@sergeytimoshin
Copy link
Contributor

@sergeytimoshin sergeytimoshin commented Mar 14, 2026

Summary by CodeRabbit

  • Refactor

    • Stronger validation and clearer error messages across batch and proof flows
    • Snapshot-based batch handling for more predictable, memory-efficient processing
    • Proof reconstruction now uses fixed-height arrays for determinism
  • Public API

    • Safer borrow-based inputs and adjusted return types for batch/proof interfaces
    • New explicit error variant for integer conversion failures
  • Tests

    • Updated tests to match new signatures and stricter validations
  • Bug Fixes

    • Health-check/network behavior made more robust; server address normalized to 127.0.0.1

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 14, 2026

📝 Walkthrough

Walkthrough

Replaces runtime Vec-based proof handling with const-generic HEIGHT fixed-size proofs and a new AddressBatchSnapshot API; batch retrieval/processing now returns snapshots, proof reconstruction is HEIGHT-generic, and error propagation is tightened across forester, SDK indexer, and prover client code.

Changes

Cohort / File(s) Summary
Forester — batch snapshot API
forester/src/processor/v2/helpers.rs
Adds pub struct AddressBatchSnapshot<const HEIGHT: usize>; replaces get_batch_data() with get_batch_snapshot<const HEIGHT>(...) -> Result<Option<AddressBatchSnapshot<HEIGHT>>>; validates empty batches, derives missing leaves_hashchain when needed, reconstructs low_element_proofs into the snapshot, and removes BatchDataSlice.
Forester — batch processing
forester/src/processor/v2/strategy/address.rs
Switches to get_batch_snapshot(...), destructures snapshot fields (addresses, low_element_* , proofs, leaves_hashchain), computes hashchain_idx, and delegates processing to staging_tree.process_batch(...), mapping errors via map_address_staging_error. Removes inline proof/hashchain reconstruction.
SDK indexer — proof types & reconstruction
sdk-libs/client/src/indexer/types/queue.rs
Introduces const-generic HEIGHT for proofs, changes reconstruct_proof() to return [[u8; 32]; HEIGHT], adds reconstruct_proofs()/reconstruct_all_proofs() and node-lookup helpers, validates proof height, and replaces per-call Vec allocations with fixed-size arrays.
SDK indexer — photon & minor fixes
sdk-libs/client/src/indexer/photon_indexer.rs
Avoids cloning proof vectors by slicing x.proof[..proof_len], maps Base58 decode errors to IndexerError::Base58DecodeError { field: "proof" }, and replaces unused _merkle_tree_pubkey param with merkle_tree_pubkey returning a NotImplemented error including the pubkey.
SDK client — load_accounts batching
sdk-libs/client/src/interface/load_accounts.rs
Adds LoadAccountsError::MissingPdaCompressedData { pubkey }; groups PDA specs by program id and uses batched fetch_proof_batches per group (plus separate ATA/mint fetch paths); introduces group_pda_specs and fetch_individual_proofs helpers and adjusts iteration to operate on groups.
Prover client — batch address append API
prover/client/src/proof_types/batch_address_append/proof_inputs.rs, prover/client/src/errors.rs
Changes get_batch_address_append_circuit_inputs() to accept borrowed slices (&[[u8;32]], &[[[u8;32]; HEIGHT]], index slices) instead of owned Vecs; updates BatchAddressAppendInputs shape; adds ProverClientError::IntegerConversion(String) and stronger index conversion/error propagation.
Prover client — helper error propagation
prover/client/src/helpers.rs, prover/client/src/proof_types/*/proof_inputs.rs
Makes compute_root_from_merkle_proof fallible: returns Result<([u8;32], ChangelogEntry<HEIGHT>), ProverClientError> and updates callers (batch_append, batch_update, etc.) to propagate with ? instead of unwrap.
Tests / harness
prover/client/tests/batch_address_append.rs, sdk-libs/program-test/src/indexer/test_indexer.rs
Adapted tests to new slice-based signatures and fixed-size proof arrays; updated proof construction to as_slice().try_into().unwrap() for fixed-size arrays; adjusted start_index logic in indexer test to use bundle right_most_index.
Misc — constants & client
prover/client/src/constants.rs, prover/client/src/prover.rs
Changed SERVER_ADDRESS to http://127.0.0.1:3001; health_check now builds reqwest client with .no_proxy() and handles client build errors.

Sequence Diagram(s)

sequenceDiagram
    participant Stream as StreamingQueue
    participant Forester as Forester::ProcessorV2
    participant Staging as StagingTree
    participant Prover as ProverClient
    participant Circuit as Circuit

    Stream->>Forester: request batch (start,end,hashchain_idx)
    Forester->>Stream: get_batch_snapshot<const HEIGHT>(start,end,hashchain_idx)
    Stream-->>Forester: AddressBatchSnapshot{addresses, low_values, indices, proofs, leaves_hashchain}
    Forester->>Staging: staging_tree.process_batch(snapshot, metadata)
    Staging-->>Forester: ProcessResult{proof_input_data, new_root}
    Forester->>Prover: prepare circuit inputs (refs to proofs/slices)
    Prover->>Circuit: get_batch_address_append_circuit_inputs(&slices...)
    Circuit-->>Prover: circuit inputs / proof
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

ai-review

Suggested reviewers

  • SwenSchaeferjohann
  • ananas-block

Poem

🧩 Snapshots forged where proofs align,
Heights fixed, slices passing by design.
From stream to staging, errors now clear,
Batches advance — small steps, no fear. 🎉

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main objective: implementing batch cold account loading functionality in the light client to improve efficiency.
Docstring Coverage ✅ Passed Docstring coverage is 73.68% which is sufficient. The required threshold is 70.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sergey/load-accounts-batching
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
sdk-libs/program-test/src/indexer/test_indexer.rs (1)

2498-2534: ⚠️ Potential issue | 🔴 Critical

Non-2xx prover responses can hang this loop indefinitely.

When send() succeeds but the prover returns a 4xx/5xx, the loop neither decrements retries nor returns an error, so it can spin forever. The same block also still unwraps text() and deserialize_gnark_proof_json(), which reintroduces panics on malformed responses. Please route both status failures and decode failures through the same IndexerError path.

💡 Proposed fix
             let mut retries = 3;
             while retries > 0 {
                 let response_result = client
                     .post(format!("{}{}", SERVER_ADDRESS, PROVE_PATH))
                     .header("Content-Type", "text/plain; charset=utf-8")
                     .body(json_payload.clone())
                     .send()
                     .await;
-                if let Ok(response_result) = response_result {
-                    if response_result.status().is_success() {
-                        let body = response_result.text().await.unwrap();
-                        let proof_json = deserialize_gnark_proof_json(&body).unwrap();
+                match response_result {
+                    Ok(response_result) if response_result.status().is_success() => {
+                        let body = response_result
+                            .text()
+                            .await
+                            .map_err(|e| IndexerError::CustomError(e.to_string()))?;
+                        let proof_json = deserialize_gnark_proof_json(&body)
+                            .map_err(|e| IndexerError::CustomError(e.to_string()))?;
                         let (proof_a, proof_b, proof_c) = proof_from_json_struct(proof_json)
                             .map_err(|e| IndexerError::CustomError(e.to_string()))?;
                         let (proof_a, proof_b, proof_c) =
                             compress_proof(&proof_a, &proof_b, &proof_c)
                                 .map_err(|e| IndexerError::CustomError(e.to_string()))?;
                         return Ok(ValidityProofWithContext {
                             accounts: account_proof_inputs,
                             addresses: address_proof_inputs,
                             proof: CompressedProof {
                                 a: proof_a,
                                 b: proof_b,
                                 c: proof_c,
                             }
                             .into(),
                         });
                     }
-                } else {
-                    println!("Error: {:#?}", response_result);
-                    tokio::time::sleep(Duration::from_secs(5)).await;
-                    retries -= 1;
+                    Ok(response_result) => {
+                        let status = response_result.status();
+                        let body = response_result.text().await.unwrap_or_default();
+                        retries -= 1;
+                        if retries == 0 {
+                            return Err(IndexerError::CustomError(format!(
+                                "Prover returned {}: {}",
+                                status, body
+                            )));
+                        }
+                    }
+                    Err(err) => {
+                        retries -= 1;
+                        if retries == 0 {
+                            return Err(IndexerError::CustomError(err.to_string()));
+                        }
+                    }
                 }
+                tokio::time::sleep(Duration::from_secs(5)).await;
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk-libs/program-test/src/indexer/test_indexer.rs` around lines 2498 - 2534,
The loop currently only decrements retries on transport errors and unwraps
response body and deserialization, causing hangs or panics for non-2xx or
malformed responses; change the logic in the while retries loop that starts with
`let response_result = client.post(...).send().await` to treat any non-success
status as a failure path that decrements `retries`, sleeps, and eventually
returns an `IndexerError::CustomError` if exhausted. Specifically: when
`response_result` is Ok but `response_result.status().is_success()` is false,
read the body safely (await and map errors) and convert the status+body into an
`IndexerError::CustomError`, decrement `retries` and sleep before retrying;
replace all `.unwrap()` calls on `response_result.text().await` and
`deserialize_gnark_proof_json(&body)` with error-mapped returns into
`IndexerError::CustomError`; keep the successful branch that calls
`proof_from_json_struct` and `compress_proof` but also map their errors (they
already use map_err) and ensure any failure there returns
`Err(IndexerError::CustomError(...))` rather than panicking. This ensures
`retries` is decremented for both transport and non-2xx responses and avoids
unwrap-induced panics while preserving use of `deserialize_gnark_proof_json`,
`proof_from_json_struct`, `compress_proof`, `ValidityProofWithContext`, and
`CompressedProof`.
prover/client/src/proof_types/batch_address_append/proof_inputs.rs (1)

556-559: 🧹 Nitpick | 🔵 Trivial

Redundant slice: [0..] is unnecessary.

new_element_values[0..] is equivalent to just new_element_values. This appears to be a leftover from a refactor.

✨ Suggested simplification
-        new_element_values: new_element_values[0..]
-            .iter()
+        new_element_values: new_element_values
+            .iter()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs` around
lines 556 - 559, The slice new_element_values[0..] is redundant; update the
mapping to iterate directly over new_element_values (i.e., replace
new_element_values[0..].iter() with new_element_values.iter()) so the BigUint
conversion via BigUint::from_bytes_be(v) and the collect() behavior remain
unchanged; touch the expression in proof_inputs.rs where new_element_values is
used in the map to remove the unnecessary slice.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs`:
- Around line 246-257: Add an IntegerConversion variant to the ProverClientError
enum so the try_into() error mappings in the batch address append code compile;
specifically add a variant like IntegerConversion(String) with the
#[error("Integer conversion error: {0}")] attribute to the ProverClientError
enum in errors.rs so calls that return ProverClientError::IntegerConversion from
the low_element_index / low_element_next_index conversion sites
(low_element_indices, low_element_next_indices) are valid.

In `@sdk-libs/client/src/indexer/types/queue.rs`:
- Around line 70-73: The const-generic height is unchecked and can produce
truncated or missing proofs; update the proof-producing functions (e.g.,
reconstruct_proof and the other proof-returning methods in this file) to
validate the supplied const HEIGHT against the fixed address-tree height (40)
before constructing/returning a proof and return an appropriate IndexerError if
HEIGHT != 40 (alternatively persist the queue height on the struct and compare
it), ensuring any code paths that previously assumed a different HEIGHT fail
early with a clear error.

---

Outside diff comments:
In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs`:
- Around line 556-559: The slice new_element_values[0..] is redundant; update
the mapping to iterate directly over new_element_values (i.e., replace
new_element_values[0..].iter() with new_element_values.iter()) so the BigUint
conversion via BigUint::from_bytes_be(v) and the collect() behavior remain
unchanged; touch the expression in proof_inputs.rs where new_element_values is
used in the map to remove the unnecessary slice.

In `@sdk-libs/program-test/src/indexer/test_indexer.rs`:
- Around line 2498-2534: The loop currently only decrements retries on transport
errors and unwraps response body and deserialization, causing hangs or panics
for non-2xx or malformed responses; change the logic in the while retries loop
that starts with `let response_result = client.post(...).send().await` to treat
any non-success status as a failure path that decrements `retries`, sleeps, and
eventually returns an `IndexerError::CustomError` if exhausted. Specifically:
when `response_result` is Ok but `response_result.status().is_success()` is
false, read the body safely (await and map errors) and convert the status+body
into an `IndexerError::CustomError`, decrement `retries` and sleep before
retrying; replace all `.unwrap()` calls on `response_result.text().await` and
`deserialize_gnark_proof_json(&body)` with error-mapped returns into
`IndexerError::CustomError`; keep the successful branch that calls
`proof_from_json_struct` and `compress_proof` but also map their errors (they
already use map_err) and ensure any failure there returns
`Err(IndexerError::CustomError(...))` rather than panicking. This ensures
`retries` is decremented for both transport and non-2xx responses and avoids
unwrap-induced panics while preserving use of `deserialize_gnark_proof_json`,
`proof_from_json_struct`, `compress_proof`, `ValidityProofWithContext`, and
`CompressedProof`.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9c2af6fd-b65c-41f4-8efc-056487d57845

📥 Commits

Reviewing files that changed from the base of the PR and between 2644529 and 67a29a4.

⛔ Files ignored due to path filters (5)
  • forester-utils/src/address_staging_tree.rs is excluded by none and included by none
  • program-tests/utils/src/e2e_test_env.rs is excluded by none and included by none
  • program-tests/utils/src/test_batch_forester.rs is excluded by none and included by none
  • sparse-merkle-tree/src/indexed_changelog.rs is excluded by none and included by none
  • sparse-merkle-tree/tests/indexed_changelog.rs is excluded by none and included by none
📒 Files selected for processing (8)
  • forester/src/processor/v2/helpers.rs
  • forester/src/processor/v2/strategy/address.rs
  • prover/client/src/proof_types/batch_address_append/proof_inputs.rs
  • prover/client/tests/batch_address_append.rs
  • sdk-libs/client/src/indexer/photon_indexer.rs
  • sdk-libs/client/src/indexer/types/queue.rs
  • sdk-libs/client/src/interface/load_accounts.rs
  • sdk-libs/program-test/src/indexer/test_indexer.rs

@sergeytimoshin sergeytimoshin force-pushed the sergey/load-accounts-batching branch 2 times, most recently from 238cb7f to 226df79 Compare March 14, 2026 18:43
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
prover/client/src/helpers.rs (1)

49-54: ⚠️ Potential issue | 🟠 Major

Widen path_index before address trees cross u32::MAX.

compute_root_from_merkle_proof is now the shared fallible helper for both state and address trees, but it still encodes the path index as u32. The address-append callers already hold usize/u64 indices for 40-level trees and currently downcast into this API; once next_index exceeds 4_294_967_295, those casts will silently truncate the upper bits and compute the wrong root and changelog. Use usize or u64 here and narrow only at a boundary that truly requires it.

Based on learnings: Address trees must initialize with ADDRESS_TREE_INIT_ROOT_40 (hardcoded for height 40) and start with next_index = 1.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/helpers.rs` around lines 49 - 54, The function
compute_root_from_merkle_proof currently accepts path_index: u32 which causes
truncation for address trees deeper than u32::MAX; change the parameter to a
wider integer (prefer usize or u64) in the signature and all internal uses
(e.g., ChangelogEntry::default_with_index call and any arithmetic) and only
downcast to u32 at the true boundary if an API you call requires u32; also
update any callers that downcast from usize/u64 to pass the wider type.
Additionally ensure address-tree initialization uses ADDRESS_TREE_INIT_ROOT_40
and that the address-tree next_index starts at 1 (update
constructors/initializers that set next_index) so address trees for height 40
behave correctly.
prover/client/src/proof_types/batch_append/proof_inputs.rs (1)

180-194: ⚠️ Potential issue | 🟠 Major

Remove the remaining proof-length panic in this path.

The new ? only covers compute_root_from_merkle_proof; merkle_proof.try_into().unwrap() still aborts the process if the indexer returns a proof with the wrong height. This function already returns ProverClientError, so surface the mismatch instead of panicking.

Suggested fix
-        let merkle_proof_array = merkle_proof.try_into().unwrap();
+        let proof_len = merkle_proof.len();
+        let merkle_proof_array: [[u8; 32]; HEIGHT] = merkle_proof.try_into().map_err(|_| {
+            ProverClientError::GenericError(format!(
+                "Invalid merkle proof length: got {}, expected {}",
+                proof_len, HEIGHT
+            ))
+        })?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/proof_types/batch_append/proof_inputs.rs` around lines 180
- 194, The code still panics on merkle_proof.try_into().unwrap(); replace the
unwrap with fallible conversion that returns a ProverClientError instead of
aborting: call merkle_proof.try_into().map_err(|e|
ProverClientError::InvalidProofLength { source: e.into(), context:
format!("expected height for index {}", start_index + i as u32) })? (or perform
a match and return an appropriate ProverClientError), then continue using
merkle_proof_array with compute_root_from_merkle_proof(final_leaf,
&merkle_proof_array, start_index + i as u32)? so the function surfaces
proof-length mismatches via its existing ProverClientError return type rather
than panicking on unwrap.
prover/client/src/proof_types/batch_update/proof_inputs.rs (1)

171-178: ⚠️ Potential issue | 🟠 Major

Finish the fallible-path cleanup in get_batch_update_inputs.

This call site now propagates compute_root_from_merkle_proof, but the proof conversion and Poseidon::hashv immediately above still unwrap(). A malformed proof or invalid hash input will still panic the process instead of returning ProverClientError.

Suggested fix
-        let merkle_proof_array = merkle_proof.try_into().unwrap();
+        let proof_len = merkle_proof.len();
+        let merkle_proof_array: [[u8; 32]; HEIGHT] = merkle_proof.try_into().map_err(|_| {
+            ProverClientError::GenericError(format!(
+                "Invalid merkle proof length: got {}, expected {}",
+                proof_len, HEIGHT
+            ))
+        })?;
@@
-        let nullifier = Poseidon::hashv(&[leaf, &index_bytes, &tx_hashes[i]]).unwrap();
+        let nullifier = Poseidon::hashv(&[leaf, &index_bytes, &tx_hashes[i]])?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/proof_types/batch_update/proof_inputs.rs` around lines 171
- 178, The code in get_batch_update_inputs currently uses
merkle_proof.try_into().unwrap() and Poseidon::hashv(...).unwrap(), which can
panic; change both to propagate errors as ProverClientError instead: replace
merkle_proof.try_into().unwrap() with merkle_proof.try_into().map_err(|e|
ProverClientError::from(...))? (or an appropriate ProverClientError variant) and
replace Poseidon::hashv(...).unwrap() with Poseidon::hashv(...).map_err(|e|
ProverClientError::from(...))? so that malformed proofs or hash failures return
an Err from get_batch_update_inputs and remain consistent with the
compute_root_from_merkle_proof error propagation.
prover/client/src/proof_types/batch_address_append/proof_inputs.rs (1)

369-489: ⚠️ Potential issue | 🟠 Major

Keep the staging mutations transactional on Err.

This function now returns recoverable errors after it has already pushed into changelog / indexed_changelog and mutated sparse_merkle_tree (for example the later compute_root_from_merkle_proof(...) ? calls and the final ordering check). If any of those fail, the caller gets Err but the local staging state is left partially advanced, so the next retry or batch runs against corrupted state. Build into temporary state and commit only once the whole batch succeeds, or add explicit rollback.

Also applies to: 500-518

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs` around
lines 369 - 489, The code mutates staging state (changelog, indexed_changelog,
sparse_merkle_tree, and pushes via patcher.push_changelog_entry::<HEIGHT>,
sparse_merkle_tree.append, and indexed_changelog.push) before fallible
operations complete, so on Err the local state is partially advanced; fix by
performing all fallible computations on temporary/clone state (clone
sparse_merkle_tree, collect changelog entries and merkle proof arrays into local
Vecs like new_element_circuit_merkle_proofs_tmp / indexed_changelog_tmp) and
only call patcher.push_changelog_entry::<HEIGHT>, sparse_merkle_tree.append (or
replace the real tree), and indexed_changelog.push once the entire batch loop
finishes successfully; alternatively implement an explicit rollback path that
reverts mutations (undo append and pop changelog) on any early Err — apply same
change for the other block around lines 500-518 where similar pushes occur.
sdk-libs/program-test/src/indexer/test_indexer.rs (1)

2500-2514: ⚠️ Potential issue | 🔴 Critical

Non-2xx prover responses hot-loop forever here.

When send() succeeds but the prover returns 4xx/5xx, retries is never decremented and the loop immediately issues another request. That turns a transient server error into an unbounded tight retry, and the adjacent unwrap() calls on text() / deserialize_gnark_proof_json() still panic on malformed responses instead of surfacing IndexerError.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk-libs/program-test/src/indexer/test_indexer.rs` around lines 2500 - 2514,
The loop currently treats any Ok(response) the same and only decrements retries
on send() errors, causing tight infinite retries for 4xx/5xx and panics from
unwrap() on malformed bodies; update the block that handles Ok(response_result)
so that non-success HTTP statuses decrement retries (and optionally sleep)
instead of immediately retrying, and convert the body/text and JSON parsing
unwraps into fallible operations that map errors into IndexerError (e.g. replace
response_result.text().await.unwrap() and
deserialize_gnark_proof_json(...).unwrap() with ?-style error mapping to
IndexerError), and ensure that any failure in proof_from_json_struct or
compress_proof is returned as Err(IndexerError::CustomError(...)) rather than
panicking; reference the response handling around deserialize_gnark_proof_json,
proof_from_json_struct, compress_proof and IndexerError to locate the changes.
forester/src/processor/v2/strategy/address.rs (1)

269-308: ⚠️ Potential issue | 🟠 Major

Require a full ZKP batch before calling process_batch.

StreamingAddressQueue::wait_for_batch() can return early after its timeout, so get_batch_snapshot() may hand back fewer than zkp_batch_size_usize addresses. Feeding addresses.len() downstream as zkp_batch_size_actual turns that short snapshot into a real address-append job instead of treating it as "not ready yet", which means a slow indexer can produce inputs for an unsupported circuit size.

Suggested guard
-        let zkp_batch_size_actual = addresses.len();
+        if addresses.len() != zkp_batch_size_usize {
+            return Ok(None);
+        }
         let result = staging_tree
             .process_batch(
                 &addresses,
                 &low_element_values,
                 &low_element_next_values,
                 &low_element_indices,
                 &low_element_next_indices,
                 &low_element_proofs,
                 leaves_hashchain,
-                zkp_batch_size_actual,
+                zkp_batch_size_usize,
                 epoch,
                 tree,
             )

Based on learnings: ZKP batch sizes for address trees must be 10 or 250 in src/initialize_address_tree.rs. Only these circuit sizes are supported for address trees.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@forester/src/processor/v2/strategy/address.rs` around lines 269 - 308, The
snapshot returned by StreamingAddressQueue::get_batch_snapshot may be shorter
than the ZKP batch size; before calling staging_tree.process_batch, verify that
addresses.len() equals zkp_batch_size_usize (the expected batch size) and, if it
is smaller, treat the batch as "not ready" (do not call process_batch) — e.g.,
return/propagate a no-op/NotReady result or retry later. Update the logic around
AddressBatchSnapshot retrieval (get_batch_snapshot::<{
DEFAULT_BATCH_ADDRESS_TREE_HEIGHT as usize }>) to check zkp_batch_size_actual
against zkp_batch_size_usize and only pass a full batch into
staging_tree.process_batch; ensure the code path preserves the existing error
mapping via map_address_staging_error when a full batch is processed.
♻️ Duplicate comments (1)
sdk-libs/client/src/indexer/types/queue.rs (1)

70-73: ⚠️ Potential issue | 🟠 Major

Reject non-40 proof heights before reconstruction.

These methods now accept any const HEIGHT, but address-queue proofs are only valid at height 40. HEIGHT < 40 silently truncates the witness, HEIGHT > 40 degrades into a late MissingResult, and encode_node_index() only reserves 8 bits for level, so oversized values can alias node ids. Add one shared height guard before any proof reconstruction path and fail early with a clear IndexerError.

Based on learnings: "Tree height must be fixed at 40 for address trees in src/initialize_address_tree.rs. Validate that height equals 40."

Also applies to: 125-143, 155-205, 209-210

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk-libs/client/src/indexer/types/queue.rs` around lines 70 - 73, Reject any
const HEIGHT != 40 up-front in all proof-reconstruction entry points (e.g.,
reconstruct_proof and the other methods referenced) by adding a single shared
guard that checks HEIGHT == 40 and returns a clear IndexerError when it is not;
update any call sites that route into encode_node_index or witness/truncation
logic so they never run for non-40 heights, and ensure the error message/variant
clearly states expected=40 and the provided height to make debugging explicit.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@sdk-libs/client/src/indexer/types/queue.rs`:
- Around line 146-152: build_node_lookup currently maps node ids to the last
index because collect() overwrites duplicates, which breaks first-match
semantics used by reconstruct_proof (which uses position()); update
build_node_lookup to either detect and reject duplicate node ids or preserve the
first index by using HashMap::entry().or_insert(...) so the first occurrence
remains in the map (refer to function build_node_lookup and callers like
reconstruct_proof to ensure consistent behavior).

In `@sdk-libs/client/src/interface/load_accounts.rs`:
- Around line 213-216: The debug-only check for max_per_group (> 0) is
inconsistent with the runtime branch that returns an empty Vec when
max_per_group == 0; make the behavior consistent by failing in all builds:
replace the debug_assert!(max_per_group > 0, ...) with a plain
assert!(max_per_group > 0, ...) (or otherwise return an Err if this function is
fallible) and remove the special-case path that returns Vec::new() when
max_per_group == 0 so that invalid zero input always triggers the same failure;
ensure references to max_per_group and specs remain and remove the silent
Vec::new() return for the zero case.

---

Outside diff comments:
In `@forester/src/processor/v2/strategy/address.rs`:
- Around line 269-308: The snapshot returned by
StreamingAddressQueue::get_batch_snapshot may be shorter than the ZKP batch
size; before calling staging_tree.process_batch, verify that addresses.len()
equals zkp_batch_size_usize (the expected batch size) and, if it is smaller,
treat the batch as "not ready" (do not call process_batch) — e.g.,
return/propagate a no-op/NotReady result or retry later. Update the logic around
AddressBatchSnapshot retrieval (get_batch_snapshot::<{
DEFAULT_BATCH_ADDRESS_TREE_HEIGHT as usize }>) to check zkp_batch_size_actual
against zkp_batch_size_usize and only pass a full batch into
staging_tree.process_batch; ensure the code path preserves the existing error
mapping via map_address_staging_error when a full batch is processed.

In `@prover/client/src/helpers.rs`:
- Around line 49-54: The function compute_root_from_merkle_proof currently
accepts path_index: u32 which causes truncation for address trees deeper than
u32::MAX; change the parameter to a wider integer (prefer usize or u64) in the
signature and all internal uses (e.g., ChangelogEntry::default_with_index call
and any arithmetic) and only downcast to u32 at the true boundary if an API you
call requires u32; also update any callers that downcast from usize/u64 to pass
the wider type. Additionally ensure address-tree initialization uses
ADDRESS_TREE_INIT_ROOT_40 and that the address-tree next_index starts at 1
(update constructors/initializers that set next_index) so address trees for
height 40 behave correctly.

In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs`:
- Around line 369-489: The code mutates staging state (changelog,
indexed_changelog, sparse_merkle_tree, and pushes via
patcher.push_changelog_entry::<HEIGHT>, sparse_merkle_tree.append, and
indexed_changelog.push) before fallible operations complete, so on Err the local
state is partially advanced; fix by performing all fallible computations on
temporary/clone state (clone sparse_merkle_tree, collect changelog entries and
merkle proof arrays into local Vecs like new_element_circuit_merkle_proofs_tmp /
indexed_changelog_tmp) and only call patcher.push_changelog_entry::<HEIGHT>,
sparse_merkle_tree.append (or replace the real tree), and indexed_changelog.push
once the entire batch loop finishes successfully; alternatively implement an
explicit rollback path that reverts mutations (undo append and pop changelog) on
any early Err — apply same change for the other block around lines 500-518 where
similar pushes occur.

In `@prover/client/src/proof_types/batch_append/proof_inputs.rs`:
- Around line 180-194: The code still panics on
merkle_proof.try_into().unwrap(); replace the unwrap with fallible conversion
that returns a ProverClientError instead of aborting: call
merkle_proof.try_into().map_err(|e| ProverClientError::InvalidProofLength {
source: e.into(), context: format!("expected height for index {}", start_index +
i as u32) })? (or perform a match and return an appropriate ProverClientError),
then continue using merkle_proof_array with
compute_root_from_merkle_proof(final_leaf, &merkle_proof_array, start_index + i
as u32)? so the function surfaces proof-length mismatches via its existing
ProverClientError return type rather than panicking on unwrap.

In `@prover/client/src/proof_types/batch_update/proof_inputs.rs`:
- Around line 171-178: The code in get_batch_update_inputs currently uses
merkle_proof.try_into().unwrap() and Poseidon::hashv(...).unwrap(), which can
panic; change both to propagate errors as ProverClientError instead: replace
merkle_proof.try_into().unwrap() with merkle_proof.try_into().map_err(|e|
ProverClientError::from(...))? (or an appropriate ProverClientError variant) and
replace Poseidon::hashv(...).unwrap() with Poseidon::hashv(...).map_err(|e|
ProverClientError::from(...))? so that malformed proofs or hash failures return
an Err from get_batch_update_inputs and remain consistent with the
compute_root_from_merkle_proof error propagation.

In `@sdk-libs/program-test/src/indexer/test_indexer.rs`:
- Around line 2500-2514: The loop currently treats any Ok(response) the same and
only decrements retries on send() errors, causing tight infinite retries for
4xx/5xx and panics from unwrap() on malformed bodies; update the block that
handles Ok(response_result) so that non-success HTTP statuses decrement retries
(and optionally sleep) instead of immediately retrying, and convert the
body/text and JSON parsing unwraps into fallible operations that map errors into
IndexerError (e.g. replace response_result.text().await.unwrap() and
deserialize_gnark_proof_json(...).unwrap() with ?-style error mapping to
IndexerError), and ensure that any failure in proof_from_json_struct or
compress_proof is returned as Err(IndexerError::CustomError(...)) rather than
panicking; reference the response handling around deserialize_gnark_proof_json,
proof_from_json_struct, compress_proof and IndexerError to locate the changes.

---

Duplicate comments:
In `@sdk-libs/client/src/indexer/types/queue.rs`:
- Around line 70-73: Reject any const HEIGHT != 40 up-front in all
proof-reconstruction entry points (e.g., reconstruct_proof and the other methods
referenced) by adding a single shared guard that checks HEIGHT == 40 and returns
a clear IndexerError when it is not; update any call sites that route into
encode_node_index or witness/truncation logic so they never run for non-40
heights, and ensure the error message/variant clearly states expected=40 and the
provided height to make debugging explicit.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f30f9f95-bb7c-4dcf-b433-899ee6f79ada

📥 Commits

Reviewing files that changed from the base of the PR and between 67a29a4 and 238cb7f.

⛔ Files ignored due to path filters (5)
  • forester-utils/src/address_staging_tree.rs is excluded by none and included by none
  • program-tests/utils/src/e2e_test_env.rs is excluded by none and included by none
  • program-tests/utils/src/test_batch_forester.rs is excluded by none and included by none
  • sparse-merkle-tree/src/indexed_changelog.rs is excluded by none and included by none
  • sparse-merkle-tree/tests/indexed_changelog.rs is excluded by none and included by none
📒 Files selected for processing (12)
  • forester/src/processor/v2/helpers.rs
  • forester/src/processor/v2/strategy/address.rs
  • prover/client/src/errors.rs
  • prover/client/src/helpers.rs
  • prover/client/src/proof_types/batch_address_append/proof_inputs.rs
  • prover/client/src/proof_types/batch_append/proof_inputs.rs
  • prover/client/src/proof_types/batch_update/proof_inputs.rs
  • prover/client/tests/batch_address_append.rs
  • sdk-libs/client/src/indexer/photon_indexer.rs
  • sdk-libs/client/src/indexer/types/queue.rs
  • sdk-libs/client/src/interface/load_accounts.rs
  • sdk-libs/program-test/src/indexer/test_indexer.rs

@sergeytimoshin sergeytimoshin force-pushed the sergey/load-accounts-batching branch 2 times, most recently from bfa6672 to e70763e Compare March 14, 2026 19:12
@sergeytimoshin sergeytimoshin force-pushed the sergey/load-accounts-batching branch from e70763e to 10ea034 Compare March 14, 2026 19:21
@sergeytimoshin sergeytimoshin force-pushed the sergey/load-accounts-batching branch from 19530a6 to a39fe5d Compare March 14, 2026 21:34
@sergeytimoshin sergeytimoshin force-pushed the sergey/load-accounts-batching branch from a39fe5d to ed44b77 Compare March 15, 2026 00:36
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
prover/client/src/helpers.rs (1)

49-54: ⚠️ Potential issue | 🟠 Major

Make path_index wide enough for address-tree indices.

The helper is now HEIGHT-generic, but path_index is still u32. The new address-append callers already hold indices as usize and have to downcast them with as u32; once an address tree passes u32::MAX, root reconstruction will wrap and produce the wrong path/changelog long before the tree is full.

Proposed fix
 pub fn compute_root_from_merkle_proof<const HEIGHT: usize>(
     leaf: [u8; 32],
     path_elements: &[[u8; 32]; HEIGHT],
-    path_index: u32,
+    path_index: usize,
 ) -> Result<([u8; 32], ChangelogEntry<HEIGHT>), ProverClientError> {
-    let mut changelog_entry = ChangelogEntry::default_with_index(path_index as usize);
+    let mut changelog_entry = ChangelogEntry::default_with_index(path_index);
 
     let mut current_hash = leaf;
     let mut current_index = path_index;

Then update the address-append call sites to pass low_element.index, new_low_element.index, and current_index directly, without lossy casts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/helpers.rs` around lines 49 - 54, The function
compute_root_from_merkle_proof currently takes path_index: u32 which can
overflow for large address trees; change its signature to accept path_index:
usize and update the internal use of
ChangelogEntry::default_with_index(path_index as usize) to pass the usize
directly (ChangelogEntry::default_with_index(path_index)), then update all
callers (the address-append sites that currently cast indices with as u32) to
pass their usize indices directly (e.g., low_element.index,
new_low_element.index, current_index) so there is no lossy cast or potential
wrap-around.
prover/client/tests/batch_address_append.rs (1)

87-109: 🧹 Nitpick | 🔵 Trivial

Add a non-empty changelog regression here.

This test still starts with empty changelog and indexed_changelog, so it only exercises the first-batch branch. The new proof-patching logic is only used once those vectors already contain prior entries, which is the riskier path introduced in this PR.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/tests/batch_address_append.rs` around lines 87 - 109, The test
currently initializes changelog and indexed_changelog as empty vectors so
get_batch_address_append_circuit_inputs only covers the first-batch path;
pre-populate both changelog: Vec<ChangelogEntry<{
DEFAULT_BATCH_ADDRESS_TREE_HEIGHT as usize }>> and indexed_changelog:
Vec<IndexedChangelogEntry<usize, { DEFAULT_BATCH_ADDRESS_TREE_HEIGHT as usize
}>> with one or more realistic entries (matching the ChangelogEntry and
IndexedChangelogEntry shapes used elsewhere in tests) before calling
get_batch_address_append_circuit_inputs so the function exercises the
proof-patching logic for non-empty prior entries; ensure the synthetic entries
use valid roots/indices/values consistent with the sparse_merkle_tree and
DEFAULT_BATCH_ADDRESS_TREE_HEIGHT expectations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@forester/src/processor/v2/helpers.rs`:
- Around line 489-534: In get_batch_snapshot: reject truncated snapshots early
by changing the available check to return Ok(None) unless available >= end
(i.e., if self.wait_for_batch(end) < end -> Ok(None)) so we never construct
proofs for partial batches; before slicing validate that
data.low_element_values, low_element_next_values, low_element_indices,
low_element_next_indices and data.addresses have length >= actual_end and that
data.reconstruct_proofs::<HEIGHT>(start..actual_end) can be produced (otherwise
return an error like "incomplete batch data"); keep the existing
leaves_hash_chains fallback logic but do not proceed to build
AddressBatchSnapshot when any of these sibling/vector length checks fail.

In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs`:
- Around line 202-209: Before slicing new_element_values and allocating
per-element vectors, validate that new_element_values.len() >= zkp_batch_size
and that each corresponding input slice (the low_element_* and new_element_*
sources used later) has at least zkp_batch_size elements; if any check fails
return a ProverClientError instead of panicking. Move the check to just before
the line creating the slice (currently using let new_element_values =
&new_element_values[..zkp_batch_size];), only perform the slice after
validation, and use a single safe variable (e.g., batch_len = zkp_batch_size)
when creating capacities and indexing in subsequent loops so all
Vec::with_capacity(...) and later indexing use the validated length. Ensure the
error returned is descriptive (e.g., "truncated batch from indexer") and that
all low_element_* and new_element_* sources are validated the same way to avoid
any later panics.

In `@prover/client/src/prover.rs`:
- Around line 54-57: health_check currently builds a reqwest client with
no_proxy() while ProofClient::local(), ProofClient::with_config(), and
ProofClient::with_full_config() use Client::new(), causing divergent proxy
behavior and a silent ignore of builder errors; create a shared helper (e.g.,
build_http_client() -> Result<reqwest::Client, anyhow::Error>) that constructs
the client with the desired proxy policy (no_proxy() or configurable as
decided), return and propagate the Result so callers can log errors instead of
silently returning, then refactor health_check and all ProofClient constructors
to call this helper (replace direct Client::new() and the current builder call),
and ensure any builder Err is logged via the existing logger before returning an
error.

In `@sdk-libs/client/src/indexer/photon_indexer.rs`:
- Around line 1145-1154: The code currently only checks x.proof.len() <
STATE_MERKLE_TREE_CANOPY_DEPTH which allows truncated proofs; update the
validation before building the MerkleProof (around x.proof, proof_len logic) to
require the post-canopy proof length equals the expected sibling count (i.e.
expected_siblings = STATE_MERKLE_TREE_HEIGHT - STATE_MERKLE_TREE_CANOPY_DEPTH or
the known constant 16 for height 26/canopy 10). If x.proof.len() does not equal
STATE_MERKLE_TREE_CANOPY_DEPTH + expected_siblings (or proof_len !=
expected_siblings) return Err(IndexerError::InvalidParameters(...)) instead of
silently slicing and constructing the MerkleProof.

In `@sdk-libs/client/src/indexer/types/queue.rs`:
- Around line 128-137: The method reconstruct_proofs currently reserves capacity
using address_range.len() before checking that the requested range is within the
available proofs; validate the range bounds up front (before Vec::with_capacity)
by comparing address_range.start and address_range.end against the available
proof count (e.g., self.proof_count() or the relevant collection length on
self), and return an IndexerError for out-of-range requests; only then call
Vec::with_capacity and proceed to call build_node_lookup and
reconstruct_proof_with_lookup::<HEIGHT>.

In `@sdk-libs/client/src/interface/load_accounts.rs`:
- Around line 120-124: The failure index reported by MissingPdaCompressed is
non-unique because collect_pda_hashes(group) restarts enumerate() per batch;
change the code so the global cold-PDA offset is carried into each group (or
remove the index from MissingPdaCompressed). Concretely: modify
collect_pda_hashes to accept a start_index/offset (or create a new
collect_pda_hashes_with_offset) and when mapping pda_groups pass the accumulated
offset (e.g., outer_idx * MAX_PDAS_PER_IX or running total) so that the inner
enumerate adds that offset when constructing MissingPdaCompressed.index;
alternatively, remove the index field from MissingPdaCompressed to match
MissingPdaCompressedData for consistency.
- Around line 244-259: fetch_individual_proofs spawns all RPCs at once which can
overwhelm the indexer; replace the try_join_all approach with a bounded
concurrency stream so requests are limited but results keep original order. Use
futures::stream::iter over the hashes, map each hash to the async
indexer.get_validity_proof call, then apply StreamExt::buffered(CONCURRENCY) (or
a named constant) and collect the responses into a Vec to preserve positional
order; apply the same pattern to fetch_proof_batches (use the same CONCURRENCY
constant and the same stream->buffered->collect flow) so both functions limit
concurrent RPCs while keeping result ordering for subsequent zip operations.

In `@sdk-libs/program-test/src/indexer/test_indexer.rs`:
- Around line 729-731: The response currently sets start_index to
address_tree_bundle.right_most_index(), which breaks pagination semantics
because the returned addresses slice is still paginated by
options.address_queue_start_index; revert start_index to reflect the pagination
offset (use options.address_queue_start_index) and add a new field (e.g.,
tree_next_insertion_index or insertion_index) to the response struct to carry
address_tree_bundle.right_most_index() as u64; update the response construction
where start_index and addresses are populated and adjust the response
type/struct (and any builders) to include the new field so callers can read the
sparse-tree next insertion index without overloading start_index.

---

Outside diff comments:
In `@prover/client/src/helpers.rs`:
- Around line 49-54: The function compute_root_from_merkle_proof currently takes
path_index: u32 which can overflow for large address trees; change its signature
to accept path_index: usize and update the internal use of
ChangelogEntry::default_with_index(path_index as usize) to pass the usize
directly (ChangelogEntry::default_with_index(path_index)), then update all
callers (the address-append sites that currently cast indices with as u32) to
pass their usize indices directly (e.g., low_element.index,
new_low_element.index, current_index) so there is no lossy cast or potential
wrap-around.

In `@prover/client/tests/batch_address_append.rs`:
- Around line 87-109: The test currently initializes changelog and
indexed_changelog as empty vectors so get_batch_address_append_circuit_inputs
only covers the first-batch path; pre-populate both changelog:
Vec<ChangelogEntry<{ DEFAULT_BATCH_ADDRESS_TREE_HEIGHT as usize }>> and
indexed_changelog: Vec<IndexedChangelogEntry<usize, {
DEFAULT_BATCH_ADDRESS_TREE_HEIGHT as usize }>> with one or more realistic
entries (matching the ChangelogEntry and IndexedChangelogEntry shapes used
elsewhere in tests) before calling get_batch_address_append_circuit_inputs so
the function exercises the proof-patching logic for non-empty prior entries;
ensure the synthetic entries use valid roots/indices/values consistent with the
sparse_merkle_tree and DEFAULT_BATCH_ADDRESS_TREE_HEIGHT expectations.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 276f6819-fcb6-48df-a7d4-de182c3af1f6

📥 Commits

Reviewing files that changed from the base of the PR and between 238cb7f and ed44b77.

⛔ Files ignored due to path filters (7)
  • Cargo.lock is excluded by !**/*.lock and included by none
  • forester-utils/src/address_staging_tree.rs is excluded by none and included by none
  • program-tests/utils/src/e2e_test_env.rs is excluded by none and included by none
  • program-tests/utils/src/mock_batched_forester.rs is excluded by none and included by none
  • program-tests/utils/src/test_batch_forester.rs is excluded by none and included by none
  • sparse-merkle-tree/src/indexed_changelog.rs is excluded by none and included by none
  • sparse-merkle-tree/tests/indexed_changelog.rs is excluded by none and included by none
📒 Files selected for processing (14)
  • forester/src/processor/v2/helpers.rs
  • forester/src/processor/v2/strategy/address.rs
  • prover/client/src/constants.rs
  • prover/client/src/errors.rs
  • prover/client/src/helpers.rs
  • prover/client/src/proof_types/batch_address_append/proof_inputs.rs
  • prover/client/src/proof_types/batch_append/proof_inputs.rs
  • prover/client/src/proof_types/batch_update/proof_inputs.rs
  • prover/client/src/prover.rs
  • prover/client/tests/batch_address_append.rs
  • sdk-libs/client/src/indexer/photon_indexer.rs
  • sdk-libs/client/src/indexer/types/queue.rs
  • sdk-libs/client/src/interface/load_accounts.rs
  • sdk-libs/program-test/src/indexer/test_indexer.rs

Comment on lines +489 to +534
pub fn get_batch_snapshot<const HEIGHT: usize>(
&self,
start: usize,
end: usize,
hashchain_idx: usize,
) -> crate::Result<Option<AddressBatchSnapshot<HEIGHT>>> {
let available = self.wait_for_batch(end);
if start >= available {
return None;
return Ok(None);
}
let actual_end = end.min(available);
let data = lock_recover(&self.data, "streaming_address_queue.data");
Some(BatchDataSlice {
addresses: data.addresses[start..actual_end].to_vec(),

let addresses = data.addresses[start..actual_end].to_vec();
if addresses.is_empty() {
return Err(anyhow!("Empty batch at start={}", start));
}

let leaves_hashchain = match data.leaves_hash_chains.get(hashchain_idx).copied() {
Some(hashchain) => hashchain,
None => {
tracing::debug!(
"Missing leaves_hash_chain for batch {} (available: {}), deriving from addresses",
hashchain_idx,
data.leaves_hash_chains.len()
);
create_hash_chain_from_slice(&addresses).map_err(|error| {
anyhow!(
"Failed to derive leaves_hash_chain for batch {} from {} addresses: {}",
hashchain_idx,
addresses.len(),
error
)
})?
}
};

Ok(Some(AddressBatchSnapshot {
low_element_values: data.low_element_values[start..actual_end].to_vec(),
low_element_next_values: data.low_element_next_values[start..actual_end].to_vec(),
low_element_indices: data.low_element_indices[start..actual_end].to_vec(),
low_element_next_indices: data.low_element_next_indices[start..actual_end].to_vec(),
})
low_element_proofs: data.reconstruct_proofs::<HEIGHT>(start..actual_end)?,
addresses,
leaves_hashchain,
}))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject truncated snapshots before constructing proofs.

Line 499 truncates to actual_end = end.min(available), so this helper can return a short batch when wait_for_batch() times out or the fetch completes with fewer than end elements. The caller then feeds addresses.len() into process_batch, which turns a slow fetch or partial tail into an unsupported address-append proof size instead of a clean "not ready yet" result. The same path also slices low_element_* vectors without checking they reach actual_end, so malformed indexer data would panic the worker.

Return Ok(None) (or a specific incomplete-batch error) unless available >= end, and validate the sibling vector lengths before slicing. Based on learnings: ZKP batch sizes for address trees must be 10 or 250, and tree updates require batch proofs with the batch hash chain as a public input.

🩹 Suggested guardrails
     let available = self.wait_for_batch(end);
-    if start >= available {
+    if start >= available || available < end {
         return Ok(None);
     }
-    let actual_end = end.min(available);
+    let actual_end = end;
     let data = lock_recover(&self.data, "streaming_address_queue.data");
+
+    for (name, len) in [
+        ("low_element_values", data.low_element_values.len()),
+        ("low_element_next_values", data.low_element_next_values.len()),
+        ("low_element_indices", data.low_element_indices.len()),
+        ("low_element_next_indices", data.low_element_next_indices.len()),
+    ] {
+        if len < actual_end {
+            return Err(anyhow!(
+                "Incomplete address batch metadata: {} has {}, need {}",
+                name,
+                len,
+                actual_end
+            ));
+        }
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@forester/src/processor/v2/helpers.rs` around lines 489 - 534, In
get_batch_snapshot: reject truncated snapshots early by changing the available
check to return Ok(None) unless available >= end (i.e., if
self.wait_for_batch(end) < end -> Ok(None)) so we never construct proofs for
partial batches; before slicing validate that data.low_element_values,
low_element_next_values, low_element_indices, low_element_next_indices and
data.addresses have length >= actual_end and that
data.reconstruct_proofs::<HEIGHT>(start..actual_end) can be produced (otherwise
return an error like "incomplete batch data"); keep the existing
leaves_hash_chains fallback logic but do not proceed to build
AddressBatchSnapshot when any of these sibling/vector length checks fail.

Comment on lines +202 to +209
let new_element_values = &new_element_values[..zkp_batch_size];
let mut new_root = [0u8; 32];
let mut low_element_circuit_merkle_proofs = Vec::with_capacity(new_element_values.len());
let mut new_element_circuit_merkle_proofs = Vec::with_capacity(new_element_values.len());
let mut patched_low_element_next_values = Vec::with_capacity(new_element_values.len());
let mut patched_low_element_next_indices = Vec::with_capacity(new_element_values.len());
let mut patched_low_element_values = Vec::with_capacity(new_element_values.len());
let mut patched_low_element_indices = Vec::with_capacity(new_element_values.len());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate all batch slices before slicing/indexing them.

Line 202 can panic when zkp_batch_size > new_element_values.len(), and the loop later indexes every low_element_* slice with the same count. A truncated or inconsistent batch from the indexer will crash the client instead of returning ProverClientError.

Proposed fix
 ) -> Result<BatchAddressAppendInputs, ProverClientError> {
+    for (name, len) in [
+        ("new_element_values", new_element_values.len()),
+        ("low_element_values", low_element_values.len()),
+        ("low_element_next_values", low_element_next_values.len()),
+        ("low_element_indices", low_element_indices.len()),
+        ("low_element_next_indices", low_element_next_indices.len()),
+        ("low_element_proofs", low_element_proofs.len()),
+    ] {
+        if len < zkp_batch_size {
+            return Err(ProverClientError::GenericError(format!(
+                "{name} length {} is smaller than zkp_batch_size {}",
+                len, zkp_batch_size
+            )));
+        }
+    }
+
     let new_element_values = &new_element_values[..zkp_batch_size];
+    let low_element_values = &low_element_values[..zkp_batch_size];
+    let low_element_next_values = &low_element_next_values[..zkp_batch_size];
+    let low_element_indices = &low_element_indices[..zkp_batch_size];
+    let low_element_next_indices = &low_element_next_indices[..zkp_batch_size];
+    let low_element_proofs = &low_element_proofs[..zkp_batch_size];
     let mut new_root = [0u8; 32];
     let mut low_element_circuit_merkle_proofs = Vec::with_capacity(new_element_values.len());
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs` around
lines 202 - 209, Before slicing new_element_values and allocating per-element
vectors, validate that new_element_values.len() >= zkp_batch_size and that each
corresponding input slice (the low_element_* and new_element_* sources used
later) has at least zkp_batch_size elements; if any check fails return a
ProverClientError instead of panicking. Move the check to just before the line
creating the slice (currently using let new_element_values =
&new_element_values[..zkp_batch_size];), only perform the slice after
validation, and use a single safe variable (e.g., batch_len = zkp_batch_size)
when creating capacities and indexing in subsequent loops so all
Vec::with_capacity(...) and later indexing use the validated length. Ensure the
error returned is descriptive (e.g., "truncated batch from indexer") and that
all low_element_* and new_element_* sources are validated the same way to avoid
any later panics.

Comment on lines +54 to +57
let client = match reqwest::Client::builder().no_proxy().build() {
Ok(client) => client,
Err(_) => return false,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify all reqwest client construction sites in prover client code
rg -nP --type=rust 'Client::new\(|Client::builder\(\)\.no_proxy\(' prover/client/src -C2

# Inspect ProofClient constructors for proxy-policy consistency
rg -nP --type=rust 'pub fn (local|with_config|with_full_config)\(' prover/client/src/proof_client.rs -A30

Repository: Lightprotocol/light-protocol

Length of output: 3753


🏁 Script executed:

# Check for any client construction helper functions
rg -nP --type=rust 'fn.*client.*Client' prover/client/src

# Search for HTTP_PROXY or proxy-related environment usage
rg -nP --type=rust 'HTTP_PROXY|http_proxy|proxy' prover/client/src -i

# Check if there's guidance in docs or CLAUDE.md
find . -maxdepth 2 -name "CLAUDE.md" -o -name "README.md" | head -5

Repository: Lightprotocol/light-protocol

Length of output: 259


🏁 Script executed:

# Check CLAUDE.md for guidance
head -100 ./CLAUDE.md

# Check README for proxy/client context
grep -i -A5 -B5 "proxy\|client\|health" ./README.md | head -50

Repository: Lightprotocol/light-protocol

Length of output: 6205


🏁 Script executed:

# Check for any tests related to health_check or ProofClient proxy behavior
rg -nP --type=rust 'health_check|ProofClient' prover/client/src --type-list | grep -i test

# Check for any comments about proxy or client construction
rg -nP --type=rust 'no_proxy|Client::new|Client::builder' prover/client/src -B2 | grep -E '//|/\*'

# Examine the full context around health_check to see if there are other related issues
cat -n prover/client/src/prover.rs | sed -n '50,70p'

Repository: Lightprotocol/light-protocol

Length of output: 1206


🏁 Script executed:

# Check the complete ProofClient implementation to understand all client usages
cat -n prover/client/src/proof_client.rs | head -130

# Search for any HTTP_PROXY environment variable usage in the entire prover crate
rg -i 'http_proxy' prover/ --type=rust

# Check if there are any integration tests for health_check or ProofClient
rg -nP --type=rust 'health_check|ProofClient' prover/client/src/lib.rs

Repository: Lightprotocol/light-protocol

Length of output: 4536


Unify HTTP client proxy behavior between health checks and prove calls.

The health_check function (line 54) explicitly disables proxies with Client::builder().no_proxy(), but all three ProofClient constructors—local(), with_config(), and with_full_config()—use Client::new(), which respects the HTTP_PROXY environment variable. This mismatch creates a false readiness condition: health checks pass via direct connection, but proof submissions fail if an HTTP proxy is configured.

Additionally, line 56 silently drops the client builder error, making troubleshooting harder. Add error logging and consolidate client construction into a shared helper so the two code paths cannot diverge.

Suggested direction
-    let client = match reqwest::Client::builder().no_proxy().build() {
+    let client = match reqwest::Client::builder().no_proxy().build() {
         Ok(client) => client,
-        Err(_) => return false,
+        Err(err) => {
+            tracing::warn!(error = %err, "failed to build health-check client");
+            return false;
+        }
     };

Then refactor ProofClient constructors to use the same client policy (shared helper preferred) so health and prove paths cannot drift under proxy settings.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/prover.rs` around lines 54 - 57, health_check currently
builds a reqwest client with no_proxy() while ProofClient::local(),
ProofClient::with_config(), and ProofClient::with_full_config() use
Client::new(), causing divergent proxy behavior and a silent ignore of builder
errors; create a shared helper (e.g., build_http_client() ->
Result<reqwest::Client, anyhow::Error>) that constructs the client with the
desired proxy policy (no_proxy() or configurable as decided), return and
propagate the Result so callers can log errors instead of silently returning,
then refactor health_check and all ProofClient constructors to call this helper
(replace direct Client::new() and the current builder call), and ensure any
builder Err is logged via the existing logger before returning an error.

Comment on lines +1145 to +1154
if x.proof.len() < STATE_MERKLE_TREE_CANOPY_DEPTH {
return Err(IndexerError::InvalidParameters(format!(
"Merkle proof length ({}) is less than canopy depth ({})",
proof_vec.len(),
x.proof.len(),
STATE_MERKLE_TREE_CANOPY_DEPTH,
)));
}
proof_vec.truncate(proof_vec.len() - STATE_MERKLE_TREE_CANOPY_DEPTH);
let proof_len = x.proof.len() - STATE_MERKLE_TREE_CANOPY_DEPTH;

let proof = proof_vec
let proof = x.proof[..proof_len]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject truncated state proofs before building MerkleProof.

Line 1145 only checks x.proof.len() < STATE_MERKLE_TREE_CANOPY_DEPTH. That means a 10-element payload turns into an empty proof, and any 11-25-element payload is silently returned short after Line 1152. sdk-libs/sdk-types/src/constants.rs fixes the state tree at height 26 with canopy depth 10, so this path should reject anything that does not leave exactly 16 siblings after canopy removal.

🛠️ Proposed fix
-                    if x.proof.len() < STATE_MERKLE_TREE_CANOPY_DEPTH {
-                        return Err(IndexerError::InvalidParameters(format!(
-                            "Merkle proof length ({}) is less than canopy depth ({})",
-                            x.proof.len(),
-                            STATE_MERKLE_TREE_CANOPY_DEPTH,
-                        )));
-                    }
-                    let proof_len = x.proof.len() - STATE_MERKLE_TREE_CANOPY_DEPTH;
+                    let proof_len = x
+                        .proof
+                        .len()
+                        .checked_sub(STATE_MERKLE_TREE_CANOPY_DEPTH)
+                        .ok_or_else(|| {
+                            IndexerError::InvalidParameters(format!(
+                                "Merkle proof length ({}) is less than canopy depth ({})",
+                                x.proof.len(),
+                                STATE_MERKLE_TREE_CANOPY_DEPTH,
+                            ))
+                        })?;
+                    let expected_proof_len =
+                        light_sdk_types::constants::STATE_MERKLE_TREE_HEIGHT
+                            - STATE_MERKLE_TREE_CANOPY_DEPTH;
+                    if proof_len != expected_proof_len {
+                        return Err(IndexerError::InvalidParameters(format!(
+                            "Merkle proof length ({}) leaves {} siblings after canopy removal; expected {}",
+                            x.proof.len(),
+                            proof_len,
+                            expected_proof_len,
+                        )));
+                    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk-libs/client/src/indexer/photon_indexer.rs` around lines 1145 - 1154, The
code currently only checks x.proof.len() < STATE_MERKLE_TREE_CANOPY_DEPTH which
allows truncated proofs; update the validation before building the MerkleProof
(around x.proof, proof_len logic) to require the post-canopy proof length equals
the expected sibling count (i.e. expected_siblings = STATE_MERKLE_TREE_HEIGHT -
STATE_MERKLE_TREE_CANOPY_DEPTH or the known constant 16 for height 26/canopy
10). If x.proof.len() does not equal STATE_MERKLE_TREE_CANOPY_DEPTH +
expected_siblings (or proof_len != expected_siblings) return
Err(IndexerError::InvalidParameters(...)) instead of silently slicing and
constructing the MerkleProof.

Comment on lines +128 to +137
pub fn reconstruct_proofs<const HEIGHT: usize>(
&self,
address_range: std::ops::Range<usize>,
) -> Result<Vec<[[u8; 32]; HEIGHT]>, IndexerError> {
self.validate_proof_height::<HEIGHT>()?;
let node_lookup = self.build_node_lookup();
let mut proofs = Vec::with_capacity(address_range.len());

for address_idx in address_range {
proofs.push(self.reconstruct_proof_with_lookup::<HEIGHT>(address_idx, &node_lookup)?);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate address_range before reserving batch capacity.

Line 134 allocates address_range.len() proofs before any bounds check. A caller can pass an oversized range and force a huge reservation attempt long before Line 137 reports the first bad index. Reject ranges outside the available proof count up front.

🛠️ Proposed fix
     pub fn reconstruct_proofs<const HEIGHT: usize>(
         &self,
         address_range: std::ops::Range<usize>,
     ) -> Result<Vec<[[u8; 32]; HEIGHT]>, IndexerError> {
         self.validate_proof_height::<HEIGHT>()?;
+        if address_range.start > address_range.end {
+            return Err(IndexerError::InvalidParameters(format!(
+                "address_range start {} is greater than end {}",
+                address_range.start, address_range.end
+            )));
+        }
+        let available = self.low_element_indices.len();
+        if address_range.end > available {
+            return Err(IndexerError::InvalidParameters(format!(
+                "address_range {}..{} exceeds available proofs {}",
+                address_range.start, address_range.end, available
+            )));
+        }
         let node_lookup = self.build_node_lookup();
         let mut proofs = Vec::with_capacity(address_range.len());
 
         for address_idx in address_range {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn reconstruct_proofs<const HEIGHT: usize>(
&self,
address_range: std::ops::Range<usize>,
) -> Result<Vec<[[u8; 32]; HEIGHT]>, IndexerError> {
self.validate_proof_height::<HEIGHT>()?;
let node_lookup = self.build_node_lookup();
let mut proofs = Vec::with_capacity(address_range.len());
for address_idx in address_range {
proofs.push(self.reconstruct_proof_with_lookup::<HEIGHT>(address_idx, &node_lookup)?);
pub fn reconstruct_proofs<const HEIGHT: usize>(
&self,
address_range: std::ops::Range<usize>,
) -> Result<Vec<[[u8; 32]; HEIGHT]>, IndexerError> {
self.validate_proof_height::<HEIGHT>()?;
if address_range.start > address_range.end {
return Err(IndexerError::InvalidParameters(format!(
"address_range start {} is greater than end {}",
address_range.start, address_range.end
)));
}
let available = self.low_element_indices.len();
if address_range.end > available {
return Err(IndexerError::InvalidParameters(format!(
"address_range {}..{} exceeds available proofs {}",
address_range.start, address_range.end, available
)));
}
let node_lookup = self.build_node_lookup();
let mut proofs = Vec::with_capacity(address_range.len());
for address_idx in address_range {
proofs.push(self.reconstruct_proof_with_lookup::<HEIGHT>(address_idx, &node_lookup)?);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk-libs/client/src/indexer/types/queue.rs` around lines 128 - 137, The
method reconstruct_proofs currently reserves capacity using address_range.len()
before checking that the requested range is within the available proofs;
validate the range bounds up front (before Vec::with_capacity) by comparing
address_range.start and address_range.end against the available proof count
(e.g., self.proof_count() or the relevant collection length on self), and return
an IndexerError for out-of-range requests; only then call Vec::with_capacity and
proceed to call build_node_lookup and reconstruct_proof_with_lookup::<HEIGHT>.

Comment on lines +120 to +124
let pda_groups = group_pda_specs(&cold_pdas, MAX_PDAS_PER_IX);
let pda_hashes = pda_groups
.iter()
.map(|group| collect_pda_hashes(group))
.collect::<Result<Vec<_>, _>>()?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Keep MissingPdaCompressed.index unique across the whole load.

collect_pda_hashes(group) restarts enumerate() for every batch, so a failure in the second or third group can report index: 0 again. That makes the index ambiguous in logs/tests. Carry the cold-PDA offset into each group, or drop the index from this variant for consistency with MissingPdaCompressedData.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk-libs/client/src/interface/load_accounts.rs` around lines 120 - 124, The
failure index reported by MissingPdaCompressed is non-unique because
collect_pda_hashes(group) restarts enumerate() per batch; change the code so the
global cold-PDA offset is carried into each group (or remove the index from
MissingPdaCompressed). Concretely: modify collect_pda_hashes to accept a
start_index/offset (or create a new collect_pda_hashes_with_offset) and when
mapping pda_groups pass the accumulated offset (e.g., outer_idx *
MAX_PDAS_PER_IX or running total) so that the inner enumerate adds that offset
when constructing MissingPdaCompressed.index; alternatively, remove the index
field from MissingPdaCompressed to match MissingPdaCompressedData for
consistency.

Comment on lines +244 to +259
async fn fetch_individual_proofs<I: Indexer>(
hashes: &[[u8; 32]],
indexer: &I,
) -> Result<Vec<ValidityProofWithContext>, IndexerError> {
if hashes.is_empty() {
return Ok(vec![]);
}
let mut proofs = Vec::with_capacity(hashes.len());
for hash in hashes {
proofs.push(
indexer
.get_validity_proof(vec![*hash], vec![], None)
.await?
.value,
);

futures::future::try_join_all(hashes.iter().map(|hash| async move {
indexer
.get_validity_proof(vec![*hash], vec![], None)
.await
.map(|response| response.value)
}))
.await
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file structure and the specific functions
wc -l sdk-libs/client/src/interface/load_accounts.rs

Repository: Lightprotocol/light-protocol

Length of output: 122


🏁 Script executed:

# Read the file around the specified line ranges
sed -n '240,280p' sdk-libs/client/src/interface/load_accounts.rs

Repository: Lightprotocol/light-protocol

Length of output: 1081


🏁 Script executed:

# Search for fetch_proof_batches to see the full function
rg -A 20 "fn fetch_proof_batches" sdk-libs/client/src/interface/load_accounts.rs

Repository: Lightprotocol/light-protocol

Length of output: 643


🏁 Script executed:

# Check imports at the top of the file to see what futures utilities are already used
head -50 sdk-libs/client/src/interface/load_accounts.rs

Repository: Lightprotocol/light-protocol

Length of output: 1704


🏁 Script executed:

# Search for any documentation about concurrency or request limits in the codebase
rg -i "concurrent|buffer" sdk-libs/client/README.md program-libs/batched-merkle-tree/README.md

Repository: Lightprotocol/light-protocol

Length of output: 301


🏁 Script executed:

# Check if there are rate limiting or concurrent request patterns elsewhere in the codebase
rg -i "buffered|buffer_unordered|rate|limit" sdk-libs/client/src/ --type rs | head -20

Repository: Lightprotocol/light-protocol

Length of output: 98


🏁 Script executed:

# Check the Indexer trait definition to understand the API better
fd "indexer.rs" sdk-libs/client/src/

Repository: Lightprotocol/light-protocol

Length of output: 152


🏁 Script executed:

# Look at the Indexer trait definition
rg -A 30 "trait Indexer" sdk-libs/client/src/indexer/

Repository: Lightprotocol/light-protocol

Length of output: 2653


🏁 Script executed:

# Check if StreamExt is already imported elsewhere in the project
rg "StreamExt|TryStreamExt" sdk-libs/client/src/ --type rs

Repository: Lightprotocol/light-protocol

Length of output: 98


🏁 Script executed:

# Look for any documentation about concurrent requests or connection limits in indexer docs
rg -i "concurrent|connection|limit|rate" sdk-libs/client/ --type md

Repository: Lightprotocol/light-protocol

Length of output: 130


🏁 Script executed:

# Find the get_validity_proof method signature
rg "get_validity_proof" sdk-libs/client/src/indexer/ -A 10

Repository: Lightprotocol/light-protocol

Length of output: 6433


🏁 Script executed:

# Check if there are any tests or examples showing concurrent request issues
rg -i "try_join_all|concurrent" sdk-libs/client/src/

Repository: Lightprotocol/light-protocol

Length of output: 844


🏁 Script executed:

# Look at how the load_accounts functions are called to understand concurrency impact
rg -B 5 "fetch_individual_proofs\|fetch_proof_batches" sdk-libs/client/src/

Repository: Lightprotocol/light-protocol

Length of output: 54


🏁 Script executed:

# Check the README.md for sdk-libs/client to understand best practices
cat sdk-libs/client/README.md

Repository: Lightprotocol/light-protocol

Length of output: 3145


Cap the concurrent proof RPCs.

Both fetch_individual_proofs and fetch_proof_batches spawn all requests concurrently via try_join_all, which can exceed indexer/provider limits and cause flakiness with large cold-account sets. Limit concurrency while preserving order (the later zip operations depend on positional matching).

💡 One way to bound concurrency while preserving order
+use futures::{StreamExt, TryStreamExt};
+
 const MAX_ATAS_PER_IX: usize = 8;
 const MAX_PDAS_PER_IX: usize = 8;
+const MAX_CONCURRENT_PROOF_REQUESTS: usize = 4;
 
 async fn fetch_individual_proofs<I: Indexer>(
     hashes: &[[u8; 32]],
     indexer: &I,
 ) -> Result<Vec<ValidityProofWithContext>, IndexerError> {
     if hashes.is_empty() {
         return Ok(vec![]);
     }
 
-    futures::future::try_join_all(hashes.iter().map(|hash| async move {
-        indexer
-            .get_validity_proof(vec![*hash], vec![], None)
-            .await
-            .map(|response| response.value)
-    }))
-    .await
+    futures::stream::iter(hashes.iter().copied())
+        .map(|hash| async move {
+            indexer
+                .get_validity_proof(vec![hash], vec![], None)
+                .await
+                .map(|response| response.value)
+        })
+        .buffered(MAX_CONCURRENT_PROOF_REQUESTS)
+        .try_collect()
+        .await
 }

Apply the same pattern to fetch_proof_batches.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async fn fetch_individual_proofs<I: Indexer>(
hashes: &[[u8; 32]],
indexer: &I,
) -> Result<Vec<ValidityProofWithContext>, IndexerError> {
if hashes.is_empty() {
return Ok(vec![]);
}
let mut proofs = Vec::with_capacity(hashes.len());
for hash in hashes {
proofs.push(
indexer
.get_validity_proof(vec![*hash], vec![], None)
.await?
.value,
);
futures::future::try_join_all(hashes.iter().map(|hash| async move {
indexer
.get_validity_proof(vec![*hash], vec![], None)
.await
.map(|response| response.value)
}))
.await
}
use futures::{StreamExt, TryStreamExt};
const MAX_ATAS_PER_IX: usize = 8;
const MAX_PDAS_PER_IX: usize = 8;
const MAX_CONCURRENT_PROOF_REQUESTS: usize = 4;
async fn fetch_individual_proofs<I: Indexer>(
hashes: &[[u8; 32]],
indexer: &I,
) -> Result<Vec<ValidityProofWithContext>, IndexerError> {
if hashes.is_empty() {
return Ok(vec![]);
}
futures::stream::iter(hashes.iter().copied())
.map(|hash| async move {
indexer
.get_validity_proof(vec![hash], vec![], None)
.await
.map(|response| response.value)
})
.buffered(MAX_CONCURRENT_PROOF_REQUESTS)
.try_collect()
.await
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk-libs/client/src/interface/load_accounts.rs` around lines 244 - 259,
fetch_individual_proofs spawns all RPCs at once which can overwhelm the indexer;
replace the try_join_all approach with a bounded concurrency stream so requests
are limited but results keep original order. Use futures::stream::iter over the
hashes, map each hash to the async indexer.get_validity_proof call, then apply
StreamExt::buffered(CONCURRENCY) (or a named constant) and collect the responses
into a Vec to preserve positional order; apply the same pattern to
fetch_proof_batches (use the same CONCURRENCY constant and the same
stream->buffered->collect flow) so both functions limit concurrent RPCs while
keeping result ordering for subsequent zip operations.

Comment on lines +729 to +731
// Consumers use start_index as the sparse tree's next insertion index,
// not the pagination offset used for queue slicing.
start_index: address_tree_bundle.right_most_index() as u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve start_index pagination semantics in address queue response

Line 731 repurposes start_index to tree insertion index, but the returned addresses slice is still paginated by options.address_queue_start_index (Lines 680-686). This makes response metadata inconsistent with the page and diverges from the indexer contract used by the photon implementation.

Suggested fix
-                        // Consumers use start_index as the sparse tree's next insertion index,
-                        // not the pagination offset used for queue slicing.
-                        start_index: address_tree_bundle.right_most_index() as u64,
+                        // Keep start_index aligned with the paginated queue slice.
+                        start_index: start as u64,

If you need sparse-tree next insertion index, expose it via a separate field instead of overloading start_index.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Consumers use start_index as the sparse tree's next insertion index,
// not the pagination offset used for queue slicing.
start_index: address_tree_bundle.right_most_index() as u64,
// Keep start_index aligned with the paginated queue slice.
start_index: start as u64,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk-libs/program-test/src/indexer/test_indexer.rs` around lines 729 - 731,
The response currently sets start_index to
address_tree_bundle.right_most_index(), which breaks pagination semantics
because the returned addresses slice is still paginated by
options.address_queue_start_index; revert start_index to reflect the pagination
offset (use options.address_queue_start_index) and add a new field (e.g.,
tree_next_insertion_index or insertion_index) to the response struct to carry
address_tree_bundle.right_most_index() as u64; update the response construction
where start_index and addresses are populated and adjust the response
type/struct (and any builders) to include the new field so callers can read the
sparse-tree next insertion index without overloading start_index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant