Skip to content

Add optional traversal limit for commit offset lookup#4007

Open
NssGourav wants to merge 2 commits intoapache:mainfrom
NssGourav:fix/commit-offset-traversal-limit
Open

Add optional traversal limit for commit offset lookup#4007
NssGourav wants to merge 2 commits intoapache:mainfrom
NssGourav:fix/commit-offset-traversal-limit

Conversation

@NssGourav
Copy link
Copy Markdown

@NssGourav NssGourav commented Mar 17, 2026

Summary

  • Add a new polaris.persistence.commit-offset-lookup-max-traversal parameter (default 0 = unlimited) to optionally bound commit history traversal while resolving offsets.
  • Enforce the limit in CommitsImpl.commitLog() and CommitsImpl.commitLogReversed() during offset lookup, failing fast with a clear exception when exceeded.
  • Add tests covering the fail-fast behavior for both variants.

Rationale

CommitsImpl currently walks commit history unbounded when resolving offsets. In pathological cases (very large histories or missing/far-back offsets) this can lead to unbounded work.

The safeguard is opt-in (default unlimited) to avoid breaking background/maintenance flows that may legitimately traverse the full log.

Test plan

  • ./gradlew :persistence:nosql:persistence-impl:test

Fixes #4000

@github-project-automation github-project-automation Bot moved this to PRs In Progress in Basic Kanban Board Mar 17, 2026
@NssGourav NssGourav closed this Mar 17, 2026
@NssGourav NssGourav deleted the fix/commit-offset-traversal-limit branch March 17, 2026 08:44
@github-project-automation github-project-automation Bot moved this from PRs In Progress to Done in Basic Kanban Board Mar 17, 2026
@NssGourav NssGourav restored the fix/commit-offset-traversal-limit branch March 17, 2026 08:46
@NssGourav NssGourav reopened this Mar 17, 2026
@github-project-automation github-project-automation Bot moved this from Done to PRs In Progress in Basic Kanban Board Mar 17, 2026
@NssGourav NssGourav marked this pull request as draft March 17, 2026 08:47
@NssGourav NssGourav marked this pull request as ready for review March 17, 2026 08:58
@Subham-KRLX
Copy link
Copy Markdown
Contributor

Subham-KRLX commented Mar 17, 2026

A few things worth addressing like the limit is still inside CommitsImpl, but @snazy mentioned the TODO was meant for the call sites (REST/service layer), not the implementation itself worth clarifying with him before going further. There's also a counter bug where commitLog and commitLogReversed count traversals differently since traversed starts at 0 in one and 1 in the other, so the same maxTraversal value behaves inconsistently. On the exception side, IllegalStateException feels off for an expected operational limit a dedicated exception would be easier to catch specifically. For the tests, the .hasNext() in the commitLog test is misleading since the exception fires during the commitLog() call itself, not lazily, and there's also no test for the success case where the offset is found within the limit. Overall the approach is reasonable, just needs a quick sync with the maintainer on the architectural question first.

@NssGourav
Copy link
Copy Markdown
Author

Thanks agreed on the feedback.

I understand the TODOs may be intended for call-site guardrails; I implemented an opt-in safeguard in CommitsImpl (commit-offset-lookup-max-traversal, default 0 = unlimited) to avoid impacting maintenance flows unless configured.

Fixed traversal counting so commitLog and commitLogReversed behave consistently.
Replaced IllegalStateException with CommitOffsetTraversalLimitExceededException for easier catching.

Updated tests to assert at the right point and added success path coverage.

@dimas-b dimas-b requested a review from snazy March 17, 2026 14:47
@NssGourav NssGourav marked this pull request as draft March 21, 2026 05:05
@Subham-KRLX
Copy link
Copy Markdown
Contributor

@dimas-b can u please review this pr.

@dimas-b
Copy link
Copy Markdown
Contributor

dimas-b commented Apr 16, 2026

The idea looks reasonable to me at first glance... Should the PR be non-draft for a wide review, though 😉

Please also rebase (just to have a recent base commit) and fix CI :)

if (tailId == offset) {
break outer;
}
if (maxTraversal != UNLIMITED_TRAVERSAL && ++traversed > maxTraversal) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please make the ++ operation more prominent (for ease of following the logic), e.g. do it on a separate line.

throw new CommitOffsetTraversalLimitExceededException(
refName, offset, traversed, maxTraversal);
}
visited.add(tailId);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't traversed equivalent to visited.size()?

* <p>A value of {@code 0} disables the safeguard (unlimited traversal).
*/
@WithDefault("0")
long commitOffsetLookupMaxTraversal();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe OptionalLong?

var type = head.type().id();

// TODO add safeguard to limit the work done when finding the commit with ID 'offset'
var maxTraversal = persistence.params().commitOffsetLookupMaxTraversal();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guardrail in this case is different in nature from commitLogReversed. Here we do not accumulate memory, AFAIK. It might be preferable to use different config settings for these cases.

Also, commitLogReversed is not called in "prod" code ATM.


// Only walk, if the most recent commit ID is != offset
if (head.id() == off) {
return singletonList(head).iterator();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@snazy : is this correct? 🤔

@Subham-KRLX
Copy link
Copy Markdown
Contributor

@NssGourav Could you please take a look at the recent review comments from @dimas-b and apply them to the current code? Also it looks like there are a couple of CI checks failing right now that will need to be fixed. Let us know if you need any help!

@NssGourav NssGourav force-pushed the fix/commit-offset-traversal-limit branch 3 times, most recently from 6dc6db0 to 3166814 Compare April 17, 2026 14:32
@NssGourav NssGourav closed this Apr 17, 2026
@NssGourav NssGourav force-pushed the fix/commit-offset-traversal-limit branch from 3166814 to fdb5ce2 Compare April 17, 2026 14:32
@github-project-automation github-project-automation Bot moved this from PRs In Progress to Done in Basic Kanban Board Apr 17, 2026
@NssGourav NssGourav reopened this Apr 17, 2026
@github-project-automation github-project-automation Bot moved this from Done to PRs In Progress in Basic Kanban Board Apr 17, 2026
@dimas-b
Copy link
Copy Markdown
Contributor

dimas-b commented Apr 21, 2026

@NssGourav : Please rebase this PR to get recent CI job changes.

Introduce a PersistenceParams knob to bound offset resolution in CommitsImpl and fail fast when exceeded, with coverage verifying the safeguard for both commitLog variants.

Made-with: Cursor
Use a dedicated exception for limit exceedance, make traversal counting consistent across log variants, and expand tests to cover both fail-fast and success paths.

Made-with: Cursor
@NssGourav NssGourav force-pushed the fix/commit-offset-traversal-limit branch from 6dc6db0 to b626841 Compare April 21, 2026 11:47
@NssGourav NssGourav marked this pull request as ready for review April 21, 2026 11:48
@snazy
Copy link
Copy Markdown
Member

snazy commented Apr 21, 2026

I'm still a bit concerned about adding a global setting that trickles down to all call sites (currently only maintenance).
Maintenance has its own limit and if configured with a higher value than a global one would be confusing, technically even wrong.

@NssGourav
Copy link
Copy Markdown
Author

Thanks, that makes sense, a global traversal limit in persistence can definitely become confusing if maintenance already has a separate limit, and I agree that would be the wrong layering.

I’ll revise the PR to avoid introducing a global setting here and keep the safeguard scoped to the specific caller/use case instead, then I’ll rerun CI and update the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add safeguard to limit commit history traversal in CommitsImpl when using offset

4 participants