fix: review-agent duplicate bot responses due to missing API pagination#74690
Conversation
The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
[REHEARSALNOTIFIER]
Prior to this PR being merged, you will need to either run and acknowledge or opt to skip these rehearsals. Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
/pj-rehearse periodic-ci-openshift-hypershift-main-periodic-review-agent |
|
@bryan-cox: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@bryan-cox: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/lgtm |
|
/pj-rehearse ack |
|
@bryan-cox: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bryan-cox, muraee The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…openshift#74690) The review-agent periodic job (every 3 hours) posted 8 nearly identical responses to the same question on PR openshift/hypershift#7561 between Feb 9-10. Root cause: comment_analyzer.py fetches issue comments without --paginate, so GitHub returns only the first 30 (default page size). Bot replies land at the end of the comment list, making them invisible to the analyzer on PRs with 30+ comments. Empirical evidence from PR openshift#7561: - Total comments: 38 (spans 2 pages) - Without --paginate: 30 returned, 7 bot comments visible - With --paginate: 38 returned, all 15 bot comments visible - The 8 bot comments from Feb 9-10 are ALL on page 2 - Analyzer saw last bot reply as Feb 6 instead of Feb 10 - Every human comment since Feb 6 was re-flagged each run The cascade: page 1 misses recent bot replies → last_bot_time is stale → all newer human comments flagged → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens. Within-run duplicates (30-70s apart) are caused by two additional issues: (1) check_replied.py path resolution relies on CLAUDE_PLUGIN_ROOT env var which may not be inherited by Claude's Bash tool in CI, and (2) GitHub API propagation delay means a just-posted reply may not be visible when checking for duplicates before the next reply. Changes: - Add --paginate to both gh api calls in comment_analyzer.py (analyze_review_bodies and analyze_issue_comments) - Add explicit check_replied.py path in REVIEW_CONTEXT so Claude doesn't depend on CLAUDE_PLUGIN_ROOT inheritance - Add within-session file-based dedup tracking to guard against API propagation delays Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Summary
The review-agent periodic CI job (runs every 3 hours) is posting duplicate responses to PR comments. On openshift/hypershift#7561, the bot posted 8 nearly identical responses to the same question between Feb 9-10. This PR fixes three root causes.
Problem
Empirical evidence from PR #7561
--paginate2026-02-06T08:24:15Z)2026-02-10T14:09:33Z)One human comment on Feb 9 at 18:23 UTC triggered 8 duplicate bot responses across 5 periodic runs:
2026-02-09T20:10:57Z2026-02-09T23:07:36Z2026-02-09T23:09:12Z2026-02-10T08:07:05Z2026-02-10T08:07:37Z2026-02-10T11:06:45Z2026-02-10T11:07:56Z2026-02-10T14:09:33ZTwo distinct patterns are visible: cross-run duplicates (~3h apart, matching the periodic schedule) and within-run duplicates (30-96s apart, within a single session).
Root cause 1: Missing
--paginate(cross-run duplicates)comment_analyzer.pyfetches issue comments without--paginatein two places (analyze_review_bodiesandanalyze_issue_comments). GitHub's REST API returns max 30 results per page by default. On PRs with 30+ comments, only page 1 (oldest comments) is returned. Bot replies accumulate at the end of the comment list, landing on page 2+.The cascade: Analyzer misses bot replies on page 2 →
last_bot_timeis stale/None → all recent human comments flagged as needing attention → Claude re-responds → more comments push bot replies further onto page 2 → problem worsens each run.Note:
check_replied.py(the dedup script used by theaddress-reviewscommand) already uses--paginatecorrectly (line 158). The analyzer just missed it.Root cause 2:
check_replied.pypath not reliably found in CI (within-run duplicates)CLAUDE_PLUGIN_ROOTis exported in the shell script (line 12), but Claude Code's Bash tool initializes a new shell from the user's profile, not from the parent process environment. If the env var isn't inherited, the path resolution falls back to searching~/.claude/plugins(doesn't exist in CI), and Claude may proceed without the dedup check.Root cause 3: API propagation delay (within-run duplicates)
When Claude posts reply A and immediately checks for existing replies before posting reply B, GitHub's API may not have propagated reply A yet. This explains the 30-96 second paired duplicates within a single run.
Changes
Single file:
hypershift-review-agent-process-commands.sh--paginateto bothgh apicalls incomment_analyzer.py(analyze_review_bodiesandanalyze_issue_comments) so all comments across all pages are fetchedcheck_replied.pypath inREVIEW_CONTEXTso Claude uses the script directly instead of relying onCLAUDE_PLUGIN_ROOTenv var inheritanceREVIEW_CONTEXTto guard against API propagation delays between repliesTest plan
--paginatenow returns all 38 comments for PR Fix statusreconciler deployment to be valid #7561 (vs 30 without it)last_bot_timecorrectly shows Feb 10 timestamp instead of Feb 6/tmp/ai-helpers/plugins/utils/scripts/check_replied.pyexists after the clone step in CI🤖 Generated with Claude Code