WIP: Add retry logic for WellKnownAvailable endpoint checks by sdodson · Pull Request #779 · openshift/cluster-authentication-operator

sdodson · 2025-08-11T19:02:00Z

The operator now retries the well-known endpoint check every second for up to 15 seconds before setting itself unavailable. This prevents premature "unavailable" status during temporary network issues or API server startup scenarios.

Changes:

Add retry loop with 15-second timeout and 1-second intervals
Preserve existing ControllerProgressingError handling logic
Set WellKnownAvailable=False with reason "NotReady" after retries exhausted
Maintain proper progressing status during retry attempts

This change improves operator reliability by giving well-known endpoints time to become ready while maintaining the same final error handling behavior.

Assisted-by: Cursor, Claude Sonnet 4

Related to https://issues.redhat.com/browse/OCPBUGS-20056 but since this only addresses one of the paths I'm not linking this until I can prove whether or not this seems to help.

The operator now retries the well-known endpoint check every second for up to 15 seconds before setting itself unavailable. This prevents premature "unavailable" status during temporary network issues or API server startup scenarios. Changes: - Add retry loop with 15-second timeout and 1-second intervals - Preserve existing ControllerProgressingError handling logic - Set WellKnownAvailable=False with reason "NotReady" after retries exhausted - Maintain proper progressing status during retry attempts This change improves operator reliability by giving well-known endpoints time to become ready while maintaining the same final error handling behavior. Assisted-by: Cursor, Claude Sonnet 4

sdodson · 2025-08-11T19:02:35Z

/test ?

openshift-ci · 2025-08-11T19:02:35Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sdodson
Once this PR has been reviewed and has the lgtm label, please assign liouk for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2025-08-11T19:02:39Z

@sdodson: The following commands are available to trigger required jobs:

/test e2e-agnostic

/test e2e-agnostic-upgrade

/test e2e-console-login

/test e2e-gcp-operator-encryption-perf

/test e2e-gcp-operator-encryption-rotation

/test e2e-oidc-techpreview

/test e2e-operator

/test e2e-operator-encryption

/test images

/test okd-scos-images

/test unit

/test verify

/test verify-bindata

/test verify-deps

The following commands are available to trigger optional jobs:

/test e2e-agnostic-ipv6

/test e2e-aws-external-oidc

/test e2e-aws-single-node

/test e2e-azure-external-oidc

/test e2e-gcp-external-oidc

/test okd-scos-e2e-aws-ovn

/test test-operator-integration

Use /test all to run the following jobs that were automatically triggered:

pull-ci-openshift-cluster-authentication-operator-master-e2e-agnostic

pull-ci-openshift-cluster-authentication-operator-master-e2e-agnostic-ipv6

pull-ci-openshift-cluster-authentication-operator-master-e2e-agnostic-upgrade

pull-ci-openshift-cluster-authentication-operator-master-e2e-aws-single-node

pull-ci-openshift-cluster-authentication-operator-master-e2e-console-login

pull-ci-openshift-cluster-authentication-operator-master-e2e-operator

pull-ci-openshift-cluster-authentication-operator-master-images

pull-ci-openshift-cluster-authentication-operator-master-okd-scos-e2e-aws-ovn

pull-ci-openshift-cluster-authentication-operator-master-okd-scos-images

pull-ci-openshift-cluster-authentication-operator-master-test-operator-integration

pull-ci-openshift-cluster-authentication-operator-master-unit

pull-ci-openshift-cluster-authentication-operator-master-verify

pull-ci-openshift-cluster-authentication-operator-master-verify-bindata

pull-ci-openshift-cluster-authentication-operator-master-verify-deps

Details

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

sdodson · 2025-08-11T19:06:59Z

/payload-test periodic-ci-openshift-release-master-ci-4.20-e2e-aws-upgrade-ovn-single-node

openshift-ci · 2025-08-11T19:07:01Z

@sdodson: it appears that you have attempted to use some version of the payload command, but your comment was incorrectly formatted and cannot be acted upon. See the docs for usage info.

sdodson · 2025-08-11T19:07:30Z

/payload-job periodic-ci-openshift-release-master-ci-4.20-e2e-aws-upgrade-ovn-single-node

openshift-ci · 2025-08-11T19:07:33Z

@sdodson: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-master-ci-4.20-e2e-aws-upgrade-ovn-single-node

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/6e4f93f0-76e6-11f0-86f8-a1b16a4176a6-0

openshift-ci · 2025-10-15T12:15:36Z

@sdodson: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-bot · 2026-01-14T09:00:21Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

coderabbitai · 2026-01-14T09:00:32Z

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Review skipped — only excluded labels are configured. (1)

do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-bot · 2026-02-14T00:30:09Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 11, 2025

openshift-ci bot requested review from ibihim and liouk August 11, 2025 19:02

sdodson marked this pull request as draft August 11, 2025 19:11

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 14, 2026

openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Add retry logic for WellKnownAvailable endpoint checks#779

WIP: Add retry logic for WellKnownAvailable endpoint checks#779
sdodson wants to merge 1 commit intoopenshift:masterfrom
sdodson:OCPBUGS-20056

sdodson commented Aug 11, 2025

Uh oh!

sdodson commented Aug 11, 2025

Uh oh!

openshift-ci bot commented Aug 11, 2025

Uh oh!

openshift-ci bot commented Aug 11, 2025

Uh oh!

sdodson commented Aug 11, 2025

Uh oh!

openshift-ci bot commented Aug 11, 2025

Uh oh!

sdodson commented Aug 11, 2025

Uh oh!

openshift-ci bot commented Aug 11, 2025

Uh oh!

openshift-ci bot commented Oct 15, 2025

Uh oh!

openshift-bot commented Jan 14, 2026

Uh oh!

coderabbitai bot commented Jan 14, 2026 •

edited

Loading

Review skipped

Uh oh!

openshift-bot commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sdodson commented Aug 11, 2025

Uh oh!

sdodson commented Aug 11, 2025

Uh oh!

openshift-ci bot commented Aug 11, 2025

Uh oh!

openshift-ci bot commented Aug 11, 2025

Uh oh!

sdodson commented Aug 11, 2025

Uh oh!

openshift-ci bot commented Aug 11, 2025

Uh oh!

sdodson commented Aug 11, 2025

Uh oh!

openshift-ci bot commented Aug 11, 2025

Uh oh!

openshift-ci bot commented Oct 15, 2025

Uh oh!

openshift-bot commented Jan 14, 2026

Uh oh!

coderabbitai bot commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

openshift-bot commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Jan 14, 2026 •

edited

Loading