Skip to content

feat: Add infractl wait command#1768

Open
msugakov wants to merge 14 commits intomasterfrom
misha/add-wait-command
Open

feat: Add infractl wait command#1768
msugakov wants to merge 14 commits intomasterfrom
misha/add-wait-command

Conversation

@msugakov
Copy link

@msugakov msugakov commented Mar 5, 2026

Per title.
I think, it's going to be beneficial to reduce time in CI: you trigger the cluster creation without waiting, do something else, then come back and wait until the cluster is ready, then start tests.

This PR can be reviewed by commits.

Tested manually.

@msugakov msugakov requested review from a team and rhacs-bot as code owners March 5, 2026 12:59
@rhacs-bot
Copy link
Contributor

A single node development cluster (infra-pr-1768) was allocated in production infra for this PR.

CI will attempt to deploy quay.io/rhacs-eng/infra-server: to it.

🔌 You can connect to this cluster with:

gcloud container clusters get-credentials infra-pr-1768 --zone us-central1-a --project acs-team-temp-dev

🛠️ And pull infractl from the deployed dev infra-server with:

nohup kubectl -n infra port-forward svc/infra-server-service 8443:8443 &
make pull-infractl-from-dev-server

🚲 You can then use the dev infra instance e.g.:

bin/infractl -k -e localhost:8443 whoami

⚠️ Any clusters that you start using your dev infra instance should have a lifespan shorter then the development cluster instance. Otherwise they will not be destroyed when the dev infra instance ceases to exist when the development cluster is deleted. ⚠️

Further Development

☕ If you make changes, you can commit and push and CI will take care of updating the development cluster.

🚀 If you only modify configuration (chart/infra-server/configuration) or templates (chart/infra-server/{static,templates}), you can get a faster update with:

make helm-deploy

Logs

Logs for the development infra depending on your @redhat.com authuser:

Or:

kubectl -n infra logs -l app=infra-server --tail=1 -f

@msugakov msugakov mentioned this pull request Mar 5, 2026
@msugakov msugakov force-pushed the misha/add-wait-command branch from 15b1901 to bf94f66 Compare March 5, 2026 18:44
msugakov added 2 commits March 5, 2026 19:53
```
  Running [/home/runner/golangci-lint-2.10.1-linux-amd64/golangci-lint config path] in [/home/runner/work/infra/infra] ...
  Running [/home/runner/golangci-lint-2.10.1-linux-amd64/golangci-lint config verify] in [/home/runner/work/infra/infra] ...
  Running [/home/runner/golangci-lint-2.10.1-linux-amd64/golangci-lint run] in [/home/runner/work/infra/infra] ...
  Error: cmd/infractl/flavor/list/fancy.go:1:9: var-naming: avoid package names that conflict with Go standard library package names (revive)
  package list
          ^
  Error: pkg/buildinfo/buildinfo.go:3:9: var-naming: avoid package names that conflict with Go standard library package names (revive)
  package buildinfo
          ^
  Error: pkg/service/metrics/metrics.go:2:9: var-naming: avoid package names that conflict with Go standard library package names (revive)
  package metrics
          ^
  3 issues:
  * revive: 3

  Error: issues found
```

https://github.com/stackrox/infra/actions/runs/22721785912/job/65885650995?pr=1768#step:4:40
I did not break those, I did not even touch them.

Confirmed they started failing on no-op PR, see
#1769 and
https://github.com/stackrox/infra/actions/runs/22730204288/job/65916882489?pr=1769
Not sure why `master` wasn't affected.

Suppressing.

Used these docs
https://golangci-lint.run/docs/linters/false-positives/#exclude-issues-by-path

I tried `package blah //nolint:var-naming` but that did not work.
I did not do
```
//nolint:var-naming
package blah
```
because that would turn off the rule for the entire file.
I think the value of this test is nearly zero, but for consistency...
`infractl artifacts` command goes without a test though.
@msugakov msugakov changed the title *WIP* feat: Add infractl wait command feat: Add infractl wait command Mar 5, 2026
@msugakov msugakov force-pushed the misha/add-wait-command branch from bf94f66 to 33243d6 Compare March 5, 2026 20:00
Copy link
Contributor

@davdhacs davdhacs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 setting a retry limit will be useful.


const (
maxWaitErrorsFlagName = "wait-max-errors"
defaultMaxConsecutiveWaitErrors = 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defaults of 10 * 30, right? 5 minutes may be too short for the default?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it default to infinite (or functionally longer than any creation will allow) to match the existing behavior? Maybe it isn't used by anyone though and this is a good point to change the default to a sane time limit.

cmd.Flags().String("description", "", "description for this cluster")
cmd.Flags().Duration("lifespan", 3*time.Hour, "initial lifespan of the cluster")
cmd.Flags().Bool("wait", false, "wait for cluster to be ready")
common.AddMaxWaitErrorsFlag(cmd)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice. I don't use the --wait. I think I don't because of it being unlimited and not being able to configure the retry interval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants