Skip to content

Probes configuration and override support#610

Open
fmount wants to merge 2 commits intoopenstack-k8s-operators:mainfrom
fmount:probes
Open

Probes configuration and override support#610
fmount wants to merge 2 commits intoopenstack-k8s-operators:mainfrom
fmount:probes

Conversation

@fmount
Copy link
Copy Markdown
Contributor

@fmount fmount commented Feb 18, 2026

Add ProbeOverrides interface and CreateProbeSet() function from lib-common for unified probe management across Cinder services. Enable probe customization through CRD overrides and remove code duplication. Updates all services (API, Scheduler, Volume, Backup) to use the new pattern with proper scheme handling and consistent defaults.
In addition, webhook validation for probes have been introduced.

Depends-On: openstack-k8s-operators/lib-common#673

@fmount fmount requested a review from stuggi February 18, 2026 20:21
@openshift-ci openshift-ci bot requested a review from eharney February 18, 2026 20:21
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Feb 18, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fmount

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@fmount fmount changed the title Probe configuration with override support Probes configuration and override support Feb 18, 2026
@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/2813b48c394243ce9a2e8ef372592913

openstack-k8s-operators-content-provider FAILURE in 9m 19s
⚠️ cinder-operator-kuttl SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cinder-operator-tempest SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/2896e982a2c54980ac9db89bd06ae45a

openstack-k8s-operators-content-provider FAILURE in 11m 52s
⚠️ cinder-operator-kuttl SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cinder-operator-tempest SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/fcdd543dbfb848aab77b2a7d913569b8

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 03m 39s
cinder-operator-kuttl FAILURE in 34m 31s
✔️ cinder-operator-tempest SUCCESS in 1h 44m 57s

Comment thread internal/cinder/funcs.go
}

// GetDefaultProbesAPI -
func GetDefaultProbesAPI(apiTimeout int) probes.OverrideSpec {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stuggi I'd like to gather your feedback before moving forward. The main point here is about computing some ProbeDefaults for the services based on a meaningful parameter.
apiTimeout makes sense for cinderAPI and because is configurable, it might help to perform some tuning on the Default values. This way we can reduce the usage of the probes override interface that still can be evaluated case by case.
The other function, that is pretty much identical in terms of logic, is based on a const parameter that is not exposed.
The idea of using the const parameter is based on the fact that we provides DefaultProbes based on defaults parameters [1]. In this case it results in relaxing the probes compared to what we had in the past, but not allowing to tune them w/o using the probes interface we introduced in lib-common for this purpose.

[1] https://opendev.org/openstack/cinder/src/branch/master/cinder/common/config.py#L121

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/b2cd3b0a467e469493dfd89037ba89ad

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 01m 18s
cinder-operator-kuttl FAILURE in 36m 02s
✔️ cinder-operator-tempest SUCCESS in 1h 42m 23s

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/28e584974ac74471995b2d48929d570d

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 14m 03s
cinder-operator-kuttl FAILURE in 43m 02s
✔️ cinder-operator-tempest SUCCESS in 1h 47m 53s

@fmount
Copy link
Copy Markdown
Contributor Author

fmount commented Apr 10, 2026

recheck

fmount added 2 commits April 16, 2026 11:45
Add ProbeOverrides interface and CreateProbeSet() function from lib-common
for unified probe management across Cinder services. Enable probe
customization through CRD overrides and remove code duplication.
Updates all services (API, Scheduler, Volume, Backup) to use the new
pattern with proper scheme handling and consistent defaults.
In addition, webhook validation for probes have been introduced.

Signed-off-by: Francesco Pantano <fpantano@redhat.com>
Replace static probe timeouts with dynamic scaling based on APITimeout parameter.
Creates separate probe configurations for API services (HTTP endpoints) and RPC
workers (internal services) with appropriate scaling factors. API services use
full APITimeout scaling while RPC workers get proportional timeouts, preventing
premature pod kills during high load scenarios.

Signed-off-by: Francesco Pantano <fpantano@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant