Skip to content

STOR-2770: Stop generating self-signed certificates#662

Merged
openshift-merge-bot[bot] merged 4 commits intoopenshift:mainfrom
mpatlasov:STOR-2770-Stop-generating-self-signed-certificates
Mar 12, 2026
Merged

STOR-2770: Stop generating self-signed certificates#662
openshift-merge-bot[bot] merged 4 commits intoopenshift:mainfrom
mpatlasov:STOR-2770-Stop-generating-self-signed-certificates

Conversation

@mpatlasov
Copy link
Copy Markdown
Contributor

@mpatlasov mpatlasov commented Feb 2, 2026

Before this commit the following operators generated self-signed certificates:

  • aws-ebs-csi-driver-operator
  • azure-disk-csi-driver-operator
  • azure-file-csi-driver-operator
  • gcp-pd-csi-driver-operator
  • openstack-cinder-csi-driver-operator
  • manila-csi-driver-operator
  • ibm-vpc-block
  • powervs-block

This commit add new Service object for each of them with special annotation which ask service-ca-operator to create certificates for us. Also, this commit ensures the operator's Pods mount those certificates to well-known path (/var/run/secrets/serving-cert).

There is also a cosmetic change for vmware-vsphere-csi-driver-operator: remove optional: true for vmware-vsphere-csi-driver-operator-metrics-serving-cert. This must ensure that this operator will always wait for Secret containing certificates.

Summary by CodeRabbit

  • New Features

    • Added HTTPS metrics services and serving-certificate mounts for CSI driver operators to enable secure metrics endpoints.
    • Applies to AWS EBS, Azure Disk/File, GCP PD, IBM VPC Block, OpenStack Cinder/Manila, and PowerVS.
  • Behavior Change

    • vSphere serving-cert secret is now required (pod will fail to start if the secret is missing).

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Feb 2, 2026

@mpatlasov: This pull request references STOR-2770 which is a valid jira issue.

Details

In response to this:

Before this commit the following operators generated self-signed certificates:

  • aws-ebs-csi-driver-operator
  • azure-disk-csi-driver-operator
  • azure-file-csi-driver-operator
  • gcp-pd-csi-driver-operator
  • openstack-cinder-csi-driver-operator
  • manila-csi-driver-operator

This commit add new Service object for each of them with special annotation which ask service-ca-operator to create certificates for us. Also, this commit ensures the operator's Pods mount those certificates to well-known path (/var/run/secrets/serving-cert).

There is also a cosmetic change for vmware-vsphere-csi-driver-operator: remove optional: true for vmware-vsphere-csi-driver-operator-metrics-serving-cert. This must ensure that this operator will always wait for Secret containing certificates.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Feb 2, 2026
@openshift-ci openshift-ci Bot requested review from dfajmon and tsmetana February 2, 2026 04:22
@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 2, 2026
@mpatlasov mpatlasov force-pushed the STOR-2770-Stop-generating-self-signed-certificates branch from 8482eec to 9ab6e2c Compare February 2, 2026 04:43
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Feb 2, 2026

@mpatlasov: This pull request references STOR-2770 which is a valid jira issue.

Details

In response to this:

Before this commit the following operators generated self-signed certificates:

  • aws-ebs-csi-driver-operator
  • azure-disk-csi-driver-operator
  • azure-file-csi-driver-operator
  • gcp-pd-csi-driver-operator
  • openstack-cinder-csi-driver-operator
  • manila-csi-driver-operator
  • ibm-vpc-block
  • powervs-block

This commit add new Service object for each of them with special annotation which ask service-ca-operator to create certificates for us. Also, this commit ensures the operator's Pods mount those certificates to well-known path (/var/run/secrets/serving-cert).

There is also a cosmetic change for vmware-vsphere-csi-driver-operator: remove optional: true for vmware-vsphere-csi-driver-operator-metrics-serving-cert. This must ensure that this operator will always wait for Secret containing certificates.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mpatlasov mpatlasov force-pushed the STOR-2770-Stop-generating-self-signed-certificates branch from 9ab6e2c to bf9b502 Compare February 2, 2026 05:10
@jsafrane
Copy link
Copy Markdown
Contributor

jsafrane commented Feb 6, 2026

/lgtm
/approve

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Feb 6, 2026
@mpatlasov
Copy link
Copy Markdown
Contributor Author

/retest-required

8 similar comments
@mpatlasov
Copy link
Copy Markdown
Contributor Author

/retest-required

@radeore
Copy link
Copy Markdown

radeore commented Feb 18, 2026

/retest-required

@radeore
Copy link
Copy Markdown

radeore commented Feb 18, 2026

/retest-required

@radeore
Copy link
Copy Markdown

radeore commented Feb 19, 2026

/retest-required

@mpatlasov
Copy link
Copy Markdown
Contributor Author

/retest-required

@mpatlasov
Copy link
Copy Markdown
Contributor Author

/retest-required

@radeore
Copy link
Copy Markdown

radeore commented Feb 24, 2026

/retest-required

@radeore
Copy link
Copy Markdown

radeore commented Feb 24, 2026

/retest-required

@mpatlasov
Copy link
Copy Markdown
Contributor Author

/test hypershift-e2e-aks

1 similar comment
@mpatlasov
Copy link
Copy Markdown
Contributor Author

/test hypershift-e2e-aks

@duanwei33
Copy link
Copy Markdown

duanwei33 commented Feb 25, 2026

@mpatlasov @radeore
From the log1,

    message: |-
      deployment azure-disk-csi-driver-operator is not ready
      failed to get deployment azure-disk-csi-driver-controller: Deployment.apps "azure-disk-csi-driver-controller" not found
      deployment azure-file-csi-driver-operator is not ready
      failed to get deployment azure-file-csi-driver-controller: Deployment.apps "azure-file-csi-driver-controller" not found

and log2:

message: 'MountVolume.SetUp failed for volume "serving-cert" : secret "azure-disk-csi-driver-operator-serving-cert"

So it looks like introduced by our PR, the serving-cert Secret doesn't consider the azure hypershift case.
(There is no corresponding Service in the control-plane namespace to trigger service-ca-operator?)

@duanwei33
Copy link
Copy Markdown

Or because the managed cluster is an AKS?

@duanwei33
Copy link
Copy Markdown

/payload-job periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-azure-ipi-ovn-hypershift-guest-f7

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Feb 25, 2026

@duanwei33: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-azure-ipi-ovn-hypershift-guest-f7

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/7a79d8e0-1252-11f1-8a2d-013dc1aeada9-0

Use `name` in Service selector. It matches to the `name` label in Deployment pod template section.
@mpatlasov
Copy link
Copy Markdown
Contributor Author

Lines 16-17 select app: gcp-pd-csi-driver-operator, but assets/csidriveroperators/gcp-pd/07_deployment.yaml labels the pod template with name: gcp-pd-csi-driver-operator on Lines 19-24. This Service will never get endpoints, so the metrics endpoint will not be reachable.

This is actually a very good catch that I could miss easily.

@jsafrane , I fixed this issue in all Service yaml-s for uniformity. PTAL

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Mar 11, 2026

@mpatlasov: This pull request references STOR-2770 which is a valid jira issue.

Details

In response to this:

Before this commit the following operators generated self-signed certificates:

  • aws-ebs-csi-driver-operator
  • azure-disk-csi-driver-operator
  • azure-file-csi-driver-operator
  • gcp-pd-csi-driver-operator
  • openstack-cinder-csi-driver-operator
  • manila-csi-driver-operator
  • ibm-vpc-block
  • powervs-block

This commit add new Service object for each of them with special annotation which ask service-ca-operator to create certificates for us. Also, this commit ensures the operator's Pods mount those certificates to well-known path (/var/run/secrets/serving-cert).

There is also a cosmetic change for vmware-vsphere-csi-driver-operator: remove optional: true for vmware-vsphere-csi-driver-operator-metrics-serving-cert. This must ensure that this operator will always wait for Secret containing certificates.

Summary by CodeRabbit

  • New Features

  • Added HTTPS metrics services and serving-certificate mounts for CSI driver operators to enable secure metrics endpoints.

  • Applies to AWS EBS, Azure Disk/File, GCP PD, IBM VPC Block, OpenStack Cinder/Manila, and PowerVS.

  • Behavior Change

  • vSphere serving-cert secret is now required (pod will fail to start if the secret is missing).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
assets/csidriveroperators/gcp-pd/01_service.yaml (1)

1-19: Operational note: Hypershift control-plane may not receive the serving-cert Secret.

Per PR discussion, in Hypershift/managed (e.g., AKS) environments, the service-ca-operator runs in the management cluster and may not create the Secret in the control-plane namespace where the operator Pod is deployed. This can cause MountVolume.SetUp failed for volume "serving-cert" errors and prevent the controller deployment from becoming ready.

Consider whether a fallback mechanism or conditional logic is needed for Hypershift scenarios.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@assets/csidriveroperators/gcp-pd/01_service.yaml` around lines 1 - 19, The
Service manifest currently unconditionally sets the service-ca annotation
(service.beta.openshift.io/serving-cert-secret-name:
gcp-pd-csi-driver-operator-serving-cert) which leads the operator Deployment to
expect a Secret named gcp-pd-csi-driver-operator-serving-cert and mount the
volume "serving-cert"; add a Hypershift-aware fallback by making the annotation
conditional (do not set it in hosted control-plane installs) and update the
operator bootstrap/reconcile logic to detect the absence of the serving-cert
Secret and either (a) create a self-signed or projected fallback Secret or (b)
skip/avoid mounting "serving-cert" so the Pod does not fail to start; target the
code that applies this Service and the Deployment manifest changes that
reference the "serving-cert" volume/secret to implement the conditional
behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@assets/csidriveroperators/gcp-pd/01_service.yaml`:
- Around line 1-19: The Service manifest currently unconditionally sets the
service-ca annotation (service.beta.openshift.io/serving-cert-secret-name:
gcp-pd-csi-driver-operator-serving-cert) which leads the operator Deployment to
expect a Secret named gcp-pd-csi-driver-operator-serving-cert and mount the
volume "serving-cert"; add a Hypershift-aware fallback by making the annotation
conditional (do not set it in hosted control-plane installs) and update the
operator bootstrap/reconcile logic to detect the absence of the serving-cert
Secret and either (a) create a self-signed or projected fallback Secret or (b)
skip/avoid mounting "serving-cert" so the Pod does not fail to start; target the
code that applies this Service and the Deployment manifest changes that
reference the "serving-cert" volume/secret to implement the conditional
behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d1ba6f8f-77ec-48ea-9915-0f343f7c22c1

📥 Commits

Reviewing files that changed from the base of the PR and between 8e85102 and 43fca9d.

⛔ Files ignored due to path filters (10)
  • assets/csidriveroperators/aws-ebs/hypershift/mgmt/generated/v1_service_aws-ebs-csi-driver-operator-metrics.yaml is excluded by !**/generated/**
  • assets/csidriveroperators/aws-ebs/standalone/generated/v1_service_aws-ebs-csi-driver-operator-metrics.yaml is excluded by !**/generated/**
  • assets/csidriveroperators/azure-disk/hypershift/mgmt/generated/v1_service_azure-disk-csi-driver-operator-metrics.yaml is excluded by !**/generated/**
  • assets/csidriveroperators/azure-disk/standalone/generated/v1_service_azure-disk-csi-driver-operator-metrics.yaml is excluded by !**/generated/**
  • assets/csidriveroperators/azure-file/hypershift/mgmt/generated/v1_service_azure-file-csi-driver-operator-metrics.yaml is excluded by !**/generated/**
  • assets/csidriveroperators/azure-file/standalone/generated/v1_service_azure-file-csi-driver-operator-metrics.yaml is excluded by !**/generated/**
  • assets/csidriveroperators/openstack-cinder/hypershift/mgmt/generated/v1_service_openstack-cinder-csi-driver-operator-metrics.yaml is excluded by !**/generated/**
  • assets/csidriveroperators/openstack-cinder/standalone/generated/v1_service_openstack-cinder-csi-driver-operator-metrics.yaml is excluded by !**/generated/**
  • assets/csidriveroperators/openstack-manila/hypershift/mgmt/generated/v1_service_manila-csi-driver-operator-metrics.yaml is excluded by !**/generated/**
  • assets/csidriveroperators/openstack-manila/standalone/generated/openshift-cluster-csi-drivers_v1_service_manila-csi-driver-operator-metrics.yaml is excluded by !**/generated/**
📒 Files selected for processing (9)
  • assets/csidriveroperators/aws-ebs/base/01_service.yaml
  • assets/csidriveroperators/azure-disk/base/01_service.yaml
  • assets/csidriveroperators/azure-file/base/01_service.yaml
  • assets/csidriveroperators/gcp-pd/01_service.yaml
  • assets/csidriveroperators/ibm-vpc-block/01_service.yaml
  • assets/csidriveroperators/openstack-cinder/base/01_service.yaml
  • assets/csidriveroperators/openstack-manila/base/09_service.yaml
  • assets/csidriveroperators/powervs-block/hypershift/mgmt/08_service.yaml
  • assets/csidriveroperators/powervs-block/standalone/08_service.yaml
🚧 Files skipped from review as they are similar to previous changes (6)
  • assets/csidriveroperators/aws-ebs/base/01_service.yaml
  • assets/csidriveroperators/openstack-manila/base/09_service.yaml
  • assets/csidriveroperators/openstack-cinder/base/01_service.yaml
  • assets/csidriveroperators/azure-disk/base/01_service.yaml
  • assets/csidriveroperators/ibm-vpc-block/01_service.yaml
  • assets/csidriveroperators/azure-file/base/01_service.yaml

@jsafrane
Copy link
Copy Markdown
Contributor

/retest-required
gather-extra + a similar step on hypershift failed.

@jsafrane
Copy link
Copy Markdown
Contributor

/label tide/merge-method-squash
does it work here?

/lgtm
/approve

I checked that all CSI driver operators in the e2e test above (+ cinder+manila) have:

I0310 12:27:47.655334       1 cmd.go:253] Using service-serving-cert provided certificates

@openshift-ci openshift-ci Bot added tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. lgtm Indicates that a PR is ready to be merged. labels Mar 11, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Mar 11, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jsafrane, mpatlasov

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jsafrane
Copy link
Copy Markdown
Contributor

/remove-label tide/merge-method-squash

White the comment titles are not very useful, the commit message has information that's worth preserving.

@openshift-ci openshift-ci Bot removed the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Mar 11, 2026
@mpatlasov
Copy link
Copy Markdown
Contributor Author

/test hypershift-e2e-aks

@mpatlasov
Copy link
Copy Markdown
Contributor Author

/label tide/merge-method-squash

@openshift-ci openshift-ci Bot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Mar 11, 2026
@jsafrane
Copy link
Copy Markdown
Contributor

/retest-required

@mpatlasov
Copy link
Copy Markdown
Contributor Author

/test hypershift-e2e-aks

@mpatlasov
Copy link
Copy Markdown
Contributor Author

/override ci/prow/hypershift-e2e-aks

The test doesn't give us any useful signals now, it is perma-failing these days. See also how it failed in:

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Mar 12, 2026

@mpatlasov: Overrode contexts on behalf of mpatlasov: ci/prow/hypershift-e2e-aks

Details

In response to this:

/override ci/prow/hypershift-e2e-aks

The test doesn't give us any useful signals now, it is perma-failing these days. See also how it failed in:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@radeore
Copy link
Copy Markdown

radeore commented Mar 12, 2026

Pre-merge verification results looks good. Test results are added here

@radeore
Copy link
Copy Markdown

radeore commented Mar 12, 2026

/verified by @radeore and CI

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Mar 12, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@radeore: This PR has been marked as verified by @radeore and CI.

Details

In response to this:

/verified by @radeore and CI

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mpatlasov
Copy link
Copy Markdown
Contributor Author

/test hypershift-aws-e2e-external

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Mar 12, 2026

@mpatlasov: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot Bot merged commit e31c296 into openshift:main Mar 12, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants