OCPBUGS-77263: create self-signed crt if failing to read default#740
OCPBUGS-77263: create self-signed crt if failing to read default#740jcmoraisjr wants to merge 1 commit intoopenshift:masterfrom
Conversation
|
@jcmoraisjr: This pull request references Jira Issue OCPBUGS-77263, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@jcmoraisjr: This pull request references Jira Issue OCPBUGS-77263, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Tested using clusterbot The |
|
@melvinjoseph86: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
| return newDefaultTemplatePlugin(router, cfg.IncludeUDP, lookupSvc), err | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| return newDefaultTemplatePlugin(router, cfg.IncludeUDP, lookupSvc), nil |
There was a problem hiding this comment.
Is this change purely stylistic?
There was a problem hiding this comment.
It's skipping the call to the later in the case the former fails.
There was a problem hiding this comment.
Ah, right, the old logic ignores err from newTemplateRouter. Was that causing a problem, or is it something you just happened to notice while fixing the certificate issue?
There was a problem hiding this comment.
Just found the pattern and preferred to see it fixed, since we are handling a non properly initialized router var in case of an error. It wasn't causing issues - and didn't look for any though =)
| if err := generateSelfSignCert(outPath); err != nil { | ||
| // Last option, we give up. This should only happen on misconfigured envs, | ||
| // like missing directory or permission. | ||
| return err | ||
| } |
There was a problem hiding this comment.
As discussed on the call, we really should fall back to the self-signed certificate only if no certificate was provided. If a certificate was provided but we fail to write it to disk, we should not fall back to a self-signed certificate.
There was a problem hiding this comment.
Just updated as we talked: 1 commit removing the hardcoded crt+key and using a self-siged crt created on the fly, another one only using the self-signed crt in case one is not provided, failing the router start otherwise.
3386c51 to
aeaf362
Compare
|
/test e2e-aws-serial-2of2 |
|
/assign |
|
/test e2e-aws-fips |
|
/unassign @alebedev87 |
|
/assign |
|
@coderabbitai rise and shine, validate this PR, check for consistency, be an adversarial reviewer of Joao and tell me major and critical findings |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 9710 🏁 Script executed: Length of output: 4748
🔴 Critical:
|
| Severity | Finding |
|---|---|
| 🔴 Critical | 2^63 is XOR in Go → serial in [0, 60] only |
| 🔴 Critical | reloadFn deletes cert then calls secretToPem("") in self-signed case — cert gone forever |
| 🟠 Major | Private key file written with 0644 (world-readable) instead of 0600 |
| 🟡 Moderate | No KeyUsage/ExtKeyUsage/BasicConstraints in cert template |
| 🟡 Minor | Function name typo: generateSelfSignCert → generateSelfSignedCert |
| } else { | ||
| r.defaultCertificatePath = outPath | ||
| // generate one on the fly only in case we don't have neither a path nor a directory configured | ||
| if err := generateSelfSignCert(outPath); err != nil { |
There was a problem hiding this comment.
I think there are two things here that can be disrupting but maybe they are already disrupting:
- If an error happens, previously we would use a default certificate, but now we will error. What does this imply? That our proxy will be out? Is this "acceptable"?
- From the 'user' perspective, the worst case fallback was accepting an insecure self signed certificate once (meh) and then never caring about it, while here we will always generate a new certificate
I understand this is the edge of the edge case, just some food for thought, although I agree that it is better to have a "safe" (not much, but still better) fallback than having someone being able to do MITM in this case because the private key is there and common
There was a problem hiding this comment.
This was previously raised and we ended up agreeing that it should fail in case there is an attempt to configure, so better to have a failing/crashing router instead of incorrectly configuring it with a distinct (the fake one) certificate.
Anyway this is an ingress operator task, so failures here would only happen in case of a bug in the operator. Users could not (should not) cause a disruption in the router due to this change in the code.
aeaf362 to
87548de
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@coderabbitai good catch. Note that I used a simplified approach in the max key, all the others are addressed as suggested. Can you confirm all is good now? |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (4)
🚧 Files skipped from review as they are similar to previous changes (4)
WalkthroughAdds self-signed certificate generation and atomic default-cert updates, tightens error handling in template router initialization and secret handling, changes default PEM file permissions to 0600, and updates the HAProxy template to use Changes
Sequence DiagramsequenceDiagram
participant Plugin as TemplatePlugin
participant Router as TemplateRouter
participant CertGen as CertGenerator
participant FS as FileSystem
participant Watcher as FileWatcher
participant HAProxy as HAProxy
Plugin->>Router: Initialize (NewTemplatePlugin)
Router->>Router: writeDefaultCert()
alt no defaultCertificate configured
Router->>CertGen: generateSelfSignedCert(outPath)
CertGen->>CertGen: determine SAN (ROUTER_DOMAIN/localhost) and create cert+key
CertGen->>FS: write PEM (cert + key) to outPath with mode 0600
FS-->>Router: write success
else defaultCertificateDir provided
Router->>FS: secretToPem(defaultCertificateDir -> outPath)
FS-->>Router: write success or error (fatal)
Router->>Watcher: start watching defaultCertificateDir
Watcher->>Watcher: on change -> write to outPath.tmp
Watcher->>FS: os.Rename(outPath.tmp -> outPath)
FS-->>Router: rename success
Router->>HAProxy: trigger rate-limited reload
end
Router-->>Plugin: return initialized plugin (or error)
Plugin->>HAProxy: render config using .DefaultCertificate
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 8 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (8 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chainLet me fetch the current state of the files to verify all fixes. 🏁 Script executed: Length of output: 5541
The simplified serial bound |
|
@jcmoraisjr: This pull request references Jira Issue OCPBUGS-77263, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/router/template/router.go (1)
432-443:⚠️ Potential issue | 🟠 MajorKeep the last good PEM in place during watched updates.
Lines 436-438 remove
outPathbefore the replacement is known-good. IfsecretToPem()fails,r.defaultCertificatePathstill points to a file that no longer exists, so later config writes/reloads will reference a missing default certificate.🩹 Proposed fix
// watching for changes r.defaultCertificatePath = outPath reloadFn := func() { log.V(0).Info("updating default certificate", "path", outPath) - os.Remove(outPath) - if err := secretToPem(r.defaultCertificateDir, outPath); err != nil { + tmpPath := outPath + ".tmp" + if err := secretToPem(r.defaultCertificateDir, tmpPath); err != nil { + _ = os.Remove(tmpPath) log.Error(err, "failed to update default certificate", "path", outPath) return } + if err := os.Rename(tmpPath, outPath); err != nil { + _ = os.Remove(tmpPath) + log.Error(err, "failed to replace default certificate", "path", outPath) + return + } log.V(0).Info("reloading to get updated default certificate") r.rateLimitedCommitFunction.RegisterChange() }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/router/template/router.go` around lines 432 - 443, The reloadFn currently removes outPath before generating the new PEM, which can leave r.defaultCertificatePath pointing at a missing file if secretToPem fails; change reloadFn to write the new PEM to a temporary file (use r.defaultCertificateDir and outPath to derive a temp name), call secretToPem to that temp file, and only on success atomically replace/rename the temp to outPath (or update r.defaultCertificatePath) and then call r.rateLimitedCommitFunction.RegisterChange(); if secretToPem returns an error, do not remove or modify the existing outPath and keep r.defaultCertificatePath unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/router/template/router.go`:
- Around line 419-423: When falling back to the self-signed branch, populate
r.defaultCertificatePath with the path to the generated PEM before returning so
writeConfig()/templateData.DefaultCertificate can reference it; specifically,
after calling generateSelfSignedCert(outPath) (inside the branch that checks
r.defaultCertificateDir == "") set r.defaultCertificatePath = outPath (or the
actual path returned/used by generateSelfSignedCert) and then return the
error/result.
---
Outside diff comments:
In `@pkg/router/template/router.go`:
- Around line 432-443: The reloadFn currently removes outPath before generating
the new PEM, which can leave r.defaultCertificatePath pointing at a missing file
if secretToPem fails; change reloadFn to write the new PEM to a temporary file
(use r.defaultCertificateDir and outPath to derive a temp name), call
secretToPem to that temp file, and only on success atomically replace/rename the
temp to outPath (or update r.defaultCertificatePath) and then call
r.rateLimitedCommitFunction.RegisterChange(); if secretToPem returns an error,
do not remove or modify the existing outPath and keep r.defaultCertificatePath
unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: deade3fc-bdb7-4c51-8e3b-a4e8df080cfa
⛔ Files ignored due to path filters (1)
images/router/haproxy/conf/default_pub_keys.pemis excluded by!**/*.pem
📒 Files selected for processing (4)
images/router/haproxy/conf/haproxy-config.templatepkg/router/template/plugin.gopkg/router/template/router.gopkg/router/template/ssl.go
87548de to
5cc70f2
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (2)
pkg/router/template/router.go (2)
429-433:⚠️ Potential issue | 🟠 MajorTighten the default PEM permissions.
This branch now relies on
secretToPem(), butsecretToPem()still writes the combined cert+key PEM with mode0444at Line 336. That leaves the default private key world-readable inside the container, unlike the self-signed path which already uses0600.🔐 Proposed fix
- return os.WriteFile(outName, pemBlock, 0444) + return os.WriteFile(outName, pemBlock, 0600)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/router/template/router.go` around lines 429 - 433, The code currently calls secretToPem(...) to write the combined cert+key PEM for defaultCertificateDir, but secretToPem writes the file with mode 0444 making the private key world-readable; update secretToPem to write the combined PEM file with mode 0600 (owner read/write only) instead of 0444 so the default private key is not world-readable, and ensure any file creation/open logic in secretToPem uses the tighter permission mode when creating the PEM.
436-445:⚠️ Potential issue | 🔴 CriticalDon’t delete the live PEM before the replacement is ready.
Line 439 removes
outPathbefore the new PEM is rendered. If the updated secret is malformed or unreadable, the current process keeps serving with the old cert in memory, but the file disappears and the next unrelated reload can fail becauseDefaultCertificatestill points at a missing path.🩹 Proposed fix
reloadFn := func() { log.V(0).Info("updating default certificate", "path", outPath) - os.Remove(outPath) - if err := secretToPem(r.defaultCertificateDir, outPath); err != nil { + tmpPath := outPath + ".tmp" + if err := secretToPem(r.defaultCertificateDir, tmpPath); err != nil { + _ = os.Remove(tmpPath) log.Error(err, "failed to update default certificate", "path", outPath) return } + if err := os.Rename(tmpPath, outPath); err != nil { + _ = os.Remove(tmpPath) + log.Error(err, "failed to replace default certificate", "path", outPath) + return + } log.V(0).Info("reloading to get updated default certificate") r.rateLimitedCommitFunction.RegisterChange() }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/router/template/router.go` around lines 436 - 445, The reloadFn currently deletes outPath before rendering the new PEM which can leave DefaultCertificate pointing at a missing file; instead, render the new PEM to a temporary file in r.defaultCertificateDir (use secretToPem targetting a temp path), verify success, then atomically replace the existing outPath with os.Rename (or equivalent) so the file is only removed on success; keep the existing error logging on secretToPem failure and only call r.rateLimitedCommitFunction.RegisterChange() after the atomic rename completes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@pkg/router/template/router.go`:
- Around line 429-433: The code currently calls secretToPem(...) to write the
combined cert+key PEM for defaultCertificateDir, but secretToPem writes the file
with mode 0444 making the private key world-readable; update secretToPem to
write the combined PEM file with mode 0600 (owner read/write only) instead of
0444 so the default private key is not world-readable, and ensure any file
creation/open logic in secretToPem uses the tighter permission mode when
creating the PEM.
- Around line 436-445: The reloadFn currently deletes outPath before rendering
the new PEM which can leave DefaultCertificate pointing at a missing file;
instead, render the new PEM to a temporary file in r.defaultCertificateDir (use
secretToPem targetting a temp path), verify success, then atomically replace the
existing outPath with os.Rename (or equivalent) so the file is only removed on
success; keep the existing error logging on secretToPem failure and only call
r.rateLimitedCommitFunction.RegisterChange() after the atomic rename completes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 5e7ac4c6-4d67-4329-8752-366aec0e5a22
⛔ Files ignored due to path filters (1)
images/router/haproxy/conf/default_pub_keys.pemis excluded by!**/*.pem
📒 Files selected for processing (4)
images/router/haproxy/conf/haproxy-config.templatepkg/router/template/plugin.gopkg/router/template/router.gopkg/router/template/ssl.go
✅ Files skipped from review due to trivial changes (1)
- pkg/router/template/ssl.go
🚧 Files skipped from review as they are similar to previous changes (2)
- images/router/haproxy/conf/haproxy-config.template
- pkg/router/template/plugin.go
Ingress operator manages and exposes a default certificate for Router. A default certificate is used if the route does not provide one, or a request is made for a non matching route. Current approach uses a hardcoded certificate in the router image, which should lead to image scanners reporting the storage of its private key. This implementation removes the hardcoded certificate, as well as the template code that ensures a default certificate is provided. In the Go code side, we are ensuring that the default certificate is provided somehow, either by serving the certificate provided by the operator, or in the case of a failure, using a self-signed certificate created on the fly. https://issues.redhat.com/browse/OCPBUGS-77263
5cc70f2 to
0500b49
Compare
|
@jcmoraisjr: This pull request references Jira Issue OCPBUGS-77263, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@jcmoraisjr: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Ingress operator manages and exposes a default certificate for Router. A default certificate is used if the route does not provide one, or a request is made for a non matching route.
Current approach uses a hardcoded certificate in the router image, which should lead to image scanners reporting the storage of its private key.
This implementation removes the hardcoded certificate, as well as the template code that ensures a default certificate is provided. In the Go code side, we are ensuring that the default certificate is always provided by generating one on the fly. If this last resort option fails due to permission, the whole router process fail to start.
Summary by CodeRabbit
New Features
Bug Fixes