🐛 OCPBUGS-78455: fix(Boxcutter-Collision): Fix collision detection bypass after ClusterExtension upgrade#2578
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
Fixes a Boxcutter collision-detection bypass that could allow a second ClusterExtension to install the same package after the first ClusterExtension upgraded and bumped managed-object revision numbers.
Changes:
- Treat
ActionProgressedresults as collisions when the reconciled object is controller-owned by a differentClusterExtensionRevision(foreign ownerRef). - Add an E2E scenario covering “upgrade then duplicate install” and a new step to ensure both
ClusterExtensionresources are cleaned up. - Add unit tests validating the new “foreign revision ownerRef” collision behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
internal/operator-controller/controllers/clusterextensionrevision_controller.go |
Adds foreign-ownerRef collision detection for ActionProgressed objects and tracks sibling CER names during reconcile. |
internal/operator-controller/controllers/clusterextensionrevision_controller_test.go |
Adds unit coverage for foreign/sibling/non-CER controller ownerRef cases. |
test/e2e/steps/steps.go |
Adds a step to track the currently-applied ClusterExtension for cleanup when a scenario applies a second CE. |
test/e2e/features/update.feature |
Adds an E2E scenario asserting collisions are detected after an upgrade when installing a duplicate CE. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
internal/operator-controller/controllers/clusterextensionrevision_controller.go
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterextensionrevision_controller.go
Outdated
Show resolved
Hide resolved
a73cf6d to
8b504c8
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2578 +/- ##
==========================================
+ Coverage 67.74% 67.89% +0.15%
==========================================
Files 137 137
Lines 9560 9586 +26
==========================================
+ Hits 6476 6508 +32
+ Misses 2585 2581 -4
+ Partials 499 497 -2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| // nil when the controller is a sibling revision (same ClusterExtension) or not a CER at all. | ||
| func getForeignRevisionController(obj metav1.Object, siblingRevisionNames sets.Set[string]) *metav1.OwnerReference { | ||
| for i := range obj.GetOwnerReferences() { | ||
| ref := obj.GetOwnerReferences()[i] |
There was a problem hiding this comment.
why are we avoiding setting this to a local and iterating over it? If we're concerned that the contents can change over the course of this call then couldn't we end up in a situation where the first call determines a bounds which exceeds the returned results later in the loop?
Like if L551 says i := 12, but then obj.GetOwnerReferences() is updated to hold fewer than 12 items.
Why not
refs := obj.GetOwnerReferences()
for i := range refs {
ref := refs[i]
There was a problem hiding this comment.
I think we can. Goog Catcher 🥇
8b504c8 to
36f661c
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
internal/operator-controller/controllers/clusterextensionrevision_controller_test.go
Outdated
Show resolved
Hide resolved
…rExtension upgrade When a ClusterExtension was upgraded (e.g., v1.0.0 to v1.0.1), the upgrade set a higher revision number on managed objects. A second ClusterExtension installing the same package was then allowed because boxcutter's revision linearity check returned ActionProgressed instead of ActionCollision, silently skipping the conflict. Fix: after boxcutter reconcile, check ActionProgressed results for objects whose controller ownerReference belongs to a revision from a different ClusterExtension, and treat those as collisions. Generated-by: Claude/Cursor
36f661c to
2b87748
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
| if ores.Action() == machinery.ActionProgressed { | ||
| if controllerRef := getForeignRevisionController(ores.Object(), siblingRevisionNames); controllerRef != nil { | ||
| collidingObjs = append(collidingObjs, ores.String()+fmt.Sprintf("Conflicting Owner: %s\n", controllerRef.String())) |
There was a problem hiding this comment.
The collision message is built by concatenating ores.String() directly with "Conflicting Owner: ..." without any separator. Unless ores.String() always ends with a newline, this will produce a hard-to-read merged string. Consider adding an explicit delimiter (e.g., "\n" or " ") between the two parts and avoid embedding an extra trailing newline in the appended fragment, since the final message already uses strings.Join(..., "\n\n").
| collidingObjs = append(collidingObjs, ores.String()+fmt.Sprintf("Conflicting Owner: %s\n", controllerRef.String())) | |
| collidingObjs = append(collidingObjs, ores.String()+fmt.Sprintf("\nConflicting Owner: %s", controllerRef.String())) |
| ref := refs[i] | ||
| if ref.Controller != nil && *ref.Controller && | ||
| ref.Kind == ocv1.ClusterExtensionRevisionKind && | ||
| ref.APIVersion == ocv1.GroupVersion.String() { | ||
| if !siblingRevisionNames.Has(ref.Name) { | ||
| return &ref |
There was a problem hiding this comment.
getForeignRevisionController returns &ref where ref is a copy of the slice element. This forces an escape allocation and can be confusing to readers expecting a pointer into the original ownerReferences slice. Returning &refs[i] (or copying the fields you need into a new struct explicitly) would be clearer and avoid the extra allocation.
| ref := refs[i] | |
| if ref.Controller != nil && *ref.Controller && | |
| ref.Kind == ocv1.ClusterExtensionRevisionKind && | |
| ref.APIVersion == ocv1.GroupVersion.String() { | |
| if !siblingRevisionNames.Has(ref.Name) { | |
| return &ref | |
| if refs[i].Controller != nil && *refs[i].Controller && | |
| refs[i].Kind == ocv1.ClusterExtensionRevisionKind && | |
| refs[i].APIVersion == ocv1.GroupVersion.String() { | |
| if !siblingRevisionNames.Has(refs[i].Name) { | |
| return &refs[i] |
|
|
||
| rev := &ocv1.ClusterExtensionRevision{} | ||
| require.NoError(t, testClient.Get(t.Context(), client.ObjectKey{Name: tc.reconcilingRevisionName}, rev)) | ||
| cond := meta.FindStatusCondition(rev.Status.Conditions, ocv1.TypeProgressing) |
There was a problem hiding this comment.
This test looks up the ClusterExtensionRevision Progressing condition using ocv1.TypeProgressing (the generic/common constant). Using the CER-specific constant ocv1.ClusterExtensionRevisionTypeProgressing would make the intent clearer and would catch any future divergence between common and CER condition-type constants.
| cond := meta.FindStatusCondition(rev.Status.Conditions, ocv1.TypeProgressing) | |
| cond := meta.FindStatusCondition(rev.Status.Conditions, ocv1.ClusterExtensionRevisionTypeProgressing) |
When a ClusterExtension was upgraded (e.g., v1.0.0 to v1.0.1), the upgrade set a higher revision number on managed objects. A second
ClusterExtension installing the same package was then allowed because boxcutter's revision linearity check returned ActionProgressed instead of ActionCollision, silently skipping the conflict.
Fix: after boxcutter reconcile, check ActionProgressed results for objects whose controller ownerReference belongs to a revision from a different ClusterExtension, and treat those as collisions.
Motivated by: https://redhat.atlassian.net/browse/OCPBUGS-78455