Skip to content

Conversation

@nunnatsa
Copy link
Collaborator

@nunnatsa nunnatsa commented Dec 8, 2025

What this PR does / why we need it:

HCO webhook forces creating the HyperConverged CR only in the installation namespace, and rejects any request to create it in another namespace. However, when installing with OLM and with the allNamespaces install mode, the webhook only applied to the current namespace, so the webhook actually only checks for the right namespace, missing the creation request in any other namespace. To solve it, the webhook removes the namespace selector from the ValidatingWebhookConfiguration, making it work for all namespaces.

Apparently, there is some kind of race condition with OLM, so eventually, OLM set the namespace selector back, after the webhook removes it.

The ValidatingAdmissionPolicy became stable at kubernetes v1.30, offers better way to validate the HyperConverged namespace.

This PR adds the new admission policy controller, to reconcile new ValidatingAdmissionPolicy and ValidatingAdmissionPolicyBinding resources, to verify the HyperConverged namespace.

Note: we need a new controller, and can't use the main (hyperconverged) controller, because the hyperconverged controller only reconciles resources if they the HyperConverged CR exists. We need the validation for the HyperConverged creation, so it must be in place before that.

Jira Ticket:

https://issues.redhat.com/browse/CNV-61721

Release note:

None

The current implementation of preventing the creation of the
HyperConverged CR in non-allowed namespace, is not working in Openshift,
where becasue of a race condition, the webhook's namespace selector is
removed by OLM.

This commit adds a new controller, to create and reconcile a
ValidatingAdmissionPolicy and the related
ValidatingAdmissionPolicyBinding, to perform the same validation.

The reason we're doing it in a new controller, is because we need the
ValidatingAdmissionPolicy to be set, even if the HyperConverged CR is
not deployed, while our main controller only reconciles resources if
the HyperConverged CR is deployed.

Signed-off-by: Nahshon Unna Tsameret <[email protected]>
Remove the existing validation of the HyperConverged CR namespace from
the validation webhook, as it is now done by the policy, created by the
admission policy controller.

Signed-off-by: Nahshon Unna Tsameret <[email protected]>
OLM adds a namespace selection on the validation webhook CR, causing the
namespace validation to be not relevant.

The webhook setup logic removes this selector, but actually this is
reconciled by OLM, and eventually, user can still create the
HyperConverged CR in any namespace.

The issue is now handled by a ValidationgAdmissionPolicy, and so we
don't need this logic anymore, and so this commit removes it.

Signed-off-by: Nahshon Unna Tsameret <[email protected]>
@kubevirt-bot kubevirt-bot added release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Dec 8, 2025
@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign tiraboschi for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@nunnatsa nunnatsa requested review from orenc1 and removed request for avlitman and tiraboschi December 8, 2025 08:22
@sonarqubecloud
Copy link

sonarqubecloud bot commented Dec 8, 2025

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and found some issues that need to be addressed.

  • In reconcilePolicy/reconcileBinding you detect label changes with CompareLabels, but then call MergeLabels on the desired object (policy/binding) instead of the live object (foundPolicy/foundBinding), so the updated labels are never persisted; swap the arguments so you actually mutate the object being updated.
  • The logger acquisition in Reconcile uses logger, err := logr.FromContext(ctx), but logr.FromContext does not return an error; this code as written will not compile and should instead use logr.FromContextOrDiscard(ctx) or a single-value FromContext call.
  • The startup reconciliation event is sent on startupEvent before add(mgr, r) wires the channel into the controller and the channel is then closed via defer, so the startup request is likely dropped; consider moving the send after the controller is watching the channel and avoiding closing it prematurely so the initial reconcile is guaranteed to run.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `reconcilePolicy`/`reconcileBinding` you detect label changes with `CompareLabels`, but then call `MergeLabels` on the desired object (`policy`/`binding`) instead of the live object (`foundPolicy`/`foundBinding`), so the updated labels are never persisted; swap the arguments so you actually mutate the object being updated.
- The logger acquisition in `Reconcile` uses `logger, err := logr.FromContext(ctx)`, but `logr.FromContext` does not return an error; this code as written will not compile and should instead use `logr.FromContextOrDiscard(ctx)` or a single-value `FromContext` call.
- The startup reconciliation event is sent on `startupEvent` before `add(mgr, r)` wires the channel into the controller and the channel is then closed via `defer`, so the startup request is likely dropped; consider moving the send after the controller is watching the channel and avoiding closing it prematurely so the initial reconcile is guaranteed to run.

## Individual Comments

### Comment 1
<location> `controllers/admissionpolicy/resources.go:41-42` </location>
<code_context>
+func getRequiredPolicy() *admissionv1.ValidatingAdmissionPolicy {
+	policyOnce.Do(func() {
+		namespace := hcoutil.GetOperatorNamespaceFromEnv()
+		requiredPolicy = &admissionv1.ValidatingAdmissionPolicy{
+			ObjectMeta: metav1.ObjectMeta{
+				Name:            policyName,
+				Labels:          hcoutil.GetLabels(hcov1beta1.HyperConvergedName, hcoutil.AppComponentDeployment),
</code_context>

<issue_to_address>
**issue (bug_risk):** Using a namespaced Deployment as ownerReference for cluster-scoped admission resources is invalid

`ValidatingAdmissionPolicy` and `ValidatingAdmissionPolicyBinding` are cluster-scoped, but their `ownerReferences` point to `ownresources.GetDeploymentRef()`, which is namespaced. Kubernetes forbids namespaced→cluster-scoped owner references, so these resources will fail validation and may not be created/updated. To avoid this, either use a cluster-scoped owner, switch to a different cleanup mechanism (e.g., finalizers), or remove the `ownerReference` entirely.
</issue_to_address>

### Comment 2
<location> `controllers/admissionpolicy/resources.go:67` </location>
<code_context>
+						},
+					},
+				},
+				Validations: []admissionv1.Validation{
+					{
+						Expression: fmt.Sprintf(`request.namespace == '%s'`, namespace),
</code_context>

<issue_to_address>
**issue (bug_risk):** CEL expression uses single-quoted string literals, which are invalid in CEL and will break the policy

The expression is currently generated as `request.namespace == '%s'`, but CEL for Kubernetes only accepts double-quoted string literals (e.g., `"ns"`); single quotes are invalid and will cause the policy to fail compilation, so the namespace check will never be enforced. Please change the format string to use double quotes, e.g. `fmt.Sprintf(`request.namespace == "%s"`, namespace).
</issue_to_address>

### Comment 3
<location> `pkg/webhooks/validator/validator_test.go:175-184` </location>
<code_context>
 			Expect(res.Result.Message).To(Equal("unknown operation request \"MALFORMED\""))
 		})

-		It("should correctly handle operation errors", func() {
-			cr.Namespace = ResourceInvalidNamespace
-			req := newRequest(admissionv1.Create, cr, v1beta1Codec, false)
-
-			res := wh.Handle(ctx, req)
-			Expect(res.Allowed).To(BeFalse())
-			Expect(res.Result.Code).To(Equal(int32(403)))
-			Expect(res.Result.Reason).To(Equal(metav1.StatusReasonForbidden))
-			Expect(res.Result.Message).To(Equal("invalid namespace for v1beta1.HyperConverged - please use the kubevirt-hyperconverged namespace"))
-		})
-
 		It("should accept creation of a resource with a valid namespace", func() {
 			Expect(wh.ValidateCreate(ctx, dryRun, cr)).To(Succeed())
 		})

-		It("should reject creation of a resource with an arbitrary namespace", func() {
-			cr.Namespace = ResourceInvalidNamespace
-			Expect(wh.ValidateCreate(ctx, dryRun, cr)).ToNot(Succeed())
</code_context>

<issue_to_address>
**suggestion (testing):** Namespace validation behavior is no longer covered by the webhook tests

With these tests removed, this suite no longer verifies that creating a HyperConverged CR in an invalid namespace is rejected. Even though the logic moved from the webhook to the ValidatingAdmissionPolicy, we should still test that the new policy enforces the namespace constraint. Please either add a focused unit test that checks the policy’s validation expression and error message, or an integration/e2e test that creates a CR in the wrong namespace and confirms the API server rejects it via the policy, so the regression this PR fixes remains covered by automated tests.

Suggested implementation:

```golang
		It("should accept creation of a resource with a valid namespace", func() {
			Expect(wh.ValidateCreate(ctx, dryRun, cr)).To(Succeed())
		})

		It("should reject creation of a resource with an arbitrary namespace", func() {
			cr.Namespace = ResourceInvalidNamespace

			err := wh.ValidateCreate(ctx, dryRun, cr)
			Expect(err).ToNot(Succeed())
			Expect(err.Error()).To(ContainSubstring("invalid namespace"))
			Expect(err.Error()).To(ContainSubstring("kubevirt-hyperconverged"))
		})

		DescribeTable("Validate annotations", func(annotations map[string]string, assertion types.GomegaMatcher) {

```

1. If `wh.ValidateCreate` already returns a `error`, the `Expect(err).ToNot(Succeed())` line assumes the use of Gomega's `Succeed()` matcher on an `error`. If your tests typically use `Expect(err).To(HaveOccurred())` instead, replace that line with `Expect(err).To(HaveOccurred())` to match existing style.
2. Ensure that the error message produced by the ValidatingAdmissionPolicy for an invalid namespace actually contains both `"invalid namespace"` and `"kubevirt-hyperconverged"`. If the message differs, adjust the `ContainSubstring(...)` expectations to match the exact message emitted by the policy.
3. If `ResourceInvalidNamespace` is not already set to a clearly wrong namespace in the shared test setup, update its value so that this test meaningfully exercises the policy's namespace constraint.
</issue_to_address>

### Comment 4
<location> `controllers/admissionpolicy/resources.go:26` </location>
<code_context>
+	requiredPolicy  *admissionv1.ValidatingAdmissionPolicy
+	requiredBinding *admissionv1.ValidatingAdmissionPolicyBinding
+
+	policyOnce  = &sync.Once{}
+	bindingOnce = &sync.Once{}
+
</code_context>

<issue_to_address>
**issue (complexity):** Consider replacing the singleton-style globals and sync.Once with simple per-call constructors that directly build and return the policy and binding objects.

You can drop the singleton pattern and globals entirely and keep the same behavior with simpler, easier-to-test code.

### 1. Remove `sync.Once` + global instances

Instead of:

```go
var (
    requiredPolicy  *admissionv1.ValidatingAdmissionPolicy
    requiredBinding *admissionv1.ValidatingAdmissionPolicyBinding

    policyOnce  = &sync.Once{}
    bindingOnce = &sync.Once{}
)

func getRequiredPolicy() *admissionv1.ValidatingAdmissionPolicy {
    policyOnce.Do(func() {
        namespace := hcoutil.GetOperatorNamespaceFromEnv()
        requiredPolicy = &admissionv1.ValidatingAdmissionPolicy{
            // ...
        }
    })

    return requiredPolicy.DeepCopy()
}

func getRequiredBinding() *admissionv1.ValidatingAdmissionPolicyBinding {
    bindingOnce.Do(func() {
        requiredBinding = &admissionv1.ValidatingAdmissionPolicyBinding{
            // ...
        }
    })

    return requiredBinding.DeepCopy()
}
```

Build a fresh struct on each call and return it directly:

```go
func getRequiredPolicy() *admissionv1.ValidatingAdmissionPolicy {
    namespace := hcoutil.GetOperatorNamespaceFromEnv()

    return &admissionv1.ValidatingAdmissionPolicy{
        ObjectMeta: metav1.ObjectMeta{
            Name:            policyName,
            Labels:          hcoutil.GetLabels(hcov1beta1.HyperConvergedName, hcoutil.AppComponentDeployment),
            OwnerReferences: []metav1.OwnerReference{ownresources.GetDeploymentRef()},
        },
        Spec: admissionv1.ValidatingAdmissionPolicySpec{
            FailurePolicy: ptr.To(admissionv1.Fail),
            MatchConstraints: &admissionv1.MatchResources{
                MatchPolicy:       ptr.To(admissionv1.Equivalent),
                NamespaceSelector: &metav1.LabelSelector{},
                ObjectSelector:    &metav1.LabelSelector{},
                ResourceRules: []admissionv1.NamedRuleWithOperations{{
                    RuleWithOperations: admissionv1.RuleWithOperations{
                        Rule: admissionv1.Rule{
                            APIGroups:   []string{hcov1beta1.APIVersionGroup},
                            APIVersions: []string{hcov1beta1.APIVersionBeta},
                            Resources:   []string{"hyperconvergeds"},
                            Scope:       ptr.To(admissionv1.NamespacedScope),
                        },
                        Operations: []admissionv1.OperationType{admissionv1.Create},
                    },
                }},
            },
            Validations: []admissionv1.Validation{{
                Expression: fmt.Sprintf(`request.namespace == '%s'`, namespace),
                Message:    fmt.Sprintf(`HyperConverged CR can only be created in the '%s' namespace.`, namespace),
            }},
        },
    }
}

func getRequiredBinding() *admissionv1.ValidatingAdmissionPolicyBinding {
    return &admissionv1.ValidatingAdmissionPolicyBinding{
        ObjectMeta: metav1.ObjectMeta{
            Name:            policyBindingName,
            Labels:          hcoutil.GetLabels(hcov1beta1.HyperConvergedName, hcoutil.AppComponentDeployment),
            OwnerReferences: []metav1.OwnerReference{ownresources.GetDeploymentRef()},
        },
        Spec: admissionv1.ValidatingAdmissionPolicyBindingSpec{
            PolicyName:        policyName,
            ValidationActions: []admissionv1.ValidationAction{admissionv1.Deny},
        },
    }
}
```

Benefits:

- No global mutable state, no `sync.Once`, no `DeepCopy` calls.
- Construction is local and obvious; env and owner refs are evaluated per call, which is usually fine for such small objects.
- Concurrency is simpler: each caller gets its own instance.

### 2. Keep predicates as simple, stateless helpers

You can keep the predicates as plain globals without coupling them to the singleton logic:

```go
var (
    policyPredicate = predicate.NewTypedPredicateFuncs[*admissionv1.ValidatingAdmissionPolicy](func(policy *admissionv1.ValidatingAdmissionPolicy) bool {
        return policy.Name == policyName && policy.DeletionTimestamp == nil
    })

    bindingPredicate = predicate.NewTypedPredicateFuncs[*admissionv1.ValidatingAdmissionPolicyBinding](func(binding *admissionv1.ValidatingAdmissionPolicyBinding) bool {
        return binding.Name == policyBindingName && binding.DeletionTimestamp == nil
    })
)
```

If you want to go one step further in decoupling, you can also thread in the namespace/owner ref via parameters instead of reading env/globals inside the helpers, but the change above already removes the main indirection and global mutability without affecting functionality.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@coveralls
Copy link
Collaborator

Pull Request Test Coverage Report for Build 20021419104

Details

  • 115 of 189 (60.85%) changed or added relevant lines in 4 files are covered.
  • 5 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.2%) to 76.505%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/components/components.go 0 5 0.0%
controllers/admissionpolicy/resources.go 54 60 90.0%
controllers/admissionpolicy/admission_policy_controller.go 60 123 48.78%
Files with Coverage Reduction New Missed Lines %
pkg/webhooks/validator/validator.go 5 90.63%
Totals Coverage Status
Change from base Build 20004527574: -0.2%
Covered Lines: 8183
Relevant Lines: 10696

💛 - Coveralls

@nunnatsa
Copy link
Collaborator Author

nunnatsa commented Dec 8, 2025

Hey there - I've reviewed your changes and found some issues that need to be addressed.

  • In reconcilePolicy/reconcileBinding you detect label changes with CompareLabels, but then call MergeLabels on the desired object (policy/binding) instead of the live object (foundPolicy/foundBinding), so the updated labels are never persisted; swap the arguments so you actually mutate the object being updated.

This is just wrong. the signature of MergeLabels is MergeLabels(src, tgt *metav1.ObjectMeta), so the arguments are in the right order.

  • The logger acquisition in Reconcile uses logger, err := logr.FromContext(ctx), but logr.FromContext does not return an error; this code as written will not compile and should instead use logr.FromContextOrDiscard(ctx) or a single-value FromContext call.

This is just wrong. logr.FromContext(ctx) does return error.

  • The startup reconciliation event is sent on startupEvent before add(mgr, r) wires the channel into the controller and the channel is then closed via defer, so the startup request is likely dropped; consider moving the send after the controller is watching the channel and avoiding closing it prematurely so the initial reconcile is guaranteed to run.

This is not true. Since the channel is cached channel, we can read from it even if it is closed. This code is working.

Just for example (runnable version here: https://go.dev/play/p/x0GaTpwpx3o):

func main() {
	ch := make(chan string, 1)
	ch <- "hello"
	close(ch)

	go func() {
		fmt.Println("Starting listening to the channel")
		str := <-ch
		if str == "" {
		} else {
			fmt.Printf("Got %q\n", str)
			<-ch
			fmt.Println("Now the channel is finally closed")
		}
	}()

	time.Sleep(time.Second)
	fmt.Println("Done; exiting...")
}

The output is:

Starting listening to the channel
Got "hello"
Now the channel is finally closed
Done; exiting...

@hco-bot
Copy link
Collaborator

hco-bot commented Dec 8, 2025

hco-e2e-operator-sdk-sno-aws lane succeeded.
/override ci/prow/hco-e2e-operator-sdk-sno-azure

@kubevirt-bot
Copy link
Contributor

@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-operator-sdk-sno-azure

In response to this:

hco-e2e-operator-sdk-sno-aws lane succeeded.
/override ci/prow/hco-e2e-operator-sdk-sno-azure

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@hco-bot
Copy link
Collaborator

hco-bot commented Dec 8, 2025

hco-e2e-upgrade-prev-operator-sdk-sno-aws lane succeeded.
/override ci/prow/hco-e2e-upgrade-prev-operator-sdk-sno-azure
hco-e2e-upgrade-operator-sdk-sno-aws lane succeeded.
/override ci/prow/hco-e2e-upgrade-operator-sdk-sno-azure
hco-e2e-operator-sdk-aws lane succeeded.
/override ci/prow/hco-e2e-operator-sdk-gcp
hco-e2e-operator-sdk-aws lane succeeded.
/override ci/prow/hco-e2e-operator-sdk-azure

@kubevirt-bot
Copy link
Contributor

@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-operator-sdk-azure, ci/prow/hco-e2e-operator-sdk-gcp, ci/prow/hco-e2e-upgrade-operator-sdk-sno-azure, ci/prow/hco-e2e-upgrade-prev-operator-sdk-sno-azure

In response to this:

hco-e2e-upgrade-prev-operator-sdk-sno-aws lane succeeded.
/override ci/prow/hco-e2e-upgrade-prev-operator-sdk-sno-azure
hco-e2e-upgrade-operator-sdk-sno-aws lane succeeded.
/override ci/prow/hco-e2e-upgrade-operator-sdk-sno-azure
hco-e2e-operator-sdk-aws lane succeeded.
/override ci/prow/hco-e2e-operator-sdk-gcp
hco-e2e-operator-sdk-aws lane succeeded.
/override ci/prow/hco-e2e-operator-sdk-azure

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@hco-bot
Copy link
Collaborator

hco-bot commented Dec 8, 2025

hco-e2e-consecutive-operator-sdk-upgrades-aws lane succeeded.
/override ci/prow/hco-e2e-consecutive-operator-sdk-upgrades-azure
hco-e2e-upgrade-prev-operator-sdk-aws lane succeeded.
/override ci/prow/hco-e2e-upgrade-prev-operator-sdk-azure

@kubevirt-bot
Copy link
Contributor

@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-consecutive-operator-sdk-upgrades-azure, ci/prow/hco-e2e-upgrade-prev-operator-sdk-azure

In response to this:

hco-e2e-consecutive-operator-sdk-upgrades-aws lane succeeded.
/override ci/prow/hco-e2e-consecutive-operator-sdk-upgrades-azure
hco-e2e-upgrade-prev-operator-sdk-aws lane succeeded.
/override ci/prow/hco-e2e-upgrade-prev-operator-sdk-azure

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@nunnatsa
Copy link
Collaborator Author

nunnatsa commented Dec 8, 2025

/retest

}

// add adds a new Controller to mgr with r as the reconcile.Reconciler
func add(mgr manager.Manager, r *ReconcileAdmissionPolicy) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func add(mgr manager.Manager, r *ReconcileAdmissionPolicy) error {
func addController(mgr manager.Manager, r *ReconcileAdmissionPolicy) error {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use this name in all existing controller. Let's keep the name for consistency.

@hco-bot
Copy link
Collaborator

hco-bot commented Dec 8, 2025

hco-e2e-kv-smoke-gcp lane succeeded.
/override ci/prow/hco-e2e-kv-smoke-azure

@kubevirt-bot
Copy link
Contributor

@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-kv-smoke-azure

In response to this:

hco-e2e-kv-smoke-gcp lane succeeded.
/override ci/prow/hco-e2e-kv-smoke-azure

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@hco-bot
Copy link
Collaborator

hco-bot commented Dec 8, 2025

hco-e2e-upgrade-operator-sdk-aws lane succeeded.
/override ci/prow/hco-e2e-upgrade-operator-sdk-azure

@kubevirt-bot
Copy link
Contributor

@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-upgrade-operator-sdk-azure

In response to this:

hco-e2e-upgrade-operator-sdk-aws lane succeeded.
/override ci/prow/hco-e2e-upgrade-operator-sdk-azure

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-signoff: yes Indicates the PR's author has DCO signed all their commits. release-note-none Denotes a PR that doesn't merit a release note. size/XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants