KEP-4876: Mutable CSINode Allocatable Property #4875

torredil · 2024-09-25T14:03:50Z

One-line PR description: This PR adds a new KEP for Mutable CSINode Allocatable Property.

Issue link: Mutable CSINode Allocatable Property #4876

Other comments:

keps/prod-readiness/sig-storage/3301.yaml

keps/sig-storage/3301-mutable-csinode-allocatable/README.md

keps/prod-readiness/sig-storage/3301.yaml

keps/sig-storage/4876-mutable-csinode-allocatable/README.md

mauriciopoppe · 2024-10-08T22:35:02Z

keps/sig-storage/4876-mutable-csinode-allocatable/README.md

+    // the CSINode allocatable capacity for this driver. If not set, periodic updates
+    // are disabled, and updates occur only upon detecting capacity-related failures.
+    // +optional
+    AllocatableUpdateInterval *metav1.Duration


Would it be a fixed interval period only? I think it'd be useful to do this with exp backoff too.

Yes, a fixed interval period only. I'm not sure that it would make sense to update this value using an exponential backoff strategy, but open to the idea.

mauriciopoppe · 2024-10-08T22:35:58Z

keps/sig-storage/4876-mutable-csinode-allocatable/README.md

+This updated logic allows the `Allocatable.Count` field to be modified when the feature gate is enabled, while ensuring all
+other fields remain immutable. When the feature gate is disabled, it falls back to the existing validation logic for backward
+compatibility.
+


Another validation is the entities that can update the field, only the kubelet should update it and not end users, so that might mean a change in the kube-apiserver as well.

talked with eddie offline, AFAIK this is done in the node authorizer

plugin/pkg/auth/authorizer/node/node_authorizer.go: return r.authorizeCSINode(nodeName, attrs) plugin/pkg/auth/authorizer/node/node_authorizer.go:// authorizeCSINode authorizes node requests to CSINode storage.k8s.io/csinodes plugin/pkg/auth/authorizer/node/node_authorizer.go:func (r *NodeAuthorizer) authorizeCSINode(nodeName string, attrs authorizer.Attributes) (authorizer.Decision, string, error) { plugin/pkg/auth/authorizer/node/node_authorizer.go: return authorizer.DecisionNoOpinion, "can only get, create, update, patch, or delete a CSINode", nil plugin/pkg/auth/authorizer/node/node_authorizer.go: return authorizer.DecisionNoOpinion, "cannot authorize CSINode subresources", nil plugin/pkg/auth/authorizer/node/node_authorizer.go: //

Hmm, I've been thinking more about this... if I understand the ask correctly, it is to ensure that only kubelet can update CSINode objects, and prevent any other entity - including privileged users - from updating this object.

It seems that the node authorizer is designed to authorize API requests made by kubelets, not restrict requests from other sources. In other words, no change on the node authorizer could possibly address @mauriciopoppe's request (and, no changes are needed on the node authorizer side because it already allows updating the CSINode object, as per @aojea's finding above : )

I'm curious to know if there is precedent before we continue down this path. I believe we don't only allow kubelet to create CSINode objects, wouldn't it make sense to have also enforced this restriction for object creation as well? It seems odd to allow privileged entities the ability to create but not update.

I am not aware of any APIs where we restrict the users. The only distinction is cluster-level API vs namespace-scoped APIs.

keps/sig-storage/4876-mutable-csinode-allocatable/README.md

alculquicondor · 2024-10-09T12:12:31Z

/cc

deads2k

PRR is sufficient for alpha, but I have some scalability concerns and would like to see some quick back of the envelope math for load and what we can do to mitigate it.

keps/sig-storage/4876-mutable-csinode-allocatable/README.md

msau42 · 2024-10-10T00:33:26Z

keps/sig-storage/4876-mutable-csinode-allocatable/README.md

+The `ResourceExhausted` error is directly reported on the `VolumeAttachment` object associated with the relevant attachment.
+
+```golang
+if err := kl.volumeManager.WaitForAttachAndMount(pod); err != nil {


Any consideration for checking in https://github.com/kubernetes/kubernetes/blob/cc67c4cf34d6c6e73458835154d06d0b6012e50f/pkg/volume/csi/csi_attacher.go#L146?

The logic in syncPod is csi-agnostic.

I spent countless hours exploring this path when writing the KEP, and the main issue here is that we need to query a CSI driver's NodeGetInfo endpoint to retrieve the updated allocatable value before updating the CSNode object. That operation can't be done on the KCM side, it needs to be done on the node side :(

Besides this, it would be helpful to create an event on the pod during pod construction to let cluster operators know that pod has failed to come up due to a volume attachment error as a result of insufficient capacity. The idea there is that cluster operators or https://github.com/kubernetes-sigs/descheduler could leverage this event to "remove" the stateful pods that are stuck in ContainerCreating so that they may be scheduled to a different node that does have capacity.

Would love to hear your recommendation for a path forward here.

keps/sig-storage/4876-mutable-csinode-allocatable/README.md

deads2k · 2024-10-10T15:48:21Z

PRR looks good and the scalability section is valuable. We'll have to build it out a little more as we gain experience in alpha.

/approve

k8s-ci-robot · 2024-10-10T15:48:29Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: deads2k, torredil
Once this PR has been reviewed and has the lgtm label, please assign saad-ali for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

~~keps/prod-readiness/OWNERS~~ [deads2k]
keps/sig-storage/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

alculquicondor · 2024-10-10T18:51:55Z

keps/sig-storage/4876-mutable-csinode-allocatable/README.md

+
+- Modifying the core scheduling logic of Kubernetes.
+- Implementing cloud provider-specific solutions within Kubernetes core.
+- Re-scheduling pods stuck in a `ContainerCreating` state.


Maybe this should be re-considered in beta, as it sounds like a major pain point for users.

But this is more a question for SIG Node (cc @dchen1107 @SergeyKanzhelev)

keps/sig-storage/4876-mutable-csinode-allocatable/README.md

keps/sig-storage/4876-mutable-csinode-allocatable/kep.yaml

SergeyKanzhelev · 2024-10-10T21:18:29Z

keps/sig-storage/4876-mutable-csinode-allocatable/README.md

+
+- Modifying the core scheduling logic of Kubernetes.
+- Implementing cloud provider-specific solutions within Kubernetes core.
+- Re-scheduling pods stuck in a `ContainerCreating` state.


is it something that we can check on pod admission instead? As we do with Device Plugin when we allocate devices on admission and then can use them later without the risk of failing the container creation

SergeyKanzhelev · 2024-10-10T21:25:14Z

keps/sig-storage/4876-mutable-csinode-allocatable/README.md

+    // are disabled, and updates occur only upon detecting capacity-related failures.
+    // The minimum allowed value for this field is 10 seconds.
+    // +optional
+    NodeAllocatableUpdatePeriodSeconds *metav1.Duration


Would it be better to use a stream for CSIDriver to report back changes? Same way as we do Device Health in Device Plugin?

None of the CSI RPCs support streaming IIRC.

keps/sig-storage/4876-mutable-csinode-allocatable/README.md

keps/sig-storage/4876-mutable-csinode-allocatable/kep.yaml

keps/sig-storage/4876-mutable-csinode-allocatable/README.md

Signed-off-by: torredil <[email protected]>

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory labels Sep 25, 2024

k8s-ci-robot requested review from saad-ali and xing-yang September 25, 2024 14:03

k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Sep 25, 2024

torredil mentioned this pull request Sep 25, 2024

Mutable CSINode Allocatable Property #4876

Open

4 tasks

torredil force-pushed the kep3301 branch from 58c0237 to 7eda4db Compare September 25, 2024 14:17

torredil force-pushed the kep3301 branch 2 times, most recently from 64ae3c9 to b03c3a3 Compare October 3, 2024 17:09

torredil mentioned this pull request Oct 4, 2024

Static EBS Volume Allocations on CSINode CRDs kubernetes-sigs/aws-ebs-csi-driver#2153

Open

xing-yang reviewed Oct 7, 2024

View reviewed changes

keps/prod-readiness/sig-storage/3301.yaml Outdated Show resolved Hide resolved

xing-yang reviewed Oct 7, 2024

View reviewed changes

keps/sig-storage/3301-mutable-csinode-allocatable/README.md Outdated Show resolved Hide resolved

torredil force-pushed the kep3301 branch 3 times, most recently from 5b85c07 to cac3da7 Compare October 8, 2024 00:15

dipesh-rawat reviewed Oct 8, 2024

View reviewed changes

keps/prod-readiness/sig-storage/3301.yaml Outdated Show resolved Hide resolved

torredil force-pushed the kep3301 branch from cac3da7 to e259a50 Compare October 8, 2024 18:07

torredil changed the title ~~KEP-3301: Mutable CSINode Allocatable Property~~ KEP-4876: Mutable CSINode Allocatable Property Oct 8, 2024

torredil force-pushed the kep3301 branch from e259a50 to bc49b4e Compare October 8, 2024 18:08

mauriciopoppe reviewed Oct 8, 2024

View reviewed changes

dchen1107 mentioned this pull request Oct 9, 2024

KEP-1287: InPlacePodVerticalScaling BETA update #4704

Merged

k8s-ci-robot requested a review from alculquicondor October 9, 2024 12:12

deads2k requested changes Oct 9, 2024

View reviewed changes

keps/sig-storage/4876-mutable-csinode-allocatable/README.md Show resolved Hide resolved

keps/sig-storage/4876-mutable-csinode-allocatable/README.md Outdated Show resolved Hide resolved

torredil force-pushed the kep3301 branch from bc49b4e to b4de850 Compare October 9, 2024 16:16

msau42 reviewed Oct 10, 2024

View reviewed changes

torredil force-pushed the kep3301 branch from b4de850 to 3ff2623 Compare October 10, 2024 04:24

alculquicondor reviewed Oct 10, 2024

View reviewed changes

SergeyKanzhelev reviewed Oct 10, 2024

View reviewed changes

keps/sig-storage/4876-mutable-csinode-allocatable/README.md Show resolved Hide resolved

gnufied reviewed Oct 14, 2024

View reviewed changes

keps/sig-storage/4876-mutable-csinode-allocatable/README.md Show resolved Hide resolved

torredil force-pushed the kep3301 branch from 3ff2623 to cb986e0 Compare October 17, 2024 13:57

x13n reviewed Oct 22, 2024

View reviewed changes

keps/sig-storage/4876-mutable-csinode-allocatable/kep.yaml Show resolved Hide resolved

keps/sig-storage/4876-mutable-csinode-allocatable/README.md Show resolved Hide resolved

Initial KEP check-in

83943e7

Signed-off-by: torredil <[email protected]>

torredil force-pushed the kep3301 branch from cb986e0 to 83943e7 Compare October 22, 2024 15:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KEP-4876: Mutable CSINode Allocatable Property #4875

KEP-4876: Mutable CSINode Allocatable Property #4875

torredil commented Sep 25, 2024 •

edited

Loading

mauriciopoppe Oct 8, 2024

torredil Oct 9, 2024

mauriciopoppe Oct 8, 2024

aojea Oct 9, 2024

torredil Oct 9, 2024

msau42 Oct 10, 2024

alculquicondor commented Oct 9, 2024

deads2k left a comment

msau42 Oct 10, 2024

torredil Oct 10, 2024

deads2k commented Oct 10, 2024

k8s-ci-robot commented Oct 10, 2024

alculquicondor Oct 10, 2024

SergeyKanzhelev Oct 10, 2024

SergeyKanzhelev Oct 10, 2024

gnufied Oct 14, 2024

KEP-4876: Mutable CSINode Allocatable Property #4875

Are you sure you want to change the base?

KEP-4876: Mutable CSINode Allocatable Property #4875

Conversation

torredil commented Sep 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alculquicondor commented Oct 9, 2024

deads2k left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

deads2k commented Oct 10, 2024

k8s-ci-robot commented Oct 10, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

torredil commented Sep 25, 2024 •

edited

Loading