Skip to content

Conversation

@samip5-bot
Copy link
Contributor

@samip5-bot samip5-bot bot commented Mar 27, 2025

This PR contains the following updates:

Package Update Change
gpu-operator (source) major v24.9.2 -> v25.10.0

Warning

Some dependencies could not be looked up. Check the Dependency Dashboard for more information.


Release Notes

NVIDIA/gpu-operator (gpu-operator)

v25.10.0: GPU Operator 25.10.0 Release

Compare Source

https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/25.10/release-notes.html

v25.3.4: GPU Operator 25.3.4 Release

Compare Source

https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/25.3.4/release-notes.html

v25.3.3: GPU Operator 25.3.3 Release

Compare Source

https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/25.3.3/release-notes.html

v25.3.2: GPU Operator 25.3.2 Release

Compare Source

https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/25.3.2/release-notes.html

v25.3.1: GPU Operator 25.3.1 Release

Compare Source

https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/25.3.1/release-notes.html

v25.3.0: GPU Operator 25.3.0 Release

Compare Source

https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/25.3.0/release-notes.html


Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR has been generated by Renovate Bot.

@samip5-bot
Copy link
Contributor Author

samip5-bot bot commented Mar 27, 2025

--- HelmRelease: gpu-operator/nvidia-gpu-operator ClusterRole: gpu-operator/nvidia-gpu-operator-node-feature-discovery

+++ HelmRelease: gpu-operator/nvidia-gpu-operator ClusterRole: gpu-operator/nvidia-gpu-operator-node-feature-discovery

@@ -5,12 +5,19 @@

   name: nvidia-gpu-operator-node-feature-discovery
   labels:
     app.kubernetes.io/name: node-feature-discovery
     app.kubernetes.io/instance: nvidia-gpu-operator
     app.kubernetes.io/managed-by: Helm
 rules:
+- apiGroups:
+  - ''
+  resources:
+  - namespaces
+  verbs:
+  - watch
+  - list
 - apiGroups:
   - ''
   resources:
   - nodes
   - nodes/status
   verbs:
--- HelmRelease: gpu-operator/nvidia-gpu-operator ClusterRole: gpu-operator/gpu-operator

+++ HelmRelease: gpu-operator/nvidia-gpu-operator ClusterRole: gpu-operator/gpu-operator

@@ -66,30 +66,39 @@

   - ''
   resources:
   - namespaces
   verbs:
   - get
   - list
-  - create
   - watch
   - update
   - patch
 - apiGroups:
   - ''
   resources:
   - events
-  - pods
-  - pods/eviction
   verbs:
   - create
   - get
   - list
   - watch
-  - update
-  - patch
   - delete
+- apiGroups:
+  - ''
+  resources:
+  - pods
+  verbs:
+  - get
+  - list
+  - watch
+- apiGroups:
+  - ''
+  resources:
+  - pods/eviction
+  verbs:
+  - create
 - apiGroups:
   - apps
   resources:
   - daemonsets
   verbs:
   - get
--- HelmRelease: gpu-operator/nvidia-gpu-operator ClusterRoleBinding: gpu-operator/gpu-operator

+++ HelmRelease: gpu-operator/nvidia-gpu-operator ClusterRoleBinding: gpu-operator/gpu-operator

@@ -9,14 +9,11 @@

     app.kubernetes.io/managed-by: Helm
     app.kubernetes.io/component: gpu-operator
 subjects:
 - kind: ServiceAccount
   name: gpu-operator
   namespace: gpu-operator
-- kind: ServiceAccount
-  name: node-feature-discovery
-  namespace: gpu-operator
 roleRef:
   kind: ClusterRole
   name: gpu-operator
   apiGroup: rbac.authorization.k8s.io
 
--- HelmRelease: gpu-operator/nvidia-gpu-operator Role: gpu-operator/nvidia-gpu-operator-node-feature-discovery-worker

+++ HelmRelease: gpu-operator/nvidia-gpu-operator Role: gpu-operator/nvidia-gpu-operator-node-feature-discovery-worker

@@ -14,12 +14,13 @@

   resources:
   - nodefeatures
   verbs:
   - create
   - get
   - update
+  - delete
 - apiGroups:
   - ''
   resources:
   - pods
   verbs:
   - get
--- HelmRelease: gpu-operator/nvidia-gpu-operator Role: gpu-operator/gpu-operator

+++ HelmRelease: gpu-operator/nvidia-gpu-operator Role: gpu-operator/gpu-operator

@@ -82,7 +82,17 @@

   - get
   - list
   - create
   - watch
   - update
   - delete
+- apiGroups:
+  - nfd.k8s-sigs.io
+  resources:
+  - nodefeatures
+  verbs:
+  - get
+  - list
+  - watch
+  - create
+  - update
 
--- HelmRelease: gpu-operator/nvidia-gpu-operator DaemonSet: gpu-operator/nvidia-gpu-operator-node-feature-discovery-worker

+++ HelmRelease: gpu-operator/nvidia-gpu-operator DaemonSet: gpu-operator/nvidia-gpu-operator-node-feature-discovery-worker

@@ -22,35 +22,38 @@

         app.kubernetes.io/name: node-feature-discovery
         app.kubernetes.io/instance: nvidia-gpu-operator
         role: worker
     spec:
       dnsPolicy: ClusterFirstWithHostNet
       priorityClassName: system-node-critical
+      imagePullSecrets: null
       serviceAccountName: node-feature-discovery
       securityContext: {}
       hostNetwork: false
       containers:
       - name: worker
         securityContext:
           allowPrivilegeEscalation: false
           capabilities:
             drop:
             - ALL
           readOnlyRootFilesystem: true
           runAsNonRoot: true
-        image: registry.k8s.io/nfd/node-feature-discovery:v0.16.6
+        image: registry.k8s.io/nfd/node-feature-discovery:v0.18.2
         imagePullPolicy: IfNotPresent
         livenessProbe:
-          grpc:
-            port: 8082
+          httpGet:
+            path: /healthz
+            port: http
           initialDelaySeconds: 10
         readinessProbe:
+          httpGet:
+            path: /healthz
+            port: http
+          initialDelaySeconds: 5
           failureThreshold: 10
-          grpc:
-            port: 8082
-          initialDelaySeconds: 5
         env:
         - name: NODE_NAME
           valueFrom:
             fieldRef:
               fieldPath: spec.nodeName
         - name: POD_NAME
@@ -67,21 +70,17 @@

           requests:
             cpu: 5m
             memory: 64Mi
         command:
         - nfd-worker
         args:
-        - -feature-gates=NodeFeatureAPI=true
         - -feature-gates=NodeFeatureGroupAPI=false
-        - -metrics=8081
-        - -grpc-health=8082
+        - -port=8080
         ports:
-        - containerPort: 8081
-          name: metrics
-        - containerPort: 8082
-          name: health
+        - containerPort: 8080
+          name: http
         volumeMounts:
         - name: host-boot
           mountPath: /host-boot
           readOnly: true
         - name: host-os-release
           mountPath: /host-etc/os-release
@@ -94,15 +93,12 @@

           readOnly: true
         - name: host-lib
           mountPath: /host-lib
           readOnly: true
         - name: host-proc-swaps
           mountPath: /host-proc/swaps
-          readOnly: true
-        - name: source-d
-          mountPath: /etc/kubernetes/node-feature-discovery/source.d/
           readOnly: true
         - name: features-d
           mountPath: /etc/kubernetes/node-feature-discovery/features.d/
           readOnly: true
         - name: nfd-worker-conf
           mountPath: /etc/kubernetes/node-feature-discovery
@@ -123,15 +119,12 @@

       - name: host-lib
         hostPath:
           path: /lib
       - name: host-proc-swaps
         hostPath:
           path: /proc/swaps
-      - name: source-d
-        hostPath:
-          path: /etc/kubernetes/node-feature-discovery/source.d/
       - name: features-d
         hostPath:
           path: /etc/kubernetes/node-feature-discovery/features.d/
       - name: nfd-worker-conf
         configMap:
           name: nvidia-gpu-operator-node-feature-discovery-worker-conf
--- HelmRelease: gpu-operator/nvidia-gpu-operator Deployment: gpu-operator/nvidia-gpu-operator-node-feature-discovery-master

+++ HelmRelease: gpu-operator/nvidia-gpu-operator Deployment: gpu-operator/nvidia-gpu-operator-node-feature-discovery-master

@@ -21,13 +21,15 @@

     metadata:
       labels:
         app.kubernetes.io/name: node-feature-discovery
         app.kubernetes.io/instance: nvidia-gpu-operator
         role: master
     spec:
+      dnsPolicy: ClusterFirstWithHostNet
       priorityClassName: system-node-critical
+      imagePullSecrets: null
       serviceAccountName: node-feature-discovery
       enableServiceLinks: false
       securityContext: {}
       hostNetwork: false
       containers:
       - name: master
@@ -35,30 +37,31 @@

           allowPrivilegeEscalation: false
           capabilities:
             drop:
             - ALL
           readOnlyRootFilesystem: true
           runAsNonRoot: true
-        image: registry.k8s.io/nfd/node-feature-discovery:v0.16.6
+        image: registry.k8s.io/nfd/node-feature-discovery:v0.18.2
         imagePullPolicy: IfNotPresent
+        startupProbe:
+          httpGet:
+            path: /healthz
+            port: http
+          failureThreshold: 30
         livenessProbe:
-          grpc:
-            port: 8082
-          initialDelaySeconds: 10
+          httpGet:
+            path: /healthz
+            port: http
         readinessProbe:
+          httpGet:
+            path: /healthz
+            port: http
           failureThreshold: 10
-          grpc:
-            port: 8082
-          initialDelaySeconds: 5
         ports:
         - containerPort: 8080
-          name: grpc
-        - containerPort: 8081
-          name: metrics
-        - containerPort: 8082
-          name: health
+          name: http
         env:
         - name: NODE_NAME
           valueFrom:
             fieldRef:
               fieldPath: spec.nodeName
         command:
@@ -67,17 +70,15 @@

           limits:
             memory: 4Gi
           requests:
             cpu: 100m
             memory: 128Mi
         args:
-        - -crd-controller=true
-        - -feature-gates=NodeFeatureAPI=true
+        - -enable-leader-election
         - -feature-gates=NodeFeatureGroupAPI=false
-        - -metrics=8081
-        - -grpc-health=8082
+        - -port=8080
         volumeMounts:
         - name: nfd-master-conf
           mountPath: /etc/kubernetes/node-feature-discovery
           readOnly: true
       volumes:
       - name: nfd-master-conf
@@ -88,28 +89,17 @@

             path: nfd-master.conf
       affinity:
         nodeAffinity:
           preferredDuringSchedulingIgnoredDuringExecution:
           - preference:
               matchExpressions:
-              - key: node-role.kubernetes.io/master
-                operator: In
-                values:
-                - ''
-            weight: 1
-          - preference:
-              matchExpressions:
               - key: node-role.kubernetes.io/control-plane
                 operator: In
                 values:
                 - ''
             weight: 1
       tolerations:
       - effect: NoSchedule
-        key: node-role.kubernetes.io/master
-        operator: Equal
-        value: ''
-      - effect: NoSchedule
         key: node-role.kubernetes.io/control-plane
         operator: Equal
         value: ''
 
--- HelmRelease: gpu-operator/nvidia-gpu-operator Deployment: gpu-operator/nvidia-gpu-operator-node-feature-discovery-gc

+++ HelmRelease: gpu-operator/nvidia-gpu-operator Deployment: gpu-operator/nvidia-gpu-operator-node-feature-discovery-gc

@@ -24,18 +24,29 @@

         app.kubernetes.io/instance: nvidia-gpu-operator
         role: gc
     spec:
       serviceAccountName: node-feature-discovery
       dnsPolicy: ClusterFirstWithHostNet
       priorityClassName: system-node-critical
+      imagePullSecrets: null
       securityContext: {}
       hostNetwork: false
       containers:
       - name: gc
-        image: registry.k8s.io/nfd/node-feature-discovery:v0.16.6
+        image: registry.k8s.io/nfd/node-feature-discovery:v0.18.2
         imagePullPolicy: IfNotPresent
+        livenessProbe:
+          httpGet:
+            path: /healthz
+            port: http
+          initialDelaySeconds: 10
+        readinessProbe:
+          httpGet:
+            path: /healthz
+            port: http
+          initialDelaySeconds: 5
         env:
         - name: NODE_NAME
           valueFrom:
             fieldRef:
               fieldPath: spec.nodeName
         command:
@@ -53,9 +64,9 @@

           capabilities:
             drop:
             - ALL
           readOnlyRootFilesystem: true
           runAsNonRoot: true
         ports:
-        - name: metrics
-          containerPort: 8081
+        - name: http
+          containerPort: 8080
 
--- HelmRelease: gpu-operator/nvidia-gpu-operator Deployment: gpu-operator/gpu-operator

+++ HelmRelease: gpu-operator/nvidia-gpu-operator Deployment: gpu-operator/gpu-operator

@@ -28,13 +28,13 @@

         openshift.io/scc: restricted-readonly
     spec:
       serviceAccountName: gpu-operator
       priorityClassName: system-node-critical
       containers:
       - name: gpu-operator
-        image: nvcr.io/nvidia/gpu-operator:v24.9.2
+        image: nvcr.io/nvidia/gpu-operator:v25.10.0
         imagePullPolicy: IfNotPresent
         command:
         - gpu-operator
         args:
         - --leader-elect
         - --zap-time-encoding=epoch
@@ -44,13 +44,13 @@

           value: ''
         - name: OPERATOR_NAMESPACE
           valueFrom:
             fieldRef:
               fieldPath: metadata.namespace
         - name: DRIVER_MANAGER_IMAGE
-          value: nvcr.io/nvidia/cloud-native/k8s-driver-manager:v0.7.0
+          value: nvcr.io/nvidia/cloud-native/k8s-driver-manager:v0.9.0
         volumeMounts:
         - name: host-os-release
           mountPath: /host-etc/os-release
           readOnly: true
         livenessProbe:
           httpGet:
--- HelmRelease: gpu-operator/nvidia-gpu-operator ClusterPolicy: gpu-operator/cluster-policy

+++ HelmRelease: gpu-operator/nvidia-gpu-operator ClusterPolicy: gpu-operator/cluster-policy

@@ -10,51 +10,48 @@

     app.kubernetes.io/component: gpu-operator
 spec:
   hostPaths:
     rootFS: /
     driverInstallDir: /run/nvidia/driver
   operator:
-    defaultRuntime: docker
     runtimeClass: nvidia
     initContainer:
       repository: nvcr.io/nvidia
       image: cuda
-      version: 12.6.3-base-ubi9
+      version: 13.0.1-base-ubi9
       imagePullPolicy: IfNotPresent
   daemonsets:
     labels:
-      helm.sh/chart: gpu-operator-v24.9.2
+      helm.sh/chart: gpu-operator-v25.10.0
       app.kubernetes.io/managed-by: gpu-operator
     tolerations:
     - effect: NoSchedule
       key: nvidia.com/gpu
       operator: Exists
     priorityClassName: system-node-critical
     updateStrategy: RollingUpdate
     rollingUpdate:
       maxUnavailable: '1'
   validator:
-    repository: nvcr.io/nvidia/cloud-native
-    image: gpu-operator-validator
-    version: v24.9.2
+    repository: nvcr.io/nvidia
+    image: gpu-operator
+    version: v25.10.0
     imagePullPolicy: IfNotPresent
     plugin:
-      env:
-      - name: WITH_WORKLOAD
-        value: 'false'
+      env: []
   mig:
     strategy: single
   psa:
     enabled: false
   cdi:
-    enabled: false
-    default: false
+    enabled: true
+    default: null
   driver:
     enabled: false
     useNvidiaDriverCRD: false
-    useOpenKernelModules: false
+    kernelModuleType: auto
     usePrecompiled: false
     repository: registry.skysolutions.fi/library/nvidia
     image: driver
     version: 550.90.07
     imagePullPolicy: IfNotPresent
     startupProbe:
@@ -65,34 +62,21 @@

     rdma:
       enabled: false
       useHostMofed: false
     manager:
       repository: nvcr.io/nvidia/cloud-native
       image: k8s-driver-manager
-      version: v0.7.0
-      imagePullPolicy: IfNotPresent
-      env:
-      - name: ENABLE_GPU_POD_EVICTION
-        value: 'true'
-      - name: ENABLE_AUTO_DRAIN
-        value: 'false'
-      - name: DRAIN_USE_FORCE
-        value: 'false'
-      - name: DRAIN_POD_SELECTOR_LABEL
-        value: ''
-      - name: DRAIN_TIMEOUT_SECONDS
-        value: 0s
-      - name: DRAIN_DELETE_EMPTYDIR_DATA
-        value: 'false'
+      version: v0.9.0
+      imagePullPolicy: IfNotPresent
     repoConfig:
       configMapName: ''
     certConfig:
       name: ''
     licensingConfig:
-      configMapName: ''
       nlsEnabled: true
+      secretName: ''
     virtualTopology:
       config: ''
     kernelModuleConfig:
       name: ''
     upgradePolicy:
       autoUpgrade: true
@@ -113,19 +97,14 @@

     enabled: false
     image: vgpu-manager
     imagePullPolicy: IfNotPresent
     driverManager:
       repository: nvcr.io/nvidia/cloud-native
       image: k8s-driver-manager
-      version: v0.7.0
-      imagePullPolicy: IfNotPresent
-      env:
-      - name: ENABLE_GPU_POD_EVICTION
-        value: 'false'
-      - name: ENABLE_AUTO_DRAIN
-        value: 'false'
+      version: v0.9.0
+      imagePullPolicy: IfNotPresent
   kataManager:
     enabled: false
     config:
       artifactsDir: /opt/nvidia-gpu-operator/artifacts/runtimeclasses
       runtimeClasses:
       - artifacts:
@@ -138,35 +117,30 @@

           url: nvcr.io/nvidia/cloud-native/kata-gpu-artifacts:ubuntu22.04-535.86.10-snp
         name: kata-nvidia-gpu-snp
         nodeSelector:
           nvidia.com/cc.capable: 'true'
     repository: nvcr.io/nvidia/cloud-native
     image: k8s-kata-manager
-    version: v0.2.2
+    version: v0.2.3
     imagePullPolicy: IfNotPresent
   vfioManager:
     enabled: true
     repository: nvcr.io/nvidia
     image: cuda
-    version: 12.6.3-base-ubi9
+    version: 13.0.1-base-ubi9
     imagePullPolicy: IfNotPresent
     driverManager:
       repository: nvcr.io/nvidia/cloud-native
       image: k8s-driver-manager
-      version: v0.7.0
-      imagePullPolicy: IfNotPresent
-      env:
-      - name: ENABLE_GPU_POD_EVICTION
-        value: 'false'
-      - name: ENABLE_AUTO_DRAIN
-        value: 'false'
+      version: v0.9.0
+      imagePullPolicy: IfNotPresent
   vgpuDeviceManager:
     enabled: true
     repository: nvcr.io/nvidia/cloud-native
     image: vgpu-device-manager
-    version: v0.2.8
+    version: v0.4.1
     imagePullPolicy: IfNotPresent
     config:
       default: default
       name: ''
   ccManager:
     enabled: false
@@ -177,13 +151,13 @@

     imagePullPolicy: IfNotPresent
     env: []
   toolkit:
     enabled: true
     repository: nvcr.io/nvidia/k8s
     image: container-toolkit
-    version: v1.17.4-ubuntu20.04
+    version: v1.18.0
     imagePullPolicy: IfNotPresent
     env:
     - name: CONTAINERD_CONFIG
       value: /var/lib/rancher/k3s/agent/etc/containerd/config.toml
     - name: CONTAINERD_SOCKET
       value: /run/k3s/containerd/containerd.sock
@@ -193,96 +167,76 @@

       value: 'true'
     installDir: /usr/local/nvidia
   devicePlugin:
     enabled: true
     repository: nvcr.io/nvidia
     image: k8s-device-plugin
-    version: v0.17.0
-    imagePullPolicy: IfNotPresent
-    env:
-    - name: PASS_DEVICE_SPECS
-      value: 'true'
-    - name: FAIL_ON_INIT_ERROR
-      value: 'true'
-    - name: DEVICE_LIST_STRATEGY
-      value: envvar
-    - name: DEVICE_ID_STRATEGY
-      value: uuid
-    - name: NVIDIA_VISIBLE_DEVICES
-      value: all
-    - name: NVIDIA_DRIVER_CAPABILITIES
-      value: all
+    version: v0.18.0
+    imagePullPolicy: IfNotPresent
     config:
       name: time-slicing-config
       default: any
   dcgm:
     enabled: false
     repository: nvcr.io/nvidia/cloud-native
     image: dcgm
-    version: 3.3.9-1-ubuntu22.04
+    version: 4.4.1-2-ubuntu22.04
     imagePullPolicy: IfNotPresent
   dcgmExporter:
     enabled: true
     repository: nvcr.io/nvidia/k8s
     image: dcgm-exporter
-    version: 3.3.9-3.6.1-ubuntu22.04
-    imagePullPolicy: IfNotPresent
-    env:
-    - name: DCGM_EXPORTER_LISTEN
-      value: :9400
-    - name: DCGM_EXPORTER_KUBERNETES
-      value: 'true'
-    - name: DCGM_EXPORTER_COLLECTORS
-      value: /etc/dcgm-exporter/dcp-metrics-included.csv
+    version: 4.4.1-4.6.0-distroless
+    imagePullPolicy: IfNotPresent
     serviceMonitor:
       additionalLabels: {}
       enabled: false
       honorLabels: false
       interval: 15s
       relabelings: []
+    service:
+      internalTrafficPolicy: Cluster
   gfd:
     enabled: true
     repository: nvcr.io/nvidia
     image: k8s-device-plugin
-    version: v0.17.0
-    imagePullPolicy: IfNotPresent
-    env:
-    - name: GFD_SLEEP_INTERVAL
-      value: 60s
-    - name: GFD_FAIL_ON_INIT_ERROR
-      value: 'true'
+    version: v0.18.0
+    imagePullPolicy: IfNotPresent
   migManager:
     enabled: true
     repository: nvcr.io/nvidia/cloud-native
     image: k8s-mig-manager
-    version: v0.10.0-ubuntu20.04
-    imagePullPolicy: IfNotPresent
-    env:
-    - name: WITH_REBOOT
-      value: 'false'
+    version: v0.13.0
+    imagePullPolicy: IfNotPresent
     config:
       name: null
       default: all-disabled
     gpuClientsConfig:
       name: ''
   nodeStatusExporter:
     enabled: false
-    repository: nvcr.io/nvidia/cloud-native
-    image: gpu-operator-validator
-    version: v24.9.2
+    repository: nvcr.io/nvidia
+    image: gpu-operator
+    version: v25.10.0
+    imagePullPolicy: IfNotPresent
+  gds:
+    enabled: false
+    repository: nvcr.io/nvidia/cloud-native
+    image: nvidia-fs
+    version: 2.26.6
     imagePullPolicy: IfNotPresent
   gdrcopy:
     enabled: false
     repository: nvcr.io/nvidia/cloud-native
     image: gdrdrv
-    version: v2.4.1-2
+    version: v2.5.1
     imagePullPolicy: IfNotPresent
   sandboxWorkloads:
     enabled: false
     defaultWorkload: container
   sandboxDevicePlugin:
     enabled: true
     repository: nvcr.io/nvidia
     image: kubevirt-gpu-device-plugin
-    version: v1.2.10
+    version: v1.4.0
     imagePullPolicy: IfNotPresent
 
--- HelmRelease: gpu-operator/nvidia-gpu-operator Job: gpu-operator/nvidia-gpu-operator-node-feature-discovery-prune

+++ HelmRelease: gpu-operator/nvidia-gpu-operator Job: gpu-operator/nvidia-gpu-operator-node-feature-discovery-prune

@@ -18,49 +18,45 @@

         app.kubernetes.io/name: node-feature-discovery
         app.kubernetes.io/instance: nvidia-gpu-operator
         app.kubernetes.io/managed-by: Helm
         role: prune
     spec:
       serviceAccountName: nvidia-gpu-operator-node-feature-discovery-prune
+      imagePullSecrets: null
       containers:
       - name: nfd-master
         securityContext:
           allowPrivilegeEscalation: false
           capabilities:
             drop:
             - ALL
           readOnlyRootFilesystem: true
           runAsNonRoot: true
-        image: registry.k8s.io/nfd/node-feature-discovery:v0.16.6
+        image: registry.k8s.io/nfd/node-feature-discovery:v0.18.2
         imagePullPolicy: IfNotPresent
         command:
         - nfd-master
         args:
         - -prune
       restartPolicy: Never
       affinity:
         nodeAffinity:
           preferredDuringSchedulingIgnoredDuringExecution:
           - preference:
               matchExpressions:
-              - key: node-role.kubernetes.io/master
-                operator: In
-                values:
-                - ''
-            weight: 1
-          - preference:
-              matchExpressions:
               - key: node-role.kubernetes.io/control-plane
                 operator: In
                 values:
                 - ''
             weight: 1
       tolerations:
       - effect: NoSchedule
-        key: node-role.kubernetes.io/master
-        operator: Equal
-        value: ''
-      - effect: NoSchedule
         key: node-role.kubernetes.io/control-plane
         operator: Equal
         value: ''
+      resources:
+        limits:
+          memory: 4Gi
+        requests:
+          cpu: 100m
+          memory: 128Mi
 
--- HelmRelease: gpu-operator/nvidia-gpu-operator Job: gpu-operator/gpu-operator-upgrade-crd

+++ HelmRelease: gpu-operator/nvidia-gpu-operator Job: gpu-operator/gpu-operator-upgrade-crd

@@ -32,15 +32,15 @@

       - effect: NoSchedule
         key: node-role.kubernetes.io/control-plane
         operator: Equal
         value: ''
       containers:
       - name: upgrade-crd
-        image: nvcr.io/nvidia/gpu-operator:v24.9.2
+        image: nvcr.io/nvidia/gpu-operator:v25.10.0
         imagePullPolicy: IfNotPresent
         command:
-        - /bin/sh
+        - sh
         - -c
         - |
           kubectl apply -f /opt/gpu-operator/nvidia.com_clusterpolicies.yaml; kubectl apply -f /opt/gpu-operator/nvidia.com_nvidiadrivers.yaml; kubectl apply -f /opt/gpu-operator/nfd-api-crds.yaml;
       restartPolicy: OnFailure
 

@samip5-bot
Copy link
Contributor Author

samip5-bot bot commented Mar 27, 2025

--- k8s/media/apps/gpu/operator/app Kustomization: flux-system/nvidia-gpu-operator HelmRelease: gpu-operator/nvidia-gpu-operator

+++ k8s/media/apps/gpu/operator/app Kustomization: flux-system/nvidia-gpu-operator HelmRelease: gpu-operator/nvidia-gpu-operator

@@ -12,13 +12,13 @@

     spec:
       chart: gpu-operator
       sourceRef:
         kind: HelmRepository
         name: nvidia
         namespace: flux-system
-      version: v24.9.2
+      version: v25.10.0
   install:
     crds: CreateReplace
     remediation:
       retries: 3
   interval: 15m
   maxHistory: 2

@samip5-bot samip5-bot bot force-pushed the renovate/media-gpu-operator-25.x branch from 190e1c6 to 19b47ff Compare June 12, 2025 20:03
@samip5-bot samip5-bot bot changed the title feat(helm)!: Update chart gpu-operator ( v24.9.2 → v25.3.0 ) feat(helm)!: Update chart gpu-operator ( v24.9.2 → v25.3.1 ) Jun 12, 2025
@samip5-bot samip5-bot bot changed the title feat(helm)!: Update chart gpu-operator ( v24.9.2 → v25.3.1 ) feat(helm)!: Update chart gpu-operator ( v24.9.2 → v25.3.2 ) Jul 26, 2025
@samip5-bot samip5-bot bot force-pushed the renovate/media-gpu-operator-25.x branch from 19b47ff to ce51447 Compare July 26, 2025 00:07
@samip5-bot samip5-bot bot force-pushed the renovate/media-gpu-operator-25.x branch from ce51447 to f2a2aa8 Compare September 11, 2025 00:06
@samip5-bot samip5-bot bot changed the title feat(helm)!: Update chart gpu-operator ( v24.9.2 → v25.3.2 ) feat(helm)!: Update chart gpu-operator ( v24.9.2 → v25.3.3 ) Sep 11, 2025
@samip5-bot samip5-bot bot force-pushed the renovate/media-gpu-operator-25.x branch from f2a2aa8 to 46bdf5e Compare September 19, 2025 20:03
@samip5-bot samip5-bot bot changed the title feat(helm)!: Update chart gpu-operator ( v24.9.2 → v25.3.3 ) feat(helm)!: Update chart gpu-operator ( v24.9.2 → v25.3.4 ) Sep 19, 2025
| datasource | package      | from    | to       |
| ---------- | ------------ | ------- | -------- |
| helm       | gpu-operator | v24.9.2 | v25.10.0 |
@samip5-bot samip5-bot bot force-pushed the renovate/media-gpu-operator-25.x branch from 46bdf5e to 968b9d7 Compare October 25, 2025 04:03
@samip5-bot samip5-bot bot changed the title feat(helm)!: Update chart gpu-operator ( v24.9.2 → v25.3.4 ) feat(helm)!: Update chart gpu-operator ( v24.9.2 → v25.10.0 ) Oct 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants