[Bug]: nvidia-driver-daemonset podAntiAffinity causes permanent Misscheduled state on multi-GPU node clusters

_**Important Note:  NVIDIA AI Enterprise customers can get support from NVIDIA Enterprise support. Please open a case [here](https://enterprise-support.nvidia.com/s/create-case)**._

**Describe the bug**

On clusters with 2+ GPU nodes, `nvidia-driver-daemonset` permanently reports
`Desired=1` and `Misscheduled=1` despite all pods running healthy on every GPU node.

The root cause is a `requiredDuringSchedulingIgnoredDuringExecution` podAntiAffinity
rule hardcoded in `assets/state-driver/0500_daemonset.yaml`:
```yaml
affinity:
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
        matchExpressions:
        - key: app.kubernetes.io/component
          operator: In
          values:
          - nvidia-driver
      topologyKey: kubernetes.io/hostname
```

This conflicts with the fundamental behavior of a DaemonSet. The DS controller
marks the pod on the second node as misscheduled because anti-affinity prohibits
scheduling there, but cannot evict it due to `IgnoredDuringExecution`.
The result is a permanent `Misscheduled=1` state.

This rule cannot be overridden via `daemonsets.affinity: {}` in values.yaml
or ClusterPolicy — it is injected directly from the asset file.

Reproduced on both v25.3.4 and v26.3.0.

**To Reproduce**

1. Deploy GPU Operator on a cluster with 2 or more GPU nodes
2. Run `kubectl get ds nvidia-driver-daemonset -n gpu-operator`
3. Observe `DESIRED=1`, `AVAILABLE=1`, but driver pods running on all GPU nodes
4. Run `kubectl describe ds nvidia-driver-daemonset -n gpu-operator | grep Misscheduled`
5. Observe `Number of Nodes Misscheduled: 1`

**Expected behavior**

`DESIRED` should match the number of GPU nodes and `Misscheduled` should be 0.
DaemonSets inherently guarantee one pod per node — `requiredDuringScheduling`
podAntiAffinity is redundant and causes incorrect state reporting in multi-GPU-node clusters.

**Suggested Fix**

Replace `requiredDuringSchedulingIgnoredDuringExecution` with
`preferredDuringSchedulingIgnoredDuringExecution`, or remove the podAntiAffinity
block entirely from `assets/state-driver/0500_daemonset.yaml`.

**Environment**

- GPU Operator Version: v25.3.4 (also reproduced on v26.3.0)
- OS: Ubuntu 24.04
- Kernel Version: 6.8.0-87-generic
- Container Runtime Version: containerd (k3s embedded)
- Kubernetes Distro and Version: k3s v1.33.5

**Information to attach**
```
NAME                                       DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE
nvidia-driver-daemonset                    1         1         1       1            1
nvidia-operator-validator                  1         1         1       1            1
```

Both daemonsets show DESIRED=1 while driver pods are Running/Ready on 2 GPU nodes:
```
nvidia-driver-daemonset-xxxxx   1/1   Running   k3s-gpu-worker1
nvidia-driver-daemonset-yyyyy   1/1   Running   k3s-gpu-worker2
```
```
Number of Nodes Misscheduled: 1
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: nvidia-driver-daemonset podAntiAffinity causes permanent Misscheduled state on multi-GPU node clusters #2255

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: nvidia-driver-daemonset podAntiAffinity causes permanent Misscheduled state on multi-GPU node clusters #2255

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions