Skip to content

[Bug]: Latest relese does not work on Talos Linux because of incorrect version parsing #2239

@finomen

Description

@finomen

Describe the bug
Latest release fails to list gpus because it fails to process OS version:
{"level":"error","ts":1774112196.4380157,"msg":"Reconciler error","controller":"clusterpolicy-controller","object":{"name":"cluster-policy"},"namespace":"","name":"cluster-policy","reconcileID":"991b62e2-34f1-4844-9ed3-fe049d7a5c04","error":"failed to retrieve GPU node OS tag: error processing OS major version v1: strconv.Atoi: parsing "v1": invalid syntax"}

Annotations on node are the following:
"feature.node.kubernetes.io/system-os_release.ID": "talos",
"feature.node.kubernetes.io/system-os_release.VERSION_ID": "v1.12.6",

Operator 25.10.1 works fine. Likely, it was broken in 3fb6dd3

To Reproduce
Install on Talos linux

Expected behavior
OPerator works

Environment (please provide the following information):

  • GPU Operator Version: 26.3.0
  • OS: Talos 1.12.6
  • Kernel Version: 6.18.18-talos
  • Container Runtime Version: 2.1.6
  • Kubernetes Distro and Version: v1.34.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIssue/PR to expose/discuss/fix a bug

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions