Skip to content

[GSD-11523] GPU Reset Behavior on Out-of-Memory with Intel Arc and xe Driver #842

@arguellocarlos

Description

@arguellocarlos

Hi there,

I'm looking for some insight into how GPU reset is handled when running into out-of-memory (OOM) issues.

My system is running:

  • Kernel: 6.15.9.arch1-1
  • Intel OneAPI Base Toolkit: 2025.2
  • Intel Compute Runtime: 25.27.34303.5
  • Driver: xe

Hardware:

  • AMD Ryzen 9900X
  • Intel Arc B580
  • 48GB DDR5 RAM @ 6200MHz

When I run AI workloads like image generation, the GPU occasionally runs out of memory. When that happens, the entire desktop freezes and becomes completely unresponsive, requiring a hard reboot to recover. I'm particularly wary of hard resets since I have a couple of mechanical drives configured in a RAID array, and I'd really prefer to avoid any risk of data corruption or filesystem damage.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Component: OtherComponent not covered by existing component labelsType: QuestionGeneral question about usage, functionality, or best practices

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions