Skip to content

fix: correct NVIDIA CUDA queue guard in run_opencl_fft#271

Merged
inducer merged 1 commit intoinducer:mainfrom
xywei:fix-nvidia-cuda-queue-check
Mar 17, 2026
Merged

fix: correct NVIDIA CUDA queue guard in run_opencl_fft#271
inducer merged 1 commit intoinducer:mainfrom
xywei:fix-nvidia-cuda-queue-check

Conversation

@xywei
Copy link
Contributor

@xywei xywei commented Mar 17, 2026

Closes #270.

Summary

  • fix an inverted queue check in run_opencl_fft for the NVIDIA CUDA path
  • raise only when wait_for contains events from a different command queue
  • keep same-queue events valid so the marker workaround behaves as intended

Why

The previous condition used if not evt.command_queue != queue, which is logically equivalent to evt.command_queue == queue. That raised on the safe same-queue case and failed to reject different queues.

Validation

  • reproduced downstream failure in volumential on NVIDIA CUDA before the fix (RuntimeError: Different queues not supported with NVIDIA CUDA)
  • after this one-line change, test/test_volume_fmm.py::test_volume_fmm_laplace passes for NVIDIA CUDA on ipa

@inducer inducer merged commit 118ae83 into inducer:main Mar 17, 2026
9 checks passed
@inducer
Copy link
Owner

inducer commented Mar 17, 2026

Thx!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

run_opencl_fft: NVIDIA CUDA queue check is inverted (raises on same queue)

2 participants