Skip to content

Conversation

@Snehal-Reddy
Copy link
Contributor

This PR fixes a bug in the load_scope! macro in crates/cuda_std/src/atomic/intrinsics.rs

Previously, the macro was using the $scope identifier (e.g., device, block) when generating the PTX instruction string. This resulted in invalid assembly instructions like ld.relaxed.device.u32.

This change updates the macro to use $scope_asm (e.g., gpu, cta), ensuring valid PTX suffixes are generated (e.g., ld.relaxed.gpu.u32).

fixes #354

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Invalid PTX generation for atomic load/store: incorrect scope suffix

1 participant