Fix CUDA tests by aaronj0 · Pull Request #866 · compiler-research/CppInterOp

aaronj0 · 2026-03-23T16:02:12Z

Drop hardcoded CUDA arch provided to the IncrementalCompilerBuilder in createClangInterpreter. While debugging issues with the CUDA support that seems broken on master, after fixing the interpreter creation, I found that the second interpreter creation held an error in cudaGetLastError. CUDA and HIP reuse that option so it would not indicate whether the interpreter was CUDA enabled. All 4 CUDA tests pass locally with an sm_89 device, LLVM 21

github-actions

clang-tidy made some suggestions

github-actions · 2026-03-23T16:22:46Z

+  auto sym = (ret(*)(__VA_ARGS__))Lib.getAddressOfSymbol(#sym);                \
+  if (!sym)                                                                    \
+    return "";
+  LOAD(cuInit, int, unsigned)


warning: do not use C-style cast to convert between unrelated types [cppcoreguidelines-pro-type-cstyle-cast]

LOAD(cuInit, int, unsigned) ^

Additional context

lib/CppInterOp/Compatibility.h:222: expanded from macro 'LOAD'

auto sym = (ret(*)(__VA_ARGS__))Lib.getAddressOfSymbol(#sym); \ ^

github-actions · 2026-03-23T16:22:47Z

+  if (!sym)                                                                    \
+    return "";
+  LOAD(cuInit, int, unsigned)
+  LOAD(cuDeviceGet, int, uint32_t*, int)


warning: no header providing "uint32_t" is directly included [misc-include-cleaner]

lib/CppInterOp/Compatibility.h:14:

+ #include <cstdint>

github-actions · 2026-03-23T16:22:47Z

+      cuDeviceGetAttribute(&maj, /*MAJOR*/ 75, dev) ||
+      cuDeviceGetAttribute(&min, /*MINOR*/ 76, dev))
+    return "";
+  return "sm_" + std::to_string(maj) + std::to_string(min);


warning: no header providing "std::to_string" is directly included [misc-include-cleaner]

return "sm_" + std::to_string(maj) + std::to_string(min); ^

github-actions · 2026-03-23T16:22:47Z

-static bool HasCudaSDK() {
-  auto supportsCudaSDK = []() {
+// perform SDK and runtime checks on a single CUDA interpreter
+static std::pair<bool, bool> CheckCudaSupport() {


warning: no header providing "std::pair" is directly included [misc-include-cleaner]

unittests/CppInterOp/CUDATest.cpp:9:

+ #include <utility>

mcbarton · 2026-03-23T17:22:54Z

-    CB.SetOffloadArch("sm_35");
+    for (const auto* arg : args)
+      if (llvm::StringRef(arg).starts_with("--offload-arch="))
+        OffloadArch = llvm::StringRef(arg).substr(strlen("--offload-arch="));


This is probably a stupid question. Do we have any logic to check that the user provided gpu architechture is a valid one? I'm saying this assuming the answer is yes.

mcbarton · 2026-03-23T17:25:34Z

  compat::Interpreter* interp = &getInterp(I);
  const auto& LO = interp->getCI()->getLangOpts();
+
+  // CUDA and HIP reuse C++ language standards, so LangStd alone reports CXX.


This comment isn't specific to this PR, but given we are looking at gpu related stuff, can the interpreter handle, c++ + openmp (gpu offloading)?

I believe you need to pass -fopenmp to the interpreter, but so far only CPU has been tested and not GPU offloading

vgvassilev · 2026-03-24T05:44:48Z

+      if (llvm::StringRef(arg).starts_with("--offload-arch="))
+        OffloadArch = llvm::StringRef(arg).substr(strlen("--offload-arch="));


I guess we will need some generic way to rely on what's already in clang. Here is what gemini proposes and I think it is worth investigating:

#include "clang/Driver/CudaInstallationDetector.h" #include "clang/Driver/Options.h" #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" #include "llvm/TargetParser/Host.h" // ... inside createClangInterpreter ... // 1. Parse the raw args into a format the Detector understands const auto &Opts = clang::driver::getDriverOptTable(); unsigned MissingArgIndex, MissingArgCount; // Use 0 as the included/excluded flags for a general parse auto InputArgs = Opts.ParseArgs(args, MissingArgIndex, MissingArgCount); // 2. Identify the Triple (Host is usually the intended target for JIT) llvm::Triple HostTriple(llvm::sys::getDefaultTargetTriple()); // 3. Initialize the Detector (No CI needed!) // The detector will look for --cuda-path in InputArgs automatically clang::driver::CudaInstallationDetector Detector( /*Driver=*/ {}, // We can pass an empty driver object or a dummy HostTriple, InputArgs); if (CudaEnabled && Detector.isValid()) { // 4. Resolve the Architecture std::string SelectedArch; if (auto *Arg = InputArgs.getLastArg(clang::driver::options::OPT_offload_arch_EQ)) { SelectedArch = Arg->getValue(); } else { SelectedArch = detectNVIDIAGPUArch(); } // 5. Validation Logic if (!SelectedArch.empty()) { clang::CudaArch CArch = clang::StringToCudaArch(SelectedArch); // If the hardware is newer than the SDK, downgrade to the SDK's max supported arch if (Detector.version() < clang::MinVersionForCudaArch(CArch)) { CArch = clang::MaxVersionForCudaArch(Detector.version()); SelectedArch = clang::CudaArchToString(CArch); } CB.SetOffloadArch(SelectedArch); } }

The advantage is that now you know where the sdk is and you also can pass the relevant include paths automatically:

if (CudaEnabled && Detector.isValid()) { // 1. Add the CUDA include directory (e.g., /usr/local/cuda/include) // We use -isystem to treat them as system headers (suppressing warnings) std::string IncludeFlag = "-isystem" + Detector.getIncludePath(); // Note: Ensure the string's lifetime matches the 'args' usage. // If 'args' is a vector of const char*, you may need to manage the allocation. args.push_back(strdup(IncludeFlag.c_str())); // 2. Add the path to the CUDA builtins (libdevice) // This is required for math functions like __sinf, __expf, etc. std::string LibDevicePath = Detector.getLibDevicePath(); if (!LibDevicePath.empty()) { std::string LinkBitcode = "--cuda-device-lib=" + LibDevicePath; args.push_back(strdup(LinkBitcode.c_str())); } }

This is somewhat what the current implementation does, we use clang::Driver to create the clang::driver::CudaInstallationDetector and iterate over the paths it inserts based on the CUDA directory

This is done by creating a compilation object with the driver and fetching the arg with this clang::driver::options::OPT_cuda_path_EQ enum value

codecov · 2026-03-25T05:54:14Z

Codecov Report

❌ Patch coverage is 91.78082% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.86%. Comparing base (8c06fd6) to head (3587c83).
⚠️ Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
lib/CppInterOp/Compatibility.h	91.78%	6 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #866      +/-   ##
==========================================
+ Coverage   78.67%   78.86%   +0.19%     
==========================================
  Files          11       11              
  Lines        4014     4079      +65     
==========================================
+ Hits         3158     3217      +59     
- Misses        856      862       +6

Files with missing lines	Coverage Δ
lib/CppInterOp/Compatibility.h	`87.50% <91.78%> (+2.06%)`	⬆️

Files with missing lines	Coverage Δ
lib/CppInterOp/Compatibility.h	`87.50% <91.78%> (+2.06%)`	⬆️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions

clang-tidy made some suggestions

There were too many comments to post at once. Showing the first 10 out of 17. Check the log or trigger a new build to see more.

github-actions

clang-tidy made some suggestions

github-actions

clang-tidy made some suggestions

mcbarton · 2026-03-25T10:06:27Z

This PR needs rebasing to get rid of the Emscripten llvm 20 jobs.

aaronj0 · 2026-03-25T10:13:16Z

I have some fixups to increase code coverage..

vgvassilev

LGTM!

github-actions

clang-tidy made some suggestions

github-actions

clang-tidy made some suggestions

`compat::createClangInterpreter` now auto detects the CUDA SDK path via CudaInstallationDetector, and the GPU compute capability (sm_xx) using the CUDA Driver API (cuDeviceGetAttribute). User provided flags still get first priority. This makes CUDA support with CppInterOp much more robust.

Consolidate SDK an dRuntime checks into a fixture that only runs once. Adds tests for `compat::detectCudaInstallPath` and `compat::detectNVPTXArch` While debugging issues with the CUDA support, after fixing the interpreter creation, I found that the second interpreter creation held an error in cudaGetLastError because we perform the check sequentially for HasCudaRuntime. Now, we have a single `CheckCudaSupport` that performs both checks at once. Passes locally

aaronj0 requested a review from vgvassilev March 23, 2026 16:05

github-actions Bot reviewed Mar 23, 2026

View reviewed changes

vgvassilev reviewed Mar 23, 2026

View reviewed changes

Comment thread lib/CppInterOp/Compatibility.h Outdated

mcbarton reviewed Mar 23, 2026

View reviewed changes

vgvassilev reviewed Mar 24, 2026

View reviewed changes

Comment thread lib/CppInterOp/Compatibility.h Outdated

Comment thread lib/CppInterOp/Compatibility.h Outdated

Comment thread lib/CppInterOp/Compatibility.h Outdated

Comment thread unittests/CppInterOp/CUDATest.cpp Outdated

vgvassilev reviewed Mar 24, 2026

View reviewed changes

vgvassilev force-pushed the fix-cuda-tests branch from b6c9e3a to 657cf09 Compare March 25, 2026 05:40

aaronj0 force-pushed the fix-cuda-tests branch 2 times, most recently from 75923f6 to d5fea35 Compare March 25, 2026 08:55

github-actions Bot reviewed Mar 25, 2026

View reviewed changes

aaronj0 force-pushed the fix-cuda-tests branch from d5fea35 to 1534a6b Compare March 25, 2026 09:24

github-actions Bot reviewed Mar 25, 2026

View reviewed changes

aaronj0 force-pushed the fix-cuda-tests branch from 1534a6b to a01fd41 Compare March 25, 2026 09:44

aaronj0 mentioned this pull request Mar 25, 2026

Fix GetLanguage for CUDA/HIP where CXX standard was returned #870

Merged

vgvassilev reviewed Mar 25, 2026

View reviewed changes

Comment thread lib/CppInterOp/Compatibility.h Outdated

vgvassilev reviewed Mar 25, 2026

View reviewed changes

Comment thread lib/CppInterOp/Compatibility.h Outdated

aaronj0 force-pushed the fix-cuda-tests branch from a01fd41 to 0947a76 Compare March 25, 2026 09:59

vgvassilev reviewed Mar 25, 2026

View reviewed changes

Comment thread unittests/CppInterOp/CUDATest.cpp Outdated

github-actions Bot reviewed Mar 25, 2026

View reviewed changes

Comment thread lib/CppInterOp/Compatibility.h

Comment thread lib/CppInterOp/Compatibility.h

Comment thread unittests/CppInterOp/CUDATest.cpp Outdated

Comment thread unittests/CppInterOp/CUDATest.cpp

vgvassilev force-pushed the fix-cuda-tests branch from 0947a76 to 235f249 Compare March 25, 2026 10:11

aaronj0 force-pushed the fix-cuda-tests branch 2 times, most recently from 41bdb5f to fdc6461 Compare March 25, 2026 10:53

vgvassilev approved these changes Mar 25, 2026

View reviewed changes

github-actions Bot reviewed Mar 25, 2026

View reviewed changes

Comment thread unittests/CppInterOp/CUDATest.cpp Outdated

Comment thread unittests/CppInterOp/CUDATest.cpp Outdated

Comment thread unittests/CppInterOp/CUDATest.cpp

aaronj0 force-pushed the fix-cuda-tests branch 2 times, most recently from cfaaf7d to d94e0ec Compare March 25, 2026 12:19

github-actions Bot reviewed Mar 25, 2026

View reviewed changes

Comment thread unittests/CppInterOp/CUDATest.cpp

Comment thread unittests/CppInterOp/CUDATest.cpp

aaronj0 mentioned this pull request Mar 25, 2026

[cuda] Add CUB BlockReduce test #872

Merged

vgvassilev force-pushed the fix-cuda-tests branch from d94e0ec to 441f174 Compare March 25, 2026 14:51

aaronj0 added 2 commits March 25, 2026 16:53

aaronj0 force-pushed the fix-cuda-tests branch from 441f174 to 3587c83 Compare March 25, 2026 15:54

vgvassilev merged commit 80d9655 into compiler-research:main Mar 25, 2026
31 of 33 checks passed

		if (llvm::StringRef(arg).starts_with("--offload-arch="))
		OffloadArch = llvm::StringRef(arg).substr(strlen("--offload-arch="));

Conversation

aaronj0 commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mcbarton Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

mcbarton Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

aaronj0 Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vgvassilev Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

aaronj0 Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

aaronj0 Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aaronj0 commented Mar 23, 2026 •

edited

Loading

codecov Bot commented Mar 25, 2026 •

edited

Loading