[Driver][SYCL][NewOffloadModel] Pull in device libraries at device compile by mdtoguchi · Pull Request #21396 · intel/llvm

mdtoguchi · 2026-02-27T16:32:44Z

Pull in the device libraries during compile time when using the new offload model. This is done by using the -mlink-builtin-bitcode option, passing in each individual device library as needed.

This frees up the responsibility of linking in the device libraries during link time, and requiring the driver to pass any libraries needed to the clang-linker-wrapper or having the clang-linker-wrapper figure out what device libraries are needed at link.

With this move, the new.o device libraries are no longer used during the device linking stage using the clang-linker-wrapper. Made updates to the build to stop creating these binaries.

…mpile time Pull in the device libraries during compile time when using the new offload model. This is done by using the `-mlink-builtin-bitcode` option, passing in each individual device library as needed. This frees up the responsibility of linking in the device libraries during link time, and requiring the driver to pass any libraries needed to the clang-linker-wrapper or having the clang-linker-wrapper figure out what device libraries are needed at link.

clang/lib/Driver/ToolChains/Clang.cpp

jinge90 · 2026-03-12T02:06:31Z

Hi, @mdtoguchi @bader
When I tried this PR to reproduce asan issue, I encountered following warning:
clang-linker-wrapper: warning: SYCL device library file list is empty
It seems that we should remove this warning.
And I see the failures for msan/tsan aot, let me reproduce in my side and take a look.
Thanks very much.

Signed-off-by: jinge90 <ge.jin@intel.com>

jinge90 · 2026-03-12T08:39:25Z

Hi, @mdtoguchi @bader When I tried this PR to reproduce asan issue, I encountered following warning: clang-linker-wrapper: warning: SYCL device library file list is empty It seems that we should remove this warning. And I see the failures for msan/tsan aot, let me reproduce in my side and take a look. Thanks very much.

The root cause is msan and tsan includes 'device globals', the workflow of mlink-builtin-bitcode in compilation phase is different from current devicelib linking. Each device image will be linked with required device library bitcode and those linked device images will be linked finally to get final device image. If msan/tsan is applied, all device image will be linked with msan/asan device library bitcode and all include the same device global, this is why we see multiple symbol definition in pre-ci failures.
I just push a little fix to bypass these failures to unblock this PR, just make all device globals to be weak. Let us wait for the pre-ci results.

Thanks very much.

clang/lib/Driver/ToolChains/SYCL.cpp

- Uses BitCodeLibraryInfo for determining how to link in the device libs - Removes warning from clang-linker-wrapper about no device libs passed

…pts to CLW" This reverts commit fe93e6e.

srividya-sundaram · 2026-03-18T02:04:27Z

Pull in the device libraries during compile time when using the new offload model. This is done by using the -mlink-builtin-bitcode option, passing in each individual device library as needed.

Will this increase the build time? Especially when compiling for multiple targets or when there are multiple source files?

clang/lib/Driver/ToolChains/SYCL.cpp

srividya-sundaram · 2026-03-18T02:09:30Z

clang/lib/Driver/ToolChains/SYCL.h

@@ -189,6 +188,10 @@ class LLVM_LIBRARY_VISIBILITY SYCLToolChain : public ToolChain {
      const llvm::opt::ArgList &Args,
      llvm::opt::ArgStringList &CC1Args) const override;

+  llvm::SmallVector<BitCodeLibraryInfo, 12>


Can you add documentation comments for all newly added functions?

mdtoguchi · 2026-03-18T16:12:05Z

Pull in the device libraries during compile time when using the new offload model. This is done by using the -mlink-builtin-bitcode option, passing in each individual device library as needed.

Will this increase the build time? Especially when compiling for multiple targets or when there are multiple source files?

I did a quick -ftime-trace check against the device compilation and the LinkInModulesPass adds about 3ms-4ms of time to the overall compilation of that source file. Compared to the overall compilation time, this should be negligible.

bader · 2026-03-18T18:04:20Z

This is done by using the -mlink-builtin-bitcode option, passing in each individual device library as needed.

This doesn't sound right. There must be just one library of builtins. We need to extend this pattern to SPIR-V target now -

llvm/clang/lib/Driver/ToolChains/SYCL.cpp

Lines 553 to 572 in 06c2553

    
           if (TargetTriple.isNVPTX()) { 
        
             if (!NoOffloadLib) 
        
               LibraryList.push_back( 
        
                   Args.MakeArgString("devicelib-nvptx64-nvidia-cuda.bc")); 
        
             return LibraryList; 
        
           } 
        
           if (TargetTriple.isAMDGCN()) { 
        
             if (!NoOffloadLib) 
        
               LibraryList.push_back( 
        
                   Args.MakeArgString("devicelib-amdgcn-amd-amdhsa.bc")); 
        
             return LibraryList; 
        
           } 
        
           // Ignore no-offloadlib for NativeCPU device library, it provides some 
        
           // critical builtins which must be linked with user's device image. 
        
           if (TargetTriple.isNativeCPU()) { 
        
             LibraryList.push_back(Args.MakeArgString("libsycl-nativecpu_utils.bc")); 
        
             return LibraryList; 
        
           }

mdtoguchi · 2026-03-18T19:28:43Z

This is done by using the -mlink-builtin-bitcode option, passing in each individual device library as needed.

This doesn't sound right. There must be just one library of builtins. We need to extend this pattern to SPIR-V target now -

llvm/clang/lib/Driver/ToolChains/SYCL.cpp

Lines 553 to 572 in 06c2553

if (TargetTriple.isNVPTX()) {

if (!NoOffloadLib)

LibraryList.push_back(

Args.MakeArgString("devicelib-nvptx64-nvidia-cuda.bc"));

return LibraryList;

}

if (TargetTriple.isAMDGCN()) {

if (!NoOffloadLib)

LibraryList.push_back(

Args.MakeArgString("devicelib-amdgcn-amd-amdhsa.bc"));

return LibraryList;

}

// Ignore no-offloadlib for NativeCPU device library, it provides some

// critical builtins which must be linked with user's device image.

if (TargetTriple.isNativeCPU()) {

LibraryList.push_back(Args.MakeArgString("libsycl-nativecpu_utils.bc"));

return LibraryList;

}

We can create a single .bc file and link it in similar to the other targets. @bader Is there an issue with pulling in multiple bitcode files?

bader · 2026-03-18T21:30:02Z

@bader Is there an issue with pulling in multiple bitcode files?

Basically, using a single file simplifies compiler implementation, debugging, distribution, etc.
Technically, multiple bitcode files works as well, but using multiple files complicates the work with the built-ins library without other benefits. Here is GitHub issue related to this topic - #21512.

mdtoguchi · 2026-03-18T22:10:30Z

@bader Is there an issue with pulling in multiple bitcode files?

Basically, using a single file simplifies compiler implementation, debugging, distribution, etc. Technically, multiple bitcode files works as well, but using multiple files complicates the work with the built-ins library without other benefits. Here is GitHub issue related to this topic - #21512.

As there is no current functional problem with multiple bitcode files - I would like to move forward with this PR and work on providing a single devicelib in a separate change. Is that OK?

bader · 2026-03-18T22:12:44Z

@bader Is there an issue with pulling in multiple bitcode files?

Basically, using a single file simplifies compiler implementation, debugging, distribution, etc. Technically, multiple bitcode files works as well, but using multiple files complicates the work with the built-ins library without other benefits. Here is GitHub issue related to this topic - #21512.

As there is no current functional problem with multiple bitcode files - I would like to move forward with this PR and work on providing a single devicelib in a separate change. Is that OK?

Yes. Thanks!

jinge90 · 2026-03-19T01:33:38Z

@bader Is there an issue with pulling in multiple bitcode files?

Basically, using a single file simplifies compiler implementation, debugging, distribution, etc. Technically, multiple bitcode files works as well, but using multiple files complicates the work with the built-ins library without other benefits. Here is GitHub issue related to this topic - #21512.

As there is no current functional problem with multiple bitcode files - I would like to move forward with this PR and work on providing a single devicelib in a separate change. Is that OK?

Yes. Thanks!

Hi, @bader @mdtoguchi
I suggest to have follwoing device lib files:
for cmath related devicelib files--->libm.bc
for other such as string, assert, random--->libc.bc
For intel math function---->libimf.bc
For device sanitzer libs: not changed
This can align with community gpu libc better, libm.bc/a and libc.bc/a are provided in LLVM libc for GPU target. By doing so, we don't need to change driver code when switching to LLVM libc for libdevice functionalities.
For intel math function, we can align with intel compiler style for normal CPU target, we have libimf.a/so in intel compiler package for CPU platform.
For sanitizer libs, I suggest to keep separate lib files since they are not linked by default but controlled by option "-fsanitize=xxx".

Thanks very much.

bader · 2026-03-19T15:18:49Z

@bader Is there an issue with pulling in multiple bitcode files?

Basically, using a single file simplifies compiler implementation, debugging, distribution, etc. Technically, multiple bitcode files works as well, but using multiple files complicates the work with the built-ins library without other benefits. Here is GitHub issue related to this topic - #21512.

As there is no current functional problem with multiple bitcode files - I would like to move forward with this PR and work on providing a single devicelib in a separate change. Is that OK?

Yes. Thanks!

Hi, @bader @mdtoguchi I suggest to have follwoing device lib files: for cmath related devicelib files--->libm.bc for other such as string, assert, random--->libc.bc For intel math function---->libimf.bc For device sanitzer libs: not changed This can align with community gpu libc better, libm.bc/a and libc.bc/a are provided in LLVM libc for GPU target. By doing so, we don't need to change driver code when switching to LLVM libc for libdevice functionalities. For intel math function, we can align with intel compiler style for normal CPU target, we have libimf.a/so in intel compiler package for CPU platform. For sanitizer libs, I suggest to keep separate lib files since they are not linked by default but controlled by option "-fsanitize=xxx".

Thanks very much.

sounds good to me. @jinge90, I expect you to implement this proposal not @mdtoguchi. Right?

mdtoguchi · 2026-03-19T15:55:23Z

Hi, @bader @mdtoguchi I suggest to have follwoing device lib files: for cmath related devicelib files--->libm.bc for other such as string, assert, random--->libc.bc For intel math function---->libimf.bc For device sanitzer libs: not changed This can align with community gpu libc better, libm.bc/a and libc.bc/a are provided in LLVM libc for GPU target. By doing so, we don't need to change driver code when switching to LLVM libc for libdevice functionalities. For intel math function, we can align with intel compiler style for normal CPU target, we have libimf.a/so in intel compiler package for CPU platform. For sanitizer libs, I suggest to keep separate lib files since they are not linked by default but controlled by option "-fsanitize=xxx".
Thanks very much.

sounds good to me. @jinge90, I expect you to implement this proposal not @mdtoguchi. Right?

@bader, I will follow this plan - thanks!

The device libs are expected and internal to the compiler, emit an error if any are not found. Update lit testing to check if the device libs have been added to the targets, so we can better manipulate the testing.

jinge90 · 2026-03-20T01:58:39Z

@bader Is there an issue with pulling in multiple bitcode files?

Basically, using a single file simplifies compiler implementation, debugging, distribution, etc. Technically, multiple bitcode files works as well, but using multiple files complicates the work with the built-ins library without other benefits. Here is GitHub issue related to this topic - #21512.

As there is no current functional problem with multiple bitcode files - I would like to move forward with this PR and work on providing a single devicelib in a separate change. Is that OK?

Yes. Thanks!

Hi, @bader @mdtoguchi I suggest to have follwoing device lib files: for cmath related devicelib files--->libm.bc for other such as string, assert, random--->libc.bc For intel math function---->libimf.bc For device sanitzer libs: not changed This can align with community gpu libc better, libm.bc/a and libc.bc/a are provided in LLVM libc for GPU target. By doing so, we don't need to change driver code when switching to LLVM libc for libdevice functionalities. For intel math function, we can align with intel compiler style for normal CPU target, we have libimf.a/so in intel compiler package for CPU platform. For sanitizer libs, I suggest to keep separate lib files since they are not linked by default but controlled by option "-fsanitize=xxx".
Thanks very much.

sounds good to me. @jinge90, I expect you to implement this proposal not @mdtoguchi. Right?

Hi, @bader
Yes, I will do this once this PR is merged.
Thanks very much.

Copilot

Pull request overview

This PR updates the SYCL “new offload model” flow so that SYCL device libraries are pulled in during the device compile step (via -mlink-builtin-bitcode / -mlink-bitcode-file) rather than being handled later during link time (e.g., by clang-linker-wrapper). It also removes generation/usage of the *.new.o/*.new.obj device library artifacts and updates driver tests and diagnostics accordingly.

Changes:

Link SYCL device libraries into SPIR/SPIR-V device compilation with -mlink-builtin-bitcode (and post-opt linking), and add a diagnostic when expected device libraries are missing.
Remove build + usage paths for *.new.o/*.new.obj SYCL device libraries; adjust wrapper/linker behavior and tests to match the new model.
Add weak definitions for certain DeviceGlobal symbols to avoid multiple-definition issues when libraries are pulled in earlier.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
libdevice/sanitizer/tsan_rtl.cpp	Marks `__TsanLaunchInfo` as weak to avoid duplicate definitions.
libdevice/sanitizer/msan_rtl.cpp	Marks `__MsanLaunchInfo` as weak to avoid duplicate definitions.
libdevice/crt_wrapper.cpp	Marks `RandNext` `DeviceGlobal` as weak to avoid duplicate definitions.
libdevice/cmake/modules/SYCLLibdevice.cmake	Stops producing/installing “new offload” (`.new.o`/`.new.obj`) device lib variants; simplifies IMF host library build.
clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp	Skips the “second device linking step” when no extracted device libs are provided.
clang/test/lit.site.cfg.py.in	Exposes `LLVM_ENABLE_PROJECTS` to lit as `config.enable_projects`.
clang/test/lit.cfg.py	Adds a `libdevice` feature when `libdevice` is enabled in projects.
clang/test/Driver/sycl-spirv-ext.cpp	Restricts test to Linux and fixes line continuation formatting.
clang/test/Driver/sycl-spirv-default-options.cpp	Restricts test to Linux.
clang/test/Driver/sycl-offload-new-driver.cpp	Removes wrapper option checks that referenced `*.new.o` device libs.
clang/test/Driver/sycl-instrumentation.cpp	Updates checks to expect `-mlink-builtin-bitcode` usage instead of wrapper-linked `*.new.o`.
clang/test/Driver/sycl-device-lib.cpp	Updates device library expectations from `.new.o` to `.bc` compile-time linking flags.
clang/test/Driver/sycl-device-lib-diag.cpp	New test verifying error diagnostic when expected SYCL device libs are missing.
clang/lib/Driver/ToolChains/SYCL.h	Refactors device-lib listing APIs: adds `getDeviceLibNames`/`getDeviceLibs` returning `BitCodeLibraryInfo`.
clang/lib/Driver/ToolChains/SYCL.cpp	Implements compile-time device-lib injection for SPIR/SPIR-V in the new model; adds missing-lib diagnostic path.
clang/lib/Driver/ToolChains/Clang.cpp	Stops passing SPIR/SPIR-V SYCL device libraries to `clang-linker-wrapper` via `-sycl-device-libraries`; uses `--bitcode-library` for non-SPIR targets.
clang/lib/Driver/Driver.cpp	Updates device-lib selection to use `SYCLToolChain::getDeviceLibNames` (now `.bc` names) during device-link construction.
clang/include/clang/Basic/DiagnosticDriverKinds.td	Adds `err_drv_no_sycl_device_lib` diagnostic used when device libraries can’t be found.

Comments suppressed due to low confidence (1)

clang/lib/Driver/ToolChains/Clang.cpp:11535

appendToList is declared but never used after the switch to --bitcode-library= arguments, which will trigger an unused-variable warning (and can break builds that treat warnings as errors). Remove this lambda or reintroduce its usage if still needed.

    auto appendToList = [](SmallString<256> &List, const Twine &Arg) {
      if (List.size() > 0)
        List += ",";
      List += Arg.str();
    };

clang/lib/Driver/ToolChains/SYCL.cpp

clang/lib/Driver/Driver.cpp

clang/lib/Driver/ToolChains/Clang.cpp

elizabethandrews

I don't see FE changes other than the diagnostic which LGTM

mdtoguchi · 2026-03-23T17:08:10Z

@intel/llvm-reviewers-runtime and @intel/dpcpp-clang-driver-reviewers, please take a look. Thanks!

hchilama

LGTM

srividya-sundaram · 2026-03-23T18:18:17Z

Pull in the device libraries during compile time when using the new offload model. This is done by using the -mlink-builtin-bitcode option, passing in each individual device library as needed.

This frees up the responsibility of linking in the device libraries during link time, and requiring the driver to pass any libraries needed to the clang-linker-wrapper or having the clang-linker-wrapper figure out what device libraries are needed at link.

With this move, the new.o device libraries are no longer used during the device linking stage using the clang-linker-wrapper. Made updates to the build to stop creating these binaries.

Can an end-to-end test/integration test be added for the full compile-link pipeline?

clang/test/Driver/sycl-device-lib.cpp

mdtoguchi changed the title ~~[Driver][SYCL][NewOffloadModel] Pull in device libraries at device co…~~ [Driver][SYCL][NewOffloadModel] Pull in device libraries at device compile Feb 27, 2026

mdtoguchi added the new-offload-model Enables testing with NewOffloadModel. label Feb 27, 2026

mdtoguchi added 8 commits February 27, 2026 09:51

Clang format

2c71f31

Fix libs to use full paths

f752b55

clang format

95642b4

Revert to using old way of grabbing device libs for old model

6020fab

Only pull in native_cpu device lib via clang-linker-wrapper

c92741b

Pass along device lib location

2372f9a

clang format

9e63e86

Disable link warnings

3ae1a65

srividya-sundaram reviewed Mar 5, 2026

View reviewed changes

clang/lib/Driver/ToolChains/Clang.cpp Show resolved Hide resolved

mdtoguchi added 7 commits March 9, 2026 15:34

Mark tests as Linux only, cannot mix Windows .bc files with compile

266927d

Move sanitizer libraries to the clang-linker-wrapper

8ac9664

clang format

5539741

Use -mlink-builtin-bitcode-postopt

24cba66

remove commented line

8f1c6b7

Use -mlink-bitcode-file for sanitizer device libs

d67c134

clang format

5dbc786

jinge90 added 3 commits March 11, 2026 23:16

apply inline to device globals in libdevice

b9ca113

Signed-off-by: jinge90 <ge.jin@intel.com>

use weak to replace inline to bypass CPU bug

6113685

Signed-off-by: jinge90 <ge.jin@intel.com>

remove unneeded inline

975ac13

Signed-off-by: jinge90 <ge.jin@intel.com>

YuriPlyakhin reviewed Mar 12, 2026

View reviewed changes

clang/lib/Driver/ToolChains/SYCL.cpp Outdated Show resolved Hide resolved

mdtoguchi added 3 commits March 12, 2026 16:49

Perform some refactoring

fb1d160

- Uses BitCodeLibraryInfo for determining how to link in the device libs - Removes warning from clang-linker-wrapper about no device libs passed

Move native_cpu device to compilation step and cleanup sycl opts to CLW

fe93e6e

Revert "Move native_cpu device to compilation step and cleanup sycl o…

cb2ab14

…pts to CLW" This reverts commit fe93e6e.

mdtoguchi marked this pull request as ready for review March 16, 2026 15:56

mdtoguchi requested a review from a team as a code owner March 16, 2026 15:56

srividya-sundaram reviewed Mar 18, 2026

View reviewed changes

clang/lib/Driver/ToolChains/SYCL.cpp Outdated Show resolved Hide resolved

srividya-sundaram reviewed Mar 18, 2026

View reviewed changes

Address code review comments.

b56e080

Add multiple target test and TODO

00df735

mdtoguchi added 2 commits March 19, 2026 13:28

Emit a diagnostic when any of the expected device libs are not found

ef42cbd

The device libs are expected and internal to the compiler, emit an error if any are not found. Update lit testing to check if the device libs have been added to the targets, so we can better manipulate the testing.

Merge remote-tracking branch 'intel_llvm/sycl' into link-bitcode-compile

c2447be

mdtoguchi requested a review from a team as a code owner March 19, 2026 20:32

add test for missing device lib diagnostic

2464670

mdtoguchi requested a review from srividya-sundaram March 19, 2026 20:36

jsji requested a review from Copilot March 20, 2026 18:01

Copilot started reviewing on behalf of jsji March 20, 2026 18:02 View session

Copilot AI reviewed Mar 20, 2026

View reviewed changes

clang/lib/Driver/ToolChains/SYCL.cpp Outdated Show resolved Hide resolved

clang/lib/Driver/Driver.cpp Show resolved Hide resolved

clang/lib/Driver/ToolChains/Clang.cpp Outdated Show resolved Hide resolved

Apply suggested reviews from copilot

becc71b

elizabethandrews approved these changes Mar 23, 2026

View reviewed changes

hchilama approved these changes Mar 23, 2026

View reviewed changes

srividya-sundaram reviewed Mar 23, 2026

View reviewed changes

clang/test/Driver/sycl-device-lib.cpp Show resolved Hide resolved

srividya-sundaram approved these changes Mar 23, 2026

View reviewed changes

Conversation

mdtoguchi commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jinge90 commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jinge90 commented Mar 12, 2026

Uh oh!

Uh oh!

srividya-sundaram commented Mar 18, 2026

Uh oh!

Uh oh!

srividya-sundaram Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

mdtoguchi commented Mar 18, 2026

Uh oh!

bader commented Mar 18, 2026

Uh oh!

mdtoguchi commented Mar 18, 2026

Uh oh!

bader commented Mar 18, 2026

Uh oh!

mdtoguchi commented Mar 18, 2026

Uh oh!

bader commented Mar 18, 2026

Uh oh!

jinge90 commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bader commented Mar 19, 2026

Uh oh!

mdtoguchi commented Mar 19, 2026

Uh oh!

jinge90 commented Mar 20, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

elizabethandrews left a comment

Choose a reason for hiding this comment

Uh oh!

mdtoguchi commented Mar 23, 2026

Uh oh!

hchilama left a comment

Choose a reason for hiding this comment

Uh oh!

srividya-sundaram commented Mar 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

mdtoguchi commented Feb 27, 2026 •

edited

Loading

jinge90 commented Mar 12, 2026 •

edited

Loading

jinge90 commented Mar 19, 2026 •

edited

Loading