Skip to content

[webgpu] Set is_channels_last to true by default in ComputeMatMul#27674

Merged
guschmue merged 7 commits intomicrosoft:mainfrom
Jiawei-Shao:use-optional-matmul
Apr 9, 2026
Merged

[webgpu] Set is_channels_last to true by default in ComputeMatMul#27674
guschmue merged 7 commits intomicrosoft:mainfrom
Jiawei-Shao:use-optional-matmul

Conversation

@Jiawei-Shao
Copy link
Copy Markdown
Contributor

@Jiawei-Shao Jiawei-Shao commented Mar 16, 2026

This patch sets is_channels_last to true by default in the parameter
of ComputeMatMul and ignores it in UseSplitK when there is no
bias.

@guschmue guschmue added the ep:WebGPU ort-web webgpu provider label Mar 16, 2026
@Jiawei-Shao
Copy link
Copy Markdown
Contributor Author

@qjia7 PTAL, thanks!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the WebGPU MatMul/Conv Split-K plumbing to treat is_channels_last as an optional signal (only meaningful when bias is present), reducing ambiguity for bias-less call sites and enabling the Split-K MatMul path.

Changes:

  • Change is_channels_last parameters to std::optional<bool> across WebGPU MatMul helpers and Split-K configuration.
  • Update shader-generation helpers (MatMulWriteFnSourceForMatMul) to accept std::optional<bool> and std::string_view.
  • Adjust WebGPU Conv and WebGPU BERT Attention call sites to only pass is_channels_last when bias is used.

Reviewed changes

Copilot reviewed 11 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
onnxruntime/core/providers/webgpu/webgpu_utils.h Updates Split-K config API to accept optional is_channels_last.
onnxruntime/core/providers/webgpu/webgpu_utils.cc Updates Split-K gating logic to only consult is_channels_last when provided.
onnxruntime/core/providers/webgpu/vendor/intel/math/matmul.h Simplifies Intel MatMul subgroup program interface (bias removed).
onnxruntime/core/providers/webgpu/vendor/intel/math/matmul.cc Updates Intel subgroup shader generation for new MatMul write helper signature.
onnxruntime/core/providers/webgpu/nn/conv.cc Passes is_channels_last only when Conv MatMul path includes bias.
onnxruntime/core/providers/webgpu/math/matmul_packed.h Updates MatMul program API to carry optional is_channels_last.
onnxruntime/core/providers/webgpu/math/matmul_packed.cc Treats bias presence as is_channels_last_.has_value() and threads optional into shader generation.
onnxruntime/core/providers/webgpu/math/matmul.h Makes ComputeMatMul accept optional is_channels_last with a default of {}.
onnxruntime/core/providers/webgpu/math/matmul.cc Enforces consistency between bias presence and is_channels_last engagement; wires optional into Split-K selection and MatMul program creation.
onnxruntime/core/providers/webgpu/math/gemm_utils.h Updates MatMul write helper signature to optional is_channels_last and string_view.
onnxruntime/core/providers/webgpu/math/gemm_utils.cc Implements optional-aware bias handling in MatMul write helper.
onnxruntime/core/providers/webgpu/math/gemm_packed.cc Updates Split-K selection call site to pass std::nullopt for is_channels_last.
onnxruntime/contrib_ops/webgpu/bert/attention.cc Builds MatMul input list conditionally and passes optional is_channels_last only with bias.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/providers/webgpu/math/matmul_packed.h Outdated
Comment thread onnxruntime/core/providers/webgpu/math/gemm_utils.h Outdated
Comment thread onnxruntime/core/providers/webgpu/math/matmul.cc
Comment thread onnxruntime/core/providers/webgpu/math/matmul.h Outdated
This patch sets `is_channels_last` to true by default in the parameter
of `ComputeMatMul` and ignores it in `UseSplitK` when there is no
`bias`.
@Jiawei-Shao Jiawei-Shao changed the title [webgpu] Always pass is_channels_last with std::optional [webgpu] Set is_channels_last to true by default in ComputeMatMul Mar 27, 2026
@Jiawei-Shao Jiawei-Shao requested review from Copilot and qjia7 March 27, 2026 08:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/providers/webgpu/webgpu_utils.cc Outdated
Comment thread onnxruntime/core/providers/webgpu/math/matmul.cc Outdated
Comment thread onnxruntime/core/providers/webgpu/webgpu_utils.cc Outdated
@Jiawei-Shao Jiawei-Shao requested a review from qjia7 March 30, 2026 07:18
Comment thread onnxruntime/core/providers/webgpu/webgpu_utils.h Outdated
@Jiawei-Shao Jiawei-Shao requested a review from qjia7 March 31, 2026 03:17
@Jiawei-Shao
Copy link
Copy Markdown
Contributor Author

Hi @guschmue, could you take a look at this PR?

@guschmue guschmue enabled auto-merge (squash) April 2, 2026 01:18
@guschmue
Copy link
Copy Markdown
Contributor

guschmue commented Apr 2, 2026

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 4 pipeline(s).

@Jiawei-Shao
Copy link
Copy Markdown
Contributor Author

The errors are not related to this PR:

The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.

@Jiawei-Shao
Copy link
Copy Markdown
Contributor Author

The failures on DirectML CI are not related to this PR.

@guschmue
Copy link
Copy Markdown
Contributor

guschmue commented Apr 8, 2026

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 4 pipeline(s).

@guschmue guschmue closed this Apr 8, 2026
auto-merge was automatically disabled April 8, 2026 19:41

Pull request was closed

@guschmue guschmue reopened this Apr 8, 2026
@guschmue guschmue merged commit 9e3614b into microsoft:main Apr 9, 2026
173 of 264 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:WebGPU ort-web webgpu provider

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants