Skip to content

Add approximate parameter to GELU activation function#1548

Open
alinpahontu2912 wants to merge 6 commits intodotnet:mainfrom
alinpahontu2912:feature/gelu-approximate-parameter
Open

Add approximate parameter to GELU activation function#1548
alinpahontu2912 wants to merge 6 commits intodotnet:mainfrom
alinpahontu2912:feature/gelu-approximate-parameter

Conversation

@alinpahontu2912
Copy link
Copy Markdown
Member

Fixes #1368

Add support for the 'approximate' parameter in GELU, matching PyTorch's torch.nn.GELU(approximate='tanh') functionality.

Changes:

  • Add GELU.Approximate enum with 'none' and 'tanh' values
  • Thread approximate parameter through all layers: native C++, PInvoke, Tensor methods, functional API, and module factory
  • Add new overloads (no breaking changes to existing API)
  • Add test for tanh approximation mode

Add support for the 'approximate' parameter in GELU, matching PyTorch's
torch.nn.GELU(approximate='tanh') functionality.

Changes:
- Add GELU.Approximate enum with 'none' and 'tanh' values
- Thread approximate parameter through all layers: native C++, PInvoke,
  Tensor methods, functional API, and module factory
- Add new overloads (no breaking changes to existing API)
- Add test for tanh approximation mode

Fixes dotnet#1368

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for PyTorch’s approximate mode to GELU (notably "tanh"), threading the option through the native (C++), P/Invoke, Tensor, functional, and module APIs, and adding a regression test.

Changes:

  • Introduces Modules.GELU.Approximate (none / tanh) and plumbs it through nn.GELU and nn.functional.gelu.
  • Extends Tensor gelu/gelu_ to accept an approximation mode and updates the corresponding native/PInvoke signatures.
  • Adds a unit test validating the tanh approximation path and that it differs from the exact mode.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
test/TorchSharpTest/NN.cs Adds a test covering GELU tanh approximation behavior.
src/TorchSharp/Tensor/Tensor.cs Adds gelu/gelu_ overloads that pass approximation through to native.
src/TorchSharp/PInvoke/LibTorchSharp.THSTensor.cs Updates P/Invoke signatures to accept the approximation string.
src/TorchSharp/NN/Activation/GELU.cs Adds approximation enum + overloads in module factory and functional API.
src/Native/LibTorchSharp/THSTensor.h Updates native exports for GELU to accept an approximation parameter.
src/Native/LibTorchSharp/THSTensor.cpp Passes approximation through to torch::gelu / torch::gelu_.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

alinpahontu2912 and others added 3 commits March 11, 2026 15:54
- Move Approximate enum from GELU module class to neutral
  TorchSharp namespace as GELUApproximate, removing Tensor/functional
  layer dependency on Modules layer
- Add CharSet, BestFitMapping, ThrowOnUnmappableChar attributes to
  THSTensor_gelu/gelu_ DllImport declarations to match existing
  LPStr-based imports pattern
- Update all references in Tensor.cs, GELU.cs, and tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add support for the 'approximate' parameter in GELU, matching PyTorch's
torch.nn.GELU(approximate='tanh') functionality.

Changes:
- Add GELU.Approximate enum with 'none' and 'tanh' values
- Thread approximate parameter through all layers: native C++, PInvoke,
  Tensor methods, functional API, and module factory
- Add new overloads (no breaking changes to existing API)
- Add test for tanh approximation mode

Fixes dotnet#1368

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Move Approximate enum from GELU module class to neutral
  TorchSharp namespace as GELUApproximate, removing Tensor/functional
  layer dependency on Modules layer
- Add CharSet, BestFitMapping, ThrowOnUnmappableChar attributes to
  THSTensor_gelu/gelu_ DllImport declarations to match existing
  LPStr-based imports pattern
- Update all references in Tensor.cs, GELU.cs, and tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Keep original THSTensor_gelu/gelu_ exports unchanged for ABI
  compatibility
- Add new THSTensor_gelu_with_approximate/gelu_with_approximate_
  exports that accept the approximate string parameter
- Add null guard in native code, treating null as 'none'
- Update P/Invoke declarations and managed callers accordingly

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@alinpahontu2912 alinpahontu2912 force-pushed the feature/gelu-approximate-parameter branch from cf93772 to 3ae09b0 Compare March 27, 2026 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GELU does not appear to support approximate tanh

2 participants