Skip to content

chore(deps-dev): bump @huggingface/transformers from 3.8.1 to 4.0.1#831

Merged
carlos-alm merged 3 commits intomainfrom
dependabot/npm_and_yarn/huggingface/transformers-4.0.1
Apr 4, 2026
Merged

chore(deps-dev): bump @huggingface/transformers from 3.8.1 to 4.0.1#831
carlos-alm merged 3 commits intomainfrom
dependabot/npm_and_yarn/huggingface/transformers-4.0.1

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot bot commented on behalf of github Apr 4, 2026

Bumps @huggingface/transformers from 3.8.1 to 4.0.1.

Release notes

Sourced from @​huggingface/transformers's releases.

4.0.0

🚀 Transformers.js v4

We're excited to announce that Transformers.js v4 is now available on NPM! After a year of development (we started in March 2025 🤯), we're finally ready for you to use it.

npm i @huggingface/transformers

Links: YouTube Video, Blog Post, Demo Collection

New WebGPU backend

The biggest change is undoubtedly the adoption of a new WebGPU Runtime, completely rewritten in C++. We've worked closely with the ONNX Runtime team to thoroughly test this runtime across our ~200 supported model architectures, as well as many new v4-exclusive architectures.

In addition to better operator support (for performance, accuracy, and coverage), this new WebGPU runtime allows the same transformers.js code to be used across a wide variety of JavaScript environments, including browsers, server-side runtimes, and desktop applications. That's right, you can now run WebGPU-accelerated models directly in Node, Bun, and Deno!

We've proven that it's possible to run state-of-the-art AI models 100% locally in the browser, and now we're focused on performance: making these models run as fast as possible, even in resource-constrained environments. This required completely rethinking our export strategy, especially for large language models. We achieve this by re-implementing new models operation by operation, leveraging specialized ONNX Runtime Contrib Operators like com.microsoft.GroupQueryAttention, com.microsoft.MatMulNBits, or com.microsoft.QMoE to maximize performance.

For example, adopting the com.microsoft.MultiHeadAttention operator, we were able to achieve a ~4x speedup for BERT-based embedding models.

New models

Thanks to our new export strategy and ONNX Runtime's expanding support for custom operators, we've been able to add many new models and architectures to Transformers.js v4. These include popular models like GPT-OSS, Chatterbox, GraniteMoeHybrid, LFM2-MoE, HunYuanDenseV1, Apertus, Olmo3, FalconH1, and Youtu-LLM. Many of these required us to implement support for advanced architectural patterns, including Mamba (state-space models), Multi-head Latent Attention (MLA), and Mixture of Experts (MoE). Perhaps most importantly, these models are all compatible with WebGPU, allowing users to run them directly in the browser or server-side JavaScript environments with hardware acceleration. We've released several Transformers.js v4 demos so far... and we'll continue to release more!

Additionally, we've added support for larger models exceeding 8B parameters. In our tests, we've been able to run GPT-OSS 20B (q4f16) at ~60 tokens per second on an M4 Pro Max.

... (truncated)

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [@huggingface/transformers](https://github.com/huggingface/transformers.js) from 3.8.1 to 4.0.1.
- [Release notes](https://github.com/huggingface/transformers.js/releases)
- [Commits](https://github.com/huggingface/transformers.js/commits)

---
updated-dependencies:
- dependency-name: "@huggingface/transformers"
  dependency-version: 4.0.1
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added dependencies Pull requests that update a dependency file javascript Pull requests that update javascript code labels Apr 4, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 4, 2026

Greptile Summary

This PR bumps @huggingface/transformers from 3.8.1 to 4.0.1 (a major version) and includes a companion fix commit that replaces the deprecated quantized: true pipeline option with dtype: 'q8', aligning with the v4 API. The peer dependency is correctly marked optional, consistent with the lazy-load embedding architecture.

Confidence Score: 5/5

Safe to merge — the key v4 breaking change (quantized → dtype) is correctly handled, and all remaining feedback is stylistic.

All findings are P2 style suggestions. The only API breaking change in the v3→v4 upgrade that touches this codebase (quantized: truedtype: 'q8') is addressed in the companion fix commit, and the HuggingFace docs confirm the mapping is correct. No correctness or data-integrity issues are present.

No files require special attention.

Important Files Changed

Filename Overview
src/domain/search/models.ts Fix commit replaces deprecated quantized: true pipeline option with dtype: 'q8' for v4 compatibility; ModelConfig.quantized field name is now a semantic holdover but the mapping logic is correct.
package.json Bumps @huggingface/transformers from ^3.8.1 to ^4.0.1 in both devDependencies and peerDependencies; peer dependency is marked optional, consistent with the lazy-load architecture.
package-lock.json Lock file updated by dependabot to reflect the new @huggingface/transformers 4.0.1 resolved version and its transitive dependencies.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[loadModel called] --> B{extractor cached\nfor same model?}
    B -- yes --> C[return cached extractor]
    B -- no --> D[disposeModel]
    D --> E[loadTransformers]
    E --> F{config.quantized?}
    F -- true --> G["pipelineOpts = { dtype: 'q8' }"]
    F -- false --> H["pipelineOpts = {}"]
    G --> I["pipeline('feature-extraction', name, opts)"]
    H --> I
    I --> J[activeModel = config.name]
    J --> C
Loading

Reviews (2): Last reviewed commit: "fix: replace deprecated quantized option..." | Re-trigger Greptile

The quantized pipeline option was removed in @huggingface/transformers v4.
Without this fix, the minilm model loads in fp32 precision (~92MB) instead
of q8 (~23MB), quadrupling memory usage.
@carlos-alm
Copy link
Copy Markdown
Contributor

Addressed the P1 finding — replaced { quantized: true } with { dtype: 'q8' } in src/domain/search/models.ts (line 197). This is the v4 API for selecting quantized model weights. All 71 search tests pass.

@greptileai

@carlos-alm carlos-alm merged commit dd88a2d into main Apr 4, 2026
4 checks passed
@carlos-alm carlos-alm deleted the dependabot/npm_and_yarn/huggingface/transformers-4.0.1 branch April 4, 2026 20:49
@github-actions github-actions bot locked and limited conversation to collaborators Apr 4, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

dependencies Pull requests that update a dependency file javascript Pull requests that update javascript code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant