[PyTorch FE] Add 2-bit (u2) weight decompression subgraph support by ljaljushkin · Pull Request #34542 · openvinotoolkit/openvino

ljaljushkin · 2026-03-06T13:41:57Z

Details:

Added u2_compression_stack pattern matcher in utils_quantize.cpp that detects the NNCF
unpack_uint2 subgraph: aten::stack([bitwise_and(packed, 3), bitwise_and(bitwise_right_shift(packed, 2), 3), bitwise_and(bitwise_right_shift(packed, 4), 3), bitwise_and(bitwise_right_shift(packed, 6), 3)], dim=-1)
and replaces it with a single element::u2 constant.
Added U2ConvertReshape transformation pass that folds Reshape on u2 constants
into the constant itself (analogous to U4ConvertReshape for u4).
Added MarkCompressedWeightConstants transformation pass that marks Convert nodes
consuming u2/u4/i4 constants with disable_constant_folding and mark_as_decompression
to prevent MOC from expanding compressed weight constants.
Integrated u2 pattern detection in both translate_stack (TorchScript) and
translate_stack_fx (torch.compile) code paths.
Added u2 → u8 type promotion in CPU plugin's transformation_pipeline.cpp.
No need for a standalone aten::bitwise_right_shift op converter — the pattern matcher
consumes the entire unpack subgraph at the aten::stack level.

This enables export of NNCF models with INT2SymmetricWeightsDecompressor to OpenVINO IR,
following the same approach used for INT4 decompression patterns (PR #27048).

Tickets:

Depends on NNCF PR: [Torch] INT2 symmetric decompression support nncf#3971 (INT2SymmetricWeightsDecompressor)
ref: Support for INT4 decompression patterns from NNCF #27048 (INT4 decompression patterns)

AI Assistance:

AI assistance used: yes
*manually converted u2/u4 qwen/qwen3-4b and evaluated in openvino

lm_eval
--model openvino
--model_args pretrained=$IR_DIR
--device cpu
--tasks lambada_openai

model	quant_config	group_size	Lambada, acc	Lambada ppl	Lambada OV acc	Lambada OV ppl
Qwen/Qwen3-4B	FQ_LORA (avg 3bit - mix of 2/4 bit)	128/64	0.5604	9.6826	0.5572	9.6666

ljaljushkin added 2 commits March 5, 2026 15:21

[PyTorch FE] Add 2-bit (u2) weight decompression subgraph support

4afce36

corrected test

4714e9a

github-actions Bot added category: CPU OpenVINO CPU plugin category: PyTorch FE OpenVINO PyTorch Frontend labels Mar 6, 2026

ljaljushkin mentioned this pull request Mar 6, 2026

[Torch] INT2 symmetric decompression support openvinotoolkit/nncf#3971

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PyTorch FE] Add 2-bit (u2) weight decompression subgraph support#34542

[PyTorch FE] Add 2-bit (u2) weight decompression subgraph support#34542
ljaljushkin wants to merge 2 commits intoopenvinotoolkit:masterfrom
ljaljushkin:nl/torch_fe_2bit_subgraph_support

ljaljushkin commented Mar 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ljaljushkin commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Details:

Tickets:

AI Assistance:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ljaljushkin commented Mar 6, 2026 •

edited

Loading