Add LoRA support to StaticAttention for split_mha=False by lucylq · Pull Request #18345 · pytorch/executorch

lucylq · 2026-03-19T21:46:16Z

When ModelArgs.target_modules is set, create LoRALinear instead of
nn.Linear for targeted q/k/v/o projections. Only applies to
split_mha=False path. Existing behavior unchanged when target_modules
is None.

Authored with Claude.

[ghstack-poisoned]

pytorch-bot · 2026-03-19T21:46:20Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18345

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 1 Cancelled Job, 7 Pending, 2 Unrelated Failures

As of commit 10f6022 with merge base dd7464a ():

NEW FAILURES - The following jobs have failed:

pull / test-llama-runner-qnn-linux (fp32, qnn_16a16w, qnn) / linux-job (gh)
RuntimeError: Command docker exec -t feaef2ca947509382d0461d9bc1eb08a3526bd6860fb80daddb7559bb1887885 /exec failed with exit code 9
pull / test-llama-runner-qnn-linux (fp32, qnn_8a8w, qnn) / linux-job (gh)
RuntimeError: Command docker exec -t 42b0eb789b6911360f240441ff169fbd5a5f8c8f029f76d8dadde50720231cf3 /exec failed with exit code 9
pull / test-sqnr-static-llm-qnn-linux (smollm2_135m) / linux-job (gh)
RuntimeError: Command docker exec -t 1956731619aef030965cc92cf025a8c7bf5996a3c2639978124b07e783f07d89 /exec failed with exit code 9

CANCELLED JOB - The following job was cancelled. Please retry:

pull / test-models-linux (ic3, portable, linux.2xlarge) / linux-job (gh)

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

pull / test-models-linux (add_mul, portable, linux.2xlarge) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)
pull / test-models-linux (resnet18, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh) (matched linux rule in flaky-rules.json)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

lucylq · 2026-03-19T21:46:21Z

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]

Copilot

Pull request overview

This PR adds LoRA-aware projection construction to the StaticAttention implementation when using the non-split MHA path (split_mha=False), so that q/k/v/o projections can become LoRALinear based on ModelArgs.target_modules, while keeping existing behavior unchanged when target_modules is None.

Changes:

For split_mha=False, conditionally instantiate LoRALinear for q/k/v projections when their corresponding target names are present in config.target_modules.
For split_mha=False, conditionally instantiate LoRALinear for the output projection (wo) when output_proj or o_proj is targeted.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-19T22:03:07Z

examples/models/llama/static_attention.py

+            has_lora = config.target_modules is not None
+            _PROJ_TARGET = {
+                "wqs": ("q_proj", self.dim, self.head_dim * self.n_heads),
+                "wks": ("k_proj", self.dim, self.head_dim * self.n_kv_heads),
+                "wvs": ("v_proj", self.dim, self.head_dim * self.n_kv_heads),
+            }
+            for attr, (target, in_dim, out_dim) in _PROJ_TARGET.items():
+                if has_lora and target in config.target_modules:
+                    proj = LoRALinear(
+                        in_dim=in_dim,
+                        out_dim=out_dim,
+                        rank=config.r,
+                        alpha=config.lora_alpha,
+                        use_bias=self.attention_qkv_bias,
                    )


When config.target_modules is set but config.r and/or config.lora_alpha are left as None (both are Optional in ModelArgs), this path will attempt to construct LoRALinear(rank=None, alpha=None) and fail with a low-signal TypeError. Consider adding an explicit validation (ValueError with a clear message) before creating any LoRALinear modules, similar to LoRAFeedForward.

Copilot · 2026-03-19T22:03:07Z

examples/models/llama/static_attention.py

+            has_lora = config.target_modules is not None
+            _PROJ_TARGET = {
+                "wqs": ("q_proj", self.dim, self.head_dim * self.n_heads),
+                "wks": ("k_proj", self.dim, self.head_dim * self.n_kv_heads),
+                "wvs": ("v_proj", self.dim, self.head_dim * self.n_kv_heads),
+            }
+            for attr, (target, in_dim, out_dim) in _PROJ_TARGET.items():
+                if has_lora and target in config.target_modules:
+                    proj = LoRALinear(
+                        in_dim=in_dim,
+                        out_dim=out_dim,
+                        rank=config.r,
+                        alpha=config.lora_alpha,
+                        use_bias=self.attention_qkv_bias,
                    )
-                ]
-            )
+                else:
+                    proj = nn.Linear(in_dim, out_dim, bias=self.attention_qkv_bias)
+                setattr(self, attr, nn.ModuleList([proj]))



New behavior is introduced here (direct StaticAttention(..., split_mha=False) now conditionally builds LoRALinear based on config.target_modules), but existing tests in test_static_attention.py only exercise LoRA via from_attention_mha. Please add a unit test that directly constructs StaticAttention with split_mha=False and target_modules set, and asserts the expected projection types and a forward equivalence check.

[ghstack-poisoned]

Update

4c90db6

[ghstack-poisoned]

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 19, 2026

lucylq added 2 commits March 19, 2026 14:50

Update

bd071ee

[ghstack-poisoned]

Update

be27f2c

[ghstack-poisoned]

lucylq requested a review from Copilot March 19, 2026 21:59

Copilot started reviewing on behalf of lucylq March 19, 2026 22:00 View session

Update

2de0d83

[ghstack-poisoned]

Copilot AI reviewed Mar 19, 2026

View reviewed changes

lucylq requested a review from sxu March 19, 2026 22:33

lucylq mentioned this pull request Mar 19, 2026

[wip] Add CI test for CoreML LoRA multimethod export #18354

Open

Update

de2e79b

[ghstack-poisoned]

lucylq mentioned this pull request Mar 19, 2026

[wip] Add --method argument to CoreML static LLM runner #18355

Open

lucylq requested a review from billmguo March 20, 2026 00:54

Update

10f6022

[ghstack-poisoned]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LoRA support to StaticAttention for split_mha=False#18345

Add LoRA support to StaticAttention for split_mha=False#18345
lucylq wants to merge 6 commits intogh/lucylq/142/headfrom
gh/lucylq/144/head

lucylq commented Mar 19, 2026

Uh oh!

pytorch-bot bot commented Mar 19, 2026 •

edited

Loading

Uh oh!

lucylq commented Mar 19, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 19, 2026

Uh oh!

Copilot AI Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lucylq commented Mar 19, 2026

Uh oh!

pytorch-bot bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18345

❌ 3 New Failures, 1 Cancelled Job, 7 Pending, 2 Unrelated Failures

Uh oh!

lucylq commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot bot commented Mar 19, 2026 •

edited

Loading

lucylq commented Mar 19, 2026 •

edited

Loading