Refactor: extract _prepare_eager_model() from CoreML export main() by lucylq · Pull Request #18343 · pytorch/executorch

lucylq · 2026-03-19T21:40:41Z

Extract eager model preparation logic (dtype conversion, linear splitting,
quantization, graph breaks) into a reusable _prepare_model() helper.

No functional change — pure refactor.

Generated with Claude.

Used for multimethod; extract out eager model instantiation and transforms, so that we can apply them to each method.

[ghstack-poisoned]

lucylq · 2026-03-19T21:40:42Z

Stack from ghstack (oldest at bottom):

pytorch-bot · 2026-03-19T21:40:46Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18343

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ 3 Pending, 4 Unrelated Failures

As of commit 06a59b2 with merge base dd7464a ():

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

pull / test-qnn-wheel-packages-linux (3.10) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)
pull / test-qnn-wheel-packages-linux (3.13) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)
pull / unittest-editable / windows / windows-job (gh) (matched win rule in flaky-rules.json)
##[error]The operation was canceled.

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

Copilot

Pull request overview

Refactors the CoreML static Llama export script by extracting model-preparation steps into a reusable helper, enabling consistent application of dtype conversion, splitting, quantization, and graph-break insertion (useful for multi-method exports).

Changes:

Added _prepare_model(model, args, float_dtype) to encapsulate dtype + transformation steps.
Replaced inlined preparation logic in main() with a single call to _prepare_model(...).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-19T22:02:23Z

examples/apple/coreml/llama/export_static_llm_coreml.py



+def _prepare_eager_model(model, args, float_dtype):
+    """Apply splitting, quantization, and graph breaks to a model."""


The _prepare_model docstring says it applies splitting/quantization/graph breaks, but the function also changes dtype and sets eval mode. Consider updating the docstring to reflect the full set of side effects (and that it mutates the module in-place).

Suggested change

"""Apply splitting, quantization, and graph breaks to a model."""

"""

Prepare an eager model for export by applying dtype conversion, eval mode,

splitting, quantization, and optional graph breaks.

This function mutates the given ``model`` in-place:

* moves it to ``float_dtype``,

* sets it to evaluation mode,

* optionally splits linear layers,

* optionally quantizes embeddings and linear layers, and

* optionally wraps the first/last transformer blocks with graph breaks.

The same (mutated) ``model`` instance is returned.

"""

Copilot · 2026-03-19T22:02:23Z

examples/apple/coreml/llama/export_static_llm_coreml.py

+        bitwidth, group_size = args.embedding_quantize.split(",")
+        bitwidth = int(bitwidth)
+        group_size = int(group_size)
+        assert bitwidth in [4, 8], "CoreML only supports 4-bit and 8-bit quantization"
+


args.embedding_quantize is parsed via .split(",") and validated with assert. Since this is user-controlled CLI input, consider doing explicit validation and raising a ValueError/argparse.ArgumentTypeError with a clear message (and avoid assert, which can be stripped with python -O).

Suggested change

bitwidth, group_size = args.embedding_quantize.split(",")

bitwidth = int(bitwidth)

group_size = int(group_size)

assert bitwidth in [4, 8], "CoreML only supports 4-bit and 8-bit quantization"

try:

bitwidth_str, group_size_str = args.embedding_quantize.split(",")

except ValueError as e:

raise ValueError(

f"Invalid value for --embedding_quantize: {args.embedding_quantize!r}. "

"Expected format is 'BITWIDTH,GROUP_SIZE', e.g. '4,32'."

) from e

try:

bitwidth = int(bitwidth_str)

group_size = int(group_size_str)

except ValueError as e:

raise ValueError(

f"Invalid value for --embedding_quantize: {args.embedding_quantize!r}. "

"BITWIDTH and GROUP_SIZE must be integers, e.g. '4,32'."

) from e

if bitwidth not in (4, 8):

raise ValueError(

f"Unsupported BITWIDTH {bitwidth} in --embedding_quantize. "

"CoreML only supports 4-bit and 8-bit quantization."

)

[ghstack-poisoned]

github-actions · 2026-03-20T17:27:57Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

lucylq added 2 commits March 19, 2026 14:40

Update

da7f489

[ghstack-poisoned]

Update

ac42ef2

[ghstack-poisoned]

lucylq requested review from cccclai and metascroy as code owners March 19, 2026 21:40

This was referenced Mar 19, 2026

Update lora def #18342

Merged

Add LoRA multimethod export to CoreML static attention #18344

Closed

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 19, 2026

This was referenced Mar 19, 2026

Add LoRA support to StaticAttention for split_mha=False #18345

Open

Add skip_names and LoRALinear guard to replace_linear_with_split_linear #18346

Open

Add LoRA multimethod export to CoreML static LLM export #18347

Open

lucylq added 4 commits March 19, 2026 14:50

Update

bb400fa

[ghstack-poisoned]

Update

367970a

[ghstack-poisoned]

Update

2bb273d

[ghstack-poisoned]

Update

a4598d9

[ghstack-poisoned]

lucylq requested a review from Copilot March 19, 2026 21:59

Copilot started reviewing on behalf of lucylq March 19, 2026 22:00 View session

Update

a8a23df

[ghstack-poisoned]

lucylq changed the title ~~Refactor: extract _prepare_model() from CoreML export main()~~ Refactor: extract _prepare_eager_model() from CoreML export main() Mar 19, 2026

Copilot AI reviewed Mar 19, 2026

View reviewed changes

lucylq mentioned this pull request Mar 19, 2026

[wip] Add CI test for CoreML LoRA multimethod export #18354

Open

lucylq added 2 commits March 19, 2026 16:39

Update

5b130e1

[ghstack-poisoned]

Update

108fa03

[ghstack-poisoned]

lucylq mentioned this pull request Mar 19, 2026

[wip] Add --method argument to CoreML static LLM runner #18355

Open

Base automatically changed from gh/lucylq/141/head to main March 20, 2026 17:20

lucylq requested a review from digantdesai as a code owner March 20, 2026 17:20

Update

06a59b2

[ghstack-poisoned]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor: extract _prepare_eager_model() from CoreML export main()#18343

Refactor: extract _prepare_eager_model() from CoreML export main()#18343
lucylq wants to merge 10 commits intomainfrom
gh/lucylq/142/head

lucylq commented Mar 19, 2026 •

edited

Loading

Uh oh!

lucylq commented Mar 19, 2026 •

edited

Loading

Uh oh!

pytorch-bot bot commented Mar 19, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 19, 2026

Uh oh!

Copilot AI Mar 19, 2026

Uh oh!

github-actions bot commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants



		def _prepare_eager_model(model, args, float_dtype):
		"""Apply splitting, quantization, and graph breaks to a model."""

-    """Apply splitting, quantization, and graph breaks to a model."""
+    """
+    Prepare an eager model for export by applying dtype conversion, eval mode,
+    splitting, quantization, and optional graph breaks.
+    This function mutates the given ``model`` in-place:
+      * moves it to ``float_dtype``,
+      * sets it to evaluation mode,
+      * optionally splits linear layers,
+      * optionally quantizes embeddings and linear layers, and
+      * optionally wraps the first/last transformer blocks with graph breaks.
+    The same (mutated) ``model`` instance is returned.
+    """

-        bitwidth, group_size = args.embedding_quantize.split(",")
-        bitwidth = int(bitwidth)
-        group_size = int(group_size)
-        assert bitwidth in [4, 8], "CoreML only supports 4-bit and 8-bit quantization"
+        try:
+            bitwidth_str, group_size_str = args.embedding_quantize.split(",")
+        except ValueError as e:
+            raise ValueError(
+                f"Invalid value for --embedding_quantize: {args.embedding_quantize!r}. "
+                "Expected format is 'BITWIDTH,GROUP_SIZE', e.g. '4,32'."
+            ) from e
+        try:
+            bitwidth = int(bitwidth_str)
+            group_size = int(group_size_str)
+        except ValueError as e:
+            raise ValueError(
+                f"Invalid value for --embedding_quantize: {args.embedding_quantize!r}. "
+                "BITWIDTH and GROUP_SIZE must be integers, e.g. '4,32'."
+            ) from e
+        if bitwidth not in (4, 8):
+            raise ValueError(
+                f"Unsupported BITWIDTH {bitwidth} in --embedding_quantize. "
+                "CoreML only supports 4-bit and 8-bit quantization."
+            )

Conversation

lucylq commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lucylq commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18343

⏳ 3 Pending, 4 Unrelated Failures

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 20, 2026

This PR needs a release notes: label

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lucylq commented Mar 19, 2026 •

edited

Loading

lucylq commented Mar 19, 2026 •

edited

Loading

pytorch-bot bot commented Mar 19, 2026 •

edited

Loading

This PR needs a `release notes:` label