Commit 7b46d93

committed

Add LoRA multimethod export to CoreML static attention

Add support for exporting LoRA adapters as separate methods in a CoreML PTE file. CoreML POSITIONAL weight sharing deduplicates base weights across methods so the binary overhead is just the lora_a/lora_b weights. - StaticAttention: LoRA-aware projection creation for split_mha=False - utils.py: skip_names + LoRALinear guard in replace_linear_with_split_linear - export: --adapter CLI, adapter loading, _exclude_lora quantization filter, skip_split_names for POSITIONAL sharing, multimethod export branches Authored with Claude. ghstack-source-id: 16ab852 ghstack-comment-id: 4093471437 Pull-Request: #18344

1 parent a652e04 commit 7b46d93Copy full SHA for 7b46d93

3 files changed

examples
- apple/coreml/llama
  - export_static_llm_coreml.py
  - utils.py
- models/llama
  - static_attention.py

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit 7b46d93

File tree

0 commit comments