Motivation
Constructing PTQConfig.overrides for LLM models is currently verbose and error-prone. Users must manually define deeply nested module paths (e.g., model.layers.{i}.self_attn.q_proj) even for very common patterns. This leads to:
- Poor readability in example scripts
- High maintenance cost when model structures differ slightly
- Increased risk of silent misconfiguration (e.g., incorrect key paths)
We should provide a helper that encapsulates these patterns while remaining extensible across model families.
Proposal
Introduce a model-aware builder:
build_llm_ptq_config(...)
Key idea:
- Public API is model-agnostic
- Internally dispatches to model-specific mapping logic
Example:
cfg = build_llm_ptq_config(
model_type="llama",
num_hidden_layers=len(model.model.layers),
wrapper_variant="prefill",
activation_dtype=DType.int(16),
default_qscheme=QScheme.PER_TENSOR_SYMM,
linear_weight_bits=4,
embedding_weight_bits=8,
lm_head_weight_bits=4,
)
Motivation
Constructing
PTQConfig.overridesfor LLM models is currently verbose and error-prone. Users must manually define deeply nested module paths (e.g., model.layers.{i}.self_attn.q_proj) even for very common patterns. This leads to:We should provide a helper that encapsulates these patterns while remaining extensible across model families.
Proposal
Introduce a model-aware builder:
build_llm_ptq_config(...)Key idea:
Example: