Skip to content

[tune][xegpu] infrastructure for tuning, applied to XeGPU matmul example#63

Open
rolfmorel wants to merge 15 commits intomainfrom
users/rolfmorel/lh-xegpu-autotuning
Open

[tune][xegpu] infrastructure for tuning, applied to XeGPU matmul example#63
rolfmorel wants to merge 15 commits intomainfrom
users/rolfmorel/lh-xegpu-autotuning

Conversation

@rolfmorel
Copy link
Contributor

@rolfmorel rolfmorel commented Mar 5, 2026

Adds ability to extract search spaces from schedules with knobs.

Adds transform interpreter semantics to constrain_params op, which interprets the smt ops in its region so that it is possible to check constraints on int params as well as calculate new int params at interpreter time.

Modifies the schedule for the XeGPU matmul (& mlp) example to embed its tuning problem.

lighthouse.tune implements:

  • trace-ing of transform schedules and the SMT regions they include:
    • yields an AST of Nodes, each evaluate-able w.r.t. an environment, with leafs such as
      • Constant, which represents just a constant in IR, and evals to a constant int, and
      • Knob, the representative of a transform.tune.knob, which takes its value from the env while knowing what it's possible values are,
        while
      • Apply depends on other Nodes as it models Values produced by ops dependent on other values
      • Predicate models which condition/constraint needs to be true, as a boolean-valued function on Nodes, for execution to proceed passed a particular op.
  • rewrite-ing of transform schedules, solely through setting the selected attr on knob ops and the selected_region attr on alternatives ops.
  • enumerate-ing all valid assignments for knob and alternatives tuneables.
  • __main__-ing to take a .mlir schedule and derive all valid knob configurations and output the corresponding concrete/transform-interpreter interpretable schedules.

Inside lighthouse.dialects, add (extension) dialects:

  • smt_ext: A wrapper for ir.Values of !smt.int type so we can support python operations on them (e.g. addition) also with integers.
  • transform_tune_ext: A wrapper for ir.Values produced directly by transform.tune.knob ops so we can do python operations on them, in particular add constraints, and a camel_case knob() -> KnobValue helper
  • transform_smt_ext: so we we can have a version of transform.smt.constrain_params which has transform-interpreter semantics: by tracing the body, containing smt ops, we get function we can applied to the transform.param which were arguments to constrain_params. That is, this version of constrain_params has a proper TransformOpInterface implementation.

@rolfmorel rolfmorel marked this pull request as ready for review March 12, 2026 18:22
@rolfmorel rolfmorel requested a review from tkarna March 17, 2026 22:15
return parameters


matmul_param_db = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this dict definition, params are read from the json file now.

PREFETCH_INST_DATA = [8, 16]
NB_WORKITEMS = 16 # workitems in subgroup
@KnobValue.ast_rewrite(in_exprs=True)
def checked_params_or_knobs(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call this something like construct_search_space? I.e. for every entry params dict we either keep the given int value and check its validity or populate None values a knob with valid choices.

NB_WORKITEMS = 16 # workitems in subgroup
@KnobValue.ast_rewrite(in_exprs=True)
def checked_params_or_knobs(
params: dict[str, int | None], layer_id=""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this func we could rename layer_id -> knob_prefix because that's what we use it for. With the prefix every knob has an unique name; is this just a good practice/debugging aid, i.e. even if every layer had knobs with the same name everything would just work?

Comment on lines +338 to +339
if isinstance(sg_threads, smt_ext.SMTIntValue):
# NB: Constraint only enabled during tuning.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think you can remove the if now

prefetch_a_k: int | KnobValue,
prefetch_b_k: int | KnobValue,
prefetch_b_n: int | KnobValue,
prefetch_nb: int,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefetch_nb can be a knob too.


def bundle_xegpu_to_binary(mod, stop_at_stage: str = "") -> ir.Module:
def bundle_xegpu_to_binary(
mod, stop_at_stage: str = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing mod type

for tuneable_values in product(
*(tuneable.possibilities() for tuneable in tuneables)
):
environment = dict(zip((tunable for tunable in tuneables), tuneable_values))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
environment = dict(zip((tunable for tunable in tuneables), tuneable_values))
environment = dict(zip(tuneables, tuneable_values))



class Node(ABC):
"""Base class for `Node`s which can be evaluated w.r.t. an environment."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For better clarity a high level docstring would help: We are tracing the IR to construct knob dependency tree, "Node" represents a node in that tree; "environment" means the (tunable, value) map.

Comment on lines +20 to +25
parser.add_argument(
"--mode",
choices=["enumerate"],
default="enumerate",
help="Mode of operation",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add --mode once we have more than one mode

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants