Conversation
|
Needs changes from #65 related to using multiple schedules in a workload. The added transform module aims to provide small reusable transform "bundles" to simplify writing schedules. All these helpers are opinionated by design, mostly modeled by what is needed for the example. The APIs could probably be refined. Also, schedules ended up being mostly simple wrappers around the transform bundles. Perhaps, it's not worth having both modules. |
5f60821 to
8029fc4
Compare
dad30a1 to
2c43efd
Compare
|
Reworked transform module to provide simple APIs over transform ops. |
rengolin
left a comment
There was a problem hiding this comment.
The reason why I wanted to add a python file as a schedule was to be able to reuse all of those new schedules you created and added to the lighthouse scope. We can discuss that later.
Some comments inline.
| if dtype == ml_dtypes.bfloat16: | ||
| # For BF16, enforce fixed tile size due to current rewriter pattern matching limitation. | ||
| # TODO: Relax when x86 BF16 pass supports dynamic indexing. | ||
| tile_size = 32 |
There was a problem hiding this comment.
perhaps a warning message (stderr?) saying you did this, to avoid surprises.
| dump_payload=args.dump_kernel, | ||
| dump_schedule=args.dump_schedule, | ||
| ) | ||
| else: |
There was a problem hiding this comment.
It's here to make it either print or execute.
Explicit sys.exit might be clearer.
| sched = lh_schedule.create_schedule() | ||
| named_seq = lh_schedule.create_named_sequence( | ||
| sched, input_types=[transform.any_op_t()] | ||
| ) |
There was a problem hiding this comment.
This pattern is captured in @schedule_boilerplate. We should either always use the decorator or get rid of it.
| def linalg_to_category() -> ir.Module: | ||
| """ | ||
| Morph all linalg ops to category ops. | ||
|
|
||
| Returns: | ||
| Schedule | ||
| """ | ||
| return linalg_morph(generic_to_category=True, named_to_category=True) |
There was a problem hiding this comment.
What's the benefit of such small wrappers, which are an alias for a single call? Do you expect they can grow?
There was a problem hiding this comment.
These act purely as aliases for readability. Fixed functionality instead of having to provide and scan args.
In this case, I anticipate that normalizing linalg to specific abstraction would be desired and/or common enough to "justify" these. But they could be removed too.
In my mind we should not add abstractions/wrappers unless they really simplify something beyond saving a few characters. I find single-operation wrappers are more confusing than helpful. Many of the schedules and transforms in this PR are of the one-operation kind and I suggest removing them. Even things like one-op+cleanup don't seem to really add anything worth a wrapper - even if the cleanup is mandatory (which ideally it should not) the benefit is debatable. |
|
Also, I would appreciate if we had a single way of expressing and using a compiler pipeline (whether they are composed of schedules or passes or whatnot). Can we use the new Personally, I prefer a simple list of things, which the evaluator dispatches in the right way. No need for extra abstractions like "PassStage" or "TransformsStage", Python has all the features to do that. As @adam-smnk also mentioned elsewhere, we should be adding abstraction only very sparingly. |
|
One potential use case of the predefined schedules is to make them accessible directly via The specific granularity is up for debate of course. |
I guess that'd require further |
| memory_writes = self.M * self.N * nbytes # write C | ||
| return (flop_count, memory_reads, memory_writes) | ||
|
|
||
| def payload_module(self) -> ir.Module: |
There was a problem hiding this comment.
Although the payload is simple we could reuse mlir_gen utils here.
Adds x86-specific vectorization example for matrix multiplication.
Comes with a collection of opinionated but reusable transforms and schedules.
The lowering schedule currently supports F32 (general) and BF16 (avx512, flat layout) matmuls.