-
Notifications
You must be signed in to change notification settings - Fork 9
Dynamic sparse attention #22
Copy link
Copy link
Open
Labels
enhancementNew feature or requestNew feature or requestperformancePerformance optimizations and resource efficiencyPerformance optimizations and resource efficiencywontfixThis will not be worked onThis will not be worked on
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestperformancePerformance optimizations and resource efficiencyPerformance optimizations and resource efficiencywontfixThis will not be worked onThis will not be worked on
Implement and benchmark blocked dynamic sparse attention modules and metal kernels, as well as a custom paged KV-cache