Hi, thank you for great work and efforts.
Current kernels seem to support only dimensions of 7B models with hidden dimension 4096.
How can I extend it for larger models like Llama-30B or 65B?
It returns an error when I just add template instances for larger dimension.
Thank you.
Hi, thank you for great work and efforts.
Current kernels seem to support only dimensions of 7B models with hidden dimension 4096.
How can I extend it for larger models like Llama-30B or 65B?
It returns an error when I just add template instances for larger dimension.
Thank you.