Description
Implement modular forward and backward CUDA kernels for the GDN (Gated Delta Network) algorithm, compatible with Kimi CP.
Context
GDN is a linear attention variant that improves expressiveness with gating and delta-style updates. Implementing modular forward/backward kernels compatible with Kimi CP would enable efficient distributed training via context parallelism.
Tasks
References
Description
Implement modular forward and backward CUDA kernels for the GDN (Gated Delta Network) algorithm, compatible with Kimi CP.
Context
GDN is a linear attention variant that improves expressiveness with gating and delta-style updates. Implementing modular forward/backward kernels compatible with Kimi CP would enable efficient distributed training via context parallelism.
Tasks
References