Conversation
Signed-off-by: ganyi <ygan@amd.com>
Signed-off-by: ganyi <ygan@amd.com>
Signed-off-by: ganyi <ygan@amd.com>
Signed-off-by: ganyi <ygan@amd.com>
There was a problem hiding this comment.
Pull request overview
Integrates an optional FlyDSL-based gated-delta-rule (GDR) decode kernel into the vLLM plugin GDN attention backend, and freezes GDN-specific parameters in the Qwen3 Next model implementation.
Changes:
- Add an optional
flydsl_gdr_decodepath for decode-time recurrent attention, with fallback to the existing fused kernel when unavailable. - Mark
dt_biasandA_logas non-trainable (requires_grad=False) parameters inqwen3_next.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
atom/plugin/vllm/attention_backend/attention_gdn.py |
Adds FlyDSL GDR decode integration with import-time feature gating and fallback path. |
atom/models/qwen3_next.py |
Freezes dt_bias and A_log parameters (no gradients). |
Comments suppressed due to low confidence (2)
atom/plugin/vllm/attention_backend/attention_gdn.py:44
- The large commented-out
maybe_dump_flydsl_gdr_inputsblock adds maintenance overhead and makes it harder to review/grep the file. If this debug dump is needed, wire it up behind a real flag/env var and keep the helper active (or move it to a dedicated debug/util module); otherwise, remove the dead commented code.
super().__init__()
def forward(
self,
q: torch.Tensor,
atom/plugin/vllm/attention_backend/attention_gdn.py:438
- The PR template sections (Motivation/Technical Details/Test Plan/Test Result) are still empty. Since this change alters the decode attention path and may depend on a specific aiter version, please document the required aiter version and how this was tested/benchmarked.
else:
core_attn_out[:num_actual_tokens] = core_attn_out_non_spec.squeeze(0)
return core_attn_out
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| USE_FLYDSL_GDR = False | ||
| print( | ||
| "Failed to import flydsl_gdr_decode. Please make sure you have the latest version of aiter installed." | ||
| ) | ||
|
|
There was a problem hiding this comment.
Avoid printing to stdout when the optional flydsl_gdr_decode import fails; this can spam logs in library/server contexts and in tests. Prefer the project’s logging/warnings mechanism (and ideally emit the message only when the decode path is selected) while falling back to the non-flydsl implementation.
Motivation
depends on ROCm/aiter#2746
before
after
Technical Details
Test Plan
Test Result
Submission Checklist