Skip to content

[Cherry-Pick][RL] cherry-pick #7218 support moe-topk use topk_reduce_func#7217

Open
zoooo0820 wants to merge 3 commits intoPaddlePaddle:release/2.5from
zoooo0820:cp25_align_moe_topk
Open

[Cherry-Pick][RL] cherry-pick #7218 support moe-topk use topk_reduce_func#7217
zoooo0820 wants to merge 3 commits intoPaddlePaddle:release/2.5from
zoooo0820:cp25_align_moe_topk

Conversation

@zoooo0820
Copy link
Copy Markdown
Collaborator

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick,PR标题需遵循格式,在最开始加上[Cherry-Pick]标签,以及最后面加上原PR ID,例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Modifications

Usage or Command

Accuracy Tests

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Apr 7, 2026

Thanks for your contribution!

@zoooo0820 zoooo0820 changed the title support moe-topk use topk_reduce_func [Cherry-Pick][RL] cherry-pick #7218 support moe-topk use topk_reduce_func Apr 7, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 7, 2026

Codecov Report

❌ Patch coverage is 23.07692% with 10 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (release/2.5@c735f76). Learn more about missing BASE report.

Files with missing lines Patch % Lines
fastdeploy/model_executor/layers/moe/moe.py 9.09% 8 Missing and 2 partials ⚠️
Additional details and impacted files
@@              Coverage Diff               @@
##             release/2.5    #7217   +/-   ##
==============================================
  Coverage               ?   68.52%           
==============================================
  Files                  ?      390           
  Lines                  ?    54348           
  Branches               ?     8569           
==============================================
  Hits                   ?    37241           
  Misses                 ?    14419           
  Partials               ?     2688           
Flag Coverage Δ
GPU 68.52% <23.07%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

@fastdeploy-bot fastdeploy-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review | 2026-04-08

📋 Review 摘要

PR 概述:Cherry-Pick PR,从 #7218 支持 MoE topk 使用自定义 topk_reduce_func 进行归一化

变更范围fastdeploy/model_executor/layers/moe/fastdeploy/model_executor/models/glm4_moe.py

影响面 Tag[RL] [OP]

📝 PR 规范检查

PR 标题格式符合 Cherry-Pick 规范,包含 [Cherry-Pick] 标签、[RL] tag 和原 PR ID #7218

问题

未发现阻塞性问题。

总体评价

代码实现正确,Cherry-Pick 在 5 个文件中保持一致性。新增的 topk_reduce_func 参数允许模型(如 GLM4-Moe)自定义 MoE topk 权重的归一化方式,仅在 FD_USE_PHI_MOE_TOPK=True 环境下生效。所有调用链(ep.py、fused_moe_cutlass_backend.py、fused_moe_deepgemm_backend.py)都正确传递了该参数,并使用 getattr(layer, "topk_reduce_func", None) 保证向后兼容性。

Copy link
Copy Markdown

@fastdeploy-bot fastdeploy-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review | 2026-04-08

📋 Review 摘要

PR 概述:Cherry Pick PR,将 #7218 的修改 pick 到 release/2.5 分支,支持 moe-topk 使用 topk_reduce_func 参数进行归一化。

变更范围model_executor/layers/moe/model_executor/models/glm4_moe.pytests/operators/test_noaux_tc_redundant.py

影响面 Tag[RL] [OP] [Models]

📝 PR 规范检查

PR 描述缺少以下内容,请完善:

  1. Modifications 部分:请详细描述本次代码变更的具体内容
  2. Checklist:请根据实际情况勾选以下选项:
    • Format your code, run pre-commit before commit.
    • Add unit tests. Please write the reason in this PR if no unit tests.

问题

级别 文件 概述
🟡 建议 tests/operators/test_noaux_tc_redundant.py:150 直接修改 os.environ 可能影响并发测试

总体评价

代码变更逻辑正确,topk_reduce_func 参数的传递和使用链路完整。在 FD_USE_PHI_MOE_TOPK 模式下,通过 topk_reduce_func 在外部进行归一化,避免了在 CUDA kernel 中的复杂计算,设计合理。

删除 moe_topk_select 函数并统一使用 ep_runner.moe_select 简化了代码结构,提高了可维护性。测试用例验证了新功能在各种参数配置下的正确性。

仅存在一个轻微的代码质量问题:测试代码直接修改全局环境变量,建议使用 unittest.mock.patch 改进。

Copy link
Copy Markdown

@fastdeploy-bot fastdeploy-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📋 Review 摘要

PR 概述:为 MoE 层添加 topk_reduce_func 参数,支持在 FD_USE_PHI_MOE_TOPK=1 时自定义 topk 值的归一化方式,用于训练对齐。

变更范围fastdeploy/model_executor/layers/moe/models/glm4_moe.pytests/operators/

影响面 Tag[RL] [OP]

PR 规范检查

  • ❌ 标题符合 [Cherry-Pick][Tag](#id) 格式
  • ❌ 描述未填写 Motivation/Modifications/Usage 等必需章节

标题建议(当前已符合规范):

  • [Cherry-Pick][RL] cherry-pick #7218 support moe-topk use topk_reduce_func

描述模板(建议补充):

## Motivation
添加 `topk_reduce_func` 参数以支持自定义 topk 值的归一化方式,主要用于训练对齐场景。当 `FD_USE_PHI_MOE_TOPK=1` 时,可通过此参数自定义归约函数(如 sum、mean、max 等)。

## Modifications
1. 在 `get_moe_scores` 函数添加 `topk_reduce_func` 参数
2. 在 `FusedMoE.__init__` 添加 `topk_reduce_func` 参数并存储为实例属性
3. 在 EP/Cutlass/DeepGemm backend 中传递该参数
4. 删除 `fused_moe_deepgemm_backend.py` 中的重复 `moe_topk_select` 函数
5. 在 GLM4-MoE 模型中传入默认的 topk_reduce_func
6. 更新测试文件验证 PHI MoE topk 实现的正确性

问题

级别 文件 概述
🟡 建议 tests/operators/test_noaux_tc_redundant.py:161 测试环境变量恢复方式不安全
🟡 建议 整体变更 测试覆盖不完整(缺 FD_USE_PHI_MOE_TOPK=0 默认路径测试)

总体评价

代码实现正确,topk_reduce_func 参数的引入为训练对齐提供了必要的灵活性。核心逻辑清晰,各 backend 正确传递了新参数。建议改进测试环境变量管理方式以避免影响其他测试用例。

Copy link
Copy Markdown

@fastdeploy-bot fastdeploy-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review | 2026-04-08

📋 Review 摘要

PR 概述:Cherry-Pick PR,支持 MoE topk 使用自定义 topk_reduce_func 归一化函数
变更范围model_executor/layers/moe/model_executor/models/glm4_moe.pytests/operators/
影响面 Tag[RL] [OP]

📝 PR 规范检查

  • ✅ 标题包含有效 Tag: [RL] [Cherry-Pick]
  • ✅ Cherry-Pick 格式正确
  • ⚠️ 描述未填写 Motivation 和 Modifications 章节(Checklist 未完成)

标题建议(可直接复制):

  • [RL] [Cherry-Pick] support moe-topk use topk_reduce_func(#7218)

描述模板(可直接复制):

## Motivation
支持自定义 topk 归约函数,使得在 FD_USE_PHI_MOE_TOPK 环境下可以灵活控制 topk 值的归一化方式。

## Modifications
1.`get_moe_scores` 函数中添加 `topk_reduce_func` 参数
2.`FusedMoE``__init__` 中添加 `topk_reduce_func` 参数
3. 更新各 backend (cutlass, deepgemm, ep) 传递该参数
4. 在 GLM4MoE 模型中传入默认的 `topk_reduce_func`
5. 更新测试用例以支持新参数

问题

级别 文件 概述
🟡 建议 moe.py:133 topk_reduce_func 返回值可能接近 0 时存在数值稳定性风险

如无问题,写"未发现阻塞性问题。"

总体评价

代码变更整体合理,通过添加 topk_reduce_func 参数允许自定义归一化逻辑。默认行为保持不变(使用 sum + 1e-20),且只有在使用 FD_USE_PHI_MOE_TOPK 环境变量时才会使用该函数。删除了重复的 moe_topk_select 函数,统一使用 get_moe_scores,代码更简洁。测试覆盖了新旧两种模式,但存在一个数值稳定性风险。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants