Conversation
|
Thanks for your contribution! |
|
“liuruian” seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #7206 +/- ##
==========================================
Coverage ? 73.92%
==========================================
Files ? 376
Lines ? 52949
Branches ? 8264
==========================================
Hits ? 39140
Misses ? 11068
Partials ? 2741
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
fastdeploy-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review |
2026-04-07 14:08 CST
📋 Review 摘要
PR 概述:新增 Hopper 架构低延迟精度测试用例
变更范围:tests/distributed/
影响面 Tag:[CI]
📝 PR 规范检查
PR 标题和描述均不符合规范:标题缺少有效 Tag,描述未填写 Motivation 和 Modifications。
标题建议(可直接复制):
[CI] Add hopper low latency precision test
描述模板(可直接复制):
## Motivation
添加 Hopper 架构 DeepEP 低延迟通信的精度验证测试,确保分布式 dispatch/combine 操作的数值正确性。
## Modifications
- 新增 `tests/distributed/test_hopper_ll_precision.py`:DeepEP low_latency_dispatch 和 low_latency_combine 的精度测试
- 新增 `tests/distributed/test_hopper_ll_precision_entry.py`:分布式测试启动入口问题
| 级别 | 文件 | 概述 |
|---|---|---|
| 🔴 Bug | test_hopper_ll_precision_entry.py:62 |
return_code 检查逻辑不正确,非零错误码会被忽略 |
| 🟡 建议 | test_hopper_ll_precision.py:1 |
缺少版权声明头 |
总体评价
测试逻辑整体清晰,验证了 DeepEP 的 dispatch/combine 数值正确性。但存在返回码检查的 bug 需要修复,以及需要补充版权声明头以保持项目一致性。
| stdout, stderr = process.communicate() | ||
| return_code = -1 | ||
| print(f"std_out: {stdout}") | ||
| assert return_code not in (-1,), f"Process exited with code {return_code}, stdout: {stdout}, stderr: {stderr}" |
There was a problem hiding this comment.
🔴 Bug return_code 检查逻辑不正确
当前使用 assert return_code not in (-1,) 只排除了超时情况(-1),但如果子进程因其他错误退出(如返回码 1、2 等),测试不会失败,可能导致错误被静默忽略。
参考项目中其他类似测试(如 test_fusedmoe_ep_entry.py、test_chunked_moe.py),建议修改为:
assert return_code in (0, 250), f"Process exited with code {return_code}, stdout: {stdout}, stderr: {stderr}"| @@ -0,0 +1,124 @@ | |||
| import unittest | |||
There was a problem hiding this comment.
🟡 建议 缺少版权声明头
项目中其他测试文件(如 test_hopper_ll_precision_entry.py)都包含 Apache 2.0 版权声明头,建议在文件开头添加:
# Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Motivation
Modifications
Usage or Command
Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.