fix(dashboard): Fix SGLang benchmark workflow and integrate into dashboard by zhuyuhua-v · Pull Request #548 · ROCm/ATOM

zhuyuhua-v · 2026-04-13T07:07:08Z

Description

This PR introduces a dedicated GitHub Actions workflow for benchmarking ATOM-accelerated SGLang (atom-sglang-benchmark.yaml) and officially integrates its performance results into the benchmark dashboard.

To avoid code duplication, it also refactors the existing vLLM Python scripts into generic plugin scripts shared by both frameworks.

cherry-pick #497 and #575 to make sure sglang docker and benchmark functionality, details can refer to each pr description.

SGLang Benchmark Details

Trigger Method: Manual only (workflow_dispatch).
Target Model: DeepSeek-R1-0528 (FP8 TP8).
Hardware / Runner: Executed on atom-mi355-8gpu-oot-benchmark (MI355 8-GPU machines).
Workflow: Handles custom SGLang Docker image building, model loading, performance profiling, and result artifact generation.

Dashboard Integration

Added a dedicated hue offset (-30) for ATOM-SGLang in index.html to visually distinguish its charts from ATOM and ATOM-vLLM.
Fixed a missing deployment step in the workflow to ensure atom_logo_mini.png is correctly copied to the gh-pages branch.

Script Unification (SGLang & vLLM)

Renamed the existing oot_benchmark_*.py scripts to plugin_benchmark_*.py to make them framework-agnostic.
Updated both atom-sglang-benchmark.yaml and atom-vllm-oot-benchmark.yaml to reuse these unified scripts by passing specific arguments (e.g., --title and --default-backend).

Docker Release Workflow Fix

Fixed mutual exclusivity bug in docker-release.yaml: Previously, if a user checked both only_release_oot and only_release_sglang, the workflow would fail to build either image. The logic has been updated so that if both are checked, both the OOT and SGLang images will be built and released (while still correctly skipping the native image test/push). The UI descriptions for these inputs have been updated to reflect this behavior.

Signed-off-by: zhuyuhua-v <yuhzhu@amd.com>

Copilot

Pull request overview

This PR adds/updates benchmarking automation for ATOM-accelerated SGLang and wires its results into the existing benchmark dashboard, while refactoring the benchmark-processing scripts to be framework-agnostic and shared across vLLM and SGLang.

Changes:

Introduces shared “plugin_*” benchmark scripts and updates both vLLM and SGLang benchmark workflows to use them (including per-backend labeling for the dashboard).
Adds a shared graph-capture patch helper and hooks SGLang plugin mode to apply it (mirroring the vLLM plugin patch behavior).
Updates Docker release workflow logic to allow releasing both OOT and SGLang images when both inputs are selected; adds dashboard color offset for ATOM-SGLang.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
`docker/Dockerfile`	Removes ROCm vision/audio wheel pinning and replaces with runtime validation for both OOT and SGLang images.
`atom/plugin/graph_capture_patch.py`	Adds a shared helper to patch framework `GroupCoordinator.graph_capture` to nest `aiter` capture.
`atom/plugin/vllm/graph_capture_patch.py`	Refactors vLLM patch module to delegate to the shared helper.
`atom/plugin/sglang/graph_capture_patch.py`	Adds SGLang patch module delegating to the shared helper.
`atom/plugin/prepare.py`	Applies the SGLang graph-capture patch during SGLang plugin model preparation.
`.github/workflows/docker-release.yaml`	Fixes release gating logic so selecting both “only_release_*” builds/releases both images.
`.github/workflows/atom-vllm-benchmark.yaml`	Switches to unified plugin scripts; passes title/default backend for dashboard integration.
`.github/workflows/atom-sglang-benchmark.yaml`	Switches to unified plugin scripts; publishes results with ATOM-SGLang backend labeling and data-only gh-pages updates.
`.github/scripts/plugin_benchmark_validate_baseline.py`	Generalizes baseline validation to skip any `*_benchmark_summary.json` and regression report.
`.github/scripts/plugin_benchmark_to_dashboard.py`	Generalizes dashboard conversion; adds `--default-backend` and supports SGLang image tag field.
`.github/scripts/plugin_benchmark_summary.py`	Generalizes summary generation; adds `--title`.
`.github/scripts/plugin_benchmark_regression.py`	Generalizes regression report wording (OOT → generic).
`.github/dashboard/index.html`	Adds backend hue offset for `ATOM-SGLang` to visually separate charts.

Comments suppressed due to low confidence (1)

.github/scripts/plugin_benchmark_to_dashboard.py:91

.github/workflows/pre-checks.yaml runs Black and Ruff on the whole repo; plugin_benchmark_to_dashboard.py currently has formatting that will fail those checks (e.g., overly long lines and whitespace on blank lines around the build_entries definition / image_tag block). Please run Black on this file (which will also remove the trailing whitespace) so CI style checks pass.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: zhuyuhua-v <yuhzhu@amd.com>

Copilot

Pull request overview

This PR adds a dedicated GitHub Actions workflow for ATOM+SGLang benchmarking and integrates its results into the existing benchmark dashboard, while refactoring previously OOT/vLLM-specific benchmark scripts into shared “plugin” scripts reusable across frameworks.

Changes:

Introduces/updates the ATOM SGLang benchmark workflow and wires its results into the dashboard publish flow.
Refactors benchmark helper scripts (*_summary, *_to_dashboard, *_validate_baseline, *_regression) into framework-agnostic “plugin_*” variants and updates workflows to use them.
Centralizes the graph-capture patch logic into a shared helper and adds an SGLang-specific delegating patch module.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
docker/Dockerfile	Removes ROCm vision/audio wheel pinning and replaces with runtime validation; adjusts SGLang step labels.
atom/plugin/graph_capture_patch.py	Adds shared framework-agnostic graph_capture patch helper.
atom/plugin/vllm/graph_capture_patch.py	Refactors vLLM patch module to delegate to shared helper.
atom/plugin/sglang/graph_capture_patch.py	Adds SGLang patch module delegating to shared helper.
atom/plugin/prepare.py	Applies SGLang graph_capture patch during plugin model preparation.
.github/workflows/docker-release.yaml	Fixes `only_release_*` mutual-exclusivity behavior so both can be released when both are selected.
.github/workflows/atom-vllm-benchmark.yaml	Switches to unified plugin benchmark scripts and passes backend/title parameters.
.github/workflows/atom-sglang-benchmark.yaml	Expands model toggles, loads model configs from JSON, uses unified plugin scripts, and publishes dashboard data.
.github/scripts/plugin_benchmark_validate_baseline.py	Generalizes baseline validation to ignore summary artifacts and remove OOT-specific wording.
.github/scripts/plugin_benchmark_to_dashboard.py	Generalizes dashboard conversion, adds `--default-backend`, supports SGLang image tag field.
.github/scripts/plugin_benchmark_summary.py	Generalizes summary title/wording and adds `--title`.
.github/scripts/plugin_benchmark_regression.py	Generalizes regression script wording (OOT → benchmark).
.github/dashboard/index.html	Adds hue offset for `ATOM-SGLang` to visually distinguish charts.
.github/benchmark/sglang_benchmark_models.json	Adds JSON model configuration source for the SGLang benchmark workflow.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: zhuyuhua-v <yuhzhu@amd.com>

zhuyuhua-v added 4 commits April 14, 2026 09:05

add sglang benchmark

f10f054

Signed-off-by: zhuyuhua-v <yuhzhu@amd.com>

fix docker release workflow

e0542f6

Signed-off-by: zhuyuhua-v <yuhzhu@amd.com>

update push dashboard data

14e2f4f

Signed-off-by: zhuyuhua-v <yuhzhu@amd.com>

add graph capture patch(like vLLM) for sglang+atom path

1f36af3

Signed-off-by: zhuyuhua-v <yuhzhu@amd.com>

zhuyuhua-v force-pushed the yuhua/slg-benchmark branch from e8d9255 to 1f36af3 Compare April 14, 2026 09:14

zhuyuhua-v added 2 commits April 15, 2026 22:46

Merge branch 'main' into yuhua/slg-benchmark

9121970

remove pinned ROCm torchvision/torchaudio wheels for vllm and sglang

6497547

Signed-off-by: zhuyuhua-v <yuhzhu@amd.com>

zhuyuhua-v marked this pull request as ready for review April 16, 2026 01:45

Copilot AI review requested due to automatic review settings April 16, 2026 01:45

Copilot started reviewing on behalf of zhuyuhua-v April 16, 2026 01:46 View session

Copilot AI reviewed Apr 16, 2026

View reviewed changes

Comment thread .github/workflows/atom-sglang-benchmark.yaml

zejunchen-zejun previously approved these changes Apr 16, 2026

View reviewed changes

zhuyuhua-v dismissed zejunchen-zejun’s stale review via 3009f89 April 16, 2026 03:22

add more cases: ds fp4 tp4 tp8, ds fp8 tp4

2f6a1cd

Signed-off-by: zhuyuhua-v <yuhzhu@amd.com>

zhuyuhua-v force-pushed the yuhua/slg-benchmark branch from 3009f89 to 2f6a1cd Compare April 16, 2026 04:51

Copilot AI review requested due to automatic review settings April 16, 2026 04:51

Copilot AI reviewed Apr 16, 2026

View reviewed changes

Comment thread docker/Dockerfile

Comment thread atom/plugin/vllm/graph_capture_patch.py

Comment thread atom/plugin/sglang/graph_capture_patch.py

Comment thread .github/workflows/atom-sglang-benchmark.yaml

Comment thread .github/workflows/atom-sglang-benchmark.yaml

zejunchen-zejun previously approved these changes Apr 16, 2026

View reviewed changes

clean format

91e93ce

Signed-off-by: zhuyuhua-v <yuhzhu@amd.com>

zhuyuhua-v dismissed zejunchen-zejun’s stale review via 91e93ce April 16, 2026 05:28

zejunchen-zejun approved these changes Apr 16, 2026

View reviewed changes

wuhuikx approved these changes Apr 16, 2026

View reviewed changes

wuhuikx requested review from gyohuangxin and valarLip April 16, 2026 06:34

gyohuangxin approved these changes Apr 16, 2026

View reviewed changes

valarLip approved these changes Apr 16, 2026

View reviewed changes

valarLip merged commit a2f33bf into main Apr 16, 2026
23 of 29 checks passed

valarLip deleted the yuhua/slg-benchmark branch April 16, 2026 07:56

zhuyuhua-v mentioned this pull request Apr 17, 2026

[fix] add graph capture patch(like vLLM) for sglang+atom path #497

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(dashboard): Fix SGLang benchmark workflow and integrate into dashboard#548

fix(dashboard): Fix SGLang benchmark workflow and integrate into dashboard#548
valarLip merged 8 commits intomainfrom
yuhua/slg-benchmark

zhuyuhua-v commented Apr 13, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

zhuyuhua-v commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

SGLang Benchmark Details

Dashboard Integration

Script Unification (SGLang & vLLM)

Docker Release Workflow Fix

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

zhuyuhua-v commented Apr 13, 2026 •

edited

Loading