Skip to content

Fix NVBandwidth benchmark results parsing bug (#748)#782

Open
NJX-njx wants to merge 1 commit intomicrosoft:mainfrom
NJX-njx:fix/nvbandwidth-parsing-bug-748
Open

Fix NVBandwidth benchmark results parsing bug (#748)#782
NJX-njx wants to merge 1 commit intomicrosoft:mainfrom
NJX-njx:fix/nvbandwidth-parsing-bug-748

Conversation

@NJX-njx
Copy link

@NJX-njx NJX-njx commented Mar 3, 2026

Summary

Fixes #748 - NVBandwidth benchmark results parsing bug causing invalid metric names like _sum_None\ and \device_to_device_latency_sm_sum_None.

Root Cause

When parsing SUM lines, \parse_status['test_name']\ or \parse_status['benchmark_type']\ could be empty/None (e.g., after waived tests or when header parsing fails), leading to invalid metric keys.

Changes

  • Add validation: only add SUM result when both \ est_name\ and \�enchmark_type\ are valid
  • Use test name from SUM line (\match.group(1)) as fallback when \parse_status\ is stale
  • Infer \�enchmark_type\ from test name when not set: 'latency' in name -> 'lat', else 'bw'

Testing

  • Logic verified against \ ests/data/nvbandwidth_results.log\ format
  • Unit tests require CUDA platform (CI will run on Linux)

…ency_sm_sum_None

- Add validation before adding SUM results: only add when test_name and benchmark_type are valid
- Use test name from SUM line as fallback when parse_status is stale (e.g., after waived tests)
- Infer benchmark_type from test name when not set (latency tests -> lat, else bw)

Fixes microsoft#748

Made-with: Cursor
Copilot AI review requested due to automatic review settings March 3, 2026 14:11
@NJX-njx NJX-njx requested a review from a team as a code owner March 3, 2026 14:11
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes NVBandwidth SUM-line parsing to prevent invalid metric keys (e.g., _sum_None) by making SUM parsing more resilient when parsing state is missing or stale.

Changes:

  • Use the SUM line’s captured test name as a fallback when parse_status['test_name'] is empty.
  • Infer benchmark_type from the test name when it wasn’t detected from the matrix header.
  • Guard SUM metric emission to only occur when both test_name and benchmark_type are valid.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 192 to +194
if self.re_summary_pattern.match(line):
value = self.re_summary_pattern.match(line).group(2)
test_name = parse_status['test_name']
match = self.re_summary_pattern.match(line)
value = match.group(2)
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re_summary_pattern.match(line) is evaluated twice (once in the if and again to assign match). Consider assigning once (e.g., match = ...; if match:) to avoid duplicated work and keep the control flow simpler.

Copilot uses AI. Check for mistakes.
Comment on lines +195 to +203
# Use test_name from parse_status, fallback to group(1) from SUM line
test_name = parse_status['test_name'] or match.group(1).lower()
benchmark_type = parse_status['benchmark_type']
parse_status['results'][f'{test_name}_sum_{benchmark_type}'] = float(value)
# Infer benchmark_type from test_name if not set (e.g., after waived tests)
if benchmark_type is None:
benchmark_type = 'lat' if 'latency' in test_name else 'bw'
# Only add result when we have valid metric name (avoid _sum_None or sum_None)
if test_name and benchmark_type:
parse_status['results'][f'{test_name}_sum_{benchmark_type}'] = float(value)
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new fallback/inference paths for SUM parsing (using match.group(1) when parse_status['test_name'] is empty, and inferring benchmark_type when it's None) aren’t covered by existing nvbandwidth tests (current tests only exercise the full header+matrix path). Adding a unit test with a minimal raw output containing a SUM ... line but no prior matrix header (and/or no preceding Running ...) would help prevent regressions of the _sum_None metric-name bug.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NVBandwidth benchmark results parsing bug

2 participants