Fix NVBandwidth benchmark results parsing bug (#748)#782
Fix NVBandwidth benchmark results parsing bug (#748)#782NJX-njx wants to merge 1 commit intomicrosoft:mainfrom
Conversation
…ency_sm_sum_None - Add validation before adding SUM results: only add when test_name and benchmark_type are valid - Use test name from SUM line as fallback when parse_status is stale (e.g., after waived tests) - Infer benchmark_type from test name when not set (latency tests -> lat, else bw) Fixes microsoft#748 Made-with: Cursor
There was a problem hiding this comment.
Pull request overview
Fixes NVBandwidth SUM-line parsing to prevent invalid metric keys (e.g., _sum_None) by making SUM parsing more resilient when parsing state is missing or stale.
Changes:
- Use the SUM line’s captured test name as a fallback when
parse_status['test_name']is empty. - Infer
benchmark_typefrom the test name when it wasn’t detected from the matrix header. - Guard SUM metric emission to only occur when both
test_nameandbenchmark_typeare valid.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if self.re_summary_pattern.match(line): | ||
| value = self.re_summary_pattern.match(line).group(2) | ||
| test_name = parse_status['test_name'] | ||
| match = self.re_summary_pattern.match(line) | ||
| value = match.group(2) |
There was a problem hiding this comment.
re_summary_pattern.match(line) is evaluated twice (once in the if and again to assign match). Consider assigning once (e.g., match = ...; if match:) to avoid duplicated work and keep the control flow simpler.
| # Use test_name from parse_status, fallback to group(1) from SUM line | ||
| test_name = parse_status['test_name'] or match.group(1).lower() | ||
| benchmark_type = parse_status['benchmark_type'] | ||
| parse_status['results'][f'{test_name}_sum_{benchmark_type}'] = float(value) | ||
| # Infer benchmark_type from test_name if not set (e.g., after waived tests) | ||
| if benchmark_type is None: | ||
| benchmark_type = 'lat' if 'latency' in test_name else 'bw' | ||
| # Only add result when we have valid metric name (avoid _sum_None or sum_None) | ||
| if test_name and benchmark_type: | ||
| parse_status['results'][f'{test_name}_sum_{benchmark_type}'] = float(value) |
There was a problem hiding this comment.
The new fallback/inference paths for SUM parsing (using match.group(1) when parse_status['test_name'] is empty, and inferring benchmark_type when it's None) aren’t covered by existing nvbandwidth tests (current tests only exercise the full header+matrix path). Adding a unit test with a minimal raw output containing a SUM ... line but no prior matrix header (and/or no preceding Running ...) would help prevent regressions of the _sum_None metric-name bug.
Summary
Fixes #748 - NVBandwidth benchmark results parsing bug causing invalid metric names like _sum_None\ and \device_to_device_latency_sm_sum_None.
Root Cause
When parsing SUM lines, \parse_status['test_name']\ or \parse_status['benchmark_type']\ could be empty/None (e.g., after waived tests or when header parsing fails), leading to invalid metric keys.
Changes
Testing