Skip to content

[Bench] Add PTI XPTI instrumentation overhead benchmark#21558

Open
againull wants to merge 6 commits intointel:syclfrom
againull:xpti_bench
Open

[Bench] Add PTI XPTI instrumentation overhead benchmark#21558
againull wants to merge 6 commits intointel:syclfrom
againull:xpti_bench

Conversation

@againull
Copy link
Contributor

Add a new benchmark suite to measure XPTI instrumentation overhead
using the pti-gpu SDK's perf-profiling-overhead test. The benchmark:

  • Clones and builds the pti-gpu repository SDK
  • Runs the perf-profiling-overhead ctest
  • Parses and reports the overhead percentage
    The benchmark is included in the "Full" preset.

Additionally add option to verbose output (false by default).

Co-Authored-By: Claude Sonnet 4.5 noreply@anthropic.com

againull and others added 5 commits March 17, 2026 21:29
Add a new benchmark suite to measure XPTI instrumentation overhead
using the pti-gpu SDK's perf-profiling-overhead test. The benchmark:
- Clones and builds the pti-gpu repository SDK
- Runs the perf-profiling-overhead ctest
- Parses and reports the overhead percentage

The benchmark is included in the "Full" and "SYCL" presets.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@againull againull requested review from a team as code owners March 18, 2026 22:12
# See LICENSE.TXT
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

import os
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import, remove

performs multiple internal iterations and reports a median value. The framework
then computes the median of all these median values.

For stable results, use --iterations=20 or higher.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we need iterations variable in dispatched runs now? Could you add it, please?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm... I'm not sure if we can add many new variables in .github/workflows/sycl-ur-perf-benchmarking.yml workflow. Perhaps if we want to add iterations we could not add verbose...?

Also, alternatively we could print a warning or fail if iterations are below 20...?


def git_hash(self) -> str:
# Latest master branch - can be updated as needed
return "master"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm.. we rather use tags/commits, as moving target "master" may lead to unexpected issues.

use_installdir=False,
)

# Patch CMakeLists.txt to increase threshold from 60 to 70
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a note: if this threshold is important variable, perhaps we should add support in xpti project to manipulate this value via e.g. env var or cmake option...?

extra_args = [
f"-DCMAKE_C_COMPILER={options.sycl}/bin/clang",
f"-DCMAKE_CXX_COMPILER={options.sycl}/bin/clang++",
"-DCMAKE_CXX_FLAGS=-Wall -Wextra -Wextra-semi -pedantic -Wformat -Wformat-security -Werror=format-security -fstack-protector-strong -D_FORTIFY_SOURCE=2",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need -Wall etc. in benchmarking...?

f"-S{self.project.src_dir}/sdk",
f"-B{build_dir}",
]
+ extra_args,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these extra_agrs are only used here, why not merge all options to a single list? unless you wanted to control wheter they are enabled or not (in some cases)


For stable results, use --iterations=20 or higher.
"""
build_dir = self.suite.project.src_dir / "sdk" / "build"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we reuse build_dir from the suite?

performs multiple internal iterations and reports a median value. The framework
then computes the median of all these median values.

For stable results, use --iterations=20 or higher.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm... I'm not sure if we can add many new variables in .github/workflows/sycl-ur-perf-benchmarking.yml workflow. Perhaps if we want to add iterations we could not add verbose...?

Also, alternatively we could print a warning or fail if iterations are below 20...?


if not results:
# Fallback: look for simpler overhead pattern
fallback_pattern = r"(?:Overhead|overhead).*:\s*([0-9.]+)\s*%"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps we could use a fallback_pattern in an else to the if match:. This way we could get rid of the extra for and make it a little simpler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants