Conversation
There was a problem hiding this comment.
Pull request overview
Improves ais-check’s HIP runtime detection by scanning common ROCm/HIP installation locations and reporting which discovered HIP runtimes support AIS-related symbols.
Changes:
- Add filesystem-based discovery for
libamdhip64.soacross common ROCm/HIP paths. - Probe each discovered HIP runtime for AIS symbol availability and store per-library support state.
- Print a per-HIP-library AIS support summary in non-quiet mode.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
Enhances tools/ais-check to improve HIP runtime detection for AMD Infinity Storage (AIS) and to provide more detailed output about which HIP runtime libraries were found and whether each supports AIS.
Changes:
- Add filesystem/environment-based discovery of
libamdhip64.soacross multiple common ROCm install locations. - Probe AIS symbol availability across all discovered HIP runtime libraries and aggregate results.
- Print a per-library AIS support summary in the non-quiet output.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| if err != 0: | ||
| if symbol_status.value != 1: | ||
| symbol = symbol.decode("utf-8") | ||
| err_str = hip.hipGetErrorString(err).decode("utf-8") | ||
| print( | ||
| f"hipGetProcAddress({symbol}) failed with err code" |
| # 1. Respect runtime linker paths | ||
| for p in os.environ.get("LD_LIBRARY_PATH", "").split(":"): | ||
| if p: | ||
| # Clean up the path by removing `..`, etc. and getting | ||
| # an absolute path. | ||
| safe_p = os.path.abspath(os.path.normpath(p)) | ||
|
|
||
| candidates.append(os.path.join(safe_p, "libamdhip64.so")) | ||
|
|
||
| # 2. Environment variables commonly set by ROCm or modules | ||
| for var in ("ROCM_HOME", "ROCM_PATH", "HIP_PATH"): | ||
| base = os.environ.get(var) |
|
I think that ais-check should return success if the current environment supports the fastpath. I think this change breaks that. If the current environment is using ROCm 7.0, but ROCm 7.2 is also installed, would this script return success? If so, I don't think it should. |
| print(u.sysname, u.nodename, u.release, u.version, u.machine) | ||
|
|
||
| print() | ||
| print("Found these HIP libraries (some may be symlinks):") |
There was a problem hiding this comment.
Might be worth adding the word "redundant" (e.g. redundant symlinks) since /opt/rocm will likely be a duplicate entry (tbh I can't think of typical scenarios where it's not). To fix entirely, we could track ROCm installs by the realpath to get rid of symlinks.
|
Played around with getting the absolute path when using find_library: Testing: |
Adds better HIP detection and output
AIHIPFILE-156