Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 35 additions & 45 deletions easybuild/easyconfigs/r/rocm/USER.md
Original file line number Diff line number Diff line change
@@ -1,60 +1,50 @@
# ROCm user instructions

There is a **big disclaimer** with these modules.
**This is ROCm(tm) installed in a non-standard location which may have
consequences for some programs that may rely on the standard location
for ROCm(tm).**

**THIS IS ROCM INSTALLED IN A WAY IT IS NOT MEANT TO BE INSTALLED.**

The ROCm installations outside of the Cray PE modules (so the 5.2.5, 5.3.3,
5.4.6, 5.6.1 and 6.2.2 modules) come **without any warranty nor support** as
they are not installed in the proper directories suggested by AMD thus may break
links encoded in the RPMs from which these packages were installed and as they
are also not guaranteed to be compatible with modules from the Cray PE as
only HPE Cray can give that warranty and as their inner working and precise
requirements is not public.
Neither can we guarantee that these modules are always compatible with
the HPE Cray Programming Environment as each version of the CPE is developed
for particular ROCm(tm) versions.

- The only modules officially supported by the current AMD GPU driver at the
time of writing (October 2024) are the `5.6.1` and `6.2.2` modules. Using
the `5.6.1` module is recommended only if a performance regression is
observed with the `6.0.3` or `6.2.2` modules. The use of the other modules
(`5.2.5`, `5.3.3` and `5.4.6`) is strongly discouraged and no longer
supported by the LUMI User Support Team.

- The ROCm modules have some PDF documentation in
`$EBROOTROCM/share/doc/rocgdb`, `$EBROOTROCM/share/doc/tracer`,
`$EBROOTROCM/share/doc/rocm_smi` and `$EBROOTROCM/share/doc/amd-dbgapi`. The
time of writing (February 2026) are the `6.2.2`, `6.2.4` and `6.4.4` modules.
Older modules may still be present on the system as a full clean-up is
nearly impossible, but modules older than ROCm(tm) 6.1 will likely not
be fully functional and there is nothing the LUMI User Support Team can
do about that.

The `6.2.2` module is only there for historical reasons as it was installed
before `6.2.4` became available.

- The ROCm modules have some PDF documentation in some subdirectories of
`$EBROOTROCM/share/doc`. The
`EBROOTROCM` environment variable is defined after loading the module.

- The `6.2.2` modules can be used with `PrgEnv-amd` but comes without matching
`amd/6.2.2` module. It is sufficient to load the `rocm/6.2.2` module after
- The `6.2.2` and `6.2.4` modules can be used with `PrgEnv-amd` but come without matching
`amd/6.2.2` or `amd/6.2.4` module. It is sufficient to load the
`rocm/6.2.2` or `rocm/6.2.4` module after
the `PrgEnv-amd` module (or `cpeAMD` module) to enable this ROCm version
also for the compiler wrappers in that programming environment.

- The `6.2.2` modules **is not compatible with the CCE 17.0.0 and 17.0.1
compilers** due to an incompatibility between LLVM 17 on which the CCE is
based and LLVM 18 from ROCm 6.2. The only supported programming environments
- The `6.2.2` and `6.2.4` modules **are not compatible with the CCE 17.0.1
compilers** (in the 23.09 version of the programming environment)
due to an incompatibility between LLVM 17 on which the CCE is
based and LLVM 18 from ROCm(tm) 6.2. The only supported programming environments
are PrgEnv-gnu (or cpeGNU) and PrgEnv-amd (or cpeAMD).

- Since ROCm 6.2, hipSolver depends on SuiteSparse. If an application depends
on hipSolver, it is the user responsibility to load the SuiteSparse module
- Since ROCm(tm) 6.2, hipSolver depends on SuiteSparse. If an application depends
on hipSolver, it is the user's responsibility to load the SuiteSparse module
which corresponds to the CPE they wish to use (cpeAMD or cpeGNU). Note that
the SuiteSparse module needs to be **loaded before** the `rocm/6.2.2` module
or `rocm/6.0.3` will be used.

- In the `CrayEnv` environment, omniperf dependencies have been installed for
all `cray-python` versions available at the time of the module installation
(October 2024, Python 3.9, 3.10 and 3.11) but the `cray-python` module is
not loaded as a dependency to let the choice of the Python version to the
user. Therefore, if you want to use omniperf, you need to load a
`cray-python` module yourself. In the `LUMI` environment, the only supported
version of Python is the one coming from the corresponding release of the
CPE. For example, for `LUMI/24.03` omniperf dependencies have been installed
for version 3.11. **Omniperf is not compatible with the system Python
(version 3.6)**.

Note that using ROCm in containers is still subject to the same driver
compatibility problems. Though containers will solve the problem of ROCm being
the SuiteSparse module needs to be **loaded before** the `rocm` module
or the regular `rocm` module for the toolchain will be used.

Note that using ROCm(tm) in containers is still subject to the same driver
compatibility problems as when using these modules.
Though containers will solve the problem of ROCm(tm) being
installed in a non-standard path (which was needed for the modules as the
standard path is already occupied by a different ROCm version), it will not
solve any problem caused by running a newer version of ROCm on a too old driver
(and there may be problems running an old version of ROCm on a too new driver
standard path is already occupied by a different ROCm(tm) version), it will not
solve any problem caused by running a newer version of ROCm(tm) on a too old driver
(and there may be problems running an old version of ROCm(tm) on a too new driver
also).