Looong01 · Looong01 · Feb 27, 2026
diff --git a/Compiling.md b/Compiling.md
@@ -34,6 +34,7 @@ As also mentioned in the instructions below but repeated here for visibility, if
       * If using the CUDA backend, CUDA 11 or later and a compatible version of CUDNN based on your CUDA version (https://developer.nvidia.com/cuda-toolkit) (https://developer.nvidia.com/cudnn) and a GPU capable of supporting them.
       * If using the TensorRT backend, in addition to a compatible CUDA Toolkit (https://developer.nvidia.com/cuda-toolkit), you also need TensorRT (https://developer.nvidia.com/tensorrt) that is at least version 8.5.
       * If using the ROCm backend, ROCm 6.4 or later and a GPU capable of supporting them. More information about installation(https://rocm.docs.amd.com/projects/install-on-linux/en/latest/) and please install all possiable ROCm developer packages, instead of just ROCm runtime packages.
+      * If using the MIGraphX backend, ROCm 7.0 or later with MIGraphX library installed.
       * If using the Eigen backend, Eigen3. With Debian packages, (i.e. apt or apt-get), this should be `libeigen3-dev`.
       * zlib, libzip. With Debian packages (i.e. apt or apt-get), these should be `zlib1g-dev`, `libzip-dev`.
       * If you want to do self-play training and research, probably Google perftools `libgoogle-perftools-dev` for TCMalloc or some other better malloc implementation. For unknown reasons, the allocation pattern in self-play with large numbers of threads and parallel games causes a lot of memory fragmentation under glibc malloc that will eventually run your machine out of memory, but better mallocs handle it fine.
@@ -42,7 +43,7 @@ As also mentioned in the instructions below but repeated here for visibility, if
       * `git clone https://github.com/lightvector/KataGo.git`
    * Compile using CMake and make in the cpp directory:
       * `cd KataGo/cpp`
-      * `cmake . -DUSE_BACKEND=OPENCL` or `cmake . -DUSE_BACKEND=CUDA` or `cmake . -DUSE_BACKEND=TENSORRT` or `cmake . -DUSE_BACKEND=EIGEN` or `cmake . -DUSE_BACKEND=ROCM`depending on which backend you want.
+      * `cmake . -DUSE_BACKEND=OPENCL` or `cmake . -DUSE_BACKEND=CUDA` or `cmake . -DUSE_BACKEND=TENSORRT` or `cmake . -DUSE_BACKEND=EIGEN` or `cmake . -DUSE_BACKEND=ROCM` or `cmake . -DUSE_BACKEND=MIGRAPHX` depending on which backend you want.
          * Specify also `-DUSE_TCMALLOC=1` if using TCMalloc.
          * Compiling will also call git commands to embed the git hash into the compiled executable, specify also `-DNO_GIT_REVISION=1` to disable it if this is causing issues for you.
          * Specify `-DUSE_AVX2=1` to also compile Eigen with AVX2 and FMA support, which will make it incompatible with old CPUs but much faster. (If you want to go further, you can also add `-DCMAKE_CXX_FLAGS='-march=native'` which will specialize to precisely your machine's CPU, but the exe might not run on other machines at all).

diff --git a/README.md b/README.md
@@ -8,7 +8,7 @@
     - [GUIs](#guis)
     - [Windows and Linux](#windows-and-linux)
     - [MacOS](#macos)
-    - [OpenCL vs CUDA vs TensorRT vs ROCm vs Eigen](#opencl-vs-cuda-vs-tensorrt-vs-rocm-vs-eigen)
+    - [OpenCL vs CUDA vs TensorRT vs ROCm vs MIGraphX vs Eigen](#opencl-vs-cuda-vs-tensorrt-vs-rocm-vs-migraphx-vs-eigen)
     - [How To Use](#how-to-use)
       - [Human-style Play and Analysis](#human-style-play-and-analysis)
       - [Other Commands:](#other-commands)
@@ -87,8 +87,8 @@ The community also provides KataGo packages for [Homebrew](https://brew.sh) on M
 
 Use `brew install katago`. The latest config files and networks are installed in KataGo's `share` directory. Find them via `brew list --verbose katago`. A basic way to run katago will be `katago gtp -config $(brew list --verbose katago | grep 'gtp.*\.cfg') -model $(brew list --verbose katago | grep .gz | head -1)`. You should choose the Network according to the release notes here and customize the provided example config as with every other way of installing KataGo.
 
-### OpenCL vs CUDA vs TensorRT vs ROCm vs Eigen
-KataGo has five backends, OpenCL (GPU), CUDA (GPU), TensorRT (GPU), ROCm (GPU) and Eigen (CPU).
+### OpenCL vs CUDA vs TensorRT vs ROCm vs MIGraphX vs Eigen
+KataGo has six backends, OpenCL (GPU), CUDA (GPU), TensorRT (GPU), ROCm (GPU), MIGraphX (GPU) and Eigen (CPU).
 
 The quick summary is:
   * **To easily get something working, try OpenCL if you have any good or decent GPU.**
@@ -97,12 +97,14 @@ The quick summary is:
   * Use Eigen without AVX2 if your CPU is old or on a low-end device that doesn't support AVX2.
   * The CUDA backend can work for NVIDIA GPUs with CUDA+CUDNN installed but is likely worse than TensorRT.
   * The ROCm backend can work for AMD GPUs with ROCm+MIOpen installed.
+  * The MIGraphX backend is an alternative AMD GPU backend using MIGraphX instead of MIOpen.
 
 More in detail:
   * OpenCL is a general GPU backend should be able to run with any GPUs or accelerators that support [OpenCL](https://en.wikipedia.org/wiki/OpenCL), including NVIDIA GPUs, AMD GPUs, as well CPU-based OpenCL implementations or things like Intel Integrated Graphics. This is the most general GPU version of KataGo and doesn't require a complicated install like CUDA does, so is most likely to work out of the box as long as you have a fairly modern GPU. **However, it also need to take some time when run for the very first time to tune itself.** For many systems, this will take 5-30 seconds, but on a few older/slower systems, may take many minutes or longer. Also, the quality of OpenCL implementations is sometimes inconsistent, particularly for Intel Integrated Graphics and for AMD GPUs that are older than several years, so it might not work for very old machines, as well as specific buggy newer AMD GPUs, see also [Issues with specific GPUs or GPU drivers](#issues-with-specific-gpus-or-gpu-drivers).
   * CUDA is a GPU backend specific to NVIDIA GPUs (it will not work with AMD or Intel or any other GPUs) and requires installing [CUDA](https://developer.nvidia.com/cuda-zone) and [CUDNN](https://developer.nvidia.com/cudnn) and a modern NVIDIA GPU. On most GPUs, the OpenCL implementation will actually beat NVIDIA's own CUDA/CUDNN at performance. The exception is for top-end NVIDIA GPUs that support FP16 and tensor cores, in which case sometimes one is better and sometimes the other is better.
   * TensorRT is similar to CUDA, but only uses NVIDIA's TensorRT framework to run the neural network with more optimized kernels. For modern NVIDIA GPUs, it should work whenever CUDA does and will usually be faster than CUDA or any other backend.
   * ROCm is a GPU backend specific to AMD GPUs (it will not work with NVIDIA or Intel or any other GPUs) and requires installing [ROCm](https://rocm.docs.amd.com) and [MIOpen](https://rocm.docs.amd.com/projects/MIOpen) and a modern AMD GPU. On most GPUs, the OpenCL implementation will actually beat AMD's own ROCm/MIOpen at performance. The exception is for top-end AMD GPUs that support FP16 and stream processors, in which case sometimes one is better and sometimes the other is better.
+  * MIGraphX is an alternative GPU backend for AMD GPUs using AMD's MIGraphX framework instead of MIOpen. It may offer better performance than ROCm on some GPUs. Requires ROCm 7.0+ with MIGraphX installed.
   * Eigen is a *CPU* backend that should work widely *without* needing a GPU or fancy drivers. Use this if you don't have a good GPU or really any GPU at all. It will be quite significantly slower than OpenCL or CUDA, but on a good CPU can still often get 10 to 20 playouts per second if using the smaller (15 or 20) block neural nets. Eigen can also be compiled with AVX2 and FMA support, which can provide a big performance boost for Intel and AMD CPUs from the last few years. However, it will not run at all on older CPUs (and possibly even some recent but low-power modern CPUs) that don't support these fancy vector instructions.
 
 For **any** implementation, it's recommended that you also tune the number of threads used if you care about optimal performance, as it can make a factor of 2-3 difference in the speed. See "Tuning for Performance" below. However, if you mostly just want to get it working, then the default untuned settings should also be still reasonable.

diff --git a/cpp/CMakeLists.txt b/cpp/CMakeLists.txt
@@ -48,7 +48,7 @@ endif()
 set(BUILD_DISTRIBUTED 0 CACHE BOOL "Build with http support for contributing to distributed training")
 set(USE_BACKEND CACHE STRING "Neural net backend")
 string(TOUPPER "${USE_BACKEND}" USE_BACKEND)
-set_property(CACHE USE_BACKEND PROPERTY STRINGS "" CUDA TENSORRT OPENCL EIGEN ROCM)
+set_property(CACHE USE_BACKEND PROPERTY STRINGS "" CUDA TENSORRT OPENCL EIGEN ROCM MIGRAPHX)
 
 set(USE_TCMALLOC 0 CACHE BOOL "Use TCMalloc")
 set(NO_GIT_REVISION 0 CACHE BOOL "Disable embedding the git revision into the compiled exe")
@@ -206,6 +206,62 @@ elseif(USE_BACKEND STREQUAL "ROCM")
   # Optional: Enable model-size‑based autotuning and other macros
   # add_compile_definitions(HIP_SUPPORTS_FP16)
 
+# --------------------------- MIGRAPHX backend（AMD MIGraphX graph inference） ---------------------------
+elseif(USE_BACKEND STREQUAL "MIGRAPHX")
+  message(STATUS "-DUSE_BACKEND=MIGRAPHX, using AMD MIGraphX backend.")
+
+  # Use standard C++ compiler with MIGraphX
+  set(CMAKE_CXX_STANDARD 17)
+
+  # Find MIGraphX manually (avoid CMake config which adds hipcc-specific flags)
+  # Note: MIGraphX headers are split between two locations:
+  # - /opt/rocm/lib/migraphx/include/migraphx/ (C++ API headers like program.hpp)
+  # - /opt/rocm/include/migraphx/ (export.h and other common headers)
+  find_path(MIGRAPHX_CXX_INCLUDE_DIR migraphx/program.hpp
+            HINTS /opt/rocm/lib/migraphx/include
+            PATH_SUFFIXES include)
+
+  find_path(MIGRAPHX_INCLUDE_DIR migraphx/export.h
+            HINTS /opt/rocm/include
+            PATH_SUFFIXES include)
+
+  find_library(MIGRAPHX_LIBRARY migraphx
+               HINTS /opt/rocm/lib/migraphx/lib /opt/rocm/lib
+               PATH_SUFFIXES lib lib64)
+
+  find_library(MIGRAPHX_GPU_LIBRARY migraphx_gpu
+               HINTS /opt/rocm/lib/migraphx/lib /opt/rocm/lib
+               PATH_SUFFIXES lib lib64)
+
+  if(NOT MIGRAPHX_CXX_INCLUDE_DIR)
+    message(FATAL_ERROR "MIGraphX C++ headers not found. Please install MIGraphX.")
+  endif()
+
+  if(NOT MIGRAPHX_LIBRARY)
+    message(FATAL_ERROR "MIGraphX library not found. Please install MIGraphX.")
+  endif()
+
+  message(STATUS "MIGraphX C++ include: ${MIGRAPHX_CXX_INCLUDE_DIR}")
+  message(STATUS "MIGraphX include: ${MIGRAPHX_INCLUDE_DIR}")
+  message(STATUS "MIGraphX library: ${MIGRAPHX_LIBRARY}")
+  if(MIGRAPHX_GPU_LIBRARY)
+    message(STATUS "MIGraphX GPU library: ${MIGRAPHX_GPU_LIBRARY}")
+  endif()
+
+  # Source files for MIGraphX backend
+  set(NEURALNET_BACKEND_SOURCES
+    neuralnet/migraphxbackend.cpp
+  )
+
+  # Include directories (both locations needed)
+  include_directories(SYSTEM ${MIGRAPHX_CXX_INCLUDE_DIR})
+  if(MIGRAPHX_INCLUDE_DIR)
+    include_directories(SYSTEM ${MIGRAPHX_INCLUDE_DIR})
+  endif()
+
+  # Add ROCm lib directory for linking
+  link_directories(/opt/rocm/lib)
+
 elseif(USE_BACKEND STREQUAL "")
   message(WARNING "${ColorBoldRed}WARNING: Using dummy neural net backend, intended for non-neural-net testing only, will fail on any code path requiring a neural net. To use neural net, specify -DUSE_BACKEND=CUDA or -DUSE_BACKEND=TENSORRT or -DUSE_BACKEND=OPENCL or -DUSE_BACKEND=EIGEN to compile with the respective backend.${ColorReset}")
   set(NEURALNET_BACKEND_SOURCES neuralnet/dummybackend.cpp)
@@ -614,6 +670,35 @@ elseif(USE_BACKEND STREQUAL "EIGEN")
       endif()
     endif()
   endif()
+elseif(USE_BACKEND STREQUAL "MIGRAPHX")
+  target_compile_definitions(katago PRIVATE USE_MIGRAPHX_BACKEND)
+
+  # Link MIGraphX libraries
+  target_link_libraries(katago ${MIGRAPHX_LIBRARY})
+  if(MIGRAPHX_GPU_LIBRARY)
+    target_link_libraries(katago ${MIGRAPHX_GPU_LIBRARY})
+  endif()
+
+  # Link HIP runtime
+  find_library(AMDHIP64_LIBRARY amdhip64
+               HINTS /opt/rocm/lib
+               PATH_SUFFIXES lib lib64)
+  if(AMDHIP64_LIBRARY)
+    target_link_libraries(katago ${AMDHIP64_LIBRARY})
+  else()
+    target_link_libraries(katago amdhip64)
+  endif()
+
+  # Link other required libraries
+  find_library(HIPRTC_LIBRARY hiprtc
+               HINTS /opt/rocm/lib
+               PATH_SUFFIXES lib lib64)
+  if(HIPRTC_LIBRARY)
+    target_link_libraries(katago ${HIPRTC_LIBRARY})
+  endif()
+
+  # Add ROCm library directories
+  link_directories(/opt/rocm/lib)
 endif()
 
 if(USE_BIGGER_BOARDS_EXPENSIVE)

diff --git a/cpp/configs/analysis_example.cfg b/cpp/configs/analysis_example.cfg
@@ -258,6 +258,30 @@ nnRandomize = true
 # ROCm does not support NHWC, so this is always false.
 
 
+# MIGraphX GPU settings--------------------------------------
+# These only apply when using the MIGraphX version of KataGo.
+
+# IF USING ONE GPU: optionally uncomment and change this if the GPU you want to use turns out to be not device 0
+# mgxDeviceToUse = 0
+
+# IF USING TWO GPUS: Uncomment these two lines (AND set numNNServerThreadsPerModel above):
+# mgxDeviceToUseThread0 = 0  # change this if the first GPU you want to use turns out to be not device 0
+# mgxDeviceToUseThread1 = 1  # change this if the second GPU you want to use turns out to be not device 1
+
+# IF USING THREE GPUS: Uncomment these three lines (AND set numNNServerThreadsPerModel above):
+# mgxDeviceToUseThread0 = 0  # change this if the first GPU you want to use turns out to be not device 0
+# mgxDeviceToUseThread1 = 1  # change this if the second GPU you want to use turns out to be not device 1
+# mgxDeviceToUseThread2 = 2  # change this if the third GPU you want to use turns out to be not device 2
+
+# You can probably guess the pattern if you have four, five, etc. GPUs.
+
+# KataGo will automatically use FP16 or not based on the compute capability of your AMD GPU. If you
+# want to try to force a particular behavior though you can uncomment these lines and change them
+# to "true" or "false". E.g. it's using FP16 but on your card that's giving an error, or it's not using
+# FP16 but you think it should.
+# mgxUseFP16 = auto
+
+
 # OpenCL-specific GPU settings--------------------------------------
 # These only apply when using the OpenCL version of KataGo.
 

diff --git a/cpp/configs/contribute_example.cfg b/cpp/configs/contribute_example.cfg
@@ -123,6 +123,30 @@ watchOngoingGameInFileName = watchgame.txt
 # ROCm does not support NHWC, so this is always false.
 
 
+# MIGraphX GPU settings--------------------------------------
+# These only apply when using the MIGraphX version of KataGo.
+
+# IF USING ONE GPU: optionally uncomment and change this if the GPU you want to use turns out to be not device 0
+# mgxDeviceToUse = 0
+
+# IF USING TWO GPUS: Uncomment these two lines (AND set numNNServerThreadsPerModel above):
+# mgxDeviceToUseThread0 = 0  # change this if the first GPU you want to use turns out to be not device 0
+# mgxDeviceToUseThread1 = 1  # change this if the second GPU you want to use turns out to be not device 1
+
+# IF USING THREE GPUS: Uncomment these three lines (AND set numNNServerThreadsPerModel above):
+# mgxDeviceToUseThread0 = 0  # change this if the first GPU you want to use turns out to be not device 0
+# mgxDeviceToUseThread1 = 1  # change this if the second GPU you want to use turns out to be not device 1
+# mgxDeviceToUseThread2 = 2  # change this if the third GPU you want to use turns out to be not device 2
+
+# You can probably guess the pattern if you have four, five, etc. GPUs.
+
+# KataGo will automatically use FP16 or not based on the compute capability of your AMD GPU. If you
+# want to try to force a particular behavior though you can uncomment these lines and change them
+# to "true" or "false". E.g. it's using FP16 but on your card that's giving an error, or it's not using
+# FP16 but you think it should.
+# mgxUseFP16 = auto
+
+
 # OpenCL GPU settings--------------------------------------
 # These only apply when using the OpenCL version of KataGo.
 

diff --git a/cpp/configs/gtp_example.cfg b/cpp/configs/gtp_example.cfg
@@ -496,6 +496,30 @@ searchFactorWhenWinningThreshold = 0.95
 # ROCm does not support NHWC, so this is always false.
 
 
+# MIGraphX GPU settings--------------------------------------
+# These only apply when using the MIGraphX version of KataGo.
+
+# IF USING ONE GPU: optionally uncomment and change this if the GPU you want to use turns out to be not device 0
+# mgxDeviceToUse = 0
+
+# IF USING TWO GPUS: Uncomment these two lines (AND set numNNServerThreadsPerModel above):
+# mgxDeviceToUseThread0 = 0  # change this if the first GPU you want to use turns out to be not device 0
+# mgxDeviceToUseThread1 = 1  # change this if the second GPU you want to use turns out to be not device 1
+
+# IF USING THREE GPUS: Uncomment these three lines (AND set numNNServerThreadsPerModel above):
+# mgxDeviceToUseThread0 = 0  # change this if the first GPU you want to use turns out to be not device 0
+# mgxDeviceToUseThread1 = 1  # change this if the second GPU you want to use turns out to be not device 1
+# mgxDeviceToUseThread2 = 2  # change this if the third GPU you want to use turns out to be not device 2
+
+# You can probably guess the pattern if you have four, five, etc. GPUs.
+
+# KataGo will automatically use FP16 or not based on the compute capability of your AMD GPU. If you
+# want to try to force a particular behavior though you can uncomment these lines and change them
+# to "true" or "false". E.g. it's using FP16 but on your card that's giving an error, or it's not using
+# FP16 but you think it should.
+# mgxUseFP16 = auto
+
+
 # ------------------------------
 # OpenCL GPU settings
 # ------------------------------

diff --git a/cpp/configs/match_example.cfg b/cpp/configs/match_example.cfg
@@ -196,6 +196,30 @@ numNNServerThreadsPerModel = 1
 # ROCm does not support NHWC, so this is always false.
 
 
+# MIGraphX GPU settings--------------------------------------
+# These only apply when using the MIGraphX version of KataGo.
+
+# IF USING ONE GPU: optionally uncomment and change this if the GPU you want to use turns out to be not device 0
+# mgxDeviceToUse = 0
+
+# IF USING TWO GPUS: Uncomment these two lines (AND set numNNServerThreadsPerModel above):
+# mgxDeviceToUseThread0 = 0  # change this if the first GPU you want to use turns out to be not device 0
+# mgxDeviceToUseThread1 = 1  # change this if the second GPU you want to use turns out to be not device 1
+
+# IF USING THREE GPUS: Uncomment these three lines (AND set numNNServerThreadsPerModel above):
+# mgxDeviceToUseThread0 = 0  # change this if the first GPU you want to use turns out to be not device 0
+# mgxDeviceToUseThread1 = 1  # change this if the second GPU you want to use turns out to be not device 1
+# mgxDeviceToUseThread2 = 2  # change this if the third GPU you want to use turns out to be not device 2
+
+# You can probably guess the pattern if you have four, five, etc. GPUs.
+
+# KataGo will automatically use FP16 or not based on the compute capability of your AMD GPU. If you
+# want to try to force a particular behavior though you can uncomment these lines and change them
+# to "true" or "false". E.g. it's using FP16 but on your card that's giving an error, or it's not using
+# FP16 but you think it should.
+# mgxUseFP16 = auto
+
+
 # OpenCL GPU settings--------------------------------------
 # These only apply when using OpenCL as the backend for inference.
 # (For GTP, we only ever have one model, when playing matches, we might have more than one, see match_example.cfg)

diff --git a/cpp/main.cpp b/cpp/main.cpp
@@ -253,6 +253,8 @@ string Version::getKataGoVersionFullInfo() {
 #define STRINGIFY2(x) STRINGIFY(x)
   out << "Compiled with HIP runtime version " << STRINGIFY2(HIP_TARGET_VERSION) << endl;
 #endif
+#elif defined(USE_MIGRAPHX_BACKEND)
+  out << "Using MIGraphX backend" << endl;
 #elif defined(USE_EIGEN_BACKEND)
   out << "Using Eigen(CPU) backend" << endl;
 #else