Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .claude/rules/common-pitfalls.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,10 +36,10 @@
- Boundary condition symmetry requirements must be maintained

## Compiler-Specific Issues
- Code must compile on gfortran, nvfortran, Cray ftn, and Intel ifx
- CI-gated compilers (must always pass): gfortran, nvfortran, Cray ftn, and Intel ifx
- AMD flang is additionally supported for `--gpu mp` builds but not in the CI matrix
- Each compiler has different strictness levels and warning behavior
- Fypp macros must expand correctly for both GPU and CPU builds
- GPU builds only work with nvfortran, Cray ftn, and AMD flang

## Test System
- Tests are generated **programmatically** in `toolchain/mfc/test/cases.py`, not standalone files
Expand Down
41 changes: 26 additions & 15 deletions .claude/rules/gpu-and-mpi.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,20 +38,27 @@ Inline macros (use `$:` prefix):
- `$:GPU_WAIT()` — Synchronization barrier.

Block macros (use `#:call`/`#:endcall`):
- `GPU_PARALLEL(...)` — GPU parallel region wrapping a code block.
- `GPU_PARALLEL(...)` — GPU parallel region (used for scalar reductions like `maxval`/`minval`).
- `GPU_DATA(copy=..., create=..., ...)` — Scoped data region.
- `GPU_HOST_DATA(use_device_addr=[...])` — Host code with device pointers.

Block macro usage:
Typical GPU loop pattern (used 750+ times in the codebase):
```
#:call GPU_PARALLEL(copyin='[var1]', copyout='[var2]')
$:GPU_LOOP(collapse=N)
do k = 0, n; do j = 0, m
! loop body
end do; end do
#:endcall GPU_PARALLEL
$:GPU_PARALLEL_LOOP(private='[i,j,k,l]', collapse=3)
do l = idwbuff(3)%beg, idwbuff(3)%end
do k = idwbuff(2)%beg, idwbuff(2)%end
do j = idwbuff(1)%beg, idwbuff(1)%end
! loop body
end do
end do
end do
$:END_GPU_PARALLEL_LOOP()
```

WARNING: Do NOT use `GPU_PARALLEL` wrapping `GPU_LOOP` for spatial loops. `GPU_LOOP`
emits empty directives on Cray and AMD compilers, causing silent serial execution.
Use `GPU_PARALLEL_LOOP` / `END_GPU_PARALLEL_LOOP` for all parallel spatial loops.

NEVER write raw `!$acc` or `!$omp` directives. Always use `GPU_*` Fypp macros.
The precheck source lint will catch raw directives and fail.

Expand All @@ -67,13 +74,17 @@ The precheck source lint will catch raw directives and fail.
- These compile only for Cray (`_CRAYFTN`); other compilers skip them

### Compiler-Backend Matrix
| Compiler | `--gpu acc` (OpenACC) | `--gpu mp` (OpenMP) | CPU-only |
|-----------------|----------------------|---------------------|----------|
| GNU gfortran | No | No | Yes |
| NVIDIA nvfortran| Yes (primary) | Yes | Yes |
| Cray ftn (CCE) | Yes | Yes (primary) | Yes |
| Intel ifx | No | No | Yes |
| AMD flang | No | Yes | Yes |

CI-gated compilers (must always pass): gfortran, nvfortran, Cray ftn, Intel ifx.
AMD flang is additionally supported for GPU builds but not in the CI matrix.
Comment on lines +78 to +79
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor imprecision in AMD flang scope description.

Line 79 says AMD flang is additionally supported for GPU builds, but AMD flang is No for --gpu acc (see the table immediately below). The sibling files are more precise: common-pitfalls.md says `--gpu mp` builds and CLAUDE.md says OpenMP target offload GPU builds. A reader skimming the prose without the table could infer broader GPU coverage than is actually supported.

✏️ Suggested wording fix
-AMD flang is additionally supported for GPU builds but not in the CI matrix.
+AMD flang is additionally supported for OpenMP target offload (`--gpu mp`) builds but not in the CI matrix.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
CI-gated compilers (must always pass): gfortran, nvfortran, Cray ftn, Intel ifx.
AMD flang is additionally supported for GPU builds but not in the CI matrix.
CI-gated compilers (must always pass): gfortran, nvfortran, Cray ftn, Intel ifx.
AMD flang is additionally supported for OpenMP target offload (`--gpu mp`) builds but not in the CI matrix.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/rules/gpu-and-mpi.md around lines 78 - 79, The sentence "AMD flang
is additionally supported for GPU builds" is imprecise; update the prose in
`.claude/rules/gpu-and-mpi.md` to explicitly limit AMD flang support to OpenMP
target/offload GPU builds (i.e., `--gpu mp` / OpenMP target offload), matching
the table and the sibling docs (`common-pitfalls.md`, `CLAUDE.md`); replace the
broad phrase with wording such as "AMD flang is supported only for OpenMP target
(--gpu mp) GPU/offload builds" so the text and table are consistent.


| Compiler | `--gpu acc` (OpenACC) | `--gpu mp` (OpenMP) | CPU-only |
|-----------------|----------------------|------------------------|----------|
| GNU gfortran | No | Experimental (AMD GCN) | Yes |
| NVIDIA nvfortran| Yes (primary) | Yes | Yes |
| Cray ftn (CCE) | Yes | Yes (primary) | Yes |
| Intel ifx | No | Experimental (SPIR64) | Yes |
| AMD flang | No | Yes | Yes |

## Preprocessor Defines (`#ifdef` / `#ifndef`)

Expand Down
5 changes: 3 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
MFC is an exascale multi-physics CFD solver written in modern Fortran 2008+ with Fypp
preprocessing. It has three executables (pre_process, simulation, post_process), a Python
toolchain for building/running/testing, and supports GPU acceleration via OpenACC and
OpenMP target offload. It must compile with gfortran, nvfortran, Cray ftn, and Intel ifx.
OpenMP target offload. It must compile with gfortran, nvfortran, Cray ftn, and Intel ifx (CI-gated).
AMD flang is additionally supported for OpenMP target offload GPU builds.

## Commands

Expand Down Expand Up @@ -167,4 +168,4 @@ When reviewing PRs, prioritize in this order:
4. MPI correctness (halo exchange, buffer sizing, GPU_UPDATE calls)
5. GPU code (GPU_* Fypp macros only, no raw pragmas)
6. Physics consistency (pressure formula matches model_eqns)
7. Compiler portability (all four compilers)
7. Compiler portability (4 CI-gated compilers + AMD flang for GPU)