Skip to content

Comments

Fix 8 HPC-sensitive bugs: GPU kernels, MPI broadcast, domain decomposition#1242

Open
sbryngelson wants to merge 20 commits intoMFlowCode:masterfrom
sbryngelson:fix/hpc-bugfixes-batch
Open

Fix 8 HPC-sensitive bugs: GPU kernels, MPI broadcast, domain decomposition#1242
sbryngelson wants to merge 20 commits intoMFlowCode:masterfrom
sbryngelson:fix/hpc-bugfixes-batch

Conversation

@sbryngelson
Copy link
Member

@sbryngelson sbryngelson commented Feb 22, 2026

User description

Summary

Batches 8 bug fixes that touch GPU parallel regions, MPI broadcasts, or compiler-sensitive code paths. These need HPC CI validation across NVHPC, CCE, and AMD compilers.

Not included: #1225 (3D viscous GPU private clauses) and #1181 (MUSCL-THINC right-state + GPU privates) remain as separate PRs due to higher risk — they add new private clause variables to GPU parallel loops, which is the most compiler-sensitive change possible.

GPU kernel fixes

Fix z-boundary backward FD stencil (#1173)

  • File: src/common/m_finite_differences.fpp
  • Bug: At the z-domain boundary, the backward FD stencil reads fields(2) (y-velocity) instead of fields(3) (z-velocity), corrupting divergence/gradient calculations for all 3D viscous, hyperelastic, and hypoelastic simulations.
  • Fix: fields(2)%sf(x, y, z-2)fields(3)%sf(x, y, z-2)
  • Impact: Runs inside GPU_PARALLEL_LOOP in m_viscous.fpp, m_hypoelastic.fpp, m_hyperelastic.fpp, and m_derived_variables.fpp. Affects every 3D simulation using these modules.

Fix GRCBC subsonic inflow using wrong L index (#1224)

  • File: src/simulation/m_cbc.fpp
  • Bug: In the GRCBC characteristic wave amplitude loop do i = 2, momxb, the code writes L(2) = ... instead of L(i) = .... Only the last species' wave amplitude survives; all others are overwritten.
  • Fix: L(2)L(i)
  • Impact: Inside a GPU_PARALLEL_LOOP (collapse=2 CBC loop with L as private). Affects multi-species GRCBC subsonic inflow boundary conditions.

Fix G_K exponential degradation in damage model (#1227)

  • File: src/common/m_variables_conversion.fpp
  • Bug: G_K = G_K * max((1-damage), 0) is inside the do i = strxb, strxe stress component loop, applying the damage factor N times (exponential (1-damage)^N) instead of once.
  • Fix: Move the damage line before the loop in both s_convert_conservative_to_primitive and s_convert_primitive_to_conservative.
  • Impact: Inside GPU_PARALLEL_LOOP(collapse=3) with G_K as private. Produces exponentially wrong stiffness degradation for multi-component stress tensors.

MPI broadcast fixes

Fix bc_x%ve3 never MPI-broadcast (#1175)

  • File: src/simulation/m_mpi_proxy.fpp
  • Bug: A fypp loop broadcasts bc_x%ve2 twice and bc_x%ve3 (z-velocity component) never. All non-root ranks receive uninitialized garbage, which is then pushed to GPU via GPU_UPDATE.
  • Fix: Change the duplicate 'bc_x%ve2' entry to 'bc_x%ve3' in the fypp broadcast list.
  • Impact: Every multi-rank 3D simulation using velocity boundary conditions on x-boundaries reads garbage ve3 on non-root ranks.

Fix fluid_rho broadcast using MPI_LOGICAL instead of mpi_p (#1176)

  • File: src/pre_process/m_mpi_proxy.fpp
  • Bug: MPI_BCAST(fluid_rho(1), num_fluids_max, MPI_LOGICAL, ...) broadcasts a real(wp) array with the wrong MPI datatype. On 64-bit wp, MPI_LOGICAL (typically 4 bytes) only transfers half the bytes, silently corrupting density values on non-root ranks.
  • Fix: MPI_LOGICALmpi_p
  • Impact: MPI-implementation-dependent. May appear to work with some stacks (if byte layout aligns) and produce garbage with others. Affects pre-process perturbation IC density initialization.

MPI / compiler-sensitive fixes

Fix loc_violations used uninitialized in MPI_Allreduce (#1186)

  • File: src/pre_process/m_data_output.fpp
  • Bug: loc_violations is passed to s_mpi_allreduce_sum without initialization. Whether this matters depends on the compiler: GCC may zero-initialize stack variables in debug mode, NVHPC and CCE may not.
  • Fix: Initialize to 0 and restructure as a local variable (not module-level, avoiding implicit SAVE).
  • Impact: Pre-process grid validation. Compiler-dependent behavior makes HPC testing essential.

Fix domain decomposition overwriting muscl_order (#1229)

  • File: src/common/m_mpi_common.fpp
  • Bug: An else branch in s_mpi_decompose_computational_domain unconditionally sets recon_order = weno_order for all non-IGR cases, even when recon_type == MUSCL_TYPE where muscl_order was already correctly assigned.
  • Fix: Remove the else branch (2 lines deleted).
  • Impact: Wrong reconstruction order causes incorrect ghost-cell counts in domain decomposition for MUSCL cases. Needs multi-rank MPI validation.

Fix NaN check reading wrong index in MPI unpack (#1231)

  • File: src/common/m_mpi_common.fpp
  • Bug: ieee_is_nan(q_comm(i)%sf(j, k, l)) should be q_comm(i)%sf(j + unpack_offset, k, l) — the check reads a stale element instead of the just-unpacked one.
  • Fix: Add unpack_offset to the appropriate index in all 3 directions (x, y, z).
  • Impact: Only compiles under #if defined(__INTEL_COMPILER). Dead code on NVHPC/CCE/GCC, but the diagnostic would miss real NaNs in ghost cells on Intel builds.

Supersedes

Closes #1173, closes #1175, closes #1176, closes #1186, closes #1224, closes #1227, closes #1229, closes #1231

Test plan

  • All GitHub CI checks pass
  • HPC (NVHPC): 3D viscous case, GRCBC multi-species case, damage model case
  • HPC (CCE/AMD): Multi-rank 3D with velocity BCs (validates bc_x%ve3 broadcast)
  • HPC (any): MUSCL multi-rank case (validates domain decomposition fix)
  • Verify no regressions in existing test suite

🤖 Generated with Claude Code


CodeAnt-AI Description

Fix multiple HPC bugs that caused incorrect 3D results, wrong boundary amplitudes, and MPI data corruption

What Changed

  • Corrected a z-boundary finite-difference term so the z-velocity component is used, fixing corrupted divergence/gradient in 3D viscous and elastic simulations
  • Restored proper multi-species GRCBC behavior by writing each species' wave amplitude into its loop index so all species' inflow amplitudes are preserved
  • Fixed MPI broadcasts: bc_x velocity end list now includes the third velocity component, and fluid_rho is broadcast with the correct numeric MPI type to avoid silent bit-corruption on non-root ranks
  • Initialized the local downsample violation counter to zero before reduction so global downsample warnings are accurate and do not sum uninitialized garbage
  • Ensured damage-related shear/elastic moduli are scaled once per cell (moved conditional to avoid repeated multiplicative shrinking inside loops), preventing unintended repeated degradation of elastic terms
  • Tightened NaN checks when unpacking MPI receive buffers so checks examine the correct array locations after offset adjustments

Impact

✅ Correct divergence/gradient in 3D viscous and elastic simulations
✅ All species preserved for subsonic inflow boundary amplitudes
✅ Fewer MPI data corruption cases and accurate cross-rank parameter broadcast

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

Summary by CodeRabbit

Bug Fixes

  • Corrected field indexing in divergence calculations for improved numerical accuracy.
  • Fixed array indexing validation in parallel communication checks.
  • Corrected material property calculations in elastic simulations.
  • Fixed parallel data transmission for fluid properties.
  • Improved boundary condition variable broadcasting for subsonic inflow handling.

sbryngelson and others added 9 commits February 22, 2026 16:01
The z-direction upper-boundary backward difference at iz_s%end uses
fields(2) (y-component) instead of fields(3) (z-component) in the
third term, corrupting the divergence in all 3D simulations using
this finite difference routine.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The bc_x velocity-end broadcast list has bc_x%ve2 duplicated where
bc_x%ve3 should be. bc_y and bc_z rows are correct. Non-root ranks
get uninitialized bc_x%ve3 in multi-rank 3D runs with x-boundary
velocity BCs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fluid_rho is a real(wp) array but is broadcast with MPI_LOGICAL type,
silently corrupting reference densities via bit reinterpretation on
non-root ranks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
loc_violations is never set to 0 before the conditional that may or
may not assign it. Non-violating ranks sum garbage in the reduction.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Initializing a local variable in its declaration gives it the SAVE
attribute in Fortran, meaning it would not reset to zero on subsequent
calls. Move the initialization to an executable assignment so the
variable is properly zeroed each time the subroutine is entered.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
In the GRCBC subsonic inflow loop (do i = 2, momxb), L(2) was
hardcoded instead of L(i), causing only the second wave amplitude
to be updated rather than each wave amplitude in the loop.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The damage factor was applied inside the stress component loop, causing
G_K (and G) to be multiplied by the damage factor on every iteration.
With N stress components, the effective shear modulus was reduced by
damage^N instead of damage^1. Move the damage application before the
loop so it is applied exactly once per cell.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The IGR conditional block unconditionally reset recon_order to
weno_order in its else branch, overwriting the muscl_order that
was correctly set by the recon_type check above. Remove the else
branch so the original recon_order is preserved when IGR is inactive.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The NaN diagnostic check used q_comm(i)%sf(j, k, l) but the value
was unpacked into q_comm(i)%sf(j + unpack_offset, k, l). This meant
the check was reading a stale or unrelated cell instead of the just-
received value.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 22, 2026 21:02
@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 22, 2026

CodeAnt AI is reviewing your PR.


Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 22, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e36eb61 and a90a08f.

📒 Files selected for processing (7)
  • src/common/m_finite_differences.fpp
  • src/common/m_mpi_common.fpp
  • src/common/m_variables_conversion.fpp
  • src/pre_process/m_data_output.fpp
  • src/pre_process/m_mpi_proxy.fpp
  • src/simulation/m_cbc.fpp
  • src/simulation/m_mpi_proxy.fpp

📝 Walkthrough

Walkthrough

This PR addresses seven critical and high-severity bugs across core computational, MPI communication, and boundary condition modules. Changes include correcting a finite-difference stencil field index, fixing MPI broadcast type and variable duplication, initializing uninitialized variables, removing erroneous conditional branches, and adjusting loop indexing patterns.

Changes

Cohort / File(s) Summary
Finite Difference Stencil
src/common/m_finite_differences.fpp
Corrected z-direction boundary backward FD stencil to use fields(3) instead of fields(2) for the third term at the upper z boundary (iz_s%end).
MPI Communication Fixes
src/common/m_mpi_common.fpp, src/pre_process/m_mpi_proxy.fpp, src/simulation/m_mpi_proxy.fpp
Fixed NaN check indexing in MPI unpack to use unpack_offset, removed conditional override of MUSCL reconstruction order, changed fluid_rho broadcast type from MPI_LOGICAL to mpi_p, and corrected bc_x%ve3 broadcast by removing duplicate bc_x%ve2.
Damage Model Calculation
src/common/m_variables_conversion.fpp
Moved shear modulus damage application (G_K, G) from inside the stress component loop to a pre-loop single assignment to prevent exponential compounding of damage across iterations.
Boundary Condition Handling
src/simulation/m_cbc.fpp
Fixed GRCBC subsonic inflow to use loop variable L(i) instead of fixed index L(2) when updating wave amplitudes in the momentum index range.
Variable Initialization
src/pre_process/m_data_output.fpp
Added explicit initialization of loc_violations to 0 at the start of data writing paths to ensure correct behavior in MPI_Allreduce operations.

Possibly related issues

  • #1173: Fix z-boundary backward FD stencil using fields(2) instead of fields(3) — The change in src/common/m_finite_differences.fpp directly addresses this critical bug affecting all 3D divergence computations.
  • #1175: Fix bc_x%ve3 never MPI-broadcast (duplicate ve2 instead) — The correction in src/simulation/m_mpi_proxy.fpp fixes the missing broadcast of boundary condition variable bc_x%ve3.
  • #1176: Fix fluid_rho broadcast using MPI_LOGICAL instead of mpi_p — The fix in src/pre_process/m_mpi_proxy.fpp corrects the MPI datatype mismatch that silently corrupted reference densities.
  • #1186: Fix loc_violations used uninitialized in MPI_Allreduce — The initialization added in src/pre_process/m_data_output.fpp prevents uninitialized values in MPI reductions.
  • #1224: Fix GRCBC subsonic inflow using wrong L index — The loop variable fix in src/simulation/m_cbc.fpp ensures all wave amplitudes are correctly updated.
  • #1227: Fix G_K exponential degradation in damage model — The pre-loop damage application in src/common/m_variables_conversion.fpp prevents the exponential compounding bug.
  • #1229: Fix domain decomposition overwriting muscl_order — The removal of the unconditional else branch in src/common/m_mpi_common.fpp preserves MUSCL reconstruction order.
  • #1231: Fix NaN check reading wrong index in MPI unpack — The offset indexing correction in src/common/m_mpi_common.fpp ensures diagnostic checks inspect correct array cells.

Suggested labels

bug, critical, core-physics, mpi, boundary-conditions

Suggested reviewers

  • wilfonba

Poem

🐰 A rabbit's ode to the bug-fix brigade:

Seven bugs squashed, their indices true,
MPI broadcasts now know what to do,
Damage no longer compounds with each turn,
Divergence stencils compute what they yearn! 🌻

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The PR title 'Fix 8 HPC-sensitive bugs: GPU kernels, MPI broadcast, domain decomposition' accurately and concisely summarizes the main changes—batching eight critical bug fixes affecting GPU kernels, MPI communication, and domain decomposition logic across multiple files.
Description check ✅ Passed The PR description is comprehensive and well-structured, including a clear summary of all eight bugs, their locations, impacts, and test plans. It follows the template structure with 'Type of change' (Bug fix), 'Testing' section, and GPU changes checklist.
Linked Issues check ✅ Passed All eight linked issues (#1173, #1175, #1176, #1186, #1224, #1227, #1229, #1231) have corresponding code changes in the PR that address their stated objectives: z-boundary FD stencil fix, bc_x%ve3 broadcast, fluid_rho MPI type, loc_violations initialization, GRCBC L indexing, G_K damage application, domain decomposition recon_order, and NaN check index correction.
Out of Scope Changes check ✅ Passed All changes in the PR are directly related to the eight linked issues and their stated objectives. No unrelated modifications to other files or features are present; the six modified files correspond exactly to the bug fixes outlined in the PR description.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codeant-ai codeant-ai bot added the size:S This PR changes 10-29 lines, ignoring generated files label Feb 22, 2026
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 7 files

Confidence score: 5/5

  • Automated review surfaced no issues in the provided summaries.
  • No files require special attention.

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 22, 2026

Nitpicks 🔍

🔒 No security issues identified
⚡ Recommended areas for review

  • Potential OOB (L array)
    The new GRCBC inflow loop assigns to L(i) for i=2..momxb and later accesses L at indices like momxb+1, momxb+2, and advxe. L is declared conditionally as dimension(sys_size) or dimension(20) (for AMD). Verify that in all build/config combinations L has sufficient length (>= max(advxe, momxb+2)) to avoid out-of-bounds writes/reads when the GRCBC branch is enabled. Also ensure GPU/private usage of L matches the declared size (private arrays in GPU regions must be properly sized).

  • MPI count mismatch
    The broadcast of fluid_rho uses num_fluids_max as the count. If the actual number of fluids is num_fluids (broadcast earlier), broadcasting the full max length may send or expect uninitialised padding. Verify intended behaviour and consider using num_fluids to limit the count to active entries.

  • Boundary stencil safety
    The z-boundary backward finite-difference stencil was corrected to reference fields(3) (z-velocity). Verify that the index z-2 is always valid for the code path that runs this branch (no out-of-bounds reads). Confirm the boundary indexes and halo/padded ranges used by the GPU kernel guarantee z-2 is in-bounds in all configurations (including small local subdomains).

  • MPI broadcast list consistency
    The MPI broadcast list was corrected to include distinct bc_x%vb*/bc_x%ve* members (previously there was a duplicate). Confirm that the newly added members exist on the derived types, that their Fortran types match the MPI datatype mpi_p used in the call, and that the ordering/number of MPI_BCAST calls matches the receiving side expectations. A mismatch in types/counts or a missing member can lead to silent data corruption across ranks.

  • MPI datatype correctness
    The call uses mpi_p as the MPI datatype token for fluid_rho. Confirm mpi_p matches the Fortran kind real(wp) (e.g. MPI_DOUBLE_PRECISION or an appropriate alias). A mismatch between mpi_p and the variable kind can lead to incorrect data transfer across MPI implementations or compilers.

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 22, 2026

CodeAnt AI finished reviewing your PR.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR batches several correctness fixes in GPU-executed kernels and MPI communication paths (broadcasts + halo exchange diagnostics), plus a domain-decomposition bug affecting reconstruction order—aimed at eliminating silent corruption and compiler/MPI-stack-dependent behavior in HPC runs.

Changes:

  • Fix multiple GPU-path physics/kernel correctness issues (FD stencil component, GRCBC wave amplitudes indexing, damage-model shear modulus degradation).
  • Fix MPI communication correctness (missing broadcast field, wrong MPI datatype, Intel-only NaN check indexing) and a domain decomposition reconstruction-order overwrite.
  • Fix compiler-sensitive undefined behavior in pre_process by initializing a reduction input.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/common/m_finite_differences.fpp Corrects z-boundary backward FD stencil to use fields(3) (z-velocity) consistently.
src/simulation/m_cbc.fpp Fixes GRCBC subsonic inflow loop to write L(i) instead of overwriting L(2).
src/common/m_variables_conversion.fpp Applies continuous damage factor to G_K/G once (outside the stress-component loop) to avoid exponential compounding.
src/simulation/m_mpi_proxy.fpp Fixes user-input broadcast list to include bc_x%ve3 (was duplicating bc_x%ve2).
src/pre_process/m_mpi_proxy.fpp Broadcasts fluid_rho with the correct MPI real datatype (mpi_p) instead of MPI_LOGICAL.
src/pre_process/m_data_output.fpp Initializes loc_violations before MPI reduction to avoid undefined values on some compilers.
src/common/m_mpi_common.fpp Fixes Intel-only NaN check to read the just-unpacked element and prevents non-IGR runs from clobbering MUSCL recon order.

@codecov
Copy link

codecov bot commented Feb 23, 2026

Codecov Report

❌ Patch coverage is 50.00000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 44.05%. Comparing base (df28255) to head (a90a08f).
⚠️ Report is 10 commits behind head on master.

Files with missing lines Patch % Lines
src/common/m_variables_conversion.fpp 0.00% 1 Missing and 1 partial ⚠️
src/common/m_finite_differences.fpp 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1242      +/-   ##
==========================================
- Coverage   44.05%   44.05%   -0.01%     
==========================================
  Files          70       70              
  Lines       20496    20496              
  Branches     1989     1991       +2     
==========================================
- Hits         9030     9029       -1     
  Misses      10328    10328              
- Partials     1138     1139       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sbryngelson and others added 3 commits February 23, 2026 09:48
The z-direction upper-boundary backward difference at iz_s%end uses
fields(2) (y-component) instead of fields(3) (z-component) in the
third term, corrupting the divergence in all 3D simulations using
this finite difference routine.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The bc_x velocity-end broadcast list has bc_x%ve2 duplicated where
bc_x%ve3 should be. bc_y and bc_z rows are correct. Non-root ranks
get uninitialized bc_x%ve3 in multi-rank 3D runs with x-boundary
velocity BCs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fluid_rho is a real(wp) array but is broadcast with MPI_LOGICAL type,
silently corrupting reference densities via bit reinterpretation on
non-root ranks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
sbryngelson and others added 6 commits February 23, 2026 09:48
loc_violations is never set to 0 before the conditional that may or
may not assign it. Non-violating ranks sum garbage in the reduction.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Initializing a local variable in its declaration gives it the SAVE
attribute in Fortran, meaning it would not reset to zero on subsequent
calls. Move the initialization to an executable assignment so the
variable is properly zeroed each time the subroutine is entered.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
In the GRCBC subsonic inflow loop (do i = 2, momxb), L(2) was
hardcoded instead of L(i), causing only the second wave amplitude
to be updated rather than each wave amplitude in the loop.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The damage factor was applied inside the stress component loop, causing
G_K (and G) to be multiplied by the damage factor on every iteration.
With N stress components, the effective shear modulus was reduced by
damage^N instead of damage^1. Move the damage application before the
loop so it is applied exactly once per cell.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The IGR conditional block unconditionally reset recon_order to
weno_order in its else branch, overwriting the muscl_order that
was correctly set by the recon_type check above. Remove the else
branch so the original recon_order is preserved when IGR is inactive.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The NaN diagnostic check used q_comm(i)%sf(j, k, l) but the value
was unpacked into q_comm(i)%sf(j + unpack_offset, k, l). This meant
the check was reading a stale or unrelated cell instead of the just-
received value.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sbryngelson sbryngelson force-pushed the fix/hpc-bugfixes-batch branch from 3adf7d6 to 239e520 Compare February 23, 2026 14:48
@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 23, 2026

CodeAnt AI is running Incremental review


Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

@codeant-ai codeant-ai bot added size:S This PR changes 10-29 lines, ignoring generated files and removed size:S This PR changes 10-29 lines, ignoring generated files labels Feb 23, 2026
@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 23, 2026

CodeAnt AI Incremental review completed.

sbryngelson and others added 2 commits February 23, 2026 10:26
The Intel-compiler NaN check prints logged raw loop indices (j,k,l)
instead of the actual array indices used (j+unpack_offset, etc.),
making the diagnostic misleading when debugging MPI recv errors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link

Claude Code Review

7 files changed:

  • src/common/m_finite_differences.fpp
  • src/common/m_mpi_common.fpp
  • src/common/m_variables_conversion.fpp
  • src/pre_process/m_data_output.fpp
  • src/pre_process/m_mpi_proxy.fpp
  • src/simulation/m_cbc.fpp
  • src/simulation/m_mpi_proxy.fpp

Summary

  • z-boundary FD stencil fix (m_finite_differences.fpp): Corrects fields(2)fields(3) in the backward-difference z-boundary stencil, ensuring the z-velocity component is used instead of y-velocity in divergence/gradient calculations for all 3D viscous, hyperelastic, and hypoelastic simulations.
  • GRCBC loop-index fix (m_cbc.fpp): Changes L(2)L(i) inside do i = 2, momxb, so each species' characteristic wave amplitude is written to its own index rather than all overwriting index 2.
  • Damage model G_K/G fix (m_variables_conversion.fpp): Moves the cont_damage degradation factor outside the do i = strxb, strxe stress loop so it is applied once per cell rather than multiplicatively on every stress component iteration.
  • MPI broadcast fixes (m_mpi_proxy.fpp ×2, m_mpi_common.fpp): Three separate MPI fixes — bc_x%ve3 was never broadcast (duplicate ve2 entry), fluid_rho was broadcast with MPI_LOGICAL instead of mpi_p (real datatype), and the recon_order = weno_order else branch that silently overwrote correct MUSCL assignments is removed.
  • Uninitialized variable fix (m_data_output.fpp): Initializes loc_violations = 0._wp before the down_sample conditional block, preventing uninitialized garbage from propagating into MPI_Allreduce.
  • NaN diagnostic index fix (m_mpi_common.fpp): Adds unpack_offset to the correct index dimension in all three directional unpack NaN checks so the check examines the freshly unpacked element rather than a stale one.

Findings

No bugs found. Checked for bugs and CLAUDE.md compliance (no CLAUDE.md files exist in this repository).

Improvement opportunities:

  1. src/pre_process/m_mpi_proxy.fpp — The fluid_rho broadcast now correctly uses mpi_p, but the count argument is still num_fluids_max. If only num_fluids entries are active, this broadcasts uninitialized padding to all non-root ranks. Changing the count to num_fluids (broadcast earlier in the same function) would tighten the contract and avoid any reliance on zero-initialization of the max-length array.

  2. src/simulation/m_mpi_proxy.fpp — The fix targets the bc_x broadcast list. It would be worth auditing the analogous bc_y and bc_z broadcast lists in the same fypp loop for similar duplicate/missing entries, as copy-paste errors in these lists tend to co-occur.

  3. src/common/m_mpi_common.fpp — The improved NaN diagnostics (ieee_is_nan with unpack_offset) are gated behind #if defined(__INTEL_COMPILER). Extending coverage to other compilers (NVHPC, CCE, GCC) would make these checks useful on the HPC targets where this PR's other fixes are most critical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S This PR changes 10-29 lines, ignoring generated files

Development

Successfully merging this pull request may close these issues.

1 participant