Fix missing private clauses in 3D viscous GPU loops by sbryngelson · Pull Request #1225 · MFlowCode/MFC

sbryngelson · 2026-02-21T22:57:17Z

User description

Summary

The 3D shear stress and bulk stress GPU_PARALLEL_LOOP directives were missing rho_visc, gamma_visc, pi_inf_visc, and alpha_visc_sum from their private clauses
The corresponding 2D loops already included these variables
Without privatization, these variables can cause race conditions when running on GPU

Test plan

Verify 3D viscous test cases produce correct results on GPU
No golden file changes expected (fixes GPU race condition, CPU results unchanged)

🤖 Generated with Claude Code

Summary by CodeRabbit

Performance
- Improved GPU parallelization for viscous stress calculations, yielding faster and more reliable multi-dimensional simulation runs with consistent results across threads. No changes to public interfaces or simulation behavior; this is an internal parallelization enhancement that reduces risk of race conditions and can improve throughput on GPU-enabled systems.

CodeAnt-AI Description

Fix GPU race conditions in 3D viscous computations by privatizing temporaries

What Changed

The 3D shear-stress and bulk-stress GPU loops now treat loop indices and temporary variables (including q, i, j, k, l, rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum and related temporaries) as private to each GPU thread
This prevents threads from sharing those temporaries during GPU execution, eliminating data races that could corrupt 3D viscous results on GPU
CPU execution and outputs remain unchanged

Impact

✅ Correct 3D viscous results on GPU
✅ Fewer GPU race-condition artifacts in viscous simulations
✅ No change to CPU outputs

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

cubic-dev-ai

No issues found across 1 file

Confidence score: 5/5

Automated review surfaced no issues in the provided summaries.
No files require special attention.

src/simulation/m_viscous.fpp

Copilot

Pull request overview

Fixes a GPU data-race hazard in the 3D viscous stress kernels by aligning their GPU_PARALLEL_LOOP privatization with the already-correct 2D implementations.

Changes:

Add missing mixture variables (rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum) to the private clauses for the 3D shear-stress and bulk-stress GPU parallel loops.

coderabbitai

🧹 Nitpick comments (1)

src/simulation/m_viscous.fpp (1)
321-321: LGTM — consider adding q to the private clause to be fully explicit

The added variables (i, j, k, l, rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum) correctly mirror the 2D shear-stress loop at line 104 and fix the GPU race condition.

The integer q (declared at line 83) is used as the induction variable of the inner do q = 1, Re_size(i) seq sub-loop (lines 392–396). While OpenACC treats subroutine-local scalars as implicitly private in a parallel region, the project guideline calls for all loop-local variables to be listed explicitly. The same gap exists in the 2D counterpart at line 104.
💡 Suggested addition
-$:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')
+$:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,q,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')
(The same addition should be applied to the 2D shear/bulk loops at lines 104 and 214, and to the 3D bulk loop at line 430.)

Based on learnings: "Ensure private(...) declarations on all loop-local variables in GPU-accelerated code to prevent unintended data sharing."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/simulation/m_viscous.fpp` at line 321, Add the induction variable q to
the private clause of the GPU_PARALLEL_LOOP directives to explicitly mark it as
loop-local: update the existing GPU_PARALLEL_LOOP(collapse=3, private='[...]')
that governs the 3D shear-stress loop (and the analogous GPU_PARALLEL_LOOP
directives used for the 2D shear and bulk loops and the 3D bulk loop) so that
'q' is included among the private variables alongside
i,j,k,l,rho_visc,gamma_visc,pi_inf_visc,alpha_visc_sum,alpha_visc,alpha_rho_visc,Re_visc,tau_Re;
this ensures the inner "do q = 1, Re_size(i)" seq sub-loop uses a private
induction variable and satisfies the project guideline for explicit loop-local
private declarations.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/simulation/m_viscous.fpp`:
- Line 430: The OpenACC/OpenMP GPU_PARALLEL_LOOP directive missing the
inner-loop induction variable causes incorrect privatization; update the
directive that currently lists private='[i,j,k,l,rho_visc, gamma_visc,
pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]' to
also include the inner seq-loop induction variable q so it matches the 2D
bulk-stress loop’s private clause and the inner seq loop that uses q (the inner
loop around lines 501–505 referenced in the review).

---

Nitpick comments:
In `@src/simulation/m_viscous.fpp`:
- Line 321: Add the induction variable q to the private clause of the
GPU_PARALLEL_LOOP directives to explicitly mark it as loop-local: update the
existing GPU_PARALLEL_LOOP(collapse=3, private='[...]') that governs the 3D
shear-stress loop (and the analogous GPU_PARALLEL_LOOP directives used for the
2D shear and bulk loops and the 3D bulk loop) so that 'q' is included among the
private variables alongside
i,j,k,l,rho_visc,gamma_visc,pi_inf_visc,alpha_visc_sum,alpha_visc,alpha_rho_visc,Re_visc,tau_Re;
this ensures the inner "do q = 1, Re_size(i)" seq sub-loop uses a private
induction variable and satisfies the project guideline for explicit loop-local
private declarations.

codecov · 2026-02-22T02:38:51Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 44.05%. Comparing base (df28255) to head (3fde8dd).
⚠️ Report is 9 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #1225   +/-   ##
=======================================
  Coverage   44.05%   44.05%           
=======================================
  Files          70       70           
  Lines       20496    20496           
  Branches     1989     1989           
=======================================
  Hits         9030     9030           
  Misses      10328    10328           
  Partials     1138     1138

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codeant-ai · 2026-02-22T23:23:49Z

CodeAnt AI is running Incremental review

Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

codeant-ai · 2026-02-22T23:25:51Z

src/simulation/m_viscous.fpp


            if (shear_stress) then    ! Shear stresses
-                $:GPU_PARALLEL_LOOP(collapse=3, private='[alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')
+                $:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')


Suggestion: Include the additional loop iterator in the GPU private list for this parallel loop to avoid potential data races and fully comply with the GPU macro guidelines. [custom_rule]

Severity Level: Minor ⚠️

Suggested change

$:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')

$:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,q,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')

Why it matters? ⭐

The improved code adds the loop-local iterator q to the GPU macro private list, which directly fixes a GPU-macro guideline violation in the PR: loop-local scalars used inside parallel regions must be privatized (see "GPU — private(...) on loop-local variables" in the review rules). The inner Re_size loop uses q inside the GPU_PARALLEL_LOOP region and leaving it shared risks data races on device/host. The suggested improved_code is syntactically consistent with existing GPU macro usage and resolves the rule violation.

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** src/simulation/m_viscous.fpp **Line:** 322:322 **Comment:** *Custom Rule: Include the additional loop iterator in the GPU private list for this parallel loop to avoid potential data races and fully comply with the GPU macro guidelines. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

👍 | 👎

codeant-ai · 2026-02-22T23:25:51Z

src/simulation/m_viscous.fpp


            if (bulk_stress) then    ! Bulk stresses
-                $:GPU_PARALLEL_LOOP(collapse=3, private='[alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')
+                $:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')


Suggestion: Add the missing loop iterator to the GPU private list for this bulk-stress parallel loop so that all loop-local scalars are correctly privatized. [custom_rule]

Severity Level: Minor ⚠️

Suggested change

$:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')

$:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,q,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')

Why it matters? ⭐

Like the shear-stress case, the bulk-stress parallel region contains an inner loop "do q = 1, Re_size(i)". Omitting q from the private list violates the GPU macro guideline to mark loop-local iterators private and can produce data races on device. Adding q to the private list directly addresses this correctness issue per the repository GPU rules.

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** src/simulation/m_viscous.fpp **Line:** 431:431 **Comment:** *Custom Rule: Add the missing loop iterator to the GPU private list for this bulk-stress parallel loop so that all loop-local scalars are correctly privatized. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

👍 | 👎

codeant-ai · 2026-02-22T23:25:55Z

CodeAnt AI Incremental review completed.

codeant-ai · 2026-02-23T14:40:59Z

CodeAnt AI is running Incremental review

Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

codeant-ai · 2026-02-23T14:42:26Z

CodeAnt AI Incremental review completed.

coderabbitai

🧹 Nitpick comments (1)

src/simulation/m_viscous.fpp (1)
105-105: Minor: trailing space before comma in alpha_visc_sum ,alpha_visc.

Lines 105 and 215 have alpha_visc_sum ,alpha_visc (space before the comma), which is inconsistent with the standard alpha_visc_sum, alpha_visc formatting used in the corresponding 3D blocks at lines 322 and 431.
✏️ Proposed fix (both occurrences)
-$:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,q,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum ,alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')
+$:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,q,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')
Also applies to: 215-215
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/simulation/m_viscous.fpp` at line 105, Fix the minor formatting typo in
the GPU_PARALLEL_LOOP private list: remove the stray space before the comma so
the variables read "alpha_visc_sum, alpha_visc" (consistent with the 3D blocks);
update both occurrences where "alpha_visc_sum ,alpha_visc" appears (lines around
the GPU_PARALLEL_LOOP uses) so the private clause lists "alpha_visc_sum,
alpha_visc" exactly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/simulation/m_viscous.fpp`:
- Line 105: Fix the minor formatting typo in the GPU_PARALLEL_LOOP private list:
remove the stray space before the comma so the variables read "alpha_visc_sum,
alpha_visc" (consistent with the 3D blocks); update both occurrences where
"alpha_visc_sum ,alpha_visc" appears (lines around the GPU_PARALLEL_LOOP uses)
so the private clause lists "alpha_visc_sum, alpha_visc" exactly.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e892dc and 164cda3.

📒 Files selected for processing (1)

src/simulation/m_viscous.fpp

The 3D shear stress and bulk stress GPU parallel loops were missing rho_visc, gamma_visc, pi_inf_visc, and alpha_visc_sum from their private clauses. The corresponding 2D loops already had these variables listed. Without privatization, these variables could cause race conditions on GPU. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add i,j,k,l to the private list for the 3D shear_stress and bulk_stress GPU parallel loops, matching the pattern already used by the analogous 2D loops at lines 105 and 215. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The sequential loop iterator q (used in `do q = 1, Re_size(i)`) was not privatized in any of the four GPU parallel regions. Without explicit privatization, q is shared across GPU threads on OpenACC and AMD OpenMP backends, causing a data race in Reynolds number computation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-02-23T14:54:37Z

Claude Code Review

Files changed: 1

src/simulation/m_viscous.fpp

Summary of changes

Adds q to the private clause of the 2D shear stress GPU_PARALLEL_LOOP directive (num_dims > 1) at line ~105
Adds q to the private clause of the 2D bulk stress GPU_PARALLEL_LOOP directive (num_dims > 1) at line ~215
Expands the 3D shear stress GPU_PARALLEL_LOOP private clause (num_dims > 2) at line ~322 from 4 variables to 10, matching the 2D equivalent
Expands the 3D bulk stress GPU_PARALLEL_LOOP private clause (num_dims > 2) at line ~431 from 4 variables to 10, matching the 2D equivalent
Fixes GPU race conditions in 3D viscous stress calculations by privatizing loop indices (i,j,k,l) and intermediate variables (rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, q)

Findings

No bugs or CLAUDE.md violations found. The fix is correct: all four GPU parallel loops now carry identical and complete private variable lists.

Improvement opportunities:

Inconsistent spacing in private clause string (src/simulation/m_viscous.fpp, lines ~105, ~215): The 2D loops retained the pre-existing alpha_visc_sum ,alpha_visc (space before comma), while the newly written 3D clauses use alpha_visc_sum, alpha_visc (space after comma). These are the only two lines still carrying the irregular spacing.
Duplicated private variable list across four directives (src/simulation/m_viscous.fpp): The same 10-variable private clause is now copy-pasted into four GPU_PARALLEL_LOOP calls. If a new thread-local variable is introduced in the loop body, all four sites must be updated—exactly the class of bug this PR fixes. Extracting the list into a shared Fypp macro (e.g., VISCOUS_PRIVATE_VARS) would make future additions a single-point change.
No in-tree test exercising 3D GPU privatization (src/simulation/m_viscous.fpp): The PR description notes that the fix targets a GPU race condition only observable in 3D cases on GPU hardware, and states "Verify 3D viscous test cases produce correct results on GPU" as a manual step. A regression test (or CI case) covering 3D viscous flow on GPU would catch any future regression of this class automatically.

Head SHA: 9b3ee97 — reviewed by Claude Code

sbryngelson requested review from Copilot and wilfonba February 21, 2026 22:57

This comment has been minimized.

Sign in to view

Copilot started reviewing on behalf of sbryngelson February 21, 2026 22:57 View session

codeant-ai bot added the size:XS This PR changes 0-9 lines, ignoring generated files label Feb 21, 2026

cubic-dev-ai bot reviewed Feb 21, 2026

View reviewed changes

codeant-ai bot reviewed Feb 21, 2026

View reviewed changes

src/simulation/m_viscous.fpp Outdated Show resolved Hide resolved

src/simulation/m_viscous.fpp Outdated Show resolved Hide resolved

This comment has been minimized.

Sign in to view

Copilot AI reviewed Feb 21, 2026

View reviewed changes

coderabbitai bot reviewed Feb 21, 2026

View reviewed changes

sbryngelson mentioned this pull request Feb 22, 2026

Fix 8 HPC-sensitive bugs: GPU kernels, MPI broadcast, domain decomposition #1242

Open

5 tasks

codeant-ai bot added size:XS This PR changes 0-9 lines, ignoring generated files and removed size:XS This PR changes 0-9 lines, ignoring generated files labels Feb 22, 2026

codeant-ai bot reviewed Feb 22, 2026

View reviewed changes

codeant-ai bot added size:XS This PR changes 0-9 lines, ignoring generated files and removed size:XS This PR changes 0-9 lines, ignoring generated files labels Feb 23, 2026

coderabbitai bot reviewed Feb 23, 2026

View reviewed changes

sbryngelson and others added 3 commits February 23, 2026 09:48

sbryngelson force-pushed the fix/viscous-3d-gpu-private branch from 164cda3 to 9b3ee97 Compare February 23, 2026 14:48

	$:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')
	$:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,q,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')

Comments

Conversation

sbryngelson commented Feb 21, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

Summary

Test plan

Summary by CodeRabbit

CodeAnt-AI Description

What Changed

Impact

Checking Your Pull Request

Talking to CodeAnt AI

Example

Preserve Org Learnings with CodeAnt

Example

Retrigger review

Check Your Repository Health

Uh oh!

This comment has been minimized.

This comment has been minimized.

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

This comment has been minimized.

This comment has been minimized.

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

codeant-ai bot commented Feb 22, 2026

Thanks for using CodeAnt! 🎉

Uh oh!

codeant-ai bot Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

codeant-ai bot Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

codeant-ai bot commented Feb 22, 2026

Uh oh!

codeant-ai bot commented Feb 23, 2026

Thanks for using CodeAnt! 🎉

Uh oh!

codeant-ai bot commented Feb 23, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 23, 2026

Claude Code Review

Summary of changes

Findings

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

sbryngelson commented Feb 21, 2026 •

edited by coderabbitai bot

Loading

codecov bot commented Feb 22, 2026 •

edited

Loading