Debug hipergator mpi #1089

danieljvickers · 2025-12-15T17:20:53Z

User description

Description

This has changes to resolve the MPI mode on the hipergator system

Fixes #1056

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)

Scope

How Has This Been Tested?

I ran various examples on the hipergator system with NVHPC/25.9 and OpenMPI/5.0.7 loaded

Checklist

PR Type

Bug fix

Description

Fix MPI I/O operations by specifying array element indices
Remove debug print statement from serial data reading
Update hipergator module configuration for NVHPC/25.9
Simplify MPI I/O data size parameter in one code path

Diagram Walkthrough

flowchart LR
  A["MPI I/O Operations"] -->|"Add array indices"| B["MPI_FILE_READ/WRITE calls"]
  C["Debug Code"] -->|"Remove"| D["Print statement"]
  E["Hipergator Config"] -->|"Update modules"| F["NVHPC/25.9 & OpenMPI/5.0.7"]
  B --> G["Fixed MPI I/O"]

File Walkthrough

Relevant files

Bug fix

m_data_input.f90 `Remove debug print statement` src/post_process/m_data_input.f90 Removed debug print statement that output `q_cons_vf(i)%sf(:, 0, 0)` after reading data	+0/-1
m_data_output.fpp `Fix MPI I/O array indexing in write operations` src/pre_process/m_data_output.fpp Changed `MPI_FILE_WRITE_ALL` calls to specify array element `(1, 1, 1)` instead of entire array Applied fix to three separate MPI write operations for different variable groups Simplified data size parameter in one write call from `data_size*mpi_io_type` to `data_size`	+4/-3
m_start_up.fpp `Fix MPI I/O array indexing in read operations` src/simulation/m_start_up.fpp Changed `MPI_FILE_READ` and `MPI_FILE_READ_ALL` calls to specify array element `(1, 1, 1)` instead of entire array Applied fix to four separate MPI read operations for different variable groups Simplified data size parameter in one read call from `data_size*mpi_io_type` to `data_size`	+6/-5

Configuration changes

modules `Update hipergator module configuration` toolchain/modules Replaced manual MPI path configuration with module load `openmpi/5.0.7` Removed hardcoded CUDA and OpenMPI paths for NVHPC/25.3 Updated `LD_LIBRARY_PATH` to use NVHPC/25.9 math libraries Added UCX network device configuration for InfiniBand optimization	+4/-7

CodeAnt-AI Description

Fix MPI file I/O alignment, remove stray debug print, and update Hipergator modules

What Changed

MPI file read/write now target the first element of each variable array when performing parallel I/O so files are read/written with correct offsets (fixes misaligned or corrupted MPI I/O).
Some MPI read/write calls use the corrected data-size argument so collective I/O uses the proper byte count.
Removed a debug print that previously produced extra console output when reading serial data files.
Updated the Hipergator module configuration to load NVHPC 25.9 and OpenMPI 5.0.7 and set library and UCX device environment entries for GPU runs.

Impact

✅ Fewer MPI I/O data corruption errors during checkpoint/load
✅ Clearer console output when loading serial data (no stray debug prints)
✅ Works with Hipergator NVHPC 25.9 + OpenMPI 5.0.7 GPU runs

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

Summary by CodeRabbit

Chores
- Removed a runtime debug print to reduce console noise during startup and I/O.
- Adjusted MPI file read/write buffer handling to make parallel I/O more explicit and consistent.
- Updated GPU/module environment and library path settings to align the runtime/toolchain configuration.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

codeant-ai · 2025-12-15T17:20:57Z

CodeAnt AI is reviewing your PR.

Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

coderabbitai · 2025-12-15T17:21:18Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Removed a debug print in post-process input; changed MPI I/O buffer arguments to use explicit subarray indexing (...%sf(1,1,1)) for multiple reads/writes in startup and output; and updated the toolchain modules file to prefer openmpi/5.0.7, adjust CUDA/LD_LIBRARY_PATH/UCX environment variables and Python to 3.12. No public interfaces changed.

Changes

Cohort / File(s)	Summary
Fortran I/O changes `src/post_process/m_data_input.f90`, `src/pre_process/m_data_output.fpp`, `src/simulation/m_start_up.fpp`	Removed a single runtime debug print in `m_data_input.f90`. Replaced MPI read/write buffer arguments from `MPI_IO_DATA%var(i)%sf` to `MPI_IO_DATA%var(i)%sf(1,1,1)` across serial and parallel branches; data sizes and control flow unchanged.
Toolchain / modules `toolchain/modules`	Replaced CUDA/HPC-OMPI–centric environment with `openmpi/5.0.7` and `python/3.12`, removed CUDA_HOME/HPC_OMPI vars, set `MFC_CUDA_CC=100`, updated `LD_LIBRARY_PATH` to NVHPC 25.9 math libs, and added `UCX_NET_DEVICES`. No API changes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Verify MPI buffer addresses/subarray usage matches Fortran array memory layout and any MPI derived datatype requirements.
Confirm removed debug print is not relied on by tests or scripts.
Validate module changes on target systems (HiPerGator/CI) for MPI+GPU runs.

Possibly related PRs

Oscar Mako and module update #1082 — edits to toolchain/modules with Python/OpenMPI-related updates; likely related to environment/module adjustments in this PR.

Suggested labels

Review effort 3/5, size:M

Suggested reviewers

sbryngelson

Poem

🐰
I hopped through buffers, careful and neat,
trimmed a print, aligned reads so they meet.
OpenMPI wakes, UCX on the way,
the little rabbit cheered — code saves the day! 🥕✨

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'Debug hipergator mpi' is vague and doesn't clearly summarize the actual changes made to the codebase.	Use a more descriptive title that reflects the main changes, such as 'Fix MPI I/O alignment and update Hipergator modules for NVHPC/25.9' or 'Resolve MPI issues on Hipergator by fixing I/O operations and updating module configuration'.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description check	✅ Passed	The PR description is comprehensive, includes issue reference (#1056), identifies bug fix type, explains testing approach, and provides detailed explanations of changes across all modified files.
Linked Issues check	✅ Passed	The PR directly addresses issue #1056 by fixing MPI I/O operations (array indexing in read/write calls), removing debug output, and updating Hipergator module configuration for NVHPC/25.9 and OpenMPI/5.0.7 compatibility.
Out of Scope Changes check	✅ Passed	All changes are directly related to resolving MPI on Hipergator: MPI I/O fixes in data read/write operations, debug cleanup in data input, and module configuration updates for the target system.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fa1a187 and a0b3f00.

📒 Files selected for processing (1)

toolchain/modules (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

toolchain/modules

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)

GitHub Check: Code Cleanliness Check
GitHub Check: Github (ubuntu, mpi, no-debug, false)
GitHub Check: Github (ubuntu, mpi, no-debug, true)
GitHub Check: Github (ubuntu, no-mpi, single, no-debug, false)
GitHub Check: Github (macos, mpi, no-debug, false)
GitHub Check: Github (ubuntu, mpi, debug, false)
GitHub Check: Github (ubuntu, mpi, debug, true)
GitHub Check: Github (macos, mpi, debug, false)
GitHub Check: Coverage Test on CodeCov
GitHub Check: Build & Publish

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

qodo-code-review · 2025-12-15T17:21:28Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🎫 Ticket compliance analysis 🔶 1056 - Partially compliant Compliant requirements: Resolve via changes to the toolchain modules file. Non-compliant requirements: Fix HiPerGator GPU runs so they work with MPI enabled. Definition of done: an example can be run successfully with MPI enabled on HiPerGator. Requires further human verification: Re-run the exact reproduction steps on HiPerGator and confirm GPU+MPI works end-to-end. Confirm multiple examples (and multi-rank runs) work, not just a single case/config.
⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Possible Issue The new MPI-IO calls pass `MPI_IO_DATA%var(i)%sf(1, 1, 1)` as the buffer start. Elsewhere arrays appear to be read with `0:m,0:n,0:p` bounds, so hardcoding `(1,1,1)` may skip the first plane or even be out-of-bounds depending on the array lower bounds. Consider using `lbound`-based indexing (e.g., `sf(lbound(sf,1), lbound(sf,2), lbound(sf,3))`) or ensuring the array is definitively 1-based in all builds. call MPI_FILE_WRITE_ALL(ifile, MPI_IO_DATA%var(i)%sf(1, 1, 1), data_sizempi_io_type, & mpi_io_p, status, ierr) end do !Additional variables pb and mv for non-polytropic qbmm if (qbmm .and. .not. polytropic) then do i = sys_size + 1, sys_size + 2nbnnode var_MOK = int(i, MPI_OFFSET_KIND) ! Initial displacement to skip at beginning of file disp = m_MOKmax(MOK, n_MOK)max(MOK, p_MOK)WP_MOK(var_MOK - 1) call MPI_FILE_SET_VIEW(ifile, disp, mpi_io_p, MPI_IO_DATA%view(i), & 'native', mpi_info_int, ierr) call MPI_FILE_WRITE_ALL(ifile, MPI_IO_DATA%var(i)%sf(1, 1, 1), data_sizempi_io_type, & mpi_io_p, status, ierr) end do Possible Issue In one path the count argument changed from `data_sizempi_io_type` to `data_size` for `MPI_FILE_READ_ALL`. Ensure `data_size` is in units of the datatype passed (elements of `mpi_io_p`) and not bytes; otherwise this can under/over-read and cause MPI-IO hangs/corruption on some MPI implementations. call MPI_FILE_READ_ALL(ifile, MPI_IO_DATA%var(i)%sf(1, 1, 1), data_size, & mpi_io_p, status, ierr) end do Config Risk* The module config now hardcodes `UCX_NET_DEVICES` and injects an NVHPC math-lib `LD_LIBRARY_PATH` path. This may be site-specific and brittle (device names can vary per node/partition), and can override user environment unexpectedly. Consider making UCX selection conditional/optional, or documenting why these exact devices are required on HiPerGator; also verify OpenMPI module load already sets the needed runtime library paths to avoid conflicts. h-gpu openmpi/5.0.7 h-gpu MFC_CUDA_CC=100 h-gpu LD_LIBRARY_PATH=/apps/compilers/nvhpc/25.9/Linux_x86_64/25.9/math_libs/12.9/lib64:$LD_LIBRARY_PATH h-gpu UCX_NET_DEVICES="mlx5_4:1,mlx5_7:1,mlx5_8:1,mlx5_9:1,mlx5_10:1,mlx5_13:1,mlx5_14:1,mlx5_15:1"

qodo-code-review · 2025-12-15T17:22:33Z

src/pre_process/m_data_output.fpp

+                    call MPI_FILE_WRITE_ALL(ifile, MPI_IO_DATA%var(i)%sf(1, 1, 1), data_size*mpi_io_type, &
                                            mpi_io_p, status, ierr)


Suggestion: In the MPI_FILE_WRITE_ALL call, correct the count parameter from data_size*mpi_io_type to data_size to prevent writing an incorrect amount of data and ensure consistency with other changes in the PR. [possible issue, importance: 9]

Suggested change

call MPI_FILE_WRITE_ALL(ifile, MPI_IO_DATA%var(i)%sf(1, 1, 1), data_size*mpi_io_type, &

mpi_io_p, status, ierr)

call MPI_FILE_WRITE_ALL(ifile, MPI_IO_DATA%var(i)%sf(1, 1, 1), data_size, &

mpi_io_p, status, ierr)

qodo-code-review · 2025-12-15T17:22:33Z

src/simulation/m_start_up.fpp

+                        call MPI_FILE_READ(ifile, MPI_IO_DATA%var(i)%sf(1, 1, 1), data_size*mpi_io_type, &
                                           mpi_io_p, status, ierr)


Suggestion: In the MPI_FILE_READ call, correct the count parameter from data_size*mpi_io_type to data_size to prevent reading an incorrect amount of data and ensure consistency with other changes in the PR. [possible issue, importance: 9]

Suggested change

call MPI_FILE_READ(ifile, MPI_IO_DATA%var(i)%sf(1, 1, 1), data_size*mpi_io_type, &

mpi_io_p, status, ierr)

call MPI_FILE_READ(ifile, MPI_IO_DATA%var(i)%sf(1, 1, 1), data_size, &

mpi_io_p, status, ierr)

codeant-ai · 2025-12-15T17:25:10Z

Nitpicks 🔍

🔒 No security issues identified
⚡ Recommended areas for review Array lower-bound assumption New MPI read calls pass the buffer as MPI_IO_DATA%var(i)%sf(1, 1, 1). Many arrays in this module are allocated with non-1 lower bounds (e.g. -1:...), so using (1,1,1) will skip or misplace data (or write out-of-bounds), causing incorrect initial conditions or memory corruption. Verify and use the actual lower bounds or pass the whole array pointer consistently. Hard-coded buffer start The new MPI write calls use the hard-coded array element `(...%sf(1, 1, 1))` as the buffer start. If the array `sf` has non-1 lower bounds (e.g. -1 or 0) or different allocations for downsampled buffers, indexing (1,1,1) can be out-of-bounds or point to the wrong memory location. This must be validated for every allocation pattern used by `MPI_IO_DATA%var(i)%sf`. Inconsistent MPI count units The code uses different forms for the MPI write count argument: some calls pass `data_sizempi_io_type` while one call passes `data_size`. It's unclear whether `mpi_io_type` is a bytes-per-element multiplier or an MPI datatype handle. If the count is in "number of elements" but is multiplied by bytes-per-element, writes will be incorrect or fail. Verify expected units for the count parameter and make them consistent across all MPI_FILE_WRITE_ALL calls. Buffer contiguity / file view alignment* The code relies on passing a starting element and previously used MPI_FILE_SET_VIEW to position the file pointer. Ensure the memory layout and MPI file view match (contiguity, element order, datatype) so that MPI writes contiguous data from memory into the correct file offsets. If arrays are non-contiguous (slices or derived types), consider creating an MPI derived datatype or packing into a contiguous buffer first. MPI count inconsistency Most MPI calls use a count of data_size*mpi_io_type, but one new MPI_FILE_READ_ALL call uses data_size (no multiplication by mpi_io_type). This mismatch between count and datatype likely causes too-small/too-large reads and corrupts data. Ensure the count unit matches the MPI datatype semantics across all branches.

src/pre_process/m_data_output.fpp

src/simulation/m_start_up.fpp

toolchain/modules

codeant-ai · 2025-12-15T17:25:51Z

CodeAnt AI finished reviewing your PR.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

src/pre_process/m_data_output.fpp (1)

768-783: MPI write buffers: sf(1,1,1) change looks correct; please re‑check count usage.

Switching the MPI-IO buffer from MPI_IO_DATA%var(i)%sf to MPI_IO_DATA%var(i)%sf(1, 1, 1) is a good fix here: it passes the base element of the contiguous array and is consistent with the corresponding MPI read changes in m_start_up, which should help with NVHPC/OpenMPI 5 interfaces. The rest of the call signature (including data_size*mpi_io_type in the bubbles_euler branches) remains coherent.

In the non‑bubbles_euler parallel-IO branch (Line 796), you still use data_size (without *mpi_io_type) as the count, whereas other MPI-IO calls with mpi_io_p use data_size*mpi_io_type. That asymmetry predates this PR, but since you’re already touching these sites, it may be worth confirming that mpi_io_type == 1 is guaranteed in this path; otherwise, aligning this call with the others would make the intent clearer.

Also applies to: 796-797

toolchain/modules (1)

87-90: HiPerGator GPU module lines match the tested stack; consider minor robustness tweaks.

The new h-gpu stanza (NVHPC 25.9 + openmpi/5.0.7, MFC_CUDA_CC=100, NVHPC math_libs in LD_LIBRARY_PATH, and UCX device selection) matches the environment described in the PR and is a reasonable way to encode the working setup into mfc.sh.

Two optional improvements you might consider:

Instead of hard-coding /apps/compilers/nvhpc/25.9/.../math_libs/12.9/lib64, derive this from the NVHPC module (for example via $NVHPC) so that future NVHPC minor updates don’t require editing this file.

Periodically re-check that the UCX_NET_DEVICES setting matches current HiPerGator RC recommendations, since network topology and best-practice UCX configs can change over time.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8ae42aa and 9d8edcc.

📒 Files selected for processing (4)

src/post_process/m_data_input.f90 (0 hunks)
src/pre_process/m_data_output.fpp (3 hunks)
src/simulation/m_start_up.fpp (4 hunks)
toolchain/modules (1 hunks)

💤 Files with no reviewable changes (1)

src/post_process/m_data_input.f90

🧰 Additional context used

📓 Path-based instructions (3)

**/*.{fpp,f90}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{fpp,f90}: Use 2-space indentation; continuation lines align beneath &
Use lower-case keywords and intrinsics (do, end subroutine, etc.)
Name modules with m_ pattern (e.g., m_transport)
Name public subroutines with s_ pattern (e.g., s_compute_flux)
Name public functions with f_ pattern
Keep subroutine size ≤ 500 lines, helper subroutines ≤ 150 lines, functions ≤ 100 lines, files ≤ 1000 lines
Limit routine arguments to ≤ 6; use derived-type params struct if more are needed
Forbid goto statements (except in legacy code), COMMON blocks, and save globals
Every argument must have explicit intent; use dimension/allocatable/pointer as appropriate
Call s_mpi_abort() for errors, never use stop or error stop

**/*.{fpp,f90}: Indent 2 spaces; continuation lines align under &
Use lower-case keywords and intrinsics (do, end subroutine, etc.)
Name modules with m_<feature> prefix (e.g., m_transport)
Name public subroutines as s_<verb>_<noun> (e.g., s_compute_flux) and functions as f_<verb>_<noun>
Keep private helpers in the module; avoid nested procedures
Enforce size limits: subroutine ≤ 500 lines, helper ≤ 150, function ≤ 100, module/file ≤ 1000
Limit subroutines to ≤ 6 arguments; otherwise pass a derived-type 'params' struct
Avoid goto statements (except unavoidable legacy); avoid global state (COMMON, save)
Every variable must have intent(in|out|inout) specification and appropriate dimension / allocatable / pointer
Use s_mpi_abort(<msg>) for error termination instead of stop
Use !> style documentation for header comments; follow Doxygen Fortran format with !! @param and !! @return tags
Use implicit none statement in all modules
Use private declaration followed by explicit public exports in modules
Use derived types with pointers for encapsulation (e.g., pointer, dimension(:,:,:) => null())
Use pure and elemental attributes for side-effect-free functions; combine them for array ...

Files:

src/simulation/m_start_up.fpp
src/pre_process/m_data_output.fpp

src/simulation/**/*.{fpp,f90}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

src/simulation/**/*.{fpp,f90}: Wrap tight GPU loops with !$acc parallel loop gang vector default(present) reduction(...); add collapse(n) when safe; declare loop-local variables with private(...)
Allocate large GPU arrays with managed memory or move them into persistent !$acc enter data regions at start-up
Avoid stop/error stop inside GPU device code
Ensure GPU code compiles with Cray ftn, NVIDIA nvfortran, GNU gfortran, and Intel ifx/ifort compilers

src/simulation/**/*.{fpp,f90}: Mark GPU-callable helpers with $:GPU_ROUTINE(function_name='...', parallelism='[seq]') immediately after declaration
Do not use OpenACC or OpenMP directives directly; use Fypp macros from src/common/include/parallel_macros.fpp instead
Wrap tight loops with $:GPU_PARALLEL_FOR(private='[...]', copy='[...]') macro; add collapse=n for safe nested loop merging
Declare loop-local variables with private='[...]' in GPU parallel loop macros
Allocate large arrays with managed or move them into a persistent $:GPU_ENTER_DATA(...) region at start-up
Do not place stop or error stop inside device code

Files:

src/simulation/m_start_up.fpp

src/**/*.fpp

📄 CodeRabbit inference engine (.cursor/rules/mfc-agent-rules.mdc)

src/**/*.fpp: Use .fpp file extension for Fypp preprocessed files; CMake transpiles them to .f90
Start module files with Fypp include for macros: #:include 'macros.fpp'
Use the fypp ASSERT macro for validating conditions: @:ASSERT(predicate, message)
Use fypp macro @:ALLOCATE(var1, var2) for device-aware allocation instead of standard Fortran allocate
Use fypp macro @:DEALLOCATE(var1, var2) for device-aware deallocation instead of standard Fortran deallocate

Files:

src/simulation/m_start_up.fpp
src/pre_process/m_data_output.fpp

🧠 Learnings (15)

📓 Common learnings

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Ensure GPU code compiles with Cray ftn, NVIDIA nvfortran, GNU gfortran, and Intel ifx/ifort compilers

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Avoid stop/error stop inside GPU device code

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Wrap tight GPU loops with !$acc parallel loop gang vector default(present) reduction(...); add collapse(n) when safe; declare loop-local variables with private(...)

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Do not use OpenACC or OpenMP directives directly; use Fypp macros from `src/common/include/parallel_macros.fpp` instead

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Wrap tight loops with `$:GPU_PARALLEL_FOR(private='[...]', copy='[...]')` macro; add `collapse=n` for safe nested loop merging

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Compile with Cray `ftn` or NVIDIA `nvfortran` for GPU offloading; also build CPU-only with GNU `gfortran` and Intel `ifx`/`ifort` for portability

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Mark GPU-callable helpers with `$:GPU_ROUTINE(function_name='...', parallelism='[seq]')` immediately after declaration

📚 Learning: 2025-11-24T21:50:16.713Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Ensure GPU code compiles with Cray ftn, NVIDIA nvfortran, GNU gfortran, and Intel ifx/ifort compilers

Applied to files:

toolchain/modules
src/simulation/m_start_up.fpp
src/pre_process/m_data_output.fpp

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Allocate large arrays with `managed` or move them into a persistent `$:GPU_ENTER_DATA(...)` region at start-up

Applied to files:

src/simulation/m_start_up.fpp

📚 Learning: 2025-11-24T21:50:16.713Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Allocate large GPU arrays with managed memory or move them into persistent !$acc enter data regions at start-up

Applied to files:

src/simulation/m_start_up.fpp

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Do not use OpenACC or OpenMP directives directly; use Fypp macros from `src/common/include/parallel_macros.fpp` instead

Applied to files:

src/simulation/m_start_up.fpp
src/pre_process/m_data_output.fpp

📚 Learning: 2025-11-24T21:50:16.713Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Wrap tight GPU loops with !$acc parallel loop gang vector default(present) reduction(...); add collapse(n) when safe; declare loop-local variables with private(...)

Applied to files:

src/simulation/m_start_up.fpp
src/pre_process/m_data_output.fpp

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to **/*.{fpp,f90} : Use `s_mpi_abort(<msg>)` for error termination instead of `stop`

Applied to files:

src/simulation/m_start_up.fpp
src/pre_process/m_data_output.fpp

📚 Learning: 2025-11-24T21:50:16.713Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Avoid stop/error stop inside GPU device code

Applied to files:

src/simulation/m_start_up.fpp

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Wrap tight loops with `$:GPU_PARALLEL_FOR(private='[...]', copy='[...]')` macro; add `collapse=n` for safe nested loop merging

Applied to files:

src/simulation/m_start_up.fpp
src/pre_process/m_data_output.fpp

📚 Learning: 2025-11-24T21:50:16.713Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to **/*.{fpp,f90} : Call s_mpi_abort(<msg>) for errors, never use stop or error stop

Applied to files:

src/simulation/m_start_up.fpp
src/pre_process/m_data_output.fpp

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Declare loop-local variables with `private='[...]'` in GPU parallel loop macros

Applied to files:

src/simulation/m_start_up.fpp

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to **/*.{fpp,f90} : Use `wp` (working precision) parameter from `m_precision_select` instead of hardcoded precision like `real*8`

Applied to files:

src/pre_process/m_data_output.fpp

📚 Learning: 2025-11-24T21:50:16.713Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to **/*.{fpp,f90} : Every argument must have explicit intent; use dimension/allocatable/pointer as appropriate

Applied to files:

src/pre_process/m_data_output.fpp

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to **/*.{fpp,f90} : Each variable should have one clear purpose; do not use the same variable for multiple purposes

Applied to files:

src/pre_process/m_data_output.fpp

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to **/*.{fpp,f90} : Every variable must have `intent(in|out|inout)` specification and appropriate `dimension` / `allocatable` / `pointer`

Applied to files:

src/pre_process/m_data_output.fpp

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)

GitHub Check: Detect File Changes
GitHub Check: cubic · AI code reviewer
GitHub Check: Code Cleanliness Check
GitHub Check: Coverage Test on CodeCov
GitHub Check: Github (ubuntu, no-mpi, single, no-debug, false)
GitHub Check: Github (macos, mpi, debug, false)
GitHub Check: Github (ubuntu, mpi, debug, false)
GitHub Check: Github (ubuntu, mpi, no-debug, true)
GitHub Check: Github (macos, mpi, no-debug, false)
GitHub Check: Github (ubuntu, mpi, no-debug, false)
GitHub Check: Github (ubuntu, mpi, debug, true)
GitHub Check: Build & Publish

🔇 Additional comments (1)

src/simulation/m_start_up.fpp (1)

682-692: MPI read buffers: sf(1,1,1) usage is consistent with writes; verify mpi_io_type assumptions.

Updating the MPI-IO reads to use MPI_IO_DATA%var(i)%sf(1, 1, 1) (both in the file-per-process branch and the global restart branch) is aligned with the corresponding write-side changes and should address the Fortran/MPI buffer descriptor issues seen with NVHPC + OpenMPI 5.0.7. Read/write symmetry for all MPI_IO_DATA variables (including qbmm pb/mv) now looks correct.

As in m_data_output, note that:

bubbles_euler/elasticity branches use data_size*mpi_io_type as the count;

the non‑bubbles branch with MPI_FILE_READ_ALL uses data_size only.

That pattern matches the existing write code, so it’s not a new behavioral change, but given these MPI-IO fixes are specifically targeting a portability bug, it may be worth confirming that mpi_io_type is always 1 in the non‑bubbles case, or otherwise harmonizing the count for clarity.

Also applies to: 828-841, 853-855

cubic-dev-ai

2 issues found across 4 files

Prompt for AI agents (all 2 issues)


Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/pre_process/m_data_output.fpp">

<violation number="1" location="src/pre_process/m_data_output.fpp:796">
P2: Inconsistent MPI write count parameter: this call uses `data_size` while other MPI_FILE_WRITE_ALL calls in this file use `data_size*mpi_io_type`. This inconsistency could cause incorrect write sizes or runtime MPI errors. Ensure all MPI write operations use the same count/datatype convention.</violation>
</file>

<file name="src/simulation/m_start_up.fpp">

<violation number="1" location="src/simulation/m_start_up.fpp:853">
P2: Inconsistent MPI read count parameter: this call uses `data_size` while other MPI_FILE_READ/MPI_FILE_READ_ALL calls in this file use `data_size*mpi_io_type`. This mismatch could cause incorrect read sizes and data corruption. Ensure all MPI read operations use a consistent count/datatype convention.</violation>
</file>

_{Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR}

src/pre_process/m_data_output.fpp

src/simulation/m_start_up.fpp

…Dan into debug-hipergator-mpi

codecov · 2025-12-15T20:10:22Z

Codecov Report

❌ Patch coverage is 0% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 42.29%. Comparing base (8ae42aa) to head (a0b3f00).

Files with missing lines	Patch %	Lines
src/simulation/m_start_up.fpp	0.00%	5 Missing ⚠️
src/pre_process/m_data_output.fpp	0.00%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1089      +/-   ##
==========================================
- Coverage   44.08%   42.29%   -1.79%     
==========================================
  Files          71       71              
  Lines       20332    20423      +91     
  Branches     1981     1982       +1     
==========================================
- Hits         8963     8638     -325     
- Misses      10236    10702     +466     
+ Partials     1133     1083      -50

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

danieljvickers · 2025-12-16T17:18:47Z

Every single benchmark failed, which doesn't seem like spurious CI to me... Really odd to pass the test suite and have this minor change result in each benchmark failing. Any way I can get access to the specific logs for one of these benchmarks? The raw longs point to some other file that aren't available through here.

danieljvickers · 2025-12-16T18:43:06Z

@sbryngelson All of the benchmarks are failing with code 143 for a seg fault. I can't see the logs for Phoenix directly on here, but the Frontier logs are seg faulting when building pre_process. I just hopped on Frontier and built the cases fine with --case-optimization, GPU, MPI, etc to match the bench test.

Since it was every single benchmark on both machines, I don't think it was a random fluke, but I am running them again just to be safe. Do you have any thoughts on this in the mean time?

sbryngelson · 2025-12-16T19:33:46Z

I'm not sure what's going on here. Have you tried debugging this? For example, you could revert your changes and see if you still experience failures.

danieljvickers · 2025-12-17T00:29:10Z

I manually got on frontier earlier today and I was not able to recreate the compiler seg fault that we are seeing in the logs. However the code did not work on cray compilers, but appears to be gone with NVHPC on Phoenix.

It makes sense because the test suite doesn't exercise any of the changes (because the test suite runs without parallel_io). So it seems like this fox won't work with cray compilers sadly. This all ties into the conversation with you, me, and Mat earlier on slack.

Daniel Vickers added 3 commits December 15, 2025 12:13

works on 2 ranks with GPU

3b24b31

Added more writes

45bdac5

Updated modules:

9d8edcc

qodo-code-review bot added the Review effort 4/5 label Dec 15, 2025

codeant-ai bot added the size:S This PR changes 10-29 lines, ignoring generated files label Dec 15, 2025

qodo-code-review bot reviewed Dec 15, 2025

View reviewed changes

codeant-ai bot reviewed Dec 15, 2025

View reviewed changes

coderabbitai bot reviewed Dec 15, 2025

View reviewed changes

cubic-dev-ai bot reviewed Dec 15, 2025

View reviewed changes

src/pre_process/m_data_output.fpp Show resolved Hide resolved

src/simulation/m_start_up.fpp Show resolved Hide resolved

Merge branch 'master' into debug-hipergator-mpi

fa1a187

sbryngelson approved these changes Dec 15, 2025

View reviewed changes

Daniel Vickers added 2 commits December 15, 2025 13:43

Added python requirement

4e57fac

Merge branch 'debug-hipergator-mpi' of github.com:danieljvickers/MFC-…

a0b3f00

…Dan into debug-hipergator-mpi

sbryngelson approved these changes Dec 15, 2025

View reviewed changes

		call MPI_FILE_WRITE_ALL(ifile, MPI_IO_DATA%var(i)%sf(1, 1, 1), data_size*mpi_io_type, &
		mpi_io_p, status, ierr)

		call MPI_FILE_READ(ifile, MPI_IO_DATA%var(i)%sf(1, 1, 1), data_size*mpi_io_type, &
		mpi_io_p, status, ierr)

Debug hipergator mpi #1089

Are you sure you want to change the base?

Debug hipergator mpi #1089

Conversation

danieljvickers commented Dec 15, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

User description

Description

Type of change

Scope

How Has This Been Tested?

Checklist

PR Type

Description

Diagram Walkthrough

File Walkthrough

CodeAnt-AI Description

What Changed

Impact

Checking Your Pull Request

Talking to CodeAnt AI

Example

Preserve Org Learnings with CodeAnt

Example

Retrigger review

Check Your Repository Health

Summary by CodeRabbit

Uh oh!

codeant-ai bot commented Dec 15, 2025

Thanks for using CodeAnt! 🎉

Uh oh!

coderabbitai bot commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

qodo-code-review bot commented Dec 15, 2025

PR Reviewer Guide 🔍

Uh oh!

qodo-code-review bot Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

qodo-code-review bot Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

codeant-ai bot commented Dec 15, 2025

Nitpicks 🔍

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codeant-ai bot commented Dec 15, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Dec 15, 2025

Codecov Report

Uh oh!

danieljvickers commented Dec 16, 2025

Uh oh!

danieljvickers commented Dec 16, 2025

Uh oh!

sbryngelson commented Dec 16, 2025

danieljvickers commented Dec 15, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 15, 2025 •

edited

Loading