-
Notifications
You must be signed in to change notification settings - Fork 125
Debug hipergator mpi #1089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Debug hipergator mpi #1089
Conversation
|
CodeAnt AI is reviewing your PR. Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughRemoved a debug print in post-process input; changed MPI I/O buffer arguments to use explicit subarray indexing ( Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
| call MPI_FILE_WRITE_ALL(ifile, MPI_IO_DATA%var(i)%sf(1, 1, 1), data_size*mpi_io_type, & | ||
| mpi_io_p, status, ierr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: In the MPI_FILE_WRITE_ALL call, correct the count parameter from data_size*mpi_io_type to data_size to prevent writing an incorrect amount of data and ensure consistency with other changes in the PR. [possible issue, importance: 9]
| call MPI_FILE_WRITE_ALL(ifile, MPI_IO_DATA%var(i)%sf(1, 1, 1), data_size*mpi_io_type, & | |
| mpi_io_p, status, ierr) | |
| call MPI_FILE_WRITE_ALL(ifile, MPI_IO_DATA%var(i)%sf(1, 1, 1), data_size, & | |
| mpi_io_p, status, ierr) |
| call MPI_FILE_READ(ifile, MPI_IO_DATA%var(i)%sf(1, 1, 1), data_size*mpi_io_type, & | ||
| mpi_io_p, status, ierr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: In the MPI_FILE_READ call, correct the count parameter from data_size*mpi_io_type to data_size to prevent reading an incorrect amount of data and ensure consistency with other changes in the PR. [possible issue, importance: 9]
| call MPI_FILE_READ(ifile, MPI_IO_DATA%var(i)%sf(1, 1, 1), data_size*mpi_io_type, & | |
| mpi_io_p, status, ierr) | |
| call MPI_FILE_READ(ifile, MPI_IO_DATA%var(i)%sf(1, 1, 1), data_size, & | |
| mpi_io_p, status, ierr) |
Nitpicks 🔍
|
|
CodeAnt AI finished reviewing your PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/pre_process/m_data_output.fpp (1)
768-783: MPI write buffers: sf(1,1,1) change looks correct; please re‑check count usage.Switching the MPI-IO buffer from
MPI_IO_DATA%var(i)%sftoMPI_IO_DATA%var(i)%sf(1, 1, 1)is a good fix here: it passes the base element of the contiguous array and is consistent with the corresponding MPI read changes inm_start_up, which should help with NVHPC/OpenMPI 5 interfaces. The rest of the call signature (includingdata_size*mpi_io_typein the bubbles_euler branches) remains coherent.In the non‑
bubbles_eulerparallel-IO branch (Line 796), you still usedata_size(without*mpi_io_type) as the count, whereas other MPI-IO calls withmpi_io_pusedata_size*mpi_io_type. That asymmetry predates this PR, but since you’re already touching these sites, it may be worth confirming thatmpi_io_type == 1is guaranteed in this path; otherwise, aligning this call with the others would make the intent clearer.Also applies to: 796-797
toolchain/modules (1)
87-90: HiPerGator GPU module lines match the tested stack; consider minor robustness tweaks.The new
h-gpustanza (NVHPC 25.9 +openmpi/5.0.7,MFC_CUDA_CC=100, NVHPC math_libs inLD_LIBRARY_PATH, and UCX device selection) matches the environment described in the PR and is a reasonable way to encode the working setup intomfc.sh.Two optional improvements you might consider:
- Instead of hard-coding
/apps/compilers/nvhpc/25.9/.../math_libs/12.9/lib64, derive this from the NVHPC module (for example via$NVHPC) so that future NVHPC minor updates don’t require editing this file.- Periodically re-check that the
UCX_NET_DEVICESsetting matches current HiPerGator RC recommendations, since network topology and best-practice UCX configs can change over time.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/post_process/m_data_input.f90(0 hunks)src/pre_process/m_data_output.fpp(3 hunks)src/simulation/m_start_up.fpp(4 hunks)toolchain/modules(1 hunks)
💤 Files with no reviewable changes (1)
- src/post_process/m_data_input.f90
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{fpp,f90}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{fpp,f90}: Use 2-space indentation; continuation lines align beneath &
Use lower-case keywords and intrinsics (do, end subroutine, etc.)
Name modules with m_ pattern (e.g., m_transport)
Name public subroutines with s_ pattern (e.g., s_compute_flux)
Name public functions with f_ pattern
Keep subroutine size ≤ 500 lines, helper subroutines ≤ 150 lines, functions ≤ 100 lines, files ≤ 1000 lines
Limit routine arguments to ≤ 6; use derived-type params struct if more are needed
Forbid goto statements (except in legacy code), COMMON blocks, and save globals
Every argument must have explicit intent; use dimension/allocatable/pointer as appropriate
Call s_mpi_abort() for errors, never use stop or error stop
**/*.{fpp,f90}: Indent 2 spaces; continuation lines align under&
Use lower-case keywords and intrinsics (do,end subroutine, etc.)
Name modules withm_<feature>prefix (e.g.,m_transport)
Name public subroutines ass_<verb>_<noun>(e.g.,s_compute_flux) and functions asf_<verb>_<noun>
Keep private helpers in the module; avoid nested procedures
Enforce size limits: subroutine ≤ 500 lines, helper ≤ 150, function ≤ 100, module/file ≤ 1000
Limit subroutines to ≤ 6 arguments; otherwise pass a derived-type 'params' struct
Avoidgotostatements (except unavoidable legacy); avoid global state (COMMON,save)
Every variable must haveintent(in|out|inout)specification and appropriatedimension/allocatable/pointer
Uses_mpi_abort(<msg>)for error termination instead ofstop
Use!>style documentation for header comments; follow Doxygen Fortran format with!! @paramand!! @returntags
Useimplicit nonestatement in all modules
Useprivatedeclaration followed by explicitpublicexports in modules
Use derived types with pointers for encapsulation (e.g.,pointer, dimension(:,:,:) => null())
Usepureandelementalattributes for side-effect-free functions; combine them for array ...
Files:
src/simulation/m_start_up.fppsrc/pre_process/m_data_output.fpp
src/simulation/**/*.{fpp,f90}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
src/simulation/**/*.{fpp,f90}: Wrap tight GPU loops with !$acc parallel loop gang vector default(present) reduction(...); add collapse(n) when safe; declare loop-local variables with private(...)
Allocate large GPU arrays with managed memory or move them into persistent !$acc enter data regions at start-up
Avoid stop/error stop inside GPU device code
Ensure GPU code compiles with Cray ftn, NVIDIA nvfortran, GNU gfortran, and Intel ifx/ifort compilers
src/simulation/**/*.{fpp,f90}: Mark GPU-callable helpers with$:GPU_ROUTINE(function_name='...', parallelism='[seq]')immediately after declaration
Do not use OpenACC or OpenMP directives directly; use Fypp macros fromsrc/common/include/parallel_macros.fppinstead
Wrap tight loops with$:GPU_PARALLEL_FOR(private='[...]', copy='[...]')macro; addcollapse=nfor safe nested loop merging
Declare loop-local variables withprivate='[...]'in GPU parallel loop macros
Allocate large arrays withmanagedor move them into a persistent$:GPU_ENTER_DATA(...)region at start-up
Do not placestoporerror stopinside device code
Files:
src/simulation/m_start_up.fpp
src/**/*.fpp
📄 CodeRabbit inference engine (.cursor/rules/mfc-agent-rules.mdc)
src/**/*.fpp: Use.fppfile extension for Fypp preprocessed files; CMake transpiles them to.f90
Start module files with Fypp include for macros:#:include 'macros.fpp'
Use the fyppASSERTmacro for validating conditions:@:ASSERT(predicate, message)
Use fypp macro@:ALLOCATE(var1, var2)for device-aware allocation instead of standard Fortranallocate
Use fypp macro@:DEALLOCATE(var1, var2)for device-aware deallocation instead of standard Fortrandeallocate
Files:
src/simulation/m_start_up.fppsrc/pre_process/m_data_output.fpp
🧠 Learnings (15)
📓 Common learnings
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Ensure GPU code compiles with Cray ftn, NVIDIA nvfortran, GNU gfortran, and Intel ifx/ifort compilers
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Avoid stop/error stop inside GPU device code
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Wrap tight GPU loops with !$acc parallel loop gang vector default(present) reduction(...); add collapse(n) when safe; declare loop-local variables with private(...)
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Do not use OpenACC or OpenMP directives directly; use Fypp macros from `src/common/include/parallel_macros.fpp` instead
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Wrap tight loops with `$:GPU_PARALLEL_FOR(private='[...]', copy='[...]')` macro; add `collapse=n` for safe nested loop merging
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Compile with Cray `ftn` or NVIDIA `nvfortran` for GPU offloading; also build CPU-only with GNU `gfortran` and Intel `ifx`/`ifort` for portability
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Mark GPU-callable helpers with `$:GPU_ROUTINE(function_name='...', parallelism='[seq]')` immediately after declaration
📚 Learning: 2025-11-24T21:50:16.713Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Ensure GPU code compiles with Cray ftn, NVIDIA nvfortran, GNU gfortran, and Intel ifx/ifort compilers
Applied to files:
toolchain/modulessrc/simulation/m_start_up.fppsrc/pre_process/m_data_output.fpp
📚 Learning: 2025-11-24T21:50:46.909Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Allocate large arrays with `managed` or move them into a persistent `$:GPU_ENTER_DATA(...)` region at start-up
Applied to files:
src/simulation/m_start_up.fpp
📚 Learning: 2025-11-24T21:50:16.713Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Allocate large GPU arrays with managed memory or move them into persistent !$acc enter data regions at start-up
Applied to files:
src/simulation/m_start_up.fpp
📚 Learning: 2025-11-24T21:50:46.909Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Do not use OpenACC or OpenMP directives directly; use Fypp macros from `src/common/include/parallel_macros.fpp` instead
Applied to files:
src/simulation/m_start_up.fppsrc/pre_process/m_data_output.fpp
📚 Learning: 2025-11-24T21:50:16.713Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Wrap tight GPU loops with !$acc parallel loop gang vector default(present) reduction(...); add collapse(n) when safe; declare loop-local variables with private(...)
Applied to files:
src/simulation/m_start_up.fppsrc/pre_process/m_data_output.fpp
📚 Learning: 2025-11-24T21:50:46.909Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to **/*.{fpp,f90} : Use `s_mpi_abort(<msg>)` for error termination instead of `stop`
Applied to files:
src/simulation/m_start_up.fppsrc/pre_process/m_data_output.fpp
📚 Learning: 2025-11-24T21:50:16.713Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Avoid stop/error stop inside GPU device code
Applied to files:
src/simulation/m_start_up.fpp
📚 Learning: 2025-11-24T21:50:46.909Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Wrap tight loops with `$:GPU_PARALLEL_FOR(private='[...]', copy='[...]')` macro; add `collapse=n` for safe nested loop merging
Applied to files:
src/simulation/m_start_up.fppsrc/pre_process/m_data_output.fpp
📚 Learning: 2025-11-24T21:50:16.713Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to **/*.{fpp,f90} : Call s_mpi_abort(<msg>) for errors, never use stop or error stop
Applied to files:
src/simulation/m_start_up.fppsrc/pre_process/m_data_output.fpp
📚 Learning: 2025-11-24T21:50:46.909Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Declare loop-local variables with `private='[...]'` in GPU parallel loop macros
Applied to files:
src/simulation/m_start_up.fpp
📚 Learning: 2025-11-24T21:50:46.909Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to **/*.{fpp,f90} : Use `wp` (working precision) parameter from `m_precision_select` instead of hardcoded precision like `real*8`
Applied to files:
src/pre_process/m_data_output.fpp
📚 Learning: 2025-11-24T21:50:16.713Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to **/*.{fpp,f90} : Every argument must have explicit intent; use dimension/allocatable/pointer as appropriate
Applied to files:
src/pre_process/m_data_output.fpp
📚 Learning: 2025-11-24T21:50:46.909Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to **/*.{fpp,f90} : Each variable should have one clear purpose; do not use the same variable for multiple purposes
Applied to files:
src/pre_process/m_data_output.fpp
📚 Learning: 2025-11-24T21:50:46.909Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to **/*.{fpp,f90} : Every variable must have `intent(in|out|inout)` specification and appropriate `dimension` / `allocatable` / `pointer`
Applied to files:
src/pre_process/m_data_output.fpp
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: Detect File Changes
- GitHub Check: cubic · AI code reviewer
- GitHub Check: Code Cleanliness Check
- GitHub Check: Coverage Test on CodeCov
- GitHub Check: Github (ubuntu, no-mpi, single, no-debug, false)
- GitHub Check: Github (macos, mpi, debug, false)
- GitHub Check: Github (ubuntu, mpi, debug, false)
- GitHub Check: Github (ubuntu, mpi, no-debug, true)
- GitHub Check: Github (macos, mpi, no-debug, false)
- GitHub Check: Github (ubuntu, mpi, no-debug, false)
- GitHub Check: Github (ubuntu, mpi, debug, true)
- GitHub Check: Build & Publish
🔇 Additional comments (1)
src/simulation/m_start_up.fpp (1)
682-692: MPI read buffers: sf(1,1,1) usage is consistent with writes; verify mpi_io_type assumptions.Updating the MPI-IO reads to use
MPI_IO_DATA%var(i)%sf(1, 1, 1)(both in the file-per-process branch and the global restart branch) is aligned with the corresponding write-side changes and should address the Fortran/MPI buffer descriptor issues seen with NVHPC + OpenMPI 5.0.7. Read/write symmetry for allMPI_IO_DATAvariables (including qbmmpb/mv) now looks correct.As in
m_data_output, note that:
- bubbles_euler/elasticity branches use
data_size*mpi_io_typeas the count;- the non‑bubbles branch with
MPI_FILE_READ_ALLusesdata_sizeonly.That pattern matches the existing write code, so it’s not a new behavioral change, but given these MPI-IO fixes are specifically targeting a portability bug, it may be worth confirming that
mpi_io_typeis always 1 in the non‑bubbles case, or otherwise harmonizing the count for clarity.Also applies to: 828-841, 853-855
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 issues found across 4 files
Prompt for AI agents (all 2 issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="src/pre_process/m_data_output.fpp">
<violation number="1" location="src/pre_process/m_data_output.fpp:796">
P2: Inconsistent MPI write count parameter: this call uses `data_size` while other MPI_FILE_WRITE_ALL calls in this file use `data_size*mpi_io_type`. This inconsistency could cause incorrect write sizes or runtime MPI errors. Ensure all MPI write operations use the same count/datatype convention.</violation>
</file>
<file name="src/simulation/m_start_up.fpp">
<violation number="1" location="src/simulation/m_start_up.fpp:853">
P2: Inconsistent MPI read count parameter: this call uses `data_size` while other MPI_FILE_READ/MPI_FILE_READ_ALL calls in this file use `data_size*mpi_io_type`. This mismatch could cause incorrect read sizes and data corruption. Ensure all MPI read operations use a consistent count/datatype convention.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
…Dan into debug-hipergator-mpi
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1089 +/- ##
==========================================
- Coverage 44.08% 42.29% -1.79%
==========================================
Files 71 71
Lines 20332 20423 +91
Branches 1981 1982 +1
==========================================
- Hits 8963 8638 -325
- Misses 10236 10702 +466
+ Partials 1133 1083 -50 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Every single benchmark failed, which doesn't seem like spurious CI to me... Really odd to pass the test suite and have this minor change result in each benchmark failing. Any way I can get access to the specific logs for one of these benchmarks? The raw longs point to some other file that aren't available through here. |
|
@sbryngelson All of the benchmarks are failing with code 143 for a seg fault. I can't see the logs for Phoenix directly on here, but the Frontier logs are seg faulting when building pre_process. I just hopped on Frontier and built the cases fine with --case-optimization, GPU, MPI, etc to match the bench test. Since it was every single benchmark on both machines, I don't think it was a random fluke, but I am running them again just to be safe. Do you have any thoughts on this in the mean time? |
|
I'm not sure what's going on here. Have you tried debugging this? For example, you could revert your changes and see if you still experience failures. |
|
I manually got on frontier earlier today and I was not able to recreate the compiler seg fault that we are seeing in the logs. However the code did not work on cray compilers, but appears to be gone with NVHPC on Phoenix. It makes sense because the test suite doesn't exercise any of the changes (because the test suite runs without parallel_io). So it seems like this fox won't work with cray compilers sadly. This all ties into the conversation with you, me, and Mat earlier on slack. |
User description
User description
Description
This has changes to resolve the MPI mode on the hipergator system
Fixes #1056
Type of change
Please delete options that are not relevant.
Scope
How Has This Been Tested?
I ran various examples on the hipergator system with NVHPC/25.9 and OpenMPI/5.0.7 loaded
Checklist
PR Type
Bug fix
Description
Fix MPI I/O operations by specifying array element indices
Remove debug print statement from serial data reading
Update hipergator module configuration for NVHPC/25.9
Simplify MPI I/O data size parameter in one code path
Diagram Walkthrough
File Walkthrough
m_data_input.f90
Remove debug print statementsrc/post_process/m_data_input.f90
q_cons_vf(i)%sf(:, 0, 0)after reading data
m_data_output.fpp
Fix MPI I/O array indexing in write operationssrc/pre_process/m_data_output.fpp
MPI_FILE_WRITE_ALLcalls to specify array element(1, 1, 1)instead of entire array
variable groups
data_size*mpi_io_typetodata_sizem_start_up.fpp
Fix MPI I/O array indexing in read operationssrc/simulation/m_start_up.fpp
MPI_FILE_READandMPI_FILE_READ_ALLcalls to specify arrayelement
(1, 1, 1)instead of entire arrayvariable groups
data_size*mpi_io_typetodata_sizemodules
Update hipergator module configurationtoolchain/modules
openmpi/5.0.7LD_LIBRARY_PATHto use NVHPC/25.9 math librariesCodeAnt-AI Description
Fix MPI file I/O alignment, remove stray debug print, and update Hipergator modules
What Changed
Impact
✅ Fewer MPI I/O data corruption errors during checkpoint/load✅ Clearer console output when loading serial data (no stray debug prints)✅ Works with Hipergator NVHPC 25.9 + OpenMPI 5.0.7 GPU runs💡 Usage Guide
Checking Your Pull Request
Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.
Talking to CodeAnt AI
Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:
This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.
Example
Preserve Org Learnings with CodeAnt
You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:
This helps CodeAnt AI learn and adapt to your team's coding style and standards.
Example
Retrigger review
Ask CodeAnt AI to review the PR again, by typing:
Check Your Repository Health
To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.