Skip to content

Conversation

@uturuncoglu
Copy link
Collaborator

Commit Queue Requirements:

  • Fill out all sections of this template.
  • All sub component pull requests have been reviewed by their code managers.
  • Run the full Intel+GNU RT suite (compared to current baselines) on either Hera/Derecho/Hercules
  • Commit 'test_changes.list' from previous step

Description:

Commit Message:

* UFSWM - 
  * AQM - 
  * CDEPS - 
  * CICE - 
  * CMEPS - 
  * CMakeModules - 
  * FV3 - 
    * ccpp-physics - 
    * atmos_cubed_sphere - 
  * GOCART - 
  * HYCOM - 
  * MOM6 - 
  * NOAHMP - 
  * WW3 - 
  * fire_behavior
  * stochastic_physics - 
  * ADCIRC -
  * FVCOM -
  * PAHM -
  * ROMS -
  * SCHISM -
  * SCHISM-ESMF -
  * GEOGATE -

Priority:

  • Critical Bugfix: Reason
  • High: Reason
  • Normal

Git Tracking

UFSWM:

  • Closes #
  • None

Sub component Pull Requests:

  • AQM:
  • CDEPS:
  • CICE:
  • CMEPS:
  • CMakeModules:
  • FV3:
    • ccpp-physics:
    • atmos_cubed_sphere:
  • GOCART:
  • HYCOM:
  • MOM6:
  • NOAHMP:
  • WW3:
  • fire_behavior:
  • stochastic_physics:
  • ADCIRC:
  • FVCOM:
  • PAHM:
  • ROMS:
  • SCHISM:
  • SCHISM-ESMF:
  • GEOGATE:
  • None

UFSWM Blocking Dependencies:

  • Blocked by #
  • None

Documentation:

  • This PR requires a documentation update, and the WM User's Guide has been updated based on the changes in this PR.
  • This PR requires a documentation update, and a WM issue has been opened to track the need for a documentation update; a person responsible for submitting the update has been assigned to the issue (link issue).
  • No documentation update is required for this PR (please explain).

Changes

Regression Test Changes (Please commit test_changes.list):

  • PR Adds New Tests/Baselines.
  • PR Updates/Changes Baselines.
  • No Baseline Changes.

Input data Changes:

  • None.
  • New input data.
  • Updated input data.

Library Changes/Upgrades:

  • Required
    • Library names w/versions:
    • Git Stack Issue (JCSDA/spack-stack#)
  • No Updates

Testing Log:

  • RDHPCS
    • Hera
    • Orion
    • Hercules
    • GaeaC6
    • Derecho
    • Ursa
  • WCOSS2
    • Dogwood/Cactus
    • Acorn
  • CI
  • opnReqTest (complete task if unnecessary)

@mansurjisan mansurjisan self-requested a review August 7, 2025 20:16
@uturuncoglu uturuncoglu changed the title bring duck rt Bring new regression test/s for Duck configuration Aug 7, 2025
@uturuncoglu
Copy link
Collaborator Author

@mansurjisan @yunfangsun JFYI, I fixed the issue with the generation baseline files dynamically. I just need to change seq -w used in the regression test file to -f "%06g" (it was just issue with number of zeros in the file). I also created a baseline and move the input files under /work2/noaa/nems/tufuk/RT (original location used by UFS Coastal). At this point, the test is failing with following log,

 Comparing outputs/schout_000000_1.nc .....USING NCCMP......OK
 Comparing outputs/schout_000000_2.nc .....USING NCCMP......OK
 Comparing outputs/schout_000001_1.nc .....USING NCCMP......OK
 Comparing outputs/schout_000001_2.nc .....USING NCCMP......OK
 Comparing outputs/schout_000002_1.nc .....USING NCCMP......OK
 Comparing outputs/schout_000002_2.nc .....USING NCCMP......OK
 Comparing outputs/schout_000003_1.nc .....USING NCCMP......OK
 Comparing outputs/schout_000003_2.nc .....USING NCCMP......OK
 Comparing outputs/schout_000004_1.nc .....USING NCCMP......OK
 Comparing outputs/schout_000004_2.nc .....USING NCCMP......OK
 Comparing outputs/schout_000005_1.nc .....USING NCCMP......OK
 Comparing outputs/schout_000005_2.nc .....USING NCCMP......OK
 Comparing outputs/schout_000006_1.nc .....USING NCCMP......OK
 Comparing outputs/schout_000006_2.nc .....USING NCCMP......OK
 Comparing outputs/schout_000007_1.nc .....USING NCCMP......NOT IDENTICAL
...
...
 Comparing outputs/schout_000019_1.nc .....USING NCCMP......NOT IDENTICAL
...

So, it seems that data in some PET are not identical. I checked one of the file with NCAR's cprnc tool like following,

cprnc -m /work2/noaa/nems/tufuk/RT/NEMSfv3gfs/develop-20250721/coastal_sandy_duck_atm2sch_intel/outputs/schout_000019_1.nc /work2/noaa/stmp/tufuk/stmp/tufuk/FV3_RT/rt_843742/coastal_sandy_duck_atm2sch_intel/outputs/schout_000019_1.nc

and it seems that zcor has some NaN values in it. The other variables are fine. It seems that there is a bug in SCHSIM model and needs to be fixed. I am leaving investigating the error in the model side to the model developers. I think this is a nice example/demostration to go with more realistic cases since we could not catch this issue with existing artificial test. @josephzhang8 Do you have any idea about zcor issue.

BTW, the test is little bit slow since it is checking lots of files. I am not sure what is the best way to do it (maybe we could have a merge step and then check but that requires running additional post-processing step through the rt.sh and requires some costomization) but we could look at later.

@josephzhang8
Copy link
Collaborator

@uturuncoglu: zcoor*.nc can have NaN's, e.g. below bottom and on dry spots. This output is strictly for post-processing so if it causes error, I suggest we turn it off in param.nml (iof_hydro(25))

@uturuncoglu
Copy link
Collaborator Author

@josephzhang8 Thanks for the clarification. It is good to know. But, it seems that the position of those NaNs are changing. We could turn it off in our end but you might want to double check in the SCHSIM side to be sure that there is no real issue. @mansurjisan I'll update the param.nml to remove that variable and recreate the baseline. Let me know what you think.

@uturuncoglu
Copy link
Collaborator Author

@josephzhang8 JFYI, some output from cprnc,

zcor   (nSCHISM_vgrid_layers,nSCHISM_hgrid_node,time)  t_index =      4     4
        177     5425  (     1,     2,     1) (     1,   175,     1) (     2,     2,     1) (    11,     2,     1)
                5425                Infinity  -8.093000411987305E+00Infinity  0.000000000000000E+00     NaN  9.007175504665868E-33
                5425                Infinity  -8.093000411987305E+00                       Infinity          7.599358562864073E+35
                5425  (     1,     2,     1) (     1,   175,     1)
          avg abs field values:                 Infinity    rms diff:     NaN   avg rel diff(npos):      NaN
                                                Infinity                        avg decimal digits(ndif):  NaN worst:  0.0
 RMS zcor                                    NaN            NORMALIZED  0.0000E+00

@josephzhang8
Copy link
Collaborator

Thx. I'll consider adding init for this array...

@uturuncoglu
Copy link
Collaborator Author

@josephzhang8 Yes, that would be great. Thanks. Even something small/large value (1.e-20, 1.e20) and adding mask attribute to that variable will work better than NaN.

@uturuncoglu
Copy link
Collaborator Author

@mansurjisan The test is passing now. So, you could check in your end.

@josephzhang8
Copy link
Collaborator

@uturuncoglu : I've added init for zcoor* and tested. Plz pull the latest master from schism repo and let me know if it solves your problem. Thx

@mansurjisan
Copy link
Collaborator

Hi @uturuncoglu ,

Sorry for the delay. The Duck RT test passed on my end without any issues. I used your run directory as a baseline when comparing the results on my end.

Also, I tried to compare the water elevation between my simulation and the baseline run. I tried to combine the outputs from both the baseline run and my run using SCHISM's combine_output_MPI program. But looks like in the baseline directory the local_to_global files are missing.

I tried to generate a baseline using ./rt.sh -l rt_coastal.conf -a nos-surge -c -n "coastal_sandy_duck_atm2sch intel" . The run completed successfully, but I don't see any local_to_global files in my baseline output directory. Could we save those local_to_global files so that we can use combine_output_MPI program to merge all the netcdf files and make plots?

My run directories are:

regression test output: /work2/noaa/nos-surge/mjisan/ufs-weather-model-coastal_new_rt/tests/stmp/mjisan/FV3_RT/rt_2238253/coastal_sandy_duck_atm2sch_intel/outputs

Baseline: /work2/noaa/nos-surge/mjisan/ufs-weather-model-coastal_new_rt/tests/stmp/mjisan/FV3_RT/REGRESSION_TEST/coastal_sandy_duck_atm2sch_intel/outputs

Copy link
Collaborator

@mansurjisan mansurjisan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good to me. The new Duck RT test case passed on my end .

@uturuncoglu
Copy link
Collaborator Author

@mansurjisan Thanks for checking. That is great. We are not saving local_to_global files since we do not have any way to call the utility and post process output before checking at this point. If individual files written by each processor are same with the baseline then we are fine. Are we planing to wait to bring other tests (i.e. coupled with WW3)? I am asking because I am not sure other Duck configurations are ready or not.

@mansurjisan
Copy link
Collaborator

@uturuncoglu ,

I think we can add the ATM2WW3 configuration for Duck in the existing RT system, although it's actively going through finetuning by @yunfangsun

@uturuncoglu
Copy link
Collaborator Author

@mansurjisan @yunfangsun If it is ready we could add to this PR.

@mansurjisan
Copy link
Collaborator

Hi @uturuncoglu - I will take a look at the setup of ATM2WW3 configuration for Duck, NC case . If everything works on my end, I can add that to this PR.

@uturuncoglu
Copy link
Collaborator Author

@mansurjisan That is great. Thanks for your help. Let me know if you need anything from my end.

@yunfangsun
Copy link
Collaborator

Hi @uturuncoglu and @mansurjisan ,

For the ATM+WW3, we also need the dataocean for the exchange of sea level data to WW3, do we have a template for dataocean ?

Thank you!

@uturuncoglu
Copy link
Collaborator Author

@mansurjisan @yunfangsun let's schedule a meeting for it. Maybe next week. Just like previous one, we could try to define ATM+WW3 with data ocean tis time. BTW, how ATM+WW3 is running now. Is it using data ocean? Could you point me a run directory so, I could check it.

@yunfangsun
Copy link
Collaborator

yunfangsun commented Aug 21, 2025

Hi @uturuncoglu

You can test/work2/noaa/nos-surge/yunfangs/duck/duck_From_Dan_03312025/stmp/yunfangs/FV3_RT/rt_1298294_atmww3/coastal_duck_atm2ww3_intel

And now the ATM+WW3 is using dataocean

@uturuncoglu
Copy link
Collaborator Author

@yunfangsun okay. let me check. maybe we do not need to have call. I'll update you.

@uturuncoglu
Copy link
Collaborator Author

@yunfangsun I am not sure this is intentional or not but I am seeing following in your docn.stream file,

stream_data_variables01: "msl So_h"

Actually, msl is the mean sea level pressure and you are passing this to WW3 as sea surface height. I think that is wrong. Could you double check your configuration?

@uturuncoglu
Copy link
Collaborator Author

@yunfangsun BTW, if you filled msl with the ssh (I am not sure about its source - if not era5 then create new file and name it correctly) then please create new data file or add new variable to the file. We would like to create these RTs for the external users and collaborators. When they use these as a reference to create their own cases, this would be very confusing for them.

@uturuncoglu
Copy link
Collaborator Author

@yunfangsun BTW, the msl is constant globally but varying in time. Still need to have new file and name the variable accordingly. Anyway, @mansurjisan @yunfangsun I think having global input is fine for now but we need to update the duck cases to use subsetted forcing data over the region not global in the future. @mansurjisan I think you were working about the RT documentation. Do you have anything for DATM+SCHISM and DAMT+DOCN+WW3 cases. We could add them to the app level documentation.

@yunfangsun
Copy link
Collaborator

Hi @uturuncoglu ,

I put the sea level data to a era5 file, and ok I will make a new nc file with the variable name So_h to replace it.

Thank you!

@yunfangsun
Copy link
Collaborator

Hi @uturuncoglu ,

The new water level file is /work2/noaa/nos-surge/yunfangs/duck/duck_From_Dan_03312025/stmp/yunfangs/FV3_RT/rt_1298294_atmww3/coastal_duck_atm2ww3_intel/INPUT/wlv.nc.

@uturuncoglu
Copy link
Collaborator Author

@yunfangsun Thanks. That is great. if you create that data, please name the file accordingly and use correct metadata in it. This will prevent the confusion in the user level. Also, please provide me the details of the configuration, link/citation to the original source of the data. Again, I would like to put those to the documentation.

@uturuncoglu
Copy link
Collaborator Author

@yunfangsun Sorry for extra work but this is still not correct. Please delete all attributes from the So_h variable and correct them (just having correct long name, unit and standard name would be enough). Also, we do not need to have wind data in this file. That is already in the datm file.

@uturuncoglu
Copy link
Collaborator Author

@josephzhang8 Thanks for update. I'll look at updating SCHISM but still working on bringing these RTs. @SmithJos13 It seems that your PR (https://github.com/oceanmodeling/ufs-weather-model/pull/174/files) requires updated SCHISM. Will it work with v5.14 or you need additional changes in SCHISM side. Maybe you already push those to SCHISM repositories but I am not aware of it.

@SmithJos13
Copy link

@uturuncoglu I think some changes have been pushed to SCHISM, I would need to check the latest version.

@josephzhang8
Copy link
Collaborator

Let me know if @SmithJos13 needs to push to schism repo. You can put them to master, and I'll cherry-pick the changes to v5.14.

@josephzhang8
Copy link
Collaborator

@uturuncoglu I think some changes have been pushed to SCHISM, I would need to check the latest version.

I never see those changes. You might have pushed them to a fork, not official schism repo.

@uturuncoglu
Copy link
Collaborator Author

@josephzhang8 @SmithJos13 Okay. We could figure out what we need when we try to bring CICE configuration. @SmithJos13 Let me know when you are ready and having successful run with data component and fully coupled configurations. So, we could start working on it.

@SmithJos13
Copy link

@uturuncoglu I'm almost ready. I'm just generating some boundary conditions for SCHISM so I can give the fully coupled Bering Sea domain a burn. If that runs and generates some ice when we are good to go! I suspect I should know tomorrow or early next week.

@uturuncoglu
Copy link
Collaborator Author

@mansurjisan @yunfangsun new RT is ready and passing the test. Please give a try to coastal_sandy_duck_atm2ocn2ww3 and let me know how it goes. Also, you could try to run one of the exiting WW3 test such as atm2sch one to be sure we are not breaking those.

@mansurjisan
Copy link
Collaborator

Thanks, @uturuncoglu . I will give it a try on my end.

Regarding the Duck regression test documentation, I have a version from last year, but it needs to be updated to reflect recent changes. I will work on it.

@yunfangsun
Copy link
Collaborator

Thank you @uturuncoglu , I will try it

@yunfangsun
Copy link
Collaborator

Hi @uturuncoglu ,

I have tested the atm+ww3 case coastal_sandy_duck_atm2ocn2ww3 in /work2/noaa/nos-surge/yunfangs/duck/ufs-weather-model_new_rt/stmp/yunfangs/FV3_RT/rt_994128/

The results is fine, it could be brought to the RT.

@uturuncoglu
Copy link
Collaborator Author

@mansurjisan @yunfangsun Please let me know if we have still issue with this PR. Otherwise, I'll merge it soon.

@uturuncoglu uturuncoglu marked this pull request as ready for review September 23, 2025 17:52
@mansurjisan
Copy link
Collaborator

Hi @uturuncoglu,

I believe @yunfangsun is planning to share another configuration for DATM+SCHISM+WW3. The one we’ve tested so far is the setup where the variables are passed through DOCN.

@mansurjisan
Copy link
Collaborator

From @yunfangsun

Hi @mansurjisan,

Could you please take a try on the ATM+SCH+WW3 for the 2d coupling case at /work2/noaa/nos-surge/yunfangs/duck/atm_sch_ww3_2d ?

Could you please also convert it a regression test?

Thank you!

@uturuncoglu uturuncoglu added the enhancement New feature or request label Oct 6, 2025
@uturuncoglu
Copy link
Collaborator Author

@SmithJos13 New CICE RT is ready and has a baseline. If you don't mind could you try to run and check the results in your end (it is configured to run 6-hours). You could run it using following command,

./rt.sh -l rt_coastal.conf -a nems -k -n "coastal_bering_sea_atm2ocn2cice intel"

Also note that you need to checkout this branch (feature/new_rt) and also set DISKNM="/work2/noaa/nems/tufuk/RT" in rt.sh. Let me know if you hit any issue. Once the fully coupled configuration is ready, just point me the run directory and I could add new test for it too.

@mansurjisan
Copy link
Collaborator

Hi @uturuncoglu ,

I was working on adding the Duck ATM2SCH2WW3 regression test and while working on it , i noticed the exisitng coastal_ike_shinnecock_atm2sch2ww3 is failing. The failing is happening for this branch this feature/new_rt .

While running the rt.sh script, something that I found suspicious is that when the code compiles, it completes within a minute. but when I cloned feature/coastal_app and run it from there it took some time to compile and then run successfully. Looking at the PET00.ESMF_LogFile i see this message:

20251118 182630.236 INFO             PET00  orb_mvelp = 1.e36
20251118 182630.236 INFO             PET00  orb_obliq = 1.e36
20251118 182630.236 INFO             PET00  stop_n = 120
20251118 182630.236 INFO             PET00  stop_option = nhours
20251118 182630.236 INFO             PET00  stop_ymd = -999
20251118 182630.236 INFO             PET00 ReadAttributes ALLCOMP_attributes:: end:
20251118 182630.236 ERROR            PET00 UFSDriver.F90:633 Not valid  -  No component ww3 found
20251118 182630.236 ERROR            PET00 UFS Driver Grid Comp:src/addon/NUOPC/src/NUOPC_Driver.F90:797 Not valid  - Passing error in return code
20251118 182630.236 ERROR            PET00 UFS Driver Grid Comp:src/addon/NUOPC/src/NUOPC_Driver.F90:486 Not valid  - Passing error in return code
20251118 182630.236 ERROR            PET00 UFS.F90:397 Not valid  - Aborting UFS

err file's log messages are:

+ export FI_MLX_INJECT_LIMIT=0
+ FI_MLX_INJECT_LIMIT=0
+ sync
+ sleep 1
+ '[' NO = WHEN_RUNNING ']'
+ srun --label -n 80 ./fv3.exe
 2: Abort(1) on node 2 (rank 2 in comm 496): application called MPI_Abort(comm=0x84000002, 1) - process 2
 4: Abort(1) on node 4 (rank 4 in comm 496): application called MPI_Abort(comm=0x84000002, 1) - process 4
 6: Abort(1) on node 6 (rank 6 in comm 496): application called MPI_Abort(comm=0x84000002, 1) - process 6

didn't find an error log in out file

[mjisan@hercules-login-1 coastal_ike_shinnecock_atm2sch2ww3_intel]$ cat out
Model started:   Tue Nov 18 18:26:26 CST 2025
 0:
 0:
 0: * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * .
 0:      PROGRAM ufs-weather-model HAS BEGUN. COMPILED       0.00     ORG: np23
 0:      STARTING DATE-TIME  NOV 18,2025  18:26:30.110  322  TUE   2460998
 0:
 0:
 0: Compiler version: Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.13.1 Build 20240703_000000
 0:
 0: MPI Library: Intel(R) MPI Library 2021.13 for Linux* OS
 0:
 0: MPI Version: 3.1

I am not sure what's causing this error. This is my run directory on hercules:
/work2/noaa/nos-surge/mjisan/ufs-weather-model/tests/stmp/mjisan/FV3_RT/rt_889618/coastal_ike_shinnecock_atm2sch2ww3_intel

@mansurjisan
Copy link
Collaborator

The issue appears to be originating from here.

if(APP MATCHES "^(CSTLW|CSTL-ALL)$")

Since CSTLSW is not defined here, Cmake is not enabling WW3 during the build process.

However, in the cmake file in feature/coastal_app branch, the CSTLSW flag exists in the WW3 section

if(APP MATCHES "^(CSTLW|CSTLAW|CSTLPAW|CSTLFW|CSTLSW|CSTLRW|CSTLRCW|CSTLSW|CSTLPSW|CSTL-ALL)$")

I fixed it on my end for feature/new_rt and model is now working as expected. I will include the fix in the PR for Coastal_Sandy_Duck_ATM2SCH2WW3

@uturuncoglu uturuncoglu changed the title Bring new regression test/s for Duck configuration Bring new regression test/s for Duck and CICE coupled configurations Jan 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

6 participants