Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cam6_4_072: Fix broken RRTMGP GPU tests #1260

Merged
merged 3 commits into from
Feb 28, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
./xmlchange ROOTPE='0'
./xmlchange ROF_NCPL=`./xmlquery --value ATM_NCPL`
./xmlchange GLC_NCPL=`./xmlquery --value ATM_NCPL`
./xmlchange CAM_CONFIG_OPTS=' -microphys mg3 -rad rrtmg' --append
./xmlchange CAM_CONFIG_OPTS=' -microphys mg3 -rad rrtmgp_gpu ' --append
./xmlchange TIMER_DETAIL='6'
./xmlchange TIMER_LEVEL='999'
./xmlchange GPU_TYPE=a100
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
./xmlchange ROOTPE='0'
./xmlchange ROF_NCPL=`./xmlquery --value ATM_NCPL`
./xmlchange GLC_NCPL=`./xmlquery --value ATM_NCPL`
./xmlchange CAM_CONFIG_OPTS=' -microphys mg3 -rad rrtmg -pcols 760 ' --append
./xmlchange CAM_CONFIG_OPTS=' -microphys mg3 -rad rrtmgp_gpu -pcols 760 ' --append
./xmlchange TIMER_DETAIL='6'
./xmlchange TIMER_LEVEL='999'
./xmlchange GPU_TYPE=a100
Expand Down
65 changes: 65 additions & 0 deletions doc/ChangeLog
Original file line number Diff line number Diff line change
@@ -1,5 +1,70 @@
===============================================================

Tag name: cam6_4_072
Originator(s): sjsprecious
Date: 28 February 2025
One-line Summary: Fix broken RRTMGP GPU tests
Github PR URL: https://github.com/ESCOMP/CAM/pull/1260

Purpose of changes (include the issue number and title text for each relevant GitHub issue):

Fixes #997 - RRTMGP not working with GPUs on derecho

Describe any changes made to build system: N/A

Describe any changes made to the namelist: N/A

List any changes to the defaults for the boundary datasets: N/A

Describe any substantial timing or memory changes: N/A

Code reviewed by: nusbaume

List all files eliminated: N/A

List all files added and what they do: N/A

List all existing files that have been modified, and describe the changes:

M cime_config/testdefs/testmods_dirs/cam/outfrq9s_gpu_default/shell_commands
M cime_config/testdefs/testmods_dirs/cam/outfrq9s_gpu_pcols760/shell_commands
- Update Derecho GPU regression test to use RRTMGP.

M src/physics/rrtmgp/radiation.F90
- Update OpenACC calls to allow RRTMGP to run on Derecho's GPUs.

If there were any failures reported from running test_driver.sh on any test
platform, and checkin with these failures has been OK'd by the gatekeeper,
then copy the lines from the td.*.status files for the failed tests to the
appropriate machine below. All failed tests must be justified.

derecho/intel/aux_cam:

ERP_Ln9.f09_f09_mg17.FCSD_HCO.derecho_intel.cam-outfrq9s (Overall: FAIL)
SMS_Ld1.f09_f09_mg17.FCHIST_GC.derecho_intel.cam-outfrq1d (Overall: DIFF)
- pre-existing failures due to HEMCO not having reproducible results (issues #1018 and #856)

SMS_D_Ln9.f19_f19_mg17.FXHIST.derecho_intel.cam-outfrq9s_amie (Overall: FAIL)
SMS_D_Ln9_P1280x1.ne0CONUSne30x8_ne0CONUSne30x8_mt12.FCHIST.derecho_intel.cam-outfrq9s (Overall: FAIL)
- pre-existing failures due to build-namelist error requiring CLM/CTSM external update

derecho/nvhpc/aux_cam:

ERS_Ln9.ne30pg3_ne30pg3_mg17.F2000dev.derecho_nvhpc.cam-outfrq9s_gpu_default (Overall: DIFF)
- Expected namelist and baseline answer changes due to the addition of RRTMGP.

izumi/nag/aux_cam: ALL PASS

izumi/gnu/aux_cam: ALL PASS

CAM tag used for the baseline comparison tests if different than previous
tag:

Summarize any changes to answers: b4b

===============================================================
===============================================================

Tag name: cam6_4_071
Originator(s): mwaxmonsky
Date: 26 February 2025
Expand Down
34 changes: 32 additions & 2 deletions src/physics/rrtmgp/radiation.F90
Original file line number Diff line number Diff line change
Expand Up @@ -1170,9 +1170,13 @@ subroutine radiation_tend( &

! Compute the gas optics (stored in atm_optics_sw).
! toa_flux is the reference solar source from RRTMGP data.
!$acc data copyin(kdist_sw,pmid_day,pint_day,t_day,gas_concs_sw) &
!$acc copy(atm_optics_sw) &
!$acc copyout(toa_flux)
errmsg = kdist_sw%gas_optics( &
pmid_day, pint_day, t_day, gas_concs_sw, atm_optics_sw, &
toa_flux)
!$acc end data
call stop_on_err(errmsg, sub, 'kdist_sw%gas_optics')

! Scale the solar source
Expand All @@ -1190,6 +1194,15 @@ subroutine radiation_tend( &
if (nday > 0) then

! Increment the gas optics (in atm_optics_sw) by the aerosol optics in aer_sw.
!$acc data copyin(coszrs_day, toa_flux, alb_dir, alb_dif, &
!$acc atm_optics_sw, atm_optics_sw%tau, &
!$acc atm_optics_sw%ssa, atm_optics_sw%g, &
!$acc aer_sw, aer_sw%tau, &
!$acc aer_sw%ssa, aer_sw%g, &
!$acc cloud_sw, cloud_sw%tau, &
!$acc cloud_sw%ssa, cloud_sw%g) &
!$acc copy(fswc, fswc%flux_net,fswc%flux_up,fswc%flux_dn, &
!$acc fsw, fsw%flux_net, fsw%flux_up, fsw%flux_dn)
errmsg = aer_sw%increment(atm_optics_sw)
call stop_on_err(errmsg, sub, 'aer_sw%increment')

Expand All @@ -1208,7 +1221,7 @@ subroutine radiation_tend( &
atm_optics_sw, top_at_1, coszrs_day, toa_flux, &
alb_dir, alb_dif, fsw)
call stop_on_err(errmsg, sub, 'all-sky rte_sw')

!$acc end data
end if

! Transform RRTMGP outputs to CAM outputs and compute heating rates.
Expand Down Expand Up @@ -1264,15 +1277,31 @@ subroutine radiation_tend( &
call rrtmgp_set_gases_lw(icall, state, pbuf, nlay, gas_concs_lw)

! Compute the gas optics and Planck sources.
!$acc data copyin(kdist_lw, pmid_rad, pint_rad, &
!$acc t_rad, t_sfc, gas_concs_lw) &
!$acc copy(atm_optics_lw, atm_optics_lw%tau, &
!$acc sources_lw, sources_lw%lay_source, &
!$acc sources_lw%sfc_source, sources_lw%lev_source_inc, &
!$acc sources_lw%lev_source_dec, sources_lw%sfc_source_jac)
errmsg = kdist_lw%gas_optics( &
pmid_rad, pint_rad, t_rad, t_sfc, gas_concs_lw, &
atm_optics_lw, sources_lw)
!$acc end data
call stop_on_err(errmsg, sub, 'kdist_lw%gas_optics')

! Set LW aerosol optical properties in the aer_lw object.
call rrtmgp_set_aer_lw(icall, state, pbuf, aer_lw)

! Increment the gas optics by the aerosol optics.
!$acc data copyin(atm_optics_lw, atm_optics_lw%tau, &
!$acc aer_lw, aer_lw%tau, &
!$acc cloud_lw, cloud_lw%tau, &
!$acc sources_lw, sources_lw%lay_source, &
!$acc sources_lw%sfc_source, sources_lw%lev_source_inc, &
!$acc sources_lw%lev_source_dec, sources_lw%sfc_source_Jac, &
!$acc emis_sfc) &
!$acc copy(flwc, flwc%flux_net, flwc%flux_up, flwc%flux_dn, &
!$acc flw, flw%flux_net, flw%flux_up, flw%flux_dn)
errmsg = aer_lw%increment(atm_optics_lw)
call stop_on_err(errmsg, sub, 'aer_lw%increment')

Expand All @@ -1287,7 +1316,8 @@ subroutine radiation_tend( &
! Compute all-sky LW fluxes
errmsg = rte_lw(atm_optics_lw, top_at_1, sources_lw, emis_sfc, flw)
call stop_on_err(errmsg, sub, 'all-sky rte_lw')

!$acc end data

! Transform RRTMGP outputs to CAM outputs and compute heating rates.
call set_lw_diags()

Expand Down