-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bring recent updates from develop and configure reforecast #3015
Bring recent updates from develop and configure reforecast #3015
Conversation
Make ATM-OCN-ICE coupling model run on AWS. This adds capability to run UFS atm-ocn-ice coupling on AWS. Resolves NOAA-EMC#2858
This PR corrects a bug in the staging job for ocean `MOM.res_#` IC files. The `OCNRES` value was coming in as an integer (e.g. `25`) but the `ocean.yaml.j2` file was checking for `"025"`. Correct to now set OCNRES to be three digits in staging script and also correct the for loop range to include third file. Resolves NOAA-EMC#2864
…AA-EMC#2738) This PR adds in support for computing files needed for the aerosol analysis **B**. This includes a new task, `aeroanlgenb`. This work was performed by both me and @andytangborn Resolves NOAA-EMC#2501 Resolves NOAA-EMC#2737 --------- Co-authored-by: Andrew.Tangborn <[email protected]> Co-authored-by: Walter Kolczynski - NOAA <[email protected]>
Adds files `atmi009.nc`, `atmi003.nc`, `ratmi009.nc`, and `ratmi003.nc` to list of files to be staged for ICs, if available. These are necessary for starting an IAU run, and are currently missing. Resolves NOAA-EMC#2874
# Description Support global-worflow GEFS C48 on Google Cloud. Make env. var. and yaml file changes, so global-workflow GEFS C48 case can run properly on Google Cloud. Resolves NOAA-EMC#2860
Add the ability to run CI test C96_atm3DVar on Gaea-C5 Resolves NOAA-EMC#2766 Refs NOAA-EMC/prepobs#32 Refs NOAA-EMC/Fit2Obs#28
Use the updated 2013 to 2024 mean MERRA2 climatology instead of 2003 to 2014 mean Depends on NOAA-EMC#2887 Refs: ufs-community/ufs-weather-model#2272 Refs: ufs-community/ufs-weather-model#2273
…MC#2893) This changes the order of the cleanup job so that the working directory is deleted at the end. It also adds the `-ignore_readdir_race` flag to `find` to prevent errors if a file was deleted after the list of files was collected. This can happen if two consecutive cycles run the cleanup job at the same time.
This updates the model hash to include the UPP update needed to be able to run the post processor on Orion, thus reenabling support on that system. A note on the UPP: it is using a newer version of g2tmpl that requires a separate spack-stack 1.6.0 installation. This version of g2tmpl will be standard in spack-stack 1.8.0, but for now requires loading separate modules for the UPP. A note on running analyses on Orion: due to a yet-unknown issue causing the BUFR library to run much slower on Orion when compared with Rocky 8, the GSI and GDASApp are expected to run significantly slower than on any other platform (on the order of an hour longer). Lastly, I made adjustments to the build_all.sh script to send more cores to compiling the UFS and GDASApp. Under this configuration, the GSI, UPP, UFS_Utils, and WW3 pre/post executables finish compiling before the UFS when run with 20 cores. Resolves NOAA-EMC#2694 Resolves NOAA-EMC#2851 --------- Co-authored-by: Rahul Mahajan <[email protected]> Co-authored-by: Walter.Kolczynski <[email protected]>
…#2816) - This task is an extension of the empty arch job previously merged. - This feature adds an archive task to GEFS system to archive files locally. - This feature archives files in ensstat directory. Resolves NOAA-EMC#2698 Refs NOAA-EMC#832 NOAA-EMC#2772
The current operational BUFR job begins concurrently with the GFS model run. This PR updates the script and ush to process all forecast hour data simultaneously, then combines the temporary outputs to create BUFR sounding products for each station. The updated job will now start processing data only after the GFS model completes its 180-hour run, handling all forecast files from 000hr to 180hr at a time. The new version job running will need 7 nodes instead of the current operational 4 nodes. This PR depends on the GFS bufr code update NOAA-EMC/gfs-utils#75 With the updates of bufr codes and scripts, there is no need to add restart capability to GFS post-process job JGFS_ATMOS_POSTSND. This PR includes the other changes: Rename the following table files: parm/product/bufr_ij13km.txt to parm/product/bufr_ij_gfs_C768.txt parm/product/bufr_ij9km.txt to parm/product/bufr_ij_gfs_C1152.txt Add a new table file: parm/product/bufr_ij_gfs_C96.txt for GFSv17 C96 testing. Added a new capability to the BUFR package. The job priority is to read bufr_ij_gfs_${CASE}.txt. If the table file is not available, the code will automatically find the nearest neighbor grid point (i, j). Refs NOAA-EMC#1257 Refs NOAA-EMC/gfs-utils#75
This PR creates a PyGFS class called JEDI, which is to be instantiated everytime a JEDI application is run. The AtmAnalysis and AtmEnsAnalysis classes are no longer children of the Analysis class, but rather direct children of the Task class. They each have a JEDI object as an attribute, which is used to run either the variational/ensemble DA JEDI applications or the FV3 increment converter JEDI application, depending on which job they are created for (e.g. atmanlvar vs. atmanlfv3inc). The intention is that a later PR will apply this framework to all analysis task, and the PyGFS Analysis class will be removed.
This PR: - Creates a standalone page for FAQ and Common issues - Adds a block of caution on using variables in a users' `bashrc` Fixes: NOAA-EMC#2850
This modifies the way the `config` dictionary is constructed and referenced. Rather than updating a single configuration dictionary with each `RUN`, a `RUN`-based dictionary of `config` dictionaries is created and referenced by the appropriate `RUN` when calculating resources. This also makes the methods that were hidden before NOAA-EMC#2727 hidden again. Resolves NOAA-EMC#2783
This replaces `APRUN` with `APRUN_default` in all of the `.env` files. Resolves NOAA-EMC#2870
This adds 3 missing links from the UPP into parm/ufs to .gitignore. Resolves NOAA-EMC#2901
…ression options (NOAA-EMC#2914) - enables writing native grid model output when doing JEDI-atm DA - updates compression settings for C384 model output Fixes NOAA-EMC#2891
Support global-worflow GEFS C48 on Azure Make env. var. and yaml file changes, so global-workflow GEFS C48 case can run properly on Google Cloud. Resolves NOAA-EMC#2882
…C#2922) Adds 1 deg ocean, ice information to config.resources so 1 deg ocean jobs can run
In preparation for GDASApp CI tests, add GDASApp build capability to global-workflow and remove memory specifications for Gaea-C5 xml setup (in Ref to NOAA-EMC#2727) Resolves NOAA-EMC#2535 Resolves NOAA-EMC#2910 Resolves NOAA-EMC#2911
As forecast ensemble jobs are added to the global workflow, this PR ensures the output is being cleaned up properly once it is no longer needed. Resolves NOAA-EMC#833
This PR updates the parm/config/gfs/config.resources and env/WCOSS2.env files for the BUFR sounding job "postsnd." It includes adjustments to resource settings such as tasks per node and memory allocations for various GFS resolutions, including C768, C1152, and others. Here are the proposed changes: C768: 7 nodes, 21 tasks per node C1152: 16 nodes, 9 tasks per node
NCO has requested that each COM variable specify whether it is an input or an output. This completes that process for the global-workflow Unified Post Processor (UPP) task. Refs: NOAA-EMC#2451
This PR updates the `develop` branch to use the newer operational `obsproc/v1.2.0` and `prepobs/v1.1.0`. The obsproc/prepobs installs in glopara space on supported platforms use tags cut from the `dev/gfsv17` branches in the respective repos. The installation of `prepobs/v1.1.0` on WCOSS2 is called "gfsv17_v1.1.0" to help avoid GFSv16 users using it instead of the operational module. Also, the `HOMEobsproc` path is updated to set an empty default for `obsproc_run_ver`. This both removes the need to set a default (and constantly update it, which is duplication) and avoid the unset variable error when the fcst jobs use their own load module script that does not know `obsproc_run_ver`: ``` export HOMEobsproc="${BASE_GIT:-}/obsproc/v${obsproc_run_ver:-}" ``` This PR also reverts the prepobs and fit2obs installs on MSU back to the glopara space from the temporary `/work/noaa/global/kfriedma/glopara` space installs. Lastly, this PR also includes updates to complete issue NOAA-EMC#2844 (merge `build.spack.ver` and `run.spack.ver`). Resolves NOAA-EMC#2291 Resolves NOAA-EMC#2840 Resolves NOAA-EMC#2844
This PR removed the GTS BUFR2IODA part of the snow obs prep job, and replaced it with a direct read from BUFR snow data at runtime by the JEDI executable. Depends on NOAA-EMC/GDASApp#1276 --------- Co-authored-by: Cory Martin <[email protected]>
This modifies the resources for gdasfcst (everywhere) and enkfgdaseupd (Hera only). For the fcst job, the number of write tasks is increased to prevent out of memory errors from the inline post. For the eupd, the number of tasks is decreased to prevent out of memory errors. The runtime for the eupd job was just over 10 minutes. Resolves NOAA-EMC#2506 Resolves NOAA-EMC#2498 Resolves NOAA-EMC#2916 --------- Co-authored-by: Walter Kolczynski - NOAA <[email protected]>
…A-EMC#2939) NCO has requested that each COM variable specify whether it is an input or an output. This completes that process for the global minimization monitor job. Refs NOAA-EMC#2451
# Description The main purpose of this PR is to remove the need for an ice_prod dependency check script `ush/check_ice_netcdf.sh`. The original purpose of the ice_prod dependency check script is to check for special case dependencies where `( cyc + FHMIN ) % FHOUT_ICE )) =! 0` (more details on this issue can be found in issue NOAA-EMC#2674 ). A bugfix for these special cases is expected to come from a PR in the ufs-weather-model. Resolves NOAA-EMC#2721 Refs NOAA-EMC#2721, NOAA-EMC#2674
Refines the issue and PR templates to cover some shortcomings and pitfalls we have identified. The fix file issue template is expanded to cover other data sets managed under "glopara". Resolves NOAA-EMC#2589
…#2895) - This task is an extension of the arch job previously merged that archives files in ROTDIR (NOAA-EMC#2816 AntonMFernando-NOAA@2816c3b) - This feature adds an archive task to GEFS system to archive files in HPSSARCH and LOCALARCH. Resolves NOAA-EMC#2698 Refs NOAA-EMC#2816 NOAA-EMC#2772 NOAA-EMC#832 --------- Co-authored-by: David Huber <[email protected]>
The restart interval has been set to FHMAX_GFS for both control and perturbed members.
To facilitate longer and more flexible GFS cadences, the `gfs_cyc` variable is replaced with a specified interval. Up front, this is reflected in a change in the arguments for setup_exp to: ``` --interval <n_hours> ``` Where `n_hours` is the interval (in hours) between gfs forecasts. `n_hours` must be a multiple of 6. If 0, no gfs will be run (only gdas; only valid for cycled mode). The default value is 6 (every cycle). (This is a change from current behavior of 24.) In cycled mode, there is an additional argument to control which cycle will be the first gfs cycle: ``` --sdate_gfs <YYYYMMDDHH> ``` The default if not provided is `--idate` + 6h (first full cycle). This is the same as current behavior when `gfs_cyc` is 6, but may vary from current behavior for other cadences. As part of this change, some of the validation of the dates has been added. `--edate` has also been made optional and defaults to `--idate` if not provided. During `config.base` template-filling, `INTERVAL_GFS` (renamed from `STEP_GFS`) is defined as `--interval` and `SDATE_GFS as `--sdate_gfs`. Some changes were necessary to the gfs verification (metp) job, as `gfs_cyc` was being used downstream by verif-global. That has been removed, and instead workflow will be responsible for only running metp on the correct cycles. This also removes "do nothing" metp tasks that exit immediately, because only the last GFS cycle in a day would actually process verification. Now, metp has its own cycledef and will (a) always runs at 18z, regardless of whether gfs is running at 18z or not, if the interval is less than 24h; (b) use the same cycledef as gfs if the interval is 24h or greater. This is simpler than trying to determine the last gfs cycle of a day when it could change from day to day. To facilitate this change, support for the undocumented rocoto dependency tag `taskvalid` is added, as the metp task needs to know whether the cycle has a gfsarch task or not. metp will trigger on gfsarch completing (as before), or look backwards for the last gfsarch to exist. Additionally, a couple EE2 issues with the metp job are resolved (even though it is not run in ops): - verif-global update replaced `$CDUMP` with `$RUN` - `$DATAROOT` is no longer redefined in the metp job Also corrects some dependency issues with the extractvars job for replay and the replay CI test. Depends on NOAA-EMC/EMC_verif-global#137 Resolves NOAA-EMC#260 Refs NOAA-EMC#1299 --------- Co-authored-by: David Huber <[email protected]>
The replay analysis data is needed for the repair_replay task.
ush/python/pygfs/task/stage_ic.py
Outdated
@@ -4,10 +4,12 @@ | |||
import os | |||
from logging import getLogger | |||
from typing import Any, Dict, List | |||
import subprocess |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of subprocess, I would suggest looking at wxflow's Executable
class, in particular the instantiation function which
. This adds a considerable amount of error checking and robustness to subprocess
and avoids using the shell=True
option, which has security concerns.
Side note, there is an open issue in wxflow to add an S3 bucket transfer capability (NOAA-EMC/wxflow#42).
import subprocess |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the suggestion, @DavidHuber-NOAA. I will try to use which
instead of subprocess
(at least until S3 bucket transfer capability is added to wxflow).
ush/python/pygfs/task/stage_ic.py
Outdated
add_to_datetime, to_timedelta, Template, TemplateConstants) | ||
logit, parse_j2yaml, strftime, to_YMD, to_YMDH, | ||
add_to_datetime, to_timedelta, Template, TemplateConstants, | ||
Hsi, Htar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hsi, Htar) | |
Hsi, Htar, which) |
ush/python/pygfs/task/stage_ic.py
Outdated
aws_cmd = "aws s3 cp --no-sign-request" | ||
aws_url = "s3://noaa-ufs-gefsv13replay-pds/" | ||
subprocess.run(aws_cmd + " " + aws_url + YYYY + "/" + MM + "/" + YYYYMMDDHH + "/GFSPRS.GrbF03 ./", shell=True) | ||
subprocess.run(aws_cmd + " " + aws_url + YYYY + "/" + MM + "/" + YYYYMMDDHH + "/GFSFLX.GrbF03 ./", shell=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aws_cmd = "aws s3 cp --no-sign-request" | |
aws_url = "s3://noaa-ufs-gefsv13replay-pds/" | |
subprocess.run(aws_cmd + " " + aws_url + YYYY + "/" + MM + "/" + YYYYMMDDHH + "/GFSPRS.GrbF03 ./", shell=True) | |
subprocess.run(aws_cmd + " " + aws_url + YYYY + "/" + MM + "/" + YYYYMMDDHH + "/GFSFLX.GrbF03 ./", shell=True) | |
aws_cmd = which("aws") | |
aws_cmd.add_default_arg("s3") | |
aws_cmd.add_default_arg("cp") | |
aws_cmd.add_default_arg("--no-sign-request") | |
aws_url = "s3://noaa-ufs-gefsv13replay-pds/" | |
aws_cmd([aws_url + YYYY + "/" + MM + "/" + YYYYMMDDHH + "/GFSPRS.GrbF03", "./"]) | |
aws_cmd([aws_url + YYYY + "/" + MM + "/" + YYYYMMDDHH + "/GFSFLX.GrbF03", "./"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DavidHuber-NOAA When I try your suggestion, I get the following error (/lfs/h2/emc/ptmp/eric.sinsky/GEFS/COMROOT/customexp/gw_reforecast_update/logs/1994010400/gefs_stage_ic.log.9
):
File "/lfs/h2/emc/ens/noscrub/eric.sinsky/GEFS/gw_reforecast_update/ush/python/pygfs/task/stage_ic.py", line 70, in execute_stage
aws_cmd([aws_url + YYYY + "/" + MM + "/" + YYYYMMDDHH + "/GFSPRS.GrbF03", "./"])
File "/lfs/h2/emc/ens/noscrub/eric.sinsky/GEFS/gw_reforecast_update/ush/python/wxflow/executable.py", line 199, in __call__
escaped_cmd = ["'%s'" % arg.replace("'", "'\"'\"'") for arg in cmd]
File "/lfs/h2/emc/ens/noscrub/eric.sinsky/GEFS/gw_reforecast_update/ush/python/wxflow/executable.py", line 199, in <listcomp>
escaped_cmd = ["'%s'" % arg.replace("'", "'\"'\"'") for arg in cmd]
AttributeError: 'list' object has no attribute 'replace'
I then removed the list object and changed line 70 to:
aws_cmd(aws_url + YYYY + "/" + MM + "/" + YYYYMMDDHH + "/GFSPRS.GrbF03 ./")
However, when I do this, I get the following error (/lfs/h2/emc/ptmp/eric.sinsky/GEFS/COMROOT/customexp/gw_reforecast_update/logs/1994010400/gefs_stage_ic.log):
aws: error: the following arguments are required: paths
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, that should not have been a list, but a set of strings sent in as arguments. Please try this instead:
aws_cmd = "aws s3 cp --no-sign-request" | |
aws_url = "s3://noaa-ufs-gefsv13replay-pds/" | |
subprocess.run(aws_cmd + " " + aws_url + YYYY + "/" + MM + "/" + YYYYMMDDHH + "/GFSPRS.GrbF03 ./", shell=True) | |
subprocess.run(aws_cmd + " " + aws_url + YYYY + "/" + MM + "/" + YYYYMMDDHH + "/GFSFLX.GrbF03 ./", shell=True) | |
aws_cmd = which("aws") | |
aws_cmd.add_default_arg("s3") | |
aws_cmd.add_default_arg("cp") | |
aws_cmd.add_default_arg("--no-sign-request") | |
aws_url = "s3://noaa-ufs-gefsv13replay-pds/" | |
aws_cmd(aws_url + YYYY + "/" + MM + "/" + YYYYMMDDHH + "/GFSPRS.GrbF03", "./") | |
aws_cmd(aws_url + YYYY + "/" + MM + "/" + YYYYMMDDHH + "/GFSFLX.GrbF03", "./") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @DavidHuber-NOAA. Comma-separating the set of strings fixed the issue.
Unused python packages have been removed from marine_analysis.py.
e7b07d5
into
NOAA-EMC:feature/gefs_reforecast
Description
This PR brings recent changes from the develop branch to the GEFS reforecast branch. This PR updates the GEFS reforecast branch to develop hash ac3cde5 (10/11/2024). This version of global-workflow uses the ufs-weather-model hash 6a4e09e (9/9/2024).
Furthermore, this PR ensures the following adjustments for the reforecast:
Type of change
Change characteristics
How has this been tested?
This branch is being tested on WCOSS2. When testing has succeeded, this PR will be marked as ready for review.
Checklist