[1pt] Debugging changes from full-scale flow-based and stage-based CatFIM runs#1786
Open
EmilyDeardorff wants to merge 16 commits intodevfrom
Open
[1pt] Debugging changes from full-scale flow-based and stage-based CatFIM runs#1786EmilyDeardorff wants to merge 16 commits intodevfrom
EmilyDeardorff wants to merge 16 commits intodevfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR contains fixes to bugs that arose during the March 2026 full-scale CatFIM runs for the FIM 6.1 release.
The following bugs were addressed:
Incomplete HUC Dictionary Bug: The
get_huc_dictionary()function was not creating a complete HUC list and was therefore not pulling all of the required site thresholds from WRDS.Issue was resolved by implementing the
aggregate_wbd_hucs()function, which uses the WBD geopackage to identify the proper HUC for each site.Incomplete NRLDB Data Bug: We recently switched from prioritizing USGS site data to prioritizing NRLDB site data when both are available on WRDS. Unfortunately, a LOT of sites have incomplete NRLDB data or incorrect datum information.
Implemented some logic into __adjust_datum_ft() to apply corrections for mistyped horizontal and vertical datums (when the intended datum is obvious enough). Implemented the
check_metadata_CRS_availability(), to check whether sites have CRS available for one or both sources and updatedget_thresholds()to use the optionalsource_crs_availabilitylist to determine which data source (USGS or NRLDB) to use based on site metadata availability.Status = None Bug: Entire CatFIM run was being aborted in
update_sites_mapping_status()when a mapped site had no value in thestatus_valobject.Updated the logic so the code would process a value of
Nonethe same as a value of"".Changes
data/wrds/download_process_wrds.py:get_huc_dictionary()function with theaggregate_wbd_hucs()function so it gets the HUCs for each site using geospatial overlay rather than relying on WRDS to have the right HUC information for each site (because that isn't always the case).download_all_thresholds()so it can download data based on which one has metadata available.check_metadata_CRS_availability(), to check whether sites have CRS available for one or both sources. This function creates thelid_source_dictdictionary.data/nws/preprocess_ahps_nws.py: Updatedget_thresholds()to have the proper placeholder for thesource_crs_availabilityinput.data/usgs/get_usgs_rating_curves.py: Updatedget_thresholds()to have the proper placeholder for thesource_crs_availabilityinput.data/usgs/preprocess_ahps_usgs.py: Updatedget_thresholds()to have the proper placeholder for thesource_crs_availabilityinput.tools/catfim/generate_categorical_fim.py:get_huc_dictionary()function with the `aggregate_wbd_hucs() functiondownload_all_thresholds()update_sites_mapping_status()so it no longer aborts the entire CatFIM run when a status_val for a mapped site = None.tools/catfim/generate_categorical_fim_flows.py: Removedget_thresholds()andget_metadata()from the inputs because they are not used in this script.tools/tools_shared_functions.py: Updatedget_thresholds()to use the optionalsource_crs_availabilitylist to determine which data source (USGS or NRLDB) to use based on site metadata availability.Testing
Tested the new
get_thresholds()functionality (intools/tools_shared_functions.py) multiple times, both individually and indata/wrds/download_process_wrds.pyandtools/catfim/generate_categorical_fim.py, and it performed as expected.The updates to
tools/catfim/generate_categorical_fim.pywere also tested in several small scale and full-scale flow- and stage-based CatFIM runs and they ran as expected.Tested the
get_thresholds()function with the placeholder of None for thesource_crs_availabilityinput and it worked as expected (no change in previous functionality). This means that this code change should be fine in the scripts where it was implemented (data/nws/preprocess_ahps_nws.py,data/usgs/get_usgs_rating_curves.py, anddata/usgs/preprocess_ahps_usgs.py).Tested the individual run of
data/wrds/download_process_wrds.pywith the new changes and it also performed as expected.Deployment Plan (For FIM developers use)
Does the change impact inputs, docker or python packages?
If you are not a FIM dev team member: Please let us know what you need and we can help with it.
If you are a FIM Dev team member:
Please work with the DevOps team and do not just go ahead and do it without some co-ordination.
Copy where you can, assign where you can not, and it is your responsibility to ensure it is done. Please ensure it is completed before the PR is merged.
Has new or updated python packages, PipFile, Pipefile.lock or Dockerfile changes? DevOps can help or take care of it if you want. Just need to know if it is required.
Require new or adjusted data inputs? Does it have a way to version (folder or file dates)?
Please use caution in removing older version unless it is at least two versions ago. Confirm with DevOps if cleanup might be involved.
If new or updated data sets, has the FIM code, including running fim_pipeline.sh, been updated and tested with the new/adjusted data? You can dev test against subsets if you like.
Notes to DevOps Team or others:
Please add any notes that are helpful for us to make sure it is all done correctly. Do not put actual server names or full true paths, just shortcut paths like 'efs..../inputs/, or 'dev1....inputs', etc.
Issuer Checklist (For developer use)
You may update this checklist before and/or after creating the PR. If you're unsure about any of them, please ask, we're here to help! These items are what we are going to look for before merging your code.
[_pt] PR: <description>devbranch (the default branch), you have a descriptive Feature Branch name using the format:dev-<description-of-change>(e.g.dev-revise-levee-masking)devbranchpre-commithooks were run locally4.x.x.xReviewer / Approver Checklist
Merge Checklist (For Technical Lead use only)