-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Overview/Tasks
- Map out
nctogribtranslation for NLDAS2- Confirm mapping with Gopal
- Determine needed inputs for Gopal's routine
- Check if NLDAS2 has been updated to include netcdf download from 1979-2023?
- Yes. It appears that all data has been reformatted to the netCDF in the archive
- Test routine to harvest layers from
ncand set them intogeotiff- Get the appropriate spatial reference system. Is srid 4087 correct? Can simply setting it make it work out?
- Add new variable to allow easier testing comparisons?
insert into dh_variabledefinition (varkey, varcode, varname) values ('nldas2_precip_netcdf', 'nldas2_precip_netcdf', 'nldas2_precip_netcdf'); - Do download of nldas2-B in case that is the difference?
- convert
geotifftogrb - Add script to convert full dataset to grib file into work flow
-
Make optional setting?No. By making this a mandatory step,grid simply becomes the "source" format for the work flow steps in process. - Put into download step (or process?)
- This should be in the process step, but, to avoid redundant conversions since it is the most time consuming processing step, should have a switch to decide whether or not to re-do if the grb files already exist.
- deploy in work flow (see bout time addition, testing single raster seemed very low time of processing)
-
- Handle all data needing translation
- Re-download 2020-2025 to get recent retroactive data update
- Run 01_translate_nc
Scripts
Run Translate Workflow Step for Range of Years
scenario=nldas2
for year in 2020 2021 2022 2023 2024 2025; do
sday=1
ndays=`date -d "${year}-12-31" '+%j'`
# or, if in the middle of a year
#ndays=`date '+%j'`
ndays=$((ndays - 1))
for i in `seq $sday $ndays`; do
job_name=`cbp mm_job_name $scenario $i`
mdate=`date -d "${year}-01-01 $i days" '+%Y-%m-%d'`
echo "sbatch --job-name=$job_name run_model raster_met $scenario "$mdate" auto met process 01_translate_nc"
sbatch --job-name=$job_name run_model raster_met $scenario "$mdate" auto met process 01_translate_nc
done
done
Development
SQL, Spatial Reference and R Tools
- Inventory of dh data: Raster Data Inventory and Analysis #50
- QA link from Connor:
- Point Analysis: Workflow
amalgamate#66 (comment) - Raster plotting: Workflow
amalgamate#66 (comment) - Find specific days: Integrate temporal disaggregation step into
wdmworkflow. #86 (comment) - temporal dissag: Extensively document the need and ramification of the reclassify step. #97
- Point Analysis: Workflow
- https://disc.gsfc.nasa.gov/information/howto?title=How%20to%20Read%20Data%20in%20netCDF%20Format%20with%20R
- https://hydro1.gesdisc.eosdis.nasa.gov/data/NLDAS/NLDAS2_README.pdf
- Projection:
- NetCDF lacks proper projection info
- But, it uses a
MAP_PROJECTION=EQUIDISTANT CYLINDRICAL - Try SRID: 4087
Proposed workflows
- Document formats:
- use
h5lsto get a list of layer names - Export to a tiff -- see code in meta_model for example
- We use tiff instead of grb, since grb can be created ,but not appended, whereas tiff can be appended by
gdal- Considered multiple tiffs with
gdal_merge.pyto put them together but not needed
- Considered multiple tiffs with
- use
- Workflow Steps:
- Create an empty based on Grib template (use old file version)
- Set projection SRID on source file
Code: gdal_create + gdal_warp
- From gdal docs:
gdal_create -if in_red.tif -bands 3 out_rgb.tif
gdalwarp in_red.tif out_rgb.tif -srcband 1 -dstband 1
gdalwarp in_green.tif out_rgb.tif -srcband 1 -dstband 2
gdalwarp in_blue.tif out_rgb.tif -srcband 1 -dstband 3
- Test a single file
- on deq3 since it has new
gdalwarpwhich can update individual bands - Uses
geotiffformat as the grid format is not editable, only creatable.
- on deq3 since it has new
- First: set the file name, the remainder of code is generic
#infile="NLDAS_FORA0125_H.A19890101.0100.020.nc"
infile="NLDAS_FORA0125_H.A20250101.0100.020.nc"
out_type='grb'
- Now do the conversion and import routine:
# out_temp is the template for output format
out_temp="NLDAS_FORA0125_H.A20230101.0100.002.grb"
maskExtent="/opt/model/model_meteorology/data/cbp_extent.csv"
out_file="test_cbp."$out_type # final file to import
lyr_file="test_cbp.tiff" # intermediate file - must have this to accept layer setting, then convert
raster_sql="test_cbp.sql"
rm $out_file
rm $lyr_file
gdal_create -if $out_temp -bands 11 $lyr_file
#gdaltransform -t_srs EPSG:4087 $outtif
# clip
# Note: -srcband is always 1, since we've extracted the band already by specifying the band name like ":Tair"
bnd=0
for i in Tair Qair PSurf Wind_E Wind_N LWdown CRainf_frac CAPE PotEvap Rainf SWdown ; do
#for i in Rainf; do
bnd=$((bnd + 1))
gdalwarp NETCDF:"$infile":$i -t_srs EPSG:4326 $lyr_file -srcband 1 -dstband $bnd
done
gdalwarp -of grib -t_srs EPSG:4326 $lyr_file $out_file
# now test
raster2pgsql $out_file >$raster_sql
echo "drop table test_cbp;" | psql -h dbase2 drupal.dh03
cat $raster_sql |psql -h dbase2 drupal.dh03
- TO test, log on to database and set startup vars:
\pset pager 0
\set band 10
\set yr 2025
\set mo 1
\set da 1
\set hr 1
-- :ttid = the timestamp and tid for the matching day
select tsendtime as ttime from dh_timeseries_weather_varkey where extract(year from to_timestamp(tsendtime)) = :yr and extract(month from to_timestamp(tsendtime)) = :mo and extract(day from to_timestamp(tsendtime)) = :da and extract(hour from to_timestamp(tsendtime)) = :hr limit 1 \gset
- The temp file import for a single hour:
select (stats).* from (
select st_summarystats(rast,:band,TRUE) as stats
from (
select st_clip(a.rast, b.dh_geofield_geom) as rast
from test_cbp as a left outer join dh_feature_fielded as b
on (
b.hydrocode = 'cbp6_met_coverage'
)
) as bar
) as foo;
- Example Result:
count | sum | mean | stddev | min | max
-------+--------------------+--------------------+--------------------+-----+-------------------
4744 | 2132.3541013153736 | 0.4494844227056015 | 0.5993693764500673 | 0 | 3.254499912261963
(1 row)
- summarize the existing rasters
- Note: imports as UTC, so, EST day is 5 hours earlier hence
:ttime - 3600 * 5
select tsendtime, (stats).* from (
select to_timestamp(tsendtime) as tsendtime, st_summarystats(rast,1,TRUE) as stats
from (
select tstime, tsendtime, count(*), st_union(rast) as rast
from dh_timeseries_weather_varkey
where varkey = 'nldas2_precip_hourly_tiled_16x16'
and tsendtime = (:ttime - 3600*5)
group by tstime, tsendtime
) as foo
);
- Result for 2025-01-01 import (imports as UTC, so, EST day is 5 hours earlier
tsendtime | count | sum | mean | stddev | min | max
------------------------+-------+-------------------+---------------------+--------------------+-----+-------------------
2024-12-31 20:00:00-05 | 4632 | 2094.369601050159 | 0.45215233183293585 | 0.6014965362747882 | 0 | 3.254499912261963
(1 row)
Unsure of ate Range? Show a bunch
select tsendtime, (stats).* from (
select to_timestamp(tsendtime) as tsendtime, st_summarystats(rast,1,TRUE) as stats
from (
select tstime, tsendtime, count(*), st_union(rast) as rast
from dh_timeseries_weather_varkey
where varkey = 'nldas2_precip_hourly_tiled_16x16'
and tsendtime >= (:ttime - 3600*15)
and tsendtime <= (:ttime + 3600*15)
group by tstime, tsendtime
) as foo
);
Try a single Land Segment
- Note: To find land segment with non-zero rainfall on the date of interest, See below: "Find Landsegs with Rainfall"
- Show the manual import:
\set landseg 'A51027'
select (stats).* from (
select st_summarystats(rast,:band,TRUE) as stats
from (
select st_clip(a.rast, b.dh_geofield_geom) as rast
from test_cbp as a left outer join dh_feature_fielded as b
on (
b.hydrocode = :'landseg' and b.ftype = 'cbp532_landseg'
)
) as bar
) as foo;
- show the dh_timeseries_weather raster summary for the single hour
select (stats).* from (
select st_summarystats(rast,1,TRUE) as stats
from (
select st_clip(a.rast, b.dh_geofield_geom) as rast
from (
select tstime, tsendtime, count(*), st_union(rast) as rast
from dh_timeseries_weather_varkey
where varkey = 'nldas2_precip_hourly_tiled_16x16'
and tsendtime = (:ttime - 3600*5)
group by tstime, tsendtime
) as a left outer join dh_feature_fielded as b
on (
b.hydrocode = :'landseg' and b.ftype = 'cbp532_landseg'
)
) as bar
) as foo;
- Show the dh_timeseries_weather rasters for a range including the date of interest
select (stats).* from (
select st_summarystats(rast,1,TRUE) as stats
from (
select st_clip(a.rast, b.dh_geofield_geom) as rast
from (
select tstime, tsendtime, count(*), st_union(rast) as rast
from dh_timeseries_weather_varkey
where varkey = 'nldas2_precip_hourly_tiled_16x16'
and tsendtime >= (:ttime - 3600*15)
and tsendtime <= (:ttime + 3600*15)
group by tstime, tsendtime
) as a left outer join dh_feature_fielded as b
on (
b.hydrocode = :'landseg' and b.ftype = 'cbp532_landseg'
)
) as bar
) as foo;
Results single test raster
count | sum | mean | stddev | min | max
-------+-------+--------+--------+-----+--------
10 | 0.125 | 0.0125 | 0.025 | 0 | 0.0625
(1 row)
dh_timeseries_weather imported raster
count | sum | mean | stddev | min | max
-------+---------------------+----------------------+----------------------+-----+---------------------
10 | 0.19900000002235174 | 0.019900000002235175 | 0.029516944293705646 | 0 | 0.07890000194311142
(1 row)
Find Landsegs with Rainfall
select (stats).* from (
select st_summarystats(rast,1,TRUE) as stats
from (
select st_clip(a.rast, b.dh_geofield_geom) as rast
from (
select tstime, tsendtime, count(*), st_union(rast) as rast
from dh_timeseries_weather_varkey
where varkey = 'nldas2_precip_hourly_tiled_16x16'
and tsendtime = (:ttime - 3600*5)
group by tstime, tsendtime
) as a left outer join dh_feature_fielded as b
on (
b.hydrocode = :'landseg' and b.ftype = 'cbp532_landseg'
)
) as bar
) as foo;
Miscellaneous
Use gdalinfo to get layer names.
Names from netCDF file format (2024 forward).
gdalinfo /backup/meteorology/2025/001/NLDAS_FORA0125_H.A20250101.0200.020.nc|grep _NAME
Warning 1: No UNIDATA NC_GLOBAL:Conventions attribute
SUBDATASET_1_NAME=NETCDF:"/backup/meteorology/2025/001/NLDAS_FORA0125_H.A20250101.0200.020.nc":time_bnds
SUBDATASET_2_NAME=NETCDF:"/backup/meteorology/2025/001/NLDAS_FORA0125_H.A20250101.0200.020.nc":Tair
SUBDATASET_3_NAME=NETCDF:"/backup/meteorology/2025/001/NLDAS_FORA0125_H.A20250101.0200.020.nc":Qair
SUBDATASET_4_NAME=NETCDF:"/backup/meteorology/2025/001/NLDAS_FORA0125_H.A20250101.0200.020.nc":PSurf
SUBDATASET_5_NAME=NETCDF:"/backup/meteorology/2025/001/NLDAS_FORA0125_H.A20250101.0200.020.nc":Wind_E
SUBDATASET_6_NAME=NETCDF:"/backup/meteorology/2025/001/NLDAS_FORA0125_H.A20250101.0200.020.nc":Wind_N
SUBDATASET_7_NAME=NETCDF:"/backup/meteorology/2025/001/NLDAS_FORA0125_H.A20250101.0200.020.nc":LWdown
SUBDATASET_8_NAME=NETCDF:"/backup/meteorology/2025/001/NLDAS_FORA0125_H.A20250101.0200.020.nc":CRainf_frac
SUBDATASET_9_NAME=NETCDF:"/backup/meteorology/2025/001/NLDAS_FORA0125_H.A20250101.0200.020.nc":CAPE
SUBDATASET_10_NAME=NETCDF:"/backup/meteorology/2025/001/NLDAS_FORA0125_H.A20250101.0200.020.nc":PotEvap
SUBDATASET_11_NAME=NETCDF:"/backup/meteorology/2025/001/NLDAS_FORA0125_H.A20250101.0200.020.nc":Rainf
SUBDATASET_12_NAME=NETCDF:"/backup/meteorology/2025/001/NLDAS_FORA0125_H.A20250101.0200.020.nc":SWdown
Names from grib file format (early 2024 and previous).
rob@deq2:/opt/model/meta_model$ gdalinfo /backup/meteorology/2023/001/NLDAS_FORA0125_H.A20230101.0200.002.grb | grep GRIB_EL
GRIB_ELEMENT=TMP
GRIB_ELEMENT=SPFH
GRIB_ELEMENT=PRES
GRIB_ELEMENT=UGRD
GRIB_ELEMENT=VGRD
GRIB_ELEMENT=DLWRF
GRIB_ELEMENT=var153
GRIB_ELEMENT=CAPE
GRIB_ELEMENT=PEVAP
GRIB_ELEMENT=APCP
GRIB_ELEMENT=DSWRF
Use h5ls to get information.
h5ls /backup/meteorology/2025/001/NLDAS_FORB0125_H.A20250101.2200.020.nc
ACond Dataset {1, 224, 464}
CRainf Dataset {1, 224, 464}
PSurf Dataset {1, 224, 464}
PhiS Dataset {1, 224, 464}
Qair Dataset {1, 224, 464}
Rainf Dataset {1, 224, 464}
SWdown Dataset {1, 224, 464}
Tair Dataset {1, 224, 464}
Wind_E Dataset {1, 224, 464}
Wind_N Dataset {1, 224, 464}
bnds Dataset {2}
lat Dataset {224}
lon Dataset {464}
time Dataset {1}
time_bnds Dataset {1, 2}
root@deq2:~# gdal_merge.py
No input files selected.
Usage: gdal_merge.py [-o out_filename] [-of out_format] [-co NAME=VALUE]*
[-ps pixelsize_x pixelsize_y] [-tap] [-separate] [-q] [-v] [-pct]
[-ul_lr ulx uly lrx lry] [-init "value [value...]"]
[-n nodata_value] [-a_nodata output_nodata_value]
[-ot datatype] [-createonly] input_files
[--help-general]
Variables Used by @gopal-bhatt C++ code:
Indicate a match in the new NLDAS2 raster format .nc
- 1 (yes): TMP Temperature [C] | 2[m]
- 2 (yes): SPFH Specific humidity [kg/kg]
- 4 (yes): u-component of wind
- 11 (yes): SWdown Downward shortwave radiation flux [W/m^2]
- 5 (yes): VGRD (wind speed v-component)
- 3 (yes): PRES SFC (Ground or water surface) Pressure [Pa]
Name mapping:
- New format is only 1 index slot off from old format
| Grib Element | Grib Number | netCDF Name | netCDF Number | Grib Comment | Grib Description |
|---|---|---|---|---|---|
| TMP | 1 | Tair | 2 | Temperature [C] | 2[m] HTGL (Specified height level above ground) |
| SPFH | 2 | Qair | 3 | Specific humidity [kg/kg] | 2[m] HTGL (Specified height level above ground) |
| PRES | 3 | PSurf | 4 | SFC (Ground or water surface) Pressure [Pa] | 0[-] SFC (Ground or water surface) |
| UGRD | 4 | Wind_E | 5 | u-component of wind [m/s] | 10[m] HTGL (Specified height level above ground) |
| VGRD | 5 | Wind_N | 6 | v-component of wind [m/s] | 10[m] HTGL (Specified height level above ground) |
| DLWRF | 6 | LWdown | 7 | Downward longwave radiation flux [W/m^2] | 0[-] SFC (Ground or water surface) |
| var153 | 7 | CRainf_frac | 8 | Undefined | 0[-] SFC (Ground or water surface) |
| CAPE | 8 | CAPE | 9 | Convective available potential energy [J/kg] | 180-0[hPa] SPDY (Level between 2 levels at specified pressure difference from ground to level) |
| PEVAP | 9 | PotEvap | 10 | Potential evaporation [kg/m^2] | 0[-] SFC (Ground or water surface) |
| APCP | 10 | Rainf | 11 | Total precipitation [kg/m^2] | 0[-] SFC (Ground or water surface) |
| DSWRF | 11 | SWdown* | 12 | Downward shortwave radiation flux [W/m^2] | 0[-] SFC (Ground or water surface) |
Excerpt from
| //DEBUG0 cout << "$#" << getRasterValue(hourGDAL, 10, grid[j].row, grid[j].col) << "\n"; |
gridValue[0] = getRasterValue(hourGDAL, 10, 224 - grid[j].row, grid[j].col - 1);
gridValue[1] = getRasterValue(hourGDAL, 1, 224 - grid[j].row, grid[j].col - 1);
gridValue[2] = getRasterValue(hourGDAL, 2, 224 - grid[j].row, grid[j].col - 1);
gridValue[3] = getRasterValue(hourGDAL, 4, 224 - grid[j].row, grid[j].col - 1);
gridValue[4] = getRasterValue(hourGDAL, 11, 224 - grid[j].row, grid[j].col - 1);
gridValue[5] = getRasterValue(hourGDAL, 5, 224 - grid[j].row, grid[j].col - 1);
gridValue[6] = getRasterValue(hourGDAL, 3, 224 - grid[j].row, grid[j].col - 1);
Metadata
Metadata
Assignees
Labels
No labels