-
Notifications
You must be signed in to change notification settings - Fork 344
Description
We had some discussion about understanding the parameters with multiple dimensions on the parameter file. The metadata is constructed to follow the CF conventions.
https://cfconventions.org/Data/cf-conventions/cf-conventions-1.12/cf-conventions.html
But, not everyone is aware of those conventions and how they work.
The extra dimensions on the file include these for one of the latest files:
netcdf ctsm60_ciso_cwd_hr_params.c250311 {
dimensions:
pft = 79 ;
ndecomp_pools_max = 8 ;
segment = 4 ;
variants = 2 ;
string_length = 40 ;
ntill_stages_max = 3 ;
ntill_intensities_max = 2 ;
variables:
The "coordinates" attribute for each variable says what coordinate variable describes a given dimension. For example, pftname is the coordinate variable for the "pft" dimension. For multidimensional data they are listed in the "C" order they are in the file with space as delimiter.
There is also a variantnames coordinate variable for the "variant" dimension.
char variantnames(variants, string_length) ;
variantnames:_FillValue = "" ;
variantnames:long_name = "Description of variant names" ;
variantnames:units = "unitless" ;
.
.
.
variantnames =
"water ",
"carbon " ;
}
So we need coordinate variables for the dimensions: segment, ndecomp_pools_max, ntill_stages_max, and ntill_intensities_max. The "string_length" dimension doesn't need a coordinate since it's just the length of the strings in the metadata. Although I suppose a coordinate variable could be added that just says that.
The following variables are without the "coordinate" attribute and should have it added:
double bgc_initial_Cstocks(ndecomp_pools_max) ;
double mimics_fmet(segment) ;
double mimics_initial_Cstocks(ndecomp_pools_max) ;
double mimics_kint(ndecomp_pools_max) ;
double mimics_kmod(ndecomp_pools_max) ;
double mimics_kslope(ndecomp_pools_max) ;
double mimics_mge(ndecomp_pools_max) ;
double mimics_vint(ndecomp_pools_max) ;
double mimics_vmod(ndecomp_pools_max) ;
double mimics_vslope(ndecomp_pools_max) ;
double bgc_till_decompk_multipliers(ntill_stages_max, ndecomp_pools_max, ntill_intensities_max) ;
double mimics_till_decompk_multipliers(ntill_stages_max, ndecomp_pools_max, ntill_intensities_max) ;
Another CF convention that is used is for variables that are "flags" as they are either logical variables or mean one of a short list of options. For example, for c3psn you have:
double c3psn(pft) ;
c3psn:_FillValue = NaN ;
c3psn:long_name = "Photosynthetic pathway" ;
c3psn:units = "flag" ;
c3psn:valid_range = 0., 1. ;
c3psn:flag_meanings = "C4 C3" ;
c3psn:flag_values = 0., 1. ;
c3psn:coordinates = "pftname" ;
The last set of CF attributes used for some variables is: comment, and valid_range.
Definition of done:
- Release spreadsheet with cesm3.0
- Decide what things should be done here and what a timeline might be
- Add a note on this to the User's Guide?
- Add coordinate variable with the names to all of the list of variables with dimensions (with same name as the dimension)
- Add a coordinate variable for the rest of the dimensions
- Add an attribute for each variable that gives the general process the variable applies to from the spreadsheet
- Add a comment to each variable that only applies when a given parameterization is on, that says what needs to be turned on for it to be used
- Add a checker that CF conventions are followed? (there are some available to use)
- Add our own such checker?
- Add a tool that can query a case or compset and tell you which parameters are active
- Add some checking to query_paramdata for these kind of things?
- Have the code read the metadata and either use it internally, or abort if it isn't what's assumed in the code.