-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Boundary variables for auxiliary coordinates of more than two dimensions #527
Comments
Dear all I propose the following changes to remedy the defect described above in Sect 7.1 "Cell boundaries". In addition I propose to insert some new subsection headings, explanatory text, and rearrangement of text in Sect 7.1, to improve clarity. I don't believe this is any change to the meaning of the convention, except for one small generalisation (at the end). On account of that, and because the text changes are quite extensive, I've relabelled the issue as an enhancement. The proposed changes are listed below. You might find it easier to look at the pull request or the HTML of the modified document. Cheers Jonathan
to
This is a small generalisation to admit the possibility of 2D cells in other than the horizontal plane e.g. in (height,latitude). I'm sure CF is being applied to those cases, and they might need 2D auxiliary coordinates. I haven't made the requirement for the vertices to be traversed in a right-handed way apply to cases apart from the horizontal plane, where we already have that requirement. |
I have updated the PR #547 to make two further changes on which there appears to be consensus in discussion 380, namely
I would be grateful for support to be expressed for these changes, so that they can go into 1.12, which we must agree by Monday 11th Nov i.e. next week. Thanks. |
@JonathanGregory > I really like and support the re-arrangement in Chapter 7. Would it be easier for a new reader to see the following order of sections?
I do not know if such a re-ordering is easy to take in this PR or if it should be moved to a later PR. |
@TomLav -- Thinking back to when I first read through the spec, it was easiest when the concepts relevant to what I was working with (output from numerical weather models) came first. I would say that the best ordering for new readers is basic concepts and common use cases first, with more complicated / specialized stuff coming later. So I think your ordering makes a lot of sense. I definitely agree that Boundaries & Formula Terms should be separated out into its own subsection, and that it should come after the section on 1-D bounds. One thing that I think might make it easier for the reader would be to talk about a true 1-D case first, like time bounds. Because in terms of use cases, the different things readers want to know are how to handle time bounds, bounds for a lat-lon grid, bounds for a projected rectilinear grid, bounds for an unstructured grid, and bounds for a parametric z coordinate. I think that's roughly the ordering of most common to least common use case, and it's also the ordering you have suggested (plus the time bounds case). |
+1 to all of what @sethmcg said :-) I've been thinking about this and started to write up some more detailed discussion -- far more detail than we should put in CF, but I think it would be good to have this sort of dc somewhere. Specifically things like @sethmcg mentioned: but a bit more general, like: How to describe: A lat-lon aligned rectangular grid Note that as it stands, CF doesn't (with the exception of the new UGRID spec) talk about the relationship between grids and cells much at all -- it talks about grids, and it talks about cells, and a grid is a particular arrangement of cells, but that has to be inferred (calculated?) from the cell bounds. Or maybe I've missed something, but in any case, I think we need more documentation about that. I've made a start here: It's only a start -- but hopefully you can see where I"m going with this -- i'd love any feedback or help anyone might provide. The end goal of that doc is that both produced of output from gridded models, and consumers of such output will know what to do with a full example. |
Hi @ChrisBarker-NOAA , is it possible to add comments to your markdown page ? Looking through it I remembered the Arakawa C grid (https://en.wikipedia.org/wiki/Arakawa_grids), used by the (now outdated) COSMO-D2 model, which defines some parameters on the nodes, some (like wind speed) on the borders of the cell, and I imagine that some parameters are means for the whole cell. |
I support the proposed change. Don't hold up things for this, but I did notice the statements under figs. 7.1 and 7.2 that say: "Tuples (lon(i),lat(j)) represent grid cell centers." Given the discussion that the cell coordinate values aren't necessarily half way between the bounds, should these statements be reworded? Perhaps "The tuple (lon(i),lat(j)) represent the coordinate location of a cell."? Or perhaps "Tuples (lon(i),lat(j)) represent grid cell nominal centers."? |
@Armin-RS: yes, please do! You can use gitHbu issues or PRs, or, if you want to do more than a little bit, I'm happy to give you access. As for "Arakawa C grid" -- yes, I hope to get there -- and there are other complications there -- fluxes through cell walls, etc. |
Thanks for your helpful comments - but please could you refrain from putting substantial comments or holding discussions in the PR. (No offence intended - I am commenting on procedure as chair of the committee.) It's fine to put small comments in PR, on typos or whatever, and it's convenient since you can add them to the line at which they apply. However, it's hard to follow the discussion, and to review it subsequently, about the wording, arrangement and content if it's not all in one place. For reference, I am therefore copying your comments here: @TomLavThanks for the PR @JonathanGregory. I will discuss a bit further below. We remove the reasonable assumption and add the no-default statement. It strikes me that we have maybe not concluded on the corner-case when the data producer wants to convey that the cells are in fact points. Should we include the corner-case in the no-default disclaimer?
@ChrisBarker-NOAA
What does "the cells are in fact points" mean? in my mind, points, are well, points, and not cells at all, and that's the usual definition of data in CF that doesn't have cell bounds. the closest I've been able to find for a definition of "cell" is:
"finite" to me means, "not a point". @TomLavI am starting to wonder if our idea of adding a "no-assumption" sentence in Chapter 7 is that good. Re-reading the very start of section 7:
To me, this reads like the absence of The new Maybe the best solution would be to delete the reasonable assumption sentence from Chapter 4, and add a sentence in Section 7 that reads: "In the absence of :bounds, the data represents the point values of a field". Then it is clear what the interpretation is when no :bounds are provided. Sorry, I am going in circles a bit. Removing the sentence from Chapter 4 is still a very good idea. @taylor13I don't think the absence of bounds can imply "point" data unless we make bounds a requirement for data representing cells. In the past, this has not been a requirement, so we can't change this without upsetting backward compatibility. @TomLavIn that case, the absence of
@ChrisBarker-NOAA
Is that really the case? how in the world can you have cells if you haven't defined them somehow? If it really was the case that that one could put in data representing cells without defining what the cells are, then I suppose what you had was: These data are on cells of unknown geometry -- seems like a bad idea to me, but if that's what CF used to allow, then I guess it still does. So how do you know if the data are point data or cell data? Is it point data if there is no cell_method, and cell data if there is? in the current text:
OK, so that is a "should" not a must.... But if it IS cell data, then it must (?) have a cell_method and/or a cell_measure (e.g. cell-area). I guess it's not useless to have, e.g. have a cell_method and a cell-area with no defined bounds, though not great. Even a cell_method with no other definition of the cells is still some information. Not sure what this means for the text, but maybe something along the lines of, in the intro to 7, something like: "Data is Representative of Cells if there is a cell_method or a cell metric defined" Along with the "should" for the bounds. |
Dear @TomLav @sethmcg @ChrisBarker-NOAA @taylor13 Thanks for your comments. I had already put the formula terms into its own section. @TomLav's suggestion is to move this section to the end. I didn't do that at first because I started this issue with the aim of minimal changes for clarity. However, it's no problem to move it, so I've done so, given that you agree it's better at the end. Taking Seth's suggestions of considering time, and a purely 1D case, I have mentioned 1D time in the preamble of section 7, and I have changed Example 7.1 from latitude to time. Also I moved the following text, which described the same example, into the box of the example, and I have spelled out more details. In addition to Example 7.1, we still also have Fig 7.1, which is about 1D latitude and longitude together, so both cases are now covered. On reviewing the preamble of sect 7, I thought it was not as clear as it could be, so I have rearranged and modified it in other ways too, though not intending to change its meaning. In the last version, I had inserted a statement that, without bounds, you could assume only that the point is somewhere within or on the boundary of the cell. I noticed subsequently that this is a recommendation in the conformance document, not a requirement. Hence, without bounds, nothing at all can be assumed about the relationship between cells and points, as @TomLav suggests. I have now stated this instead in sect 7.1. I have also inserted an explicit statement that if you want to indicate that a cell has zero size you must give it explicit coincident bounds. Also in sect 7.1, I have added more text to describe the recommendation that the point should lie within the cell, and also the requirement that the bounds should be ordered in the same sense as the coordinates. I've added the latter requirement to the conformance document, which didn't have it before, although it was already stated with "must" in the convention. I agree with Chris's reluctant conclusions, "I guess it's not useless to have, e.g. a cell_method and a cell-area with no defined bounds, though not great. Even a cell_method with no other definition of the cells is still some information." I've always understood that to be the intention of the convention. Bounds and cell methods are both separately optional, but informative, especially in combination. Bounds are currently optional, not recommended. I suggest we postpone reconsidering whether they should be recommended to version 1.12. I have altered the reference to "centers" that Karl noticed in the captions of Figs 7.1 and 7.2. It now says that the coordinate values "locate the gridpoints". The aim of the present issue is clarification of the current convention. In order to meet the CF 1.12 deadline (beginning the three-week period for acceptance on Monday), we must limit our ambitions. 😃 The PR #547 now shows the text following the above changes. They are also shown in the modified HTML conventions document. What do you think? Best wishes Jonathan |
After a quick skim, I think it is good to go for CF 1.2 . I'm sure we could always find ways to improve on it, but it is definitely better than before. |
Because both NUG and COARDS define coordinate variables but not cells or bounds, I think it would be useful to continue to have some guidance on the undefined nature of cells in chapter 2. It should perhaps also include some description in terms of pixels as I think this is often a point of confusion for those who are used to dealing with raster/image data and may not think of cell/bounds as something relevant. Perhaps something like:
Probably the raster/image language should be a separate discussion with a more extensive review. |
absolutely - there may even be a need to add something specifically for pixels -- I've noticed that sometimes pixels are defined by their corner and size, sometimes by the center points. Also: "The raster data model consists of rows and columns of equally sized pixels interconnected to form a planar surface." From a random google: But that's consistent with what I've seen, including GDAL: Key point is that the pixels are all the same size (in the coordinate system used) and that size is defined -- so you don't need to specify all the coordinates -- if you know the corner, the pixel size, and the raster size (number of pixels) you. can. compute the locations of all the pixels. You can cover that in CF the two coordinate dimensions that happen to be equally spaced -- but even then it's a bit confusing as to whether the coordinates are the center or corner, and the dx and dy are not clearly stated (can be computed of course) I know that CF is trying to be general, but saying something about specific cases like this is a good idea -- either as a specific spec, or at least a documented: "best practices for storing raster data". Anyway -- new topic :-) |
Dear all Thanks to your comments, enough support has been expressed for us to be able to accept this proposal in three weeks, on 1st December, if no-one raises any concerns before then. @ethanrd, thanks for your suggestion. I agree that it's useful to compare CF with raster pixels. I also agree that we could have an extensive discussion of it, which we don't have time for before CF 1.12 is finalised. However, I've made an attempt, following your suggestion. I've updated the PR #547 and HTML as below. Will this help? Cheers Jonathan Addition of new text (in bold) to the definition of "cell" in sect 1.3, "Terminology":A region in one or more dimensions whose boundary can be described by a set of vertices recorded in boundary variables. The term interval is sometimes used for one-dimensional cells. A two-dimensional cell is analogous to a pixel in a raster graphic, but is a more general concept (see section 1.4, "Overview"). Addition of new text (in bold) and small replacements, in sect 1.4, "Overview"It is often the case that data values are not representative of single points in time, A two-dimensional cell is analogous to a pixel in a raster graphic, but is a more general concept. Pixels in a raster are rectangular, all of the same size, and arranged in a logically rectangular array with their nominal point locations at their centers. By contrast, two-dimensional cells in a CF field do not necessarily satisfy any of those conditions, though they commonly do. Furthermore, as an alternative to cells in two dimensions, CF defines a convention for the case where each data value is associated with a geographical feature that is described by one or more points, lines or polygons. When data that is representative of cells can be described by simple statistical methods (for instance, mean or maximum), those methods can be indicated using the |
I think we should avoid talking about pixels without more discussion: http://alvyray.com/Memos/CG/Microsoft/6_pixel.pdf Title: "A pixel is not a little square" |
Dear Chris I added this text because of your comments and Ethan's that it would be useful to compare the CF concept of a cell with the pixels of a raster. I can take it out again if it's wrong or not useful, but I do think it could help some people who are new to CF, so perhaps we can improve it. What do you think, @ethanrd? You're right that we shouldn't say anything about the shape of pixels. How about this (replacing three of the sentences in the previous draft): A two-dimensional cell is analogous to a pixel in a raster graphic, but is a more general concept. Pixels in a raster are evenly spaced in each dimension and arranged in a logically rectangular array. Two-dimensional cells in a CF field do not necessarily satisfy either of those conditions, though they commonly do. Wiktionary defines "pixel" as "One of the tiny dots that make up the representation of an image in a computer's memory" or "One of the squares that make up a work of pixel art or a zoomed-in image in a computer." About raster graphics, Wikipedia says "A raster graphic represents a two-dimensional picture as a rectangular matrix or grid of pixels." Best wishes Jonathan |
@ChrisBarker-NOAA and @TomLav. Thanks for your comments on the PR #547, which I have implemented, as shown in the HTML. |
Section 7.1 on "Cell boundaries" contains the following text, about providing boundary variables for auxiliary coordinate variables of more than two dimensions:
There is no example given. I suppose it must mean something like this for three dimensions:
Since it talks about traversing the vertices clockwise in the lat-lon plane, it must be concerned with horizontal faces of the 3D cells. A horizontal coordinate variable only has to be 3D (rather than 2D) if the cells aren't aligned in vertical columns. Do you agree with this interpretation? If so, we should clarify it. 3D cells don't just have horizontal faces, so this convention is of restricted use. For a more general treatment of bounds in two dimensions, we could refer to UGRID.
Cheers, Jonathan
The text was updated successfully, but these errors were encountered: