Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify the conventions for boundary variables #547

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

JonathanGregory
Copy link
Contributor

See issue #527 for discussion of these changes.

Release checklist

  • [NA] Authors updated in cf-conventions.adoc? Add in two places: on line 3 and under .Additional Authors in About the authors.
  • [NA] Next version in cf-conventions.adoc up to date? Versioning inspired by SemVer.
  • [Y] history.adoc up to date?
  • [NA] Conformance document up to date?

Copy link
Contributor

@ChrisBarker-NOAA ChrisBarker-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good -- thanks!

ch07.adoc Outdated
For cells with fewer vertices than the size of vertex dimension, the unneeded elements must appear as the last elements in the vertex dimension and must be assigned the **`_FillValue`**.
CF can currently describe boundaries for cells which have one or two spatial dimensions, but does not provide conventions to describe the boundaries of cells with three spatial dimensions.
Please refer to <<UGRID>> for development of such conventions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UGRID only kinda-sorta does 3D -- not sure it should be mentioned here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it probably should be mentioned somewhere near here ....

@TomLav
Copy link

TomLav commented Nov 7, 2024

Thanks for the PR @JonathanGregory. I will discuss a bit further below.

We remove the reasonable assumption and add the no-default statement. It strikes me that we have maybe not concluded on the corner-case when the data producer wants to convey that the cells are in fact points. Should we include the corner-case in the no-default disclaimer?

Without a boundary variable, the extent of a cell is not known, nor whether adjacent cells are contiguous, separated by a gap, overlapping, or in fact points (cells with zero lengths).

@ChrisBarker-NOAA
Copy link
Contributor

It strikes me that we have maybe not concluded on the corner-case when the data producer wants to convey that the cells are in fact points.

What does "the cells are in fact points" mean? in my mind, points, are well, points, and not cells at all, and that's the usual definition of data in CF that doesn't have cell bounds.

the closest I've been able to find for a definition of "cell" is:

"When gridded data does not represent the point values of a field but instead represents some characteristic of the field within cells of finite "volume,"...

"finite" to me means, "not a point".

@taylor13
Copy link

taylor13 commented Nov 7, 2024

I noticed the statement under figs. 7.1 and 7.2 that say: "Tuples (lon(i),lat(j)) represent grid cell centers." Given the discussion, that the cell coordinate values aren't necessarily half way between the bounds, should these statements be reworded? Perhaps "Tuples (lon(i),lat(j)) represent the coordinate location of a cell."? Or perhaps "Tuples (lon(i),lat(j)) represent grid cell nominal centers."?

I support this pull request.

@TomLav
Copy link

TomLav commented Nov 7, 2024

@ChrisBarker-NOAA @JonathanGregory

I am starting to wonder if our idea of adding a "no-assumption" sentence in Chapter 7 is that good.

Re-reading the very start of section 7:

When gridded data does not represent the point values of a field but instead represents some characteristic of the field within cells of finite "volume," a complete description of the variable should include metadata that describes the domain or extent of each cell, and the characteristic of the field that the cell values represent.

To me, this reads like the absence of :bounds should rightly be interpreted as providing locations for point values of a field. Because providing :bounds was specifically created for the cases where the data does not represent point values. Shouldn't the logical interpretation of the absence of :bounds rather be point values?

The new no-assumption sentence states that we cannot assume anything about cells in the absence of :bounds, but by doing so it actually brings the concept of a cell to mind, which seems contradictory to the first sentence of Chapter 7.

Maybe the best solution would be to delete the reasonable assumption sentence from Chapter 4, and add a sentence in Section 7 that reads: "In the absence of :bounds, the data represents the point values of a field". Then it is clear what the interpretation is when no :bounds are provided.

Sorry, I am going in circles a bit. Removing the sentence from Chapter 4 is still a very good idea.

@taylor13
Copy link

taylor13 commented Nov 7, 2024

I don't think the absence of bounds can imply "point" data unless we make bounds a requirement for data representing cells. In the past, this has not been a requirement, so we can't change this without upsetting backward compatibility.

@TomLav
Copy link

TomLav commented Nov 7, 2024

In that case, the absence of :bounds cannot imply anything: neither that the axis value represents cells, nor that such cells are of any extent and shape.

Without a boundary variable, the axis values can neither be assumed to hold point positions, nor that the axis values represent cells. An unambiguous way to define point positions is to use the :bounds attribute to define 0-length cells. The only way to define cells, their position, and extent, is to use the :bounds attribute, as described below.

@ChrisBarker-NOAA
Copy link
Contributor

I don't think the absence of bounds can imply "point" data unless we make bounds a requirement for data representing cells. In the past, this has not been a requirement, so we can't change this without upsetting backward compatibility.

Is that really the case? how in the world can you have cells if you haven't defined them somehow?

If it really was the case that that one could put in data representing cells without defining what the cells are, then I suppose what you had was:

These data are on cells of unknown geometry -- seems like a bad idea to me, but if that's what CF used to allow, then I guess it still does.

So how do you know if the data are point data or cell data?

Is it point data if there is no cell_method, and cell data if there is?

in the current text:

When gridded data does not represent the point values of a field but instead represents some characteristic of the field within cells of finite "volume," a complete description of the variable should include metadata that describes the domain or extent of each cell..and the characteristic of the field that the cell values represent

OK, so that is a "should" not a must....

But if it IS cell data, then it must (?) have a cell_method and/or a cell_measure (e.g. cell-area).

I guess it's not useless to have, e.g. have a cell_method and a cell-area with no defined bounds, though not great.

Even a cell_method with no other definition of the cells is still some information.

Not sure what this means for the text, but maybe something along the lines of, in the intro to 7, something like:

"Data is Representative of Cells if there is a cell_method or a cell metric defined"

Along with the "should" for the bounds.

@JonathanGregory
Copy link
Contributor Author

If you have comments (other than trivial ones on typos and words), please could you make them in #527, not here in the PR. Thanks.

Copy link
Contributor

@ChrisBarker-NOAA ChrisBarker-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asside from one micro copy-edit suggestion, this looks good to go!

Thanks for keeping up with the slog!

ch07.adoc Outdated

[[cell-boundaries, Section 7.1, "Cell Boundaries"]]
=== Cell Boundaries

To represent cells we add the attribute **`bounds`** to the appropriate coordinate variable(s).
To represent cells, the **`bounds`** attribute may be added to the appropriate coordinate variable(s).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe "to define cells ..." rather than "represent"? -- no biggie.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I've replaced "represent" with "delimit".

ch01.adoc Outdated
It is often the case that data values are not representative of single points in time, space and other dimensions, but rather of intervals or multidimensional cells.
CF defines a **`bounds`** attribute to specify the extent of intervals or cells.
Because both the <<NUG>> and <<COARDS>> define coordinate variables but not cells or bounds, many applications assume that gridpoints are always located at the centers of their cells.
This assumption does not hold in CF. If bounds are not provided, the location of the gridpoint within the cell is undefined, and nothing can assumed about the location and extent of the cell.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing word: and nothing can be assumed...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Done.

@TomLav
Copy link

TomLav commented Nov 9, 2024

I support all these modifications and clarifications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Boundary variables for auxiliary coordinates of more than two dimensions
4 participants