Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Modify extract_season to return only "full" seasons upon request #2597

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
5 changes: 5 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,11 @@ authors:
family-names: Garcia Perdomo
given-names: Karen
orcid: "https://orcid.org/0009-0004-2333-3358"
-
affiliation: "ECCC, Canada"
family-names: Webb
given-names: Kristi
orcid: "https://orcid.org/0000-0002-8610-0672"

cff-version: 1.2.0
date-released: 2024-11-26
Expand Down
5 changes: 4 additions & 1 deletion doc/recipe/preprocessor.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1359,10 +1359,13 @@ See also :func:`esmvalcore.preprocessor.extract_time`.

Extract only the times that occur within a specific season.

This function only has one argument: ``season``. This is the named season to
This function has one required argument: ``season``. This is the named season to
extract, i.e. DJF, MAM, JJA, SON, but also all other sequentially correct
combinations, e.g. JJAS.

The argument ``full`` specifies whether only full seasons are returned, i.e. DJF
but not JF. Default is False.

Note that this function does not change the time resolution. If your original
data is in monthly time resolution, then this function will return three
monthly datapoints per year.
Expand Down
39 changes: 38 additions & 1 deletion esmvalcore/preprocessor/_time.py
Original file line number Diff line number Diff line change
Expand Up @@ -331,7 +331,7 @@ def clip_timerange(cube: Cube, timerange: str) -> Cube:
return _extract_datetime(cube, t_1, t_2)


def extract_season(cube: Cube, season: str) -> Cube:
def extract_season(cube: Cube, season: str, full: bool = False) -> Cube:
"""Slice cube to get only the data belonging to a specific season.

Parameters
Expand All @@ -341,6 +341,9 @@ def extract_season(cube: Cube, season: str) -> Cube:
season:
Season to extract. Available: DJF, MAM, JJA, SON
and all sequentially correct combinations: e.g. JJAS
full:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add more documentation or maybe context. I know what the issue is, but I think a person without context might get confused.

Only return full seasons e.g. DJF and never JF
default: False

Returns
-------
Expand Down Expand Up @@ -378,6 +381,40 @@ def extract_season(cube: Cube, season: str) -> Cube:
coords_to_remove.append("season_year")

result = cube.extract(iris.Constraint(clim_season=season))

if full: # remove incomplete seasons
iris.coord_categorisation.add_month_number(
result, "time", name="month_number"
)

# end of season
send = sstart + len(season) - 1
if send >= 12:
send -= 12

# add coordinate which flags whether months are part of a full season
full_season = np.full_like(result.coord("month_number").points, 1)
first_month = np.where(
result.coord("month_number").points == sstart + 1
)[0][0]
last_month = np.where(result.coord("month_number").points == send + 1)[
0
][-1]
full_season[:first_month] = 0
full_season[last_month + 1 :] = 0

full_season_coord = iris.coords.AuxCoord(
full_season, long_name="full_season"
)
result.add_aux_coord(
full_season_coord, data_dims=result.coord_dims("time")
)

result = result.extract(iris.Constraint(full_season=1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personal preference, rather than need, but I would add a logger message in the debug format which months have been removed. Also on a similar note, in case the values are existing in a timeseries, but have been masked out, as far as I understand they won't be removed. I would recommend to add this information somewhere.


for coord in ["full_season", "month_number"]:
result.remove_coord(coord)

for coord in coords_to_remove:
cube.remove_coord(coord)
if result is None:
Expand Down
12 changes: 12 additions & 0 deletions tests/unit/preprocessor/_time/test_time.py
Original file line number Diff line number Diff line change
Expand Up @@ -631,6 +631,18 @@ def test_get_jf(self):
with assert_raises(iris.exceptions.CoordinateNotFoundError):
self.cube.coord("season_year")

def test_get_djf_full(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, personal preference, rather than requirement, but I would add a second test when you provide a cube, where the season is captured fully (let's say from Dec to Nov) with the keyword full=True and check that the returned cube is correct.

"""Test function for winter."""
sliced = extract_season(self.cube, "DJF", full=True)
iris.coord_categorisation.add_month_number(sliced, "time")
assert_array_equal(
np.array([12, 1, 2]), sliced.coord("month_number").points
)
with assert_raises(iris.exceptions.CoordinateNotFoundError):
self.cube.coord("clim_season")
with assert_raises(iris.exceptions.CoordinateNotFoundError):
self.cube.coord("season_year")


class TestClimatology(tests.Test):
"""Test class for :func:`esmvalcore.preprocessor._time.climatology`."""
Expand Down