Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid unnecessary time frequency resampling #2112

Merged
merged 9 commits into from
Mar 26, 2025

Conversation

Hem-W
Copy link
Contributor

@Hem-W Hem-W commented Mar 20, 2025

Pull Request Checklist:

What kind of change does this PR introduce?

  • Avoid unnecessary time resampling in preprocess_standardized_index when freq is not None but the same as the input data.

Does this PR introduce a breaking change?

No

Other information:

I am thinking whether it is necessary to prompt users to install flox when they input dask datasets and require time resampling.

@coxipi coxipi self-requested a review March 20, 2025 12:12
Copy link

Warning

This Pull Request is coming from a fork and must be manually tagged approved in order to perform additional testing.

@coxipi
Copy link
Contributor

coxipi commented Mar 20, 2025

Hi @Hem-W !

Thanks for your insight here and in your issue. That's a great point. I think constraining the fast implementation to freq=None is counter-intuitive. If you give a daily input dataset and specify freq="D", you would be right to expect that resampling should be avoided.

@aulemahal @Zeitsperre , have your thoughts evolved on the inclusion of flox by default or not?

@coxipi coxipi added the approved Approved for additional tests label Mar 20, 2025
Copy link
Collaborator

@aulemahal aulemahal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this works!

Co-authored-by: Pascal Bourgault <[email protected]>
@Zeitsperre
Copy link
Collaborator

@coxipi Outside the CI issues (being addressed elsewhere), it looks like there's one actual failing test:

  =================================== FAILURES ===================================
  _ TestStandardizedIndices.test_standardized_streamflow_index[MS-1-fisk-ML-values12-0.02] _
...
   tests/test_indices.py:982: 
  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
  
  args = (<function assert_allclose.<locals>.compare at 0x13cc64400>, array([ 0.32619351, -1.57770013, -0.43633094,  0.25251413, -0.81498785]), array([ 0.533154, -1.5777  , -0.436331,  0.29581 , -0.814988]))
  kwds = {'equal_nan': True, 'err_msg': '', 'header': 'Not equal to tolerance rtol=0, atol=0.02', 'strict': False, ...}
  
      @wraps(func)
      def inner(*args, **kwds):
          with self._recreate_cm():
  >           return func(*args, **kwds)
  E           AssertionError: 
  E           Not equal to tolerance rtol=0, atol=0.02
  E           
  E           Mismatched elements: 2 / 5 (40%)
  E           Max absolute difference among violations: 0.20696049
  E           Max relative difference among violations: 0.38818145
  E            ACTUAL: array([ 0.326194, -1.5777  , -0.436331,  0.252514, -0.814988])
  E            DESIRED: array([ 0.533154, -1.5777  , -0.436331,  0.29581 , -0.814988])
  
  /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/contextlib.py:81: AssertionError

Once the CI has been fixed here, could you quickly verify that this is expected?

@Zeitsperre Zeitsperre added this to the v0.56.0 milestone Mar 26, 2025
@Zeitsperre Zeitsperre merged commit 6085dac into Ouranosinc:main Mar 26, 2025
21 of 22 checks passed
@Zeitsperre
Copy link
Collaborator

@Hem-W Thanks so much for the code contribution!

@Hem-W Hem-W deleted the avoid-unnecessary-time-resampling branch March 27, 2025 03:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Approved for additional tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unexpected time resampling leads to extremely large dask graphs in default installation
4 participants