-
Notifications
You must be signed in to change notification settings - Fork 952
Leverage new pylibcudf grouped_range_rolling_window for cuDF classic rolling(window: int)
#19162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Leverage new pylibcudf grouped_range_rolling_window for cuDF classic rolling(window: int)
#19162
Conversation
…ndow/grouped_range_window
…ndow/grouped_range_window
…ndow/grouped_range_window
def _window_to_window_sizes(self, window): | ||
if is_integer(window): | ||
return cudautils.grouped_window_sizes_from_offset( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we delete this function from cudautils
now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still used when doing a groupby.rolling
with a frequency-like window. I think grouped_range_rolling_window should be able to handle this case if formulated right so I can look into it in a follow up
) | ||
if not is_integer(window): | ||
gb_size = groupby.size().sort_index() | ||
self._group_starts = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found it a bit hard to follow why _group_starts
is getting set this way, and that led me down a bit of a rabbit hole. Do you think it would be cleaner to move input validation from __init__
into separate helper functions so that RollingGroupby
and Rolling
could have completely separate __init__
functions that call helpers, and then inline the code for converting windows to sizes? The current way that _normalize
and _window_to_window_sizes
is set up seems to improve code reuse but at the significant expense of readability.
Out of scope for this PR, but a suggestion for future improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah agreed there's an opportunity here to make things clearer. Additionally, we may not even need this if we can convert this to purely use pylibcudf as described in #19162 (comment)
/merge |
Description
Toward #18709
Does not cover windows that are
BaseIndexer
subclasses or timedeltasChecklist