Skip to content

Add fallback to HStack lowering in cudf-polars #19163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 17, 2025

Conversation

rjzamora
Copy link
Member

Description

Closes #19150
(Although, we can probably do a bit more work to avoid fallback for some non-pointwise expressions)

We were not validating that the items in HStack.columns were all "pointwise" before lowering to partitionwise logic. This adds the possibility for single-partition fallback.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@rjzamora rjzamora self-assigned this Jun 13, 2025
@rjzamora rjzamora requested a review from a team as a code owner June 13, 2025 20:02
@rjzamora rjzamora added bug Something isn't working 2 - In Progress Currently a work in progress non-breaking Non-breaking change cudf-polars Issues specific to cudf-polars labels Jun 13, 2025
@github-actions github-actions bot added the Python Affects Python cuDF API. label Jun 13, 2025
@GPUtester GPUtester moved this to In Progress in cuDF Python Jun 13, 2025
Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rjzamora, looks great.

Do you think this could affect any other operations? Do you think we should do some assert in _lower_ir_pwise to validate that everything actually is piecewise (probably not, but maybe...)?

Copy link
Contributor

@Matt711 Matt711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious if the test is failing only with pytest-xdist turned on?

Comment on lines +21 to +25
df = (
pl.DataFrame({"dt": dates, "a": [3, 7, 5, 9, 2, 1]})
.with_columns(pl.col("dt").str.strptime(pl.Datetime("ns")))
.lazy()
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
df = (
pl.DataFrame({"dt": dates, "a": [3, 7, 5, 9, 2, 1]})
.with_columns(pl.col("dt").str.strptime(pl.Datetime("ns")))
.lazy()
)
df = (
pl.LazyFrame({"dt": dates, "a": [3, 7, 5, 9, 2, 1]})
.with_columns(pl.col("dt").str.strptime(pl.Datetime("ns")))
)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like engine='gpu' raises when I use this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we don't support this strptime deducing the format I think. (cc @brandon-b-miller who implemented this stuff).

@rjzamora
Copy link
Member Author

Ahh - Looks like we have a serialization/hashing error when the new test runs in distributed mode :/

@rjzamora rjzamora marked this pull request as draft June 16, 2025 13:36
Copy link

copy-pr-bot bot commented Jun 16, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rjzamora rjzamora marked this pull request as ready for review June 16, 2025 14:49
Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny suggestions, but I think this looks good.

@rjzamora rjzamora added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 2 - In Progress Currently a work in progress labels Jun 16, 2025
@rjzamora
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit 77d0757 into rapidsai:branch-25.08 Jun 17, 2025
93 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in cuDF Python Jun 17, 2025
@rjzamora rjzamora deleted the rick/bug/19150 branch June 17, 2025 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge bug Something isn't working cudf-polars Issues specific to cudf-polars non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[BUG]: Incorrect result for rolling with experimental streaming executor and multiple partitions
4 participants