Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-49552][PYTHON] Add DataFrame API support for new 'randstr' and 'uniform' SQL functions #48143

Open
wants to merge 28 commits into
base: master
Choose a base branch
from

Conversation

dtenedor
Copy link
Contributor

@dtenedor dtenedor commented Sep 18, 2024

What changes were proposed in this pull request?

In #48004 we added new SQL functions randstr and uniform. This PR adds DataFrame API support for them.

For example, in Scala:

sql("create table t(col int not null) using csv")
sql("insert into t values (0)")
val df = sql("select col from t")
df.select(randstr(lit(5), lit(0)).alias("x")).select(length(col("x")))
> 5

df.select(uniform(lit(10), lit(20), lit(0)).alias("x")).selectExpr("x > 5")
> true

Why are the changes needed?

This improves DataFrame parity with the SQL API.

Does this PR introduce any user-facing change?

Yes, see above.

How was this patch tested?

This PR adds unit test coverage.

Was this patch authored or co-authored using generative AI tooling?

No.

@dtenedor dtenedor changed the title [WIP][SPARK-49552][Python] Add DataFrame API support for new 'randstr' and 'uniform' SQL functions [SPARK-49552][Python] Add DataFrame API support for new 'randstr' and 'uniform' SQL functions Sep 18, 2024
@dtenedor dtenedor marked this pull request as ready for review September 18, 2024 15:59
@dtenedor
Copy link
Contributor Author

cc @HyukjinKwon @MaxGekk here is the DataFrame support for the new randstr and uniform functions :)

@HyukjinKwon HyukjinKwon changed the title [SPARK-49552][Python] Add DataFrame API support for new 'randstr' and 'uniform' SQL functions [SPARK-49552][PYTHON] Add DataFrame API support for new 'randstr' and 'uniform' SQL functions Sep 19, 2024
python/pyspark/sql/connect/functions/builtin.py Outdated Show resolved Hide resolved
python/pyspark/sql/connect/functions/builtin.py Outdated Show resolved Hide resolved
python/pyspark/sql/functions/builtin.py Outdated Show resolved Hide resolved
python/pyspark/sql/functions/builtin.py Show resolved Hide resolved
python/pyspark/sql/functions/builtin.py Outdated Show resolved Hide resolved
Copy link
Contributor Author

@dtenedor dtenedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zhengruifeng for your review! Responded to your comments, please take another look.

python/pyspark/sql/functions/builtin.py Show resolved Hide resolved
python/pyspark/sql/connect/functions/builtin.py Outdated Show resolved Hide resolved
python/pyspark/sql/connect/functions/builtin.py Outdated Show resolved Hide resolved
python/pyspark/sql/functions/builtin.py Outdated Show resolved Hide resolved
python/pyspark/sql/functions/builtin.py Outdated Show resolved Hide resolved
python/pyspark/sql/functions/builtin.py Outdated Show resolved Hide resolved
+------+
| ceV0P|
+------+

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we normally don't include an empty line at the end of the docstring

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, this is done.

+------+
| 7|
+------+

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, this is done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants