-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-54660][SS] Add RTM trigger to python #53448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
95bce2d to
675ddd7
Compare
| except Exception as e: | ||
| # This error is expected | ||
| self._assert_exception_tree_contains_msg( | ||
| e, "STREAMING_REAL_TIME_MODE.INPUT_STREAM_NOT_SUPPORTED" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this input stream is not supported by real time mode, can we possibly test supported source in Python test?
viirya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code change looks okay. Only wondering if we can add a supported case in Python test.
|
@viirya thank you for the review. The issue with that is currently no RTM supported sources can be used in pyspark. There is no memory source equivalent in python and using Kafka would require us to start a kafka cluster via docker and I had limited success with at via: In my follow up PR to this I will probably convert the socket source to support RTM and use that to test. |
|
Merged to master. Thanks @jerrypeng |
### What changes were proposed in this pull request? Add RTM trigger to pyspark so that pyspark queries can run in RTM. Only stateless (without UDF) queries will be supported for now. Also added support for spark connect since it fails a test if the method signatures do not match. ### Why are the changes needed? To support running RTM queries in pyspark ### Does this PR introduce _any_ user-facing change? Yes, add RTM trigger to pyspark ### How was this patch tested? Add a simple test. I will add more tests in a subsequent PR. ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#53448 from jerrypeng/SPARK-53998-3. Authored-by: Jerry Peng <[email protected]> Signed-off-by: Liang-Chi Hsieh <[email protected]>
What changes were proposed in this pull request?
Add RTM trigger to pyspark so that pyspark queries can run in RTM. Only stateless (without UDF) queries will be supported for now.
Also added support for spark connect since it fails a test if the method signatures do not match.
Why are the changes needed?
To support running RTM queries in pyspark
Does this PR introduce any user-facing change?
Yes, add RTM trigger to pyspark
How was this patch tested?
Add a simple test. I will add more tests in a subsequent PR.
Was this patch authored or co-authored using generative AI tooling?
No