Potential incompatibility of graphar-pyspark with SparkConnect #366
SemyonSinchenko
started this conversation in
General
Replies: 2 comments 5 replies
-
Hi,Sem, it seems that Spark Connect is set default by Databricks Runtime, does it will be set default by the Apache Spark? If so, the libraries that rely python bindings would break, it seems to be a huge change to Apache Spark |
Beta Was this translation helpful? Give feedback.
4 replies
-
cc/ @lixueclaire , Do you have any comment or foresight about the proposal? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In Apache Spark starting from 3.4 the Spark Connect was introduced. It is absolutely different way to interact between python and scala. It looks like the recent release of Databricks Runtime (14.3 LTS) make Spark Connect as a default way of working. I see, that people start facing issues in python bindings of Microsoft Synapse ML: microsoft/SynapseML#2167
I guess, that all the libraries, that provide scala core and python bindings (via py4j) will face similar issues soon.
Synapse ML is well known spark extension and when I worked on graphar-pyspark it was one of the main inspirations for me.
Today, when graphar is so young, we have a chance to change everything. So, I see as an option to rewrite graphar-pyspark from py4j to pure pyspark to make it work with Spark Connect. I can do it, but I want do discuss it first. Maybe I'm missing something important.
Information from Databricks 14 release notes:
The most important for us is that
_jvm
is no longer available.Beta Was this translation helpful? Give feedback.
All reactions