Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-51338][INFRA] Add automated CI build for connect-examples #50187

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

vicennial
Copy link
Contributor

What changes were proposed in this pull request?

Adds a build step for the connect-examples directory (specifically the server-library-example project) in the CI steps (build_and_test.yml)

Why are the changes needed?

No existing automated CI build atm

Does this PR introduce any user-facing change?

No

How was this patch tested?

CI run in this PR

Was this patch authored or co-authored using generative AI tooling?

Generated-by: o3-mini-high

@github-actions github-actions bot added the INFRA label Mar 6, 2025
@hvanhovell
Copy link
Contributor

@HyukjinKwon can you take a look?

@github-actions github-actions bot added the BUILD label Mar 6, 2025
@vicennial vicennial marked this pull request as draft March 6, 2025 21:15
@@ -92,6 +92,7 @@ jobs:
pyspark_pandas_modules=`cd dev && python -c "import sparktestsupport.modules as m; print(','.join(m.name for m in m.all_modules if m.name.startswith('pyspark-pandas')))"`
pyspark=`./dev/is-changed.py -m $pyspark_modules`
pandas=`./dev/is-changed.py -m $pyspark_pandas_modules`
connect_examples=`./dev/is-changed.py -m "connect-examples"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to run this with Java 21 (and other scheduled builds), we would need to add this in https://github.com/apache/spark/blob/master/.github/workflows/build_java21.yml too as an example, .e.g., "connect_examples": "true"

- name: Build server-library-example
run: |
cd connect-examples/server-library-example
mvn clean package
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use build/sbt instead? We use SBT in the PR builder

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good otherwise

@HyukjinKwon HyukjinKwon changed the title [SPARK-51338] Add automated CI build for connect-examples [SPARK-51338][INFRA] Add automated CI build for connect-examples Mar 6, 2025
@LuciferYang
Copy link
Contributor

I've noticed a rather peculiar issue here. It seems that the connect-examples project is dependent on a released version of Spark, which means we can only update to a new version after it has been officially released.

For example, when Spark 4.0.0 is released, the version of this project in the published v4.0.0 tag won't be Spark 4.0.0, but rather some previously released version. Am I right about this? If I'm wrong, please correct me.

If I'm right, doesn't that seem a bit strange? Therefore, I still believe that connect-examples module should be placed outside of the Spark Repo, so that we can update and release it in another codebase after Spark 4.0.0(Or other official versions) is out.

<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.apache.connect.examples.serverlibrary</groupId>
<artifactId>spark-server-library-example</artifactId>
<version>1.0.0</version>
<packaging>pom</packaging>
<modules>
<module>common</module>
<module>server</module>
<module>client</module>
</modules>
<properties>
<maven.compiler.source>17</maven.compiler.source>
<maven.compiler.target>17</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<scala.binary>2.13</scala.binary>
<scala.version>2.13.15</scala.version>
<protobuf.version>3.25.4</protobuf.version>
<spark.version>4.0.0-preview2</spark.version>
</properties>
</project>

What do you think about this? @HyukjinKwon @hvanhovell @dongjoon-hyun @cloud-fan @srowen @yaooqinn

Copy link
Contributor

@LuciferYang LuciferYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should discuss #50187 (comment) clearly before merging

@srowen
Copy link
Member

srowen commented Mar 8, 2025

Can the examples module simply point to SNAPSHOT versions like everything else in the build? the main branch code is always pointing at unreleased code, but on release, those SNAPSHOT versions are changed to the current release version.

@LuciferYang
Copy link
Contributor

Can the examples module simply point to SNAPSHOT versions like everything else in the build? the main branch code is always pointing at unreleased code, but on release, those SNAPSHOT versions are changed to the current release version.

Yes, if it can be done this way, I think it's ok.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants