[#2750] feat(spark): Support Spark 4.1#2751
Merged
Merged
Conversation
3b37203 to
1fd752e
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #2751 +/- ##
=========================================
Coverage 51.76% 51.77%
+ Complexity 3982 3979 -3
=========================================
Files 600 600
Lines 33228 33228
Branches 3141 3141
=========================================
+ Hits 17202 17203 +1
+ Misses 14910 14908 -2
- Partials 1116 1117 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Add a `spark4.1` Maven profile that reuses the existing `client-spark/spark4`
module to compile and run against Spark 4.1.1.
Spark 4.1 introduced two binary-incompatible additions to APIs called from
this module:
- `MapStatus.apply` gained a trailing `checksumVal: Long` parameter.
- `ExternalSorter`'s constructor gained a trailing
`rowBasedChecksums: Array[RowBasedChecksum]` parameter.
Scala default parameters do not surface to Java callers, so a single
`Spark4Compat.java` cannot satisfy both 4.0.x and 4.1.x signatures. Two
parallel source roots ship a matching `Spark4Compat`:
- `src/main/java-spark4_0/` — 3-arg `mapStatus`, 5-arg `ExternalSorter` ctor
- `src/main/java-spark4_1/` — 4-arg `mapStatus` (checksumVal=0L),
6-arg `ExternalSorter` ctor (empty `RowBasedChecksum[]`)
`build-helper-maven-plugin` selects the source root via
`${spark4.compat.source.dir}`; the `spark4` profile keeps the 4.0 default,
the `spark4.1` profile overrides it to the 4.1 variant.
`RssShuffleReader` now calls `Spark4Compat.newExternalSorter(...)` and
`RssShuffleWriter` calls `Spark4Compat.mapStatus(...)`.
The `client-spark/extension` (Spark UI) module already uses
`scala-jakarta` under the `spark4` profile; Spark 4.1's `WebUIPage.render`
keeps the same `jakarta.servlet.http.HttpServletRequest` signature, so the
existing source root works for `spark4.1` as well.
CI:
- parallel.yml: add `profile: spark4.1, java-version: 17` to the matrix.
- sequential.yml: add an `Execute -Pspark4.1` step gated on java-version 17.
1fd752e to
54ad2ef
Compare
…g into body
Response decoders that wrap the frame ByteBuf into a NettyManagedBuffer
body relied on TransportFrameDecoder.shouldRelease() returning false to
transfer ownership of the single ref count from the frame to the body.
Netty 4.2's stricter PooledByteBuf lifecycle exposes this fragility:
RepartitionWithHadoopHybridStorageRssTest fails with
IllegalReferenceCountException: refCnt: 0 from
NettyManagedBuffer.nioByteBuffer when running under Spark 4.1 (which
ships with netty 4.2.7.Final).
Make ownership explicit: each body-wrapping decoder now calls
byteBuf.retain() so the body has its own independent ref count, and
TransportFrameDecoder.shouldRelease() always returns true. The frame
buffer's lifetime is owned by the decoder loop; the body's lifetime is
owned by the response handler.
Updated TransportFrameDecoderTest to match the new contract: every
decoded message expects shouldRelease() == true, and body-bearing
messages release the frame buffer first, then the body.
Verified locally:
mvn -Pspark4.1 -pl integration-test/spark-common,integration-test/spark4 \
-Dtest=RepartitionWithHadoopHybridStorageRssTest test
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
….1.1 Spark 4.1.1 ships jackson-module-scala_2.13:2.20.0, which validates the classpath jackson-databind version on registration and rejects anything outside [2.20.0, 2.21.0). The spark4.1 profile previously pinned jackson-databind/jackson-core to 2.18.2, so RDDOperationScope's static initializer (which calls ObjectMapper.registerModule(DefaultScalaModule)) threw JsonMappingException, surfacing as ExceptionInInitializerError on the first test (AQERepartitionTest) and NoClassDefFoundError on every later test that touches RDDOperationScope (e.g. MapSideCombineTest via SparkContext.parallelize). Bump jackson.version to 2.20.0 so databind/core/module-scala stay in lockstep under -Pspark4.1.
Contributor
Author
zuston
approved these changes
May 25, 2026
Member
|
merged! thanks @LuciferYang |
Contributor
Author
|
Thank you @zuston |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Add a
spark4.1Maven profile that reuses the existingclient-spark/spark4module to compile and run against Spark 4.1.1.API shims
Spark 4.1 introduced two binary-incompatible additions to APIs called from this module:
org.apache.spark.scheduler.MapStatus.apply— gained a trailingchecksumVal: Longparameter.org.apache.spark.util.collection.ExternalSorter— constructor gained a trailingrowBasedChecksums: Array[RowBasedChecksum]parameter.Scala default parameters do not surface to Java callers, so a single
Spark4Compat.javacannot satisfy both 4.0.x and 4.1.x signatures. Two parallel source roots ship a matchingSpark4Compat:src/main/java-spark4_0/— 3-argmapStatus, 5-argExternalSorterctor (current Spark 4.0 shape).src/main/java-spark4_1/— 4-argmapStatus(checksumVal = 0L), 6-argExternalSorterctor (emptyRowBasedChecksum[]).build-helper-maven-pluginselects the source root via${spark4.compat.source.dir}; the existingspark4profile keeps the 4.0 default, the newspark4.1profile overrides it to the 4.1 variant.RssShuffleReadernow callsSpark4Compat.newExternalSorter(...)andRssShuffleWritercallsSpark4Compat.mapStatus(...)instead of constructing those Spark types directly.The
client-spark/extension(Spark UI) module already usesscala-jakartaunder thespark4profile; Spark 4.1'sWebUIPage.renderkeeps the samejakarta.servlet.http.HttpServletRequestsignature, so the existing source root works forspark4.1as well — only aspark4.1profile body that mirrors thespark4one is added.spark4andspark4.1are mutually exclusive at build time; they share a Maven coordinate but produce incompatible bytecode against different Spark majors. Runmvn cleanbetween profile switches locally.Runtime dependency adjustments under
-Pspark4.1Spark 4.1.1 bumped several runtime libraries; the
spark4.1profile pins matching versions and the code makes the necessary lifecycle adjustments:NettyUtils.createEventLoopreferencesio.netty.channel.nio.NioIoHandler(new in Netty 4.2). Netty 4.2 also tightensPooledByteBuf.nioBuffer()to requirerefCnt > 0. Response decoders that wrap the frame buffer in aNettyManagedBufferbody now callretain()explicitly, so the body's reference count is independent of the frame's.TransportFrameDecoder.shouldReleaseis simplified to always return true, andTransportFrameDecoderTestis updated accordingly. Netty 4.1 silently tolerated the previous shape; 4.2 throwsIllegalReferenceCountException.JettyServer.createThreadPoolis already source-compatible with both lines.jackson-module-scala_2.13:2.20.0, which validates the classpathjackson-databindversion on registration and rejects anything outside[2.20.0, 2.21.0). Without the bump,RDDOperationScope's static initializer (ObjectMapper.registerModule(DefaultScalaModule)) throwsJsonMappingException, surfacing asExceptionInInitializerErroron the first SQL/RDD-touching test andNoClassDefFoundErroron every later one.These adjustments live entirely inside the
spark4.1profile and the response-decoder ref-count fix; the default build and-Pspark4(Spark 4.0.2) paths are unchanged.Why are the changes needed?
Following #1805 (Spark 4.0.2 support, now closed), users on Spark 4.1 currently cannot link Uniffle's Spark 4 client because of the API additions above. Closes #2750.
Does this PR introduce any user-facing change?
No (build / packaging only). A new
-Pspark4.1build flag is available; default builds and the existing-Pspark4flag are unchanged.How was this patch tested?
mvn clean package -Pspark4.1 -DskipTestsagainst Spark 4.1.1 — passes.mvn clean package -Pspark4 -DskipTestsagainst Spark 4.0.2 — still passes.Spark4Compat.classdiffers across the two profiles (different bytecode size / arity), confirming the multi-source-root selection works.javap -pon Spark 4.0.2 / 4.1.1 jars from Maven Central match the shim implementations.mvn -Pspark4.1 -pl integration-test/spark4 -am test—AQERepartitionTestandMapSideCombineTestpass after the netty refCnt + jackson alignment fixes.TransportFrameDecoderTestpasses against Netty 4.2 (-Pspark4.1) and continues to pass on the default Netty 4.1 line.parallel.ymlnow runs-Pspark4.1(java-version 17) in the matrix;sequential.ymladds anExecute -Pspark4.1step gated on java-version 17. Latest run on this branch (26386496434) is all-green across the full 45-job matrix.