-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-53176][DEPLOY] Spark launcher should respect --load-spark-defaults
#51905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -369,8 +372,13 @@ private void testCmdBuilder(boolean isDriver, File propertiesFile) throws Except | |||
String[] cp = findArgValue(cmd, "-cp").split(Pattern.quote(File.pathSeparator)); | |||
if (isDriver) { | |||
assertTrue(contains("/driver", cp), "Driver classpath should contain provided entry."); | |||
if (propertiesFile == null || loadSparkDefaults) { | |||
assertTrue(contains("/driver-default", cp), | |||
"Driver classpath should contain provided entry."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this assertion fails without this patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
launcher/src/main/java/org/apache/spark/launcher/AbstractCommandBuilder.java
Outdated
Show resolved
Hide resolved
|
||
for (Map.Entry<Object, Object> entry : props.entrySet()) { | ||
if (!defaultsProps.containsKey(entry.getKey())) { | ||
defaultsProps.put(entry.getKey(), entry.getValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm? I think the logic should be that if user specified prop file contains any default entries, they must be overwritten. But here it looks like if the user specified entries are not in default prop file, they will be present? So it is a merger instead of overwritting? Is it correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, you are right, will fix soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@viirya I fixed the behavior, also refactored UT to verify the priority of the configs:
--conf
- then the user-specified properties file
- then
spark-defaults.conf
if it exists
} | ||
} | ||
if (loadSparkDefaults) { | ||
assertEquals("/driver", effectiveConfig.get(SparkLauncher.DRIVER_EXTRA_CLASSPATH)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only the default props contain DRIVER_EXTRA_CLASSPATH
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I reuse the existing spark-defaults.conf
https://github.com/apache/spark/blob/master/launcher/src/test/resources/spark-defaults.conf
} | ||
|
||
private void doTestGetEffectiveConfig( | ||
File propertiesFile, boolean loadSparkDefaults, boolean confDriverMemory) throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this test use loadSparkDefaults
to conditionally check something? Seems no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
used at line 118
assertFalse(effectiveConfig.containsKey(SparkLauncher.DRIVER_EXTRA_CLASSPATH)); | ||
} | ||
} else { | ||
assertEquals("/driver", effectiveConfig.get(SparkLauncher.DRIVER_EXTRA_CLASSPATH)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this require loadSparkDefaults
to be true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's only present at spark-defaults.conf
@viirya could you take another look? |
I will take another look today. Thanks. |
Thanks for waiting @pan3793. Looks good to me. |
@sunchao Do you want to take another look on the change after you approved? |
Thanks @viirya . It looks good to me too! |
…aults` ### What changes were proposed in this pull request? SPARK-48392 introduces `--load-spark-defaults`, but does not apply correctly for the Spark launcher process, this mainly affects the driver when Spark runs in local/client mode. let's say we have ``` $ cat > conf/spark-defaults.conf <<EOF spark.driver.memory=4g EOF $ cat > conf/spark-local.conf <<EOF spark.master=local[4] EOF ``` ``` $ bin/spark-shell --properties-file conf/spark-local.conf --load-spark-defaults ... scala> spark.sql("SET spark.driver.memory").show() +-------------------+-----+ | key|value| +-------------------+-----+ |spark.driver.memory| 4g| +-------------------+-----+ ``` even the spark conf reports that driver uses 4GiB heap memory, but if we check the Java process, the config actually does not take effect, the default 1GiB is used instead. ``` $ jinfo <spark-submit-pid> ... VM Arguments: jvm_args: -Dscala.usejavacp=true -Xmx1g ... ``` ### Why are the changes needed? Bug fix. ### Does this PR introduce _any_ user-facing change? Yes, bug fix. ### How was this patch tested? UT is modified to cover the change, plus manual tests for the above cases. ``` $ jinfo <spark-submit-pid> ... VM Arguments: jvm_args: -Dscala.usejavacp=true -Xmx4g ... ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes #51905 from pan3793/SPARK-53176. Authored-by: Cheng Pan <[email protected]> Signed-off-by: Liang-Chi Hsieh <[email protected]> (cherry picked from commit b82957c) Signed-off-by: Liang-Chi Hsieh <[email protected]>
What changes were proposed in this pull request?
SPARK-48392 introduces
--load-spark-defaults
, but does not apply correctly for the Spark launcher process, this mainly affects the driver when Spark runs in local/client mode.let's say we have
even the spark conf reports that driver uses 4GiB heap memory, but if we check the Java process, the config actually does not take effect, the default 1GiB is used instead.
Why are the changes needed?
Bug fix.
Does this PR introduce any user-facing change?
Yes, bug fix.
How was this patch tested?
UT is modified to cover the change, plus manual tests for the above cases.
Was this patch authored or co-authored using generative AI tooling?
No.