-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YARN-11708: Setting maximum-application-lifetime using AQCv2 template… #7041
base: trunk
Are you sure you want to change the base?
Conversation
…s doesn't apply on the first submitted app
if (queue == null) { | ||
QueuePath queuePath = new QueuePath(queueName); | ||
|
||
writeLock.lock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This write lock should be before the get queue call, to properly handle the race condition, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing, yes it should be.
try { | ||
queue = queueManager.createQueue(queuePath); | ||
} catch (YarnException | IOException e) { | ||
LOG.error("Failed to create queue '{}': ", queueName, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOG.error("Failed to create queue '{}': ", queueName, e); | |
LOG.error("Failed to create queue " + queueName, e); |
Because the current log line wont log the exception, cause will think that is the 2nd param in the message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @susheelgupta7 ! LGTM
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Thanks for the update @susheelgupta7 ! May i ask you to fill the PR description and "How was this patch tested?" parts? |
if (queue == null) { | ||
String message; | ||
if (isAmbiguous(queueName)) { | ||
message = "Application " + app.getApplicationId() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here we have some code duplication. We can add the
"Application " + app.getApplicationId() + " submitted by user " + app.getUser()
part to the line 3385
@@ -3366,14 +3367,47 @@ public boolean moveReservedContainer(RMContainer toBeMovedContainer, | |||
|
|||
@Override | |||
public long checkAndGetApplicationLifetime(String queueName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think instead of add this create new queue logic to the method we could make the getOrCreateQueueFromPlacementContext public and call it before we call the checkAndGetApplicationLifetime method. If i see well this checkAndGetApplicationLifetime used just once in the code
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @susheelgupta7 ! LGTM!
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @susheel-gupta for the patch, I had some comments.
@@ -1566,6 +1567,14 @@ public long checkAndGetApplicationLifetime(String queueName, long lifetime) { | |||
return lifetime; | |||
} | |||
|
|||
@Override | |||
public CSQueue getOrCreateQueueFromPlacementContext(ApplicationId |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't return a CSQueue, as Fair Scheduler or the Fifo Scheduler do not utilize Capacity Scheduler queues.
import org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainer; | ||
import org.apache.hadoop.yarn.exceptions.YarnException; | ||
import org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YarnScheduler should not rely on one of its implementations' utility classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review. Yes FairScheduler has its own FSQueue class, which is specific to its design.
🎊 +1 overall
This message was automatically generated. |
…s doesn't apply on the first submitted app
Description of PR
Setting the maximum-application-lifetime property using AQC v2 templates (yarn.scheduler.capacity.root.test.auto-queue-creation-v2.template.maximum-application-lifetime) doesn't apply to the first submitted application (through which the queue is created), only to the subsequent ones. It should apply to the first application as well. So added a interface to check and create auto-queues if it doesn't exist.
How was this patch tested?
Queue present:
root.test
root.default
root.sample.test
root.queuetest
AQCv2 enabled, maxAppLifetime and defaultAppLifetime are set to 8 sec for "root.test".
Scenarios:
root.test
root.test.user
root.test.alice
root.test.alice.bob
root.sample.test
root.default
root.test.alice.bob.notaqc
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?