-
Notifications
You must be signed in to change notification settings - Fork 30
fix: Add timeout for Nuke jobs on EnvEnter/Exit #201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: mainline
Are you sure you want to change the base?
fix: Add timeout for Nuke jobs on EnvEnter/Exit #201
Conversation
Signed-off-by: Marin Maksutaj <[email protected]>
|
| - file://{{ Env.File.initData }} | ||
| cancelation: | ||
| mode: NOTIFY_THEN_TERMINATE | ||
| timeout: 600 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this interact with the existing settings for Nuke timeouts? https://github.com/aws-deadline/deadline-cloud-for-nuke/blob/mainline/src/deadline/nuke_submitter/deadline_submitter_for_nuke.py#L59
If a user specifies no timeouts, do they now get a default of 10 minutes for the env enter? Related, but I believe if not specified then EnvExit will now have a default due to changes in OpenJobDescription.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thought was that by default now it would've been set to 10 minutes (time which I based it off from the other DCCs, for example Cinema4D). Anywhere else we are setting this value that I'm not aware?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I see what you mean now, there are default values on the GUI toggles themselves, and those would overwrite the default values on the template.. In that case we could always adjust those values to 10 minutes as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely try out the existing timeout controls and see what behaviour we see when we change values, or choose to not use timeouts. What we end up doing should be clear and expected by the users.
| - file://{{ Env.File.initData }} | ||
| cancelation: | ||
| mode: NOTIFY_THEN_TERMINATE | ||
| timeout: 600 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long does this action take normally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 minutes seems like a generous number, not too sure how long it actually takes, but cinema4D has the same value specified for both actions.



What was the problem/requirement? (What/Why)
The Nuke adaptor's environment enter and exit actions did not have timeout values specified in the job template. This could lead to situations where if the environment enter or exit gets stuck (and the first layer of timeouts doesn't work), the worker agent would not enforce any timeout.
What was the solution? (How)
Added timeout values of 600 seconds (10 minutes) to both the environment enter and exit actions in the Nuke job template. This provides enough time for the server and Nuke to start/stop with an additional buffer, similar to the approach used in other packages like Cinema 4D.
What is the impact of this change?
This change improves the reliability of Nuke jobs by ensuring that environment enter and exit actions have proper timeout handling. If these actions get stuck, the worker agent will now enforce the timeout and prevent jobs from hanging indefinitely.
How was this change tested?
The change was tested by running the formatting and linting checks, which passed successfully. The timeout values are consistent with the approach used in other packages (e.g., Cinema 4D).
Was this change documented?
No additional documentation was needed as this is an internal implementation detail that doesn't affect the user-facing API or behavior. The timeout values are set to reasonable defaults that don't require user configuration.
Is this a breaking change?
No, this is not a breaking change. It adds timeout values where they were previously missing but doesn't change any existing functionality or interfaces.