-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overall workflow state / completed state #5701
Comments
For sub-workflows, we can currently use the workflow's exit code which kinda works, however, with this it is hard to tell the difference between a stopped workflow and a completed workflow. We could add a new top-level workflow status for "completed" workflows. Currently this state can be effectively detected by querying the task-pool table in the database, if there are no entries, then the workflow has completed. |
My sub-workflow example notes this, and addresses it by having the sub-workflow launch script (for the launcher task in the main workflow) check the DB for completion of a known final task in the sub-workflow:
However, your suggestion to use the task pool table is an improvement 🎉 I'll amend my example and alert the couple of NIWA teams with sub-workflow use-cases. Also, a new top-level workflow status for "completed" is a good idea. |
It would be a good idea to make accessing the "complete" status as easy as possible as this is something that tools like Ideally we wouldn't need to go to the database at all (managing database connections is hassle), perhaps a |
See also cylc/cylc-uiserver#618 |
Problem
Cylc has a clear concept of task and job states, but less so when it comes to the overall workflow state. For example, once the workflow has stopped, there is no easy way to tell the underlying reason without digging through the logs or database. In particular, for non-cycling workflows or ones with finite number of cycles it would be useful to easily tell apart normal termination (workflow reached and completed the final cycle) from abnormal one (stalled, server crash, ...). Chatting to @oliver-sanders about it, this seems to be also a prerequisite for having proper support for subworkflow as a task in the future (couldn't find a specific issue for it).
Proposed Solution
A possible solution could be to add a workflow-wide status file akin to
job.status
that can be scanned for and interrogated for information.The text was updated successfully, but these errors were encountered: