-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make cylc remove
flow-aware and extend to historical tasks
#6370
Conversation
3b5fab1
to
ef5ea12
Compare
cylc remove
: make flow aware and extend to historical taskscylc remove
flow-aware and extend to historical tasks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good!
ef5ea12
to
4f3aabe
Compare
This comment was marked as resolved.
This comment was marked as resolved.
4f3aabe
to
b6b3cc0
Compare
This comment was marked as outdated.
This comment was marked as outdated.
Have been trying this out for sub-graph re-run use cases. The remove functionality is all working correctly 👍, I am able to re-run sub-graphs cleanly without using new flows 🚀. I have encountered some hitches (not related to this PR, all for consideration in follow-on work):
|
5c362a0
to
40045e3
Compare
fb49bcf
to
8f72cb4
Compare
What's the status on these TODO items from the OP:
|
Will either come in a follow-up PR or this one depending on how soon @hjoliver reviews this |
- Update data store with changed prereqs - Don't un-queue downstream task if: - the task is already preparing - the task exists in flows other than that being removed - the task's prereqs are still satisfied overall - Remove the downstream task from the pool if it no longer has any satisfied prerequisite tasks
f6a7d29
to
39be53f
Compare
Getting in review mode here - sorry for the delay! |
def get_resolved_dependencies(self) -> List[str]: | ||
def get_satisfied_dependencies(self) -> List[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Vaguely interesting historical aside: I think use of "resolved" here goes all the way back to Cylc 2, when we first got the ability to visualize a graph - not from the workflow definition because there was no workflow definition, just a bunch of task definition files with inputs and outputs - from dependencies "resolved" at runtime.]
NOTE: If `flow_nums` is empty, it means 'all', whereas | ||
if `self.flow_nums` is empty, it means this task is in the 'none' flow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK nice way to handle that.
"""Return True if any of this task's prerequisite tasks are | ||
satisfied.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Return True if any of this task's prerequisite tasks are | |
satisfied.""" | |
"""Return True if any of this task's prerequisites are satisfied.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't fit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(my main point was the wording ... I'll transfer to the superseding PR)
# we can infer it was deliberately removed, so don't respawn it. | ||
# we can infer it has just been deliberately removed (N.B. not | ||
# by `cylc remove`), so don't immediately respawn it. | ||
# TODO (follow-up work): | ||
# - this logic fails if task removed after some outputs completed | ||
# - this is does not conform to future "cylc remove" flow-erasure | ||
# behaviour which would result in respawning of the removed task | ||
# See github.com/cylc/cylc-flow/pull/6186/#discussion_r1669727292 | ||
LOG.debug(f"Not respawning {point}/{name} - task was removed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I wrote this comment myself, some time ago. Presumably it does not refer to prevention of "conditional respawning" because those tasks would have been removed with completed outputs. The only other only non-cylc-remove case is to prevent respawning of a task removed by suicide trigger, I think. Can we adapt the new machinery on this branch to handle that properly too - i.e. record in the DB that the task was removed by suicide trigger, to avoid having to imperfectly infer it? Could be follow-up work of course, if I'm not barking up the wrong tree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I'll transfer this to the superseding PR...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been through almost all of the non-test code, looks good. I still need to read the task_pool changes more carefully and do some testing.
@MetRonnie - is this still in draft because of the two "what's left to do" items at the top? |
Yes, because Oliver wanted to wait for the whole proposal to be addressed before merging. (FYI I'm going to close this PR and open a new one to stop GitHub collapsing recent review comments) |
Superseded by #6472 |
Note
Superseded by #6472
Partially addresses #5643
Summary
This mostly implements the "Cylc Remove Extension" proposal.
Flow numbers
cylc remove
now has a--flow
option for removing a task from specific flows.If not used, it will remove the task from all flows that it belongs to.
If the removed task is active/waiting, if it is removed from a subset of flows that it belongs to, it will remain in the task pool; if it is removed from all flows that it belongs to, it will be removed from the task pool (as is the current behaviour).
If a task is removed from all flows that it belongs to, it will become a no-flow task (
flow=None
).For ease of reviewing, you can use my UI branch that displays flow numbers: https://github.com/MetRonnie/cylc-ui/tree/flow-nums 1.
Historical tasks
cylc remove
now can remove tasks that are no longer active, making it look like they never ran. It does this by removing the task from the specified flows in the database (in thetask_states
andtask_outputs
tables)2, and un-setting any prerequisites of active tasks that the removed task had naturally satisfied3. If a task is removed from all flows that it belongs to, a no-flow task is left in the DB for provenance.The above also applies to active/waiting tasks that
cylc remove
is used on.What's left to do
When removing an active task from all its flows, kill the task.
Should probably add a functional test with the
--flow
option.Need to check this one:
Check List
CONTRIBUTING.md
and added my name as a Code Contributor.?.?.x
branch.Footnotes
Waiting tasks that are not yet in the pool have greyed out flow numbers at the moment. ↩
If removing flows would result in two rows in the DB no longer being unique, the SQLite
UPDATE OR REPLACE
statement is used, so the first entry will be removed and the most recent entry will remain. ↩Prerequisites manually satisfied by
cylc set --pre
are not affected bycylc remove
. ↩