-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CWL job chaining tests #3696
base: master
Are you sure you want to change the base?
CWL job chaining tests #3696
Conversation
The md_list_reduced*cwl workflows are added. md_list_reduced_2nests.cwl will chain 2 jobs, while for md_list_reduced.cwl (currently) no jobs are chained. The test uses the stats logfiles (so toilcwl must be run with --stats and --logDebug), pulling out the lists of CWL jobs which are run in each TOIL task.
@douglowe Are both tests supposed to pass? |
The second test should fail. Really I should have made the test not equal, but for this prototype I wanted it to display what was being tested, so made it fail the test, rather than testing for failure. |
Okay, I added 72df69d so your new test would get run as part of the CI |
The parent thread now pickle's toilState, passing it to the worker thread, where it is unpickled again. A test for completed predecessors for jobs with multiple predecessors has been added to the worker, so that some basic job chaining can be performed for jobs whose remaining predecessor completes within that task. This change does result in the parent thread not correctly noting jobs which had multiple predecessors have now been completed correctly. So some of the checks at the end of the internal job routine have had to be switched off for the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some complaints from the type checker:
src/toil/worker.py:59: error: Function is missing a type annotation
src/toil/worker.py:101: error: Function is missing a type annotation for one or more arguments
src/toil/worker.py:185: error: Function is missing a type annotation for one or more arguments
src/toil/utils/toilDebugJob.py:49: error: Missing positional argument "parentToilState" in call to "workerScript"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @douglowe
As of https://github.com/DataBiosphere/toil/pull/3776/files#diff-86e13719d8c065346eadea6d6e3735e45c07874bea3be19d5f7396a22858f18fL83 jobsToBeScheduledWithMultiplePredecessors
is no longer a Dict
but a Set
so I think your code needs some changes to catch up.
Just to see if the recent changes in Toil 5.5.0 fixed this, I ran the tests without the other code changes and
|
I pushed this to our repo as |
successor = toilState.jobsToBeScheduledWithMultiplePredecessors[ | ||
successor.jobStoreID | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @douglowe , it has been clarified that jobsToBeScheduledWithMultiplePredecessors
is a set[str]
, not a dictionary. Could you refresh this test in light of changes made since you started this PR?
Thanks again!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh - yes, I will correct that now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right - my brain is slowly getting back up to speed with this. Not quite the quick fix I was thinking of - as my code was using the dictionary of job information cached here to check what successors the current job had.
I think that a discussion will be needed with the Toil devs, to see where the main Toil development arc for the JobStore is heading. IIRC the intent was to have a mutable central JobStore database, which could be accessed & changed by all jobs (not just the parent job) - which would make the hashed jobstore that my hacky code passes around redundant.
Perhaps we should cut out the test code that I wrote (though this also needs some work, as the regex doesn't match the current output stored in the Stats directory), and save that - as it still shows what kind of workflow in which I'd hope full job chaining would occur. Then, separately, we can work on correcting the information flow and logic for working out what jobs can be chained and which can't?
This is supposed to fix #3697. |
@douglowe It seems like it might make sense for us to schedule a meeting with us and @DailyDreaming to talk about getting this feature actually merged, and how it changes the scheduling design. Is that something you would be interested in? |
This PR is a prototype for the test suite to determine if Toil is chaining CWL jobs as we expect / would like.
I'm using a list of Toil tasks, each containing a list of the CWL jobs completed within that task, to test if chaining has occurred or not. The list of tasks is compiled from the stats outputs (requiring the --stats and --logDebug flags) - if a more robust method for getting this information could be suggested I'd be very happy to change this code.
To show the expected outputs I've included an example of two CWL workflows, and a test which one fails and the other passes (because of the subworkflow wrapping the first 2 CWL jobs). While we're working out the best way to write these tests I'll work on more substantial tests for the final setup.
Changelog Entry
To be copied to the draft changelog by merger:
Reviewer Checklist
issues/XXXX-fix-the-thing
in the Toil repo, or from an external repo.camelCase
that want to be insnake_case
.docs/running/cliOptions.rst
Merger Checklist