Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move file virtualization in toil-wdl-runner to task boundaries #5028

Merged
merged 56 commits into from
Oct 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
89c276a
Defer file virtualization to task boundaries and consolidate decl eva…
stxue1 Jul 19, 2024
c85f173
Implement support for importing relative URL paths; import files at s…
stxue1 Jul 24, 2024
230efa0
Fix possible invalid lookup and don't import raw URLs
stxue1 Jul 24, 2024
2955bc3
Merge branch 'master' of github.com:DataBiosphere/toil into issues/50…
stxue1 Jul 24, 2024
066f780
Get rid of sentinel value implementation + drop files before virtuali…
stxue1 Jul 24, 2024
d5365a2
Drop missing files during decl eval for outputs + Add a check for inv…
stxue1 Jul 25, 2024
bbb098d
Deal with mypy
stxue1 Jul 25, 2024
1b6bd02
Don't drop unnecesssarily
stxue1 Jul 25, 2024
f65d59c
Switch to setattr implementation
stxue1 Aug 20, 2024
be0203e
Fix overwriting files
stxue1 Aug 20, 2024
f60d475
Fix prototype implementation
stxue1 Aug 22, 2024
552ac74
Merge branch 'master' of github.com:DataBiosphere/toil into issues/50…
stxue1 Aug 23, 2024
8163c2c
Merge master into issues/5004-wdl-virtualize-only-at-task-boundaries
github-actions[bot] Aug 27, 2024
4bbe63d
Fix documentation
stxue1 Aug 23, 2024
05dc240
Merge branch 'issues/5004-wdl-virtualize-only-at-task-boundaries' of …
stxue1 Aug 27, 2024
a6b18c7
Resolve nested virtualize files issue by converting back to original …
stxue1 Aug 29, 2024
94da9af
Fix virtualization of URLs that aren't toil URIs
stxue1 Aug 29, 2024
644fd51
Mypy
stxue1 Aug 29, 2024
d5bedee
Merge branch 'master' of github.com:DataBiosphere/toil into issues/50…
stxue1 Aug 29, 2024
f8c0ef1
Remove documentation that no longer applies
stxue1 Aug 29, 2024
6365b34
Fix merge conflict issue
stxue1 Aug 29, 2024
0a35c53
Enable symlink conformance test
stxue1 Aug 29, 2024
ce21051
whitespace
stxue1 Aug 29, 2024
343d6d8
Fix virtualize/devirtualize_filename to be called only during command…
stxue1 Aug 31, 2024
f5f3eea
Merge master into issues/5004-wdl-virtualize-only-at-task-boundaries
github-actions[bot] Sep 3, 2024
37efd55
Add documentation and make convert_files function import greedily
stxue1 Sep 6, 2024
491ba1b
Get rid of memoization as File is not hashable (I believe inside LRU)
stxue1 Sep 10, 2024
8569a81
Merge master into issues/5004-wdl-virtualize-only-at-task-boundaries
github-actions[bot] Sep 10, 2024
731120d
Merge master into issues/5004-wdl-virtualize-only-at-task-boundaries
github-actions[bot] Sep 10, 2024
b25c80d
Merge master into issues/5004-wdl-virtualize-only-at-task-boundaries
github-actions[bot] Sep 12, 2024
ffb2e08
Merge master into issues/5004-wdl-virtualize-only-at-task-boundaries
github-actions[bot] Sep 16, 2024
7f17413
Merge master into issues/5004-wdl-virtualize-only-at-task-boundaries
github-actions[bot] Sep 16, 2024
82383ab
Merge master into issues/5004-wdl-virtualize-only-at-task-boundaries
github-actions[bot] Sep 16, 2024
a7292be
Apply suggestions from code review
stxue1 Sep 17, 2024
e6718cd
Update src/toil/wdl/wdltoil.py
stxue1 Sep 18, 2024
2ad130f
Rename, add comments, remove unused code/comments
stxue1 Sep 18, 2024
cb8b230
Merge branch 'issues/5004-wdl-virtualize-only-at-task-boundaries' of …
stxue1 Sep 18, 2024
b0db027
Add comments and adjust wdl context usage
stxue1 Sep 18, 2024
8c24d8f
add namespace
stxue1 Sep 19, 2024
666aef5
integrate namespace into wdl_context
stxue1 Sep 19, 2024
0b6fb82
properly name wdl value bases
stxue1 Sep 19, 2024
4d5ecae
Remove irrelevant comment
stxue1 Sep 24, 2024
b43d256
Adjust docstring formatting to remove RST warnings
adamnovak Sep 24, 2024
7a36b59
Adjust comment grammar
adamnovak Sep 24, 2024
96cde22
Merge master into issues/5004-wdl-virtualize-only-at-task-boundaries
github-actions[bot] Sep 24, 2024
8c42888
Remove disallowed backticks
adamnovak Sep 24, 2024
fec8663
Merge remote-tracking branch 'upstream/issues/5004-wdl-virtualize-onl…
adamnovak Sep 24, 2024
966d855
Merge master into issues/5004-wdl-virtualize-only-at-task-boundaries
github-actions[bot] Sep 26, 2024
dcf2f27
Merge master into issues/5004-wdl-virtualize-only-at-task-boundaries
github-actions[bot] Sep 26, 2024
a318208
Extract out the import function to add back memoization and resolve i…
stxue1 Sep 27, 2024
a545493
Merge remote-tracking branch 'origin/issues/5004-wdl-virtualize-only-…
stxue1 Sep 27, 2024
3ece42b
Merge branch 'master' of github.com:DataBiosphere/toil into issues/50…
stxue1 Sep 27, 2024
f9db076
change wdl_context to wdlcontext
stxue1 Sep 27, 2024
c16290f
also change typeddict
stxue1 Sep 27, 2024
cb0f986
Add some documentation for the WDLContext object
stxue1 Sep 27, 2024
c7c69eb
Merge master into issues/5004-wdl-virtualize-only-at-task-boundaries
github-actions[bot] Sep 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions src/toil/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -1121,6 +1121,9 @@ def _setupAutoDeployment(
logger.debug('Injecting user script %s into batch system.', userScriptResource)
self._batchSystem.setUserScript(userScriptResource)

def url_exists(self, src_uri: str) -> bool:
return self._jobStore.url_exists(self.normalize_uri(src_uri))

# Importing a file with a shared file name returns None, but without one it
# returns a file ID. Explain this to MyPy.

Expand Down
2 changes: 1 addition & 1 deletion src/toil/jobStores/googleJobStore.py
Original file line number Diff line number Diff line change
Expand Up @@ -383,7 +383,7 @@ def _get_blob_from_url(cls, url, exists=False):

if exists:
if not blob.exists():
raise NoSuchFileException
raise NoSuchFileException(fileName)
# sync with cloud so info like size is available
blob.reload()
return blob
Expand Down
5 changes: 2 additions & 3 deletions src/toil/test/wdl/wdltoil_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import subprocess
import unittest
from uuid import uuid4
from typing import Optional
from typing import Optional, Union

from unittest.mock import patch
from typing import Any, Dict, List, Set
Expand Down Expand Up @@ -49,11 +49,10 @@ def tearDown(self) -> None:
WDL_CONFORMANCE_TEST_COMMIT = "2d617b703a33791f75f30a9db43c3740a499cd89"
# These tests are known to require things not implemented by
# Toil and will not be run in CI.
WDL_CONFORMANCE_TESTS_UNSUPPORTED_BY_TOIL= [
WDL_CONFORMANCE_TESTS_UNSUPPORTED_BY_TOIL = [
16, # Basic object test (deprecated and removed in 1.1); MiniWDL and toil-wdl-runner do not support Objects, so this will fail if ran by them
21, # Parser: expression placeholders in strings in conditional expressions in 1.0, Cromwell style; Fails with MiniWDL and toil-wdl-runner
64, # Legacy test for as_map_as_input; It looks like MiniWDL does not have the function as_map()
72, # Symlink passthrough; see <https://github.com/DataBiosphere/toil/issues/5031>
77, # Test that array cannot coerce to a string. WDL 1.1 does not allow compound types to coerce into a string. This should return a TypeError.
]

Expand Down
Loading