Releases: DataBiosphere/toil
5.6.0
Changelog
Highlighted Features Added
- Integrate ARM Docker builds to make multi-arch images. (#3802)
- WES support and server mode. (#3779)
- TES batch system prototype. (#3821)
- Support for new resource syntax in PBSPro. (#3048)
- Toil now looks for lost jobs every minute instead of every hour (#3948)
Breaking Changes
- Remove --disableCaching's true/false argument. (#3869)
CWL
- CWL helper jobs: better disk & memory requirements. (#3834)
- CWL: safer test path generation for post-install testing. (#3818)
- More detailed names, which show up in job names sent to BatchSystem schedulers. #3941 (was #3893)
- If you use scatter and collect files, duplicates are correctly dealt with and renamed. (#3968)
- CWL: at the end of a job, ask cwltool to cleanup (#3965)
Kubernetes
Dependencies
- Enable Dependabot updates. (#3827)
- Multiple consolidated dependabot updates. (#3851)
- Update addict requirement from <2.3,>=2.2.1 to >=2.2.1,<2.5. (#3861)
- Bump cwltool to 3.1.20211107152837. (#3833 #3866 #3909)
- Bump cwltest from 2.1.20210626101542 to 2.2.20210901154959. (#3848)
- Bump flake8 from 3.8.4 to 4.0.1. (#3847)
- Allow more docker-py versions. (#3860)
- Remove pyyaml dependency. (#3858)
Misc
- Update cleanup script. (#3937)
- Remove use of sys.maxsize. (#3824)
- Spelling fixes. (#3814)
- Add codeql-analysis for Python. (#3825)
- Update jobstore function names. (#3809)
- Move AMI functions to lib. (#3810)
- Add "make pyupgrade" (py36-plus). (#3805)
- Type hints. (#3930)
- Coalesce status calls in slurm. (#3822)
- Python logging takes format values as *args. (#3852)
- Change quick test to a 10 minute timeout. (#3843)
- Add make uninstall to makefile. (#3883)
- Fix toil kill to find shared pid.log file (with unit test). #3941 (was #3932)
- Update documentation. (#3947)
- Remove remains of Travis. (#3976)
Bug Fixes
- Stop checkpoints from being reissued multiple times. (#3931)
- Don't consult LSF config when explicitly defining memory units. (#3820)
- Robustly remove state dirs. (#3836)
- Fix exception checking for exit_code. (#3830)
- Use exitStatus instead of exitReason for batch exit type comparison. (#3839)
- Update cwltest to improve K8 runs. (#3935)
- Consolidated CI Fixes. (#3887)
- Fix CWL conformance tests. (#3891)
- Toil-managed cluster scaling should work again with
--metrics
. (#3943)
Thank you to our contributors: @mr-c, @adamnovak, @w-gao, @jonathanxu18, @Hexotical, @tmooney, @nikhil, @kannon92, @douglowe, @mhpopescu, @Phhere, and @gmloose!
5.5.0
Changelog
CWL
- Add podman support; and other fixes from recent cwltool #3799
- Add streaming feature for cwltoil #3694
- Warn users if a different cwltool version is installed #3686
- Turn on all Kubernetes CWL tests that are expected to work on Singularity #3720
- Fix CWL in toil docs jobstore usage #3728
- DOC: update versions of CWL support
- Allow filestore bypass #3652
Misc
- Numerous Type Hints. #3705 #3701 #3693 #3691 #3663 #3688 #3642 #3684 #3682 #3680 #3666 #3675
- Single source of truth for job state #3776
- Do not set default for statePollingWait #3774
- Use absolute local paths when exportFile/importFile do not detect a schema #3767
- Multi-zone balancing within regions for AWS autoscaling groups #3746
- Migrate cloud-config to ignition #3488
- 🎡 Wheel Of Issues 🎰 #3760
- Add back addBatchSystemFactory function #3754
- Redirect stderr to /dev/null of lsf conf queries #3751
- Set number of cores based on job.cores for OpenMP applications #3739
- Google jobstore batching #3740
- Locations of CLI option docs
- Add AWS provisioner storage system #3727
- Set cls.bucket #3726
- Stream vs dowload jobs #3722
- Update Toil's main python test version to 3.8. #3669
- Move Travis tests to Gitlab #3675
Bug Fixes
- Don't leak symlinks #3795
- Prevent exception from being raised when modifying dir permissions for clean up #3778
- Fix scontrol output parsing #3793
- Fix AttributeError #3742
- Workaround for S3 in us-east-1 #3710
- Time data format #3708
- Fix leader.py batch system std files prefix glob #3679
Thank you to our contributors: @mr-c, @adamnovak, @w-gao, @jonathanxu18, @Hexotical, @ionox0, @gmloose, @juanesarango, @mhpopescu, @mberacochea, @nikhil!
5.4.0
Changelog
CWL
- Fix cwl and wdl dependency bleed and add stand alone tests. #3582
- Use MpiConfig.load() to handle MPI config file #3574
- Add support for MPI with CWL. #3525
Misc
- Numerous Type Hints. #3571 #3634 #3626 #3625 #3616 #3614 #3601 #3592 #3590 #3581
- Configurable Grafana port #3597
- Handle streaming reads from the cache when the data isn't written back yet #3595
- Balance over pools using AWS ASGs #3490
- Change how Kubernetes schedules and scales for hopefully better scale-down behavior #3587
- Add Owner tags to our AWS buckets when testing. #3577
- Allow workdir override #3586
- Additional info for "Permission denied" error #3579
- Consolidate memory functions #3529
- Pen children #3482
- Sniff raid better #3526
- Add a propagation policy to batch delete #3522
- Cleanup script for buckets, sdb domains, instance profiles, and roles. #3373
- Add decorator for flaky tests. #3510
Bug Fixes
- Enforce valid uuid from getNodeId() #3611
- Removed extra memory multiplication #3608
- Fix non integers lsf memory requested #3609
- Fixes jobCommand unpack #3610
- Fix CWL test 20 on Kubernetes #3572
- Fix #3579 breaking AWS docs #3596
- Update cactus test to fix broken bucket links. #3594
- Allow cleaning up a job whose overlargeID fell off #3584
- Only deploy a user script if we can deploy user scripts #3518
- Reduce path name lengths. #3438
- Insist on credentials for testing AMI finding #3524
Thank you to our contributors: @mr-c, @adamnovak, @w-gao, @jonathanxu18, @thiagogenez, @julian-klode, @darafferty, @nikhil, @mhpopescu!
5.3.0
Changelog
CWL
- Run CWL conformance tests on Kubernetes. #3323
- CWL symlinking files into work directory #3445
- set resource reqs for all CWL_INTERNAL_JOBS #3442
WDL
- Add WDL flatten() function + tests. #3485
- Add WDL collect_by_key() function + tests. #3476
- Add WDL keys() function + tests. #3460
- Add WDL as_pairs() function + tests. #3364
- Add as_map() WDL function #3448
- Add basic WDL development support #3434
- Add basic WDL 1.0 support #3421
- Update dictionary structure in AnalyzeWDL. #3416
Misc
- Add extra context for disk usage warning. #3495
- Update developing.rst #3479
- Allow text/encode options for read and write functions in file store #3428
- Type hinting and mypy checking additions #3470 # #3458 #3456
- Quality of life improvements for cactus. #3463
- Add checklists for reviewing and merging PRs #3432
- Ensure that python3 and not python (2) is used #3446
- Support Kubernetes clusters in toil launch-cluster #3357
- Additional check for predecessors. #3417
- Allow setOptions to accept an argparse group. #3426
Bug Fixes
- AWS: Handle socket timeout during AWS discovery #3503
- Use back-up flatcar AMI if not found. #3513
- Lsf command parser #3475
- Retry on s3 throttling. #3504
- Stop deleting all ASGs with tags #3474
- Only determine execute permissions on files we copy/move. #3437
- cope with an invalid HTTP_PROXY #3447
- Fix failing Google jobstore test #3420
- Update Google job store #3412
Thank you to our contributors: @mr-c, @adamnovak, @w-gao, @jonathanxu18, Arthur Rand, @mberacochea, @Jessime, @thiagogenez, @julian-klode!
5.2.0
Changelog
CWL
- Update to the latest cwltool (3.0.20201121085451 -> 3.0.20201203173111). #3375
- Add better handling for potentially mis-ordered CWL args. #3395
- Confirm CommandInputParameter expression can receive a File object. #3350
WDL
- Make AnalyzeWDL abstract to allow implementation for different WDL versions #3391
Misc
- Clean up memoize.py. #3374
- Refactor bioio library, simplify logging, and small adjustments to utils. #3351
- Use regular division for memory calculation on LSF. #3387
- Remove unused --nodeOptions arg. #3397
- Get boto S3 args for minio from environment variables. (#3370)
- Update EC2 lists. #3376
- Add a retry to AWS destroy. #3379
- Regular division for memory calculation in LSF. #3387
- Add option to blank mem allocation for SLURM. #3399
- Preserve file permissions on imported/exported files #3322
Bug Fixes
- Preserve file permissions on imported/exported files. #3322
- Make building a flattened list of live jobs non-recursive. #3394
- Move AWS job store uploads/downloads to boto3 (reopened) #3400
- Use a Docker Hub mirror when pulling Docker images. #3411
- Corrected gridengine maxMEM check to use memorystring #3410
Thank you to our contributors: @mr-c, @adamnovak, @w-gao, @jonathanxu18, @douglowe, @stevekm , @thiagogenez!
5.1.0
Version 5.1.0
Changelog
WDL
- Add WDL cross function. #3360
Bug Fixes
Thank you to our contributors: @mr-c , @adamnovak , @w-gao , @jonathanxu18 !
5.0.0
Caching is now turned on by default.
CWL v1.1 is also now fully supported and passing all conformance tests.
The following batch system names are now deprecated (#3225) and replaced by:
- singleMachine -> single_machine
- gridEngine -> grid_engine
- LSF -> lsf
- Mesos -> mesos
- Slurm -> slurm
- Torque -> torque
- HTCondor -> htcondor
- Kubernetes -> kubernetes
- k8s -> kubernetes
Changelog
CWL
- Upgrade to a CWL v1.2.0 capable cwltool (3.0.20200709181526 -> 3.0.20200807132242). #3137
- Export to a bucket if cwl's CreateFile is specified. #3124
- CWL v1.1 Support: loadContents if True for StepValueFrom objects. #3266
- CWL: Fix workdir permissions. #3229
- Test CWL secondary file from s3. #3311
- CWL v1.1 Support: Inplace update has side effect on directory content. #3280
- More CWL v1.2 conformance tests. #3336
- CWL: Reformat w/ Black and better support for tmpdir_prefix with suffix. #3333
WDL
- Add ceil() and floor() WDL functions. #3168
- Create a stdout and stderr file for each WDL task. #3181
- Add WDL write functions and builtin tests. #3236
- write_lines
- write_tsv
- write_json
- write_map
- Add WDL read functions and builtin tests. #3244
- read_lines
- read_tsv
- read_json
- read_map
- read_int
- read_string
- read_float
- read_boolean
- Add transpose() WDL function. #3273
- Add length() WDL function. #3307
- Add sub() WDL function. #3252
- Add range() WDL function. #3277
- Add size() WDL function. #3255
- Add zip() WDL function. #3355
- Implement WDL Pair type. #3304
- Refactor WDL types #3335
Kubernetes
- Adding Labels to Pods/Jobs. #3233
- Make kubernetes & botocore packages fully optional. #3259 #3261
- KubeWatch optimization. #3227
- Catch other kubernetes imports. #3261
- Automatically infer singularity for kubernetes. #3279
Misc
- Allow Caching By Default. #3111
- Update dependencies and loosen restrictions slightly. #3139
- Ensure that falsey (but non-null) values are considered. #3143
- Remove pathlib. #3169
- Move AWS job store uploads/downloads to boto3. #3153
- Update prom/node-exporter to use quay.io. #3186
- Add support for TOIL_CUSTOM_INIT_COMMAND. #3183
- Update docker module version to 4.3.1. #3222
- Add 'stream' and 'demux' options to apiDockerCall. #3224
- Use boto3 to create our leader node. #3145
- Prevent job kind filenames from getting too long. #3230
- Add a retry decorator. #3144
- Remove mesos offer messages. #3308
- Job Concept Unification. #3250
- Add reporting for accessed files of failed jobs. #3309
- Add coverage and refactor. #3314
- Expose --statePollingWait param through toil. #3321
- Record the maximum memory used by LSF jobs. #3327
- Remove custom math functions and replace with built ins. #3338
- Define a changelog process. #3316
- SLURM: getting job details from
scontrol show jobs
was non-functional and has now been fixed. This behavior in the case ofsacct
not being configured now works. #3346 - LSF: handle unicode configuration files #3354
- Replace abssympath w/ builtin. #3356
Bug Fixes
- Turn off watch code in Kubernetes. #3175
- Check against joining the current thread during CachingFileStore destructor. #3187
- Slurm.py error, undefined stdout - stdout should be stderr in code. #3194
- Fix restart flag error (issue #3094). #3142
- Batch System generates a key error instead of a usage message. #3225
- Remove Python 2 imports in WDL. #3228
- Move checkForDeadlocks() in the leader. #3234
- Fix boto3 migration bugs. #3243
- Fix None check when parsing WDL JSON file. #3272
- Fix broken stdout/stderr in torque job. #3306
- Add doubleMem for LSF jobs that die due to imposed memory limit. #3313
- Add 500 status to the list of retriable SDB BotoServerErrors. #3329
- Add a retry logic for BucketNotEmpty error. #3341
- Fix None check when parsing WDL JSON file. #3272
- Warn when trying to clean non-existent job store #3288
- Only rm a statefile that exists. #3348
- CWL: Remove unprocessed secondaryFiles from both Dict and List types.
Thank you to our contributors: @mr-c, @arostamianfar, @adamnovak, @diekhans, @w-gao, @jonathanxu18, @jeffrey856, @davidlougheed, @altairwei, @mberacochea, @drkennetz, @ionox0!
4.2.0
- Don't log in the inner scheduling loop. #3065
- Handle anticipated 404 errors without retrying. #3067
- Test and reimplement AMI finding via Flatcar JSON feed. #3061
- Bump Enlighten version to partly fix #3069. #3070
- Get wait duration is far too short of a rest for lsf. #3076
- Unify logging format. #3073
- Support CWL 1.1 Listing. #3058
- Enable CWL v1.2.0-dev3 (3.0.20200324120055 -> 3.0.20200530110633). #3092
- Enable CWL v1.2.0-dev4 (3.0.20200530110633 -> 3.0.20200709181526). #3105
- Add --nodeStorageOverrides option. #3096
- More robust CWL filepaths and secondary files. #3114
- Replace custom mkdir_p with the built in. #3123
- Move conditional execution test to before fill_in_defaults. #3117
Bug Fixes
- Stop CWL from trying to copy downloaded directories from somewhere local. #3053
- Exit the worker with 1 when jobs fail so the batch system can see. #3052
- Eliminate recursion in root finding to fix #3080. #3081
- Add a Cactus Kubernetes test to the integration tests on Gitlab. #3077
- Chaining doesn't make sense for CWL. #3091
- Fix failed job accounting. #3103
- Improve lsf batchsystem stability. #3101
- Deduplicate updated jobs. #3108
- Redundant virtualenv check. #3135
- Change Slurm command output processing. #3133
- Fix for log file being linked more than once. #3130
- Wrap Kubernetes watches to fix #3125. #3126
- Create dirs that do not exist when bind-mounting to docker. #3120
- Fix database state when file upload fails. #3122
Thank you to our contributors: @mr-c, @arostamianfar, @adamnovak, @diekhans, @nikhil, @drkennetz, @jonathanxu18, @tobiaszjarosiewicz, @ionox0!
4.1.0
- Attempt to make batch system more robust and debuggable. #2959
- Kubernetes Shared Caching. #3012
- Test python3.8 and build a python3.8 appliance. #3028
- Use the original pymesos. #3036
- Detect more kinds of virtual environment and normalize prefixes. #3035
- Do More Hashing. Can't have enough. #3025
- Replace reissued jobs message with better progress indicators. #3044
- Detect insufficient resources and deadlocks more usefully. #3043
Bug Fixes
- Don't list specific CWL features, try them all (and fix some CWL running bugs). #3015
- Always honor TOIL_GRIDENGINE_PE and never assume a site-dependent default. #3041
Thank you to our contributors: @mr-c, @arostamianfar, @adamnovak, @diekhans!
4.0.0
This release moves to python3 only, dropping all python2.7 compatibility, and also deprecates the command "cwltoil" in favor of "toil-cwl-runner".
- Update cwltool version (==1.0.20190906054215 -> <=2.0.20200126090152). #2969
- Drop the long-deprecated cwltoil. #2843
- Dropped python2.7. #2973
- Port provisioner tests and only test py3.6. #2842
- CWL: Refactor of link merge + conditionals + pickValue. #2845
- Add import subprocess to executor.py. #2976
- Encode mtail stdin to utf-8 and flush. #2978
- Encode framework message as utf-8 as well. #2981
- Make the output files for grid engine batch systems not try and be in per-host directories. #2956
- Use file locks instead of PID polling to see if other processes sharing the cache are alive. #2982
- Size downloads during download. #2989
- Add timeouts to all SQLite connect statements. #2993
- Eliminate sleep time on --restart. #2990
- Add support for --awsEc2ExtraSecurityGroupId. #2997
- Add moveExports option. #2983
- Set worker threads as dameon to prevent hanging process on error. #2998
- Cut the Threads. #2999
- [Part 1] CWL v1.1 Support #2985
- [Part 2] CWL v1.1 Support #3000
- Update Sphinx and let its version float. #3011
- Add --enableUnlimitedPreemptableRetries option. #2896
- Add less and vim to docker. #3017
- Improve and Unify Log Dumping. #3008
- Make dependency on Python3.6+ machine-readable in setup.py. #3021
- Improve slurm update frequency. #3026
- Make Sphinx a make prepare component and not a dependency. #3023
- Use forked pymesos and http-parser dependencies. #3024
Bug Fixes
- Fix python3 string/bytes error when using --printLogs. #3005
- Fix connection between Kubernetes and Leader. #3004
Thank you to our contributors: @mr-c, @arostamianfar, @adamnovak, @jeffrey856, @kaushik-work, @glmxndr, @johnbradley, @cmarkello, @diekhans!