Releases: kids-first/kf-somatic-workflow
🔧 Fix CNVkit Batch Bug
CNVkit batch had a bug that cause soft link of .cns file to fail when input cram was the same as input tumor sample name. This fixes that. Also the following were addressed and improved:
- Adjust resources for controlfreec as only up to 8 cores seem to ever be used
- Replace b allele germline filter tool with more efficient bcftools one
- Clarify and improve some CNVkit batch tool input doc
- Fix previously undiscovered CNVkit access tool bug - no actual affect on results
- Added and updated default inputs for
blacklist_regionsandcnv_blacklist_regions
What's Changed
Full Changelog: v5.2.2...v5.3.0
🤝 CRAM 3.1 Compatibility
- Samtools version bumped to 1.20 to ensure allow newer versions of CRAM to be able to be processed
- Samtools image contains generic and custom-compiled images to improve performance for the following intel machines:
- skylake-avx512 (c5 ec2)
- icelake-server (c6 ec2)
- sapphireridge (c7 ec2)
- PileUp step for ControlFreeC replaced with faster samtools version (100% increase with half the cores)
What's Changed
- Dm cram 3 1 compat by @dmiller15 in #201
- 🔨 samtools pileup update by @migbro in #203
Full Changelog: v5.2.1...v5.2.2
🏎️ Minor Performance Improvement
A CAVATICA-specific update bumps resources requested for annotation tools so that less instance type switching occurs.
What's Changed
Full Changelog: v5.2.0...v5.2.1
TelomereHunter and Somatic Bug Fixes
TelomereHunter
This release adds TelomereHunter as a tool to the repository. It is not yet part of the main somatic workflow.
Somatic Workflow Fixes
Two quick fixes for the somatic workflow.
- We no longer add "reheader" to the name of prepass VCF outputs. This change was made to ensure that file naming stayed consistent across different run modes
- We changed the way that the chrLen file is built. Occasionally, the b-allele files for male samples would be lacking any chrY calls. When that happens, ControlFREEC will error out with:
An error occurred in GenomeCopyNumber::addBAFinfo: could not find an SNP index for Y. We added a filter step that removes any chromosomes from the chrLen file that are not present in the b allele file.
What's Changed
- 🎉 telomerehunter init by @dmiller15 in #191
- 🐛 remove chrlen chrs not in ballele by @dmiller15 in #192
- Update Production WF Docker Tables by @github-actions in #193
- 🔧 do not rename reheadered vcfs by @dmiller15 in #194
- 🔖 bump release number by @dmiller15 in #195
Full Changelog: v5.1.1...v5.2.0
📝 SNV HotSpot Updates
We have identified the following important cancer HotSpots to be of interest in addition to our existing list:
H3C3, amino acid position 28
H3C14, amino acid position 28
H3C14, amino acid position 35
This update simply updates the default file pull for SNV HotSpot annotation for these rare but important cancer variants
What's Changed
Full Changelog: v5.1.0...v5.1.1
📝 Default File Update and new Optional Inputs
This release updates a couple of the default annotation file pointers. The files have remained the same, but CAVATICA file ID changed as the files was moved in the bucket. Also added the following:
- Added
lancet_input_vcfso that if a user needs to re-run LAncet only, and has existing variant calls that they'd like to use to supplement the CDS in Exome+ mode for WGS, they can add it here. This is a time and cost efficiency to skip running Strelka2 and Mutect2 again if a user already has those calls from a previous run - Added logic to skip samtools on-the-fly recalculate MD tag while converting to BAM format, if file is already BAM. Also added flag in case the input BAM has MD tag issues to override and run conversion anyway
- Update annotation submodule to latest relevant
- Updated consensus call workflow to also use update default file pointers and update default/suggested public filters for both to ensure error-free GATK soft filtering
- Additional validation checks and performance improvements
What's Changed
- 🐎 sets for faster searching by @dmiller15 in #176
- 🐛 more robust run checking by @dmiller15 in #177
- 📚 fix broken link by @dmiller15 in #178
- 🎉 update somatic annotation by @dmiller15 in #179
- 🤖 Update Annotation Submodule for Release v1.2.0 by @migbro in #182
- 🤖 Update Annotation Submodule for Release v1.2.1 by @migbro in #183
- 📝 🧹Update ref pointers, add pure lancet mode, lceanup deprecated by @migbro in #184
- ✏️ minor cwl doc cleanup and doc cleanup by @migbro in #186
Full Changelog: v5.0.0...v5.1.0
Somatic Workflow Optimization
Somatic Workflow Optimization
This release covers a wide range of changes that speed up our somatic workflow.
Functional Changes
- Calling interval creation and use have been overhauled. Documentation on this change is available in the repository
- All callers can now be turned on or off at runtime, allowing for limited runs of the workflow. As such, all individual production callers have been removed
- Gnomad annotation now performed by Echtvar rather than bcftools. Echtvar performs the annotation significantly faster
- Changed reference inputs to include secondary/index files. Workflow will no longer build reference indices to start the workflow. This change significantly cuts down on repetitive work and accelerates the workflow
Bug Fix
- Fixed potentially erroneous pipefails
Documentation Updates
We cleaned up and updated a lot of documentation in this release:
- Added tables detailing the tools and docker images used in our production workflows (somatic and consensus)
- Added a detailed doc about interval creation and usage
- Added a doc detailing Echtvar referecne creation
- Primary README was rewritten
What's Changed
- pass disable vep switch by @sakshamphul in #167
- 🏷️ Release 5.0.0 by @dmiller15 in #175
New Contributors
- @sakshamphul made their first contribution in #167
Full Changelog: v4.4.2...v5.0.0
Hotfix WGS Lancet Interval Generation
Hotfix for WGS Lancet Interval Generation
Discovered a bug that was causing Lancet interval generation for WGS samples to fail. The tool responsible for generating the intervals would fail when bedops encountered very large variant records from Mutect2 or Strelka2. The error was silent due to an erroneous implementation of pipefail. This hotfix resolves both issues by fixing the pipefail implementation and using the megarow binary for bedops.
What's Changed
- 🐛 fix lancet interval silent failure by @dmiller15 in #163
Full Changelog: v4.4.1...v4.4.2
Hotfix for FilterMutectCalls
Hotfix for FilterMutectCalls
Updated the version of GATK being used for FilterMutectCalls. The old version (4.1.1.0) errors on some VCFs. GATK subsequently fixed it in 4.1.5.0. We went ahead and bumped the version of GATK used in FilterMutectCalls to 4.2.0.0
What's Changed
- 🐛 bump filtermutectcalls docker by @dmiller15 in #162
Full Changelog: v4.4.0...v4.4.1
🚦CNV + SV Switch
Added params to allow a user to choose not to run CNV and/or SV tools by setting run_cnv_tools and/or run_sv_tools to false. This feature was added as some input, like limited panel data, is too small for accurate/usable CNV and SV calls. Also implemented was some minor doc formatting to the cwl, allowing for larger strings before splitting the line.
What's Changed
Full Changelog: v4.3.7...v4.4.0