Skip to content

Commit e534b24

Browse files
authored
Merge pull request #854 from Delphine-L/vgp1
VGP 1 - Add Rdeval, hetorozygous coverage, and estimated genome size outputs
2 parents 03ad9d9 + f73dbd3 commit e534b24

File tree

4 files changed

+1020
-87
lines changed

4 files changed

+1020
-87
lines changed

workflows/VGP-assembly-v2/kmer-profiling-hifi-VGP1/CHANGELOG.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,12 @@
11
# Changelog
22

3+
4+
## [0.2] - 2025-05-16
5+
6+
### Changes
7+
- Add RDeval to evaluate PacBio HiFi reads quality
8+
- Now compute the homozygous read coverage and the estimated genome size in this workflow
9+
310
## [0.1.9] - 2024-12-17
411
### Added
512
- Annotation for workflow describing its function
Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,18 @@
11
# VGP Workflow #1
22

3-
This workflow produces a Meryl database and Genomescope outputs that will be used to determine parameters for following workflows, and assess the quality of genome assemblies. Specifically, it provides information about the genomic complexity, such as the genome size and levels of heterozygosity and repeat content, as well about the data quality.
3+
This workflow produces a Meryl database and Genomescope outputs that will be used to determine parameters for following workflows, and assess the quality of genome assemblies. Specifically, it provides information about the genomic complexity, such as the genome size and levels of heterozygosity and repeat content, as well about the data quality. It also provides statistics on the PacBio Hifi reads.
44

55
### Inputs
66

7-
- A collection of Hifi long reads in FASTQ format
8-
- *k*-mer length
9-
- Ploidy
7+
1. The name of the species being assembled
8+
2. The Name of the assembly
9+
3. A collection of Hifi long reads in FASTQ format
10+
4. *k*-mer length
11+
5. Ploidy
1012

1113
### Outputs
1214

13-
- Meryl Database of kmer counts
15+
- Meryl Database of *k*-mer counts
1416
- GenomeScope
1517
- Linear plot
1618
- Log plot
@@ -19,5 +21,9 @@ This workflow produces a Meryl database and Genomescope outputs that will be use
1921
- Summary
2022
- Model
2123
- Model parameteres
24+
- RDeval for PacBio Hifi Reads QC
25+
- Reads statistics
26+
- HTML report
27+
2228

2329
![image](https://github.com/galaxyproject/iwc/assets/4291636/565238fc-f8a9-46ac-8b31-6276410fa436)
Lines changed: 22 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,46 +1,44 @@
11
- doc: Test outline for kmer-profiling-hifi-VGP1.ga
22
job:
3+
Species Name: Test Species
4+
Assembly Name: testSpe1
35
Collection of Pacbio Data:
46
class: Collection
57
collection_type: list
68
elements:
79
- class: File
810
identifier: pacbio_reads
9-
location: https://zenodo.org/record/6603774/files/child.fastq?download=1
11+
location: https://zenodo.org/record/6098306/files/HiFi_synthetic_50x_01.fasta
1012
filetype: fastqsanger
11-
'K-mer length ': 8
12-
Ploidy: 1
13+
'K-mer length': 21
14+
Ploidy: 2
1315
outputs:
1416
GenomeScope linear plot:
15-
file: test-data/GenomeScope_Linear_plot.png
16-
compare: sim_size
17-
delta: 10000
17+
asserts:
18+
has_size:
19+
value: 182100
20+
delta: 10000
1821
GenomeScope log plot:
19-
file: test-data/GenomeScope_Log_plot.png
20-
compare: sim_size
21-
delta: 10000
22-
GenomeScope transformed linear plot:
23-
file: test-data/GenomeScope_Transformed_linear_plot.png
24-
compare: sim_size
25-
delta: 10000
26-
GenomeScope transformed log plot:
27-
file: test-data/GenomeScope_Transformed_log_plot.png
28-
compare: sim_size
29-
delta: 10000
22+
asserts:
23+
has_size:
24+
value: 195200
25+
delta: 10000
3026
GenomeScope summary:
3127
asserts:
3228
- has_text_matching:
33-
expression: '27,84. bp'
34-
- has_text_matching:
35-
expression: '35,91. bp'
29+
expression: '7,859,... bp'
3630
GenomeScope Model Parameters:
3731
asserts:
3832
- has_text_matching:
39-
expression: '0.09184.*'
33+
expression: '0.012.*'
4034
- has_text_matching:
41-
expression: '27.44.*'
35+
expression: '12.45.*'
4236
Merged Meryl Database:
4337
asserts:
4438
has_size:
45-
value: 105703
46-
delta: 10000
39+
value: 27710666
40+
delta: 1000000
41+
Reads Statistics:
42+
asserts:
43+
has_text:
44+
text: "# reads 21224"

0 commit comments

Comments
 (0)