-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Genome size is much smaller than expected #147
Comments
By the way, the assembled size is 906Mb, but the busco evaluation is normal, 96.8% |
Hi, could you paste the main log to here? what is the file size of |
Hello, Mr. Hu; Types Count (#) Bases (bp) Depth (X) Suggested seed_cutoff (genome size: 1741.00Mb, expected seed depth: 45, real seed depth: 45.00): 38807 bp Min. 54156 - |
The log shows that |
Ok, thank you Mr. Hu; |
Hello, dear teacher,
I really hope you can help me,
The size of my assembled genome is seriously smaller than expected, but the process ends normally.
The following is my parameter configuration:
##############################################################
[General]
job_type = sge
job_prefix = test
task = all # 'all', 'correct', 'assemble'
rewrite = yes # yes/no
deltmp = no
rerun = 3
parallel_jobs = 36
input_type = raw
read_type = ont # clr, ont, hifi
input_fofn = input.fofn
workdir = test_nd_run
submit = qsub -cwd -l vf=60g,q=all.q {script} #for sge
[correct_option]
read_cutoff = 1k
genome_size = 1741M
pa_correction = 51
sort_options = -m 43g -t 18 -k 40
minimap2_options_raw = -t 21
correction_options = -p 15
[assemble_option]
minimap2_options_cns = -t 21 -k17 -w17
nextgraph_options = -a 1
##############################################################
However I noticed that many nd.asm.f.part*.fasta files in my 03.ctg_graph/03.ctg_cns.sh.work directory are empty,If these files were generated properly, it would be about the size of the genome I expected.
The intermediate result for this part is as follows:
##############################################################
ll ctg_cns20 ctg_cns25
ctg_cns20:
total 24564
-rw-rw-r-- 1 25146734 May 21 04:10 nd.asm.f.part019.fasta
-rwxr--r-- 1 957 May 21 04:04 test.sh
-rw-rw-r-- 1 0 May 21 04:10 test.sh.done
ctg_cns25:
total 4
-rw-rw-r-- 1 0 May 21 04:05 nd.asm.f.part024.fasta
-rwxr--r-- 1 957 May 21 04:04 test.sh
-rw-rw-r-- 1 0 May 21 04:05 test.sh.done
##############################################################
However, even if the generated nd.asm.f.part024.fasta file is empty, the process still works;
The following is the log file delivered to the sge cluster
##############################################################
hostname
cd /calculate/home/user/test/03.test/01.assembly/workdir/test_nd_run/03.ctg_graph/03.ctg_cns.sh.work/ctg_cns25
time /public/apps/miniconda3/bin/python /public/pipline/GenomicAnalysis/01.assemble/01.nextdenovo/NextDenovo_v2.5.0/NextDenovo/lib/ctg_cns.py -p 15 -g /calculate/home/user/test/03.test/01.assembly/workdir/test_nd_run/03.ctg_graph/01.ctg_graph.sh.work/ctg_graph1/nd.asm.p.fasta -b /calculate/home/user/test/03.test/01.assembly/workdir/test_nd_run/03.ctg_graph/01.ctg_graph.sh.work/ctg_graph1/nd.asm.p.fasta.blc -i 24 -r ont -l /calculate/home/user/test/03.test/01.assembly/workdir/test_nd_run/03.ctg_graph/03.ctg_cns.input.bams -o nd.asm.f.part024.fasta
[232296 INFO] 2022-05-21 04:06:33 Corrected step options:
[232296 INFO] 2022-05-21 04:06:33
split: 1
auto: True
process: 15
read_type: 1
block_index: 24
window: 5000000
uppercase: False
alignment_score_ratio: 0.8
out: nd.asm.f.part024.fasta
alignment_identity_ratio: 0.8
bam_list: /calculate/home/user/test/03.test/01.assembly/workdir/test_nd_run/03.ctg_graph/03.ctg_cns.input.bams
genome: /calculate/home/user/test/03.test/01.assembly/workdir/test_nd_run/03.ctg_graph/01.ctg_graph.sh.work/ctg_graph1/nd.asm.p.fasta
block: /calculate/home/user/test/03.test/01.assembly/workdir/test_nd_run/03.ctg_graph/01.ctg_graph.sh.work/ctg_graph1/nd.asm.p.fasta.blc
[232301 INFO] 2022-05-21 04:06:36 Start a corrected worker in 232301 from parent 232296
[232304 INFO] 2022-05-21 04:06:36 Start a corrected worker in 232304 from parent 232296
[232306 INFO] 2022-05-21 04:06:36 Start a corrected worker in 232306 from parent 232296
[232307 INFO] 2022-05-21 04:06:36 Start a corrected worker in 232307 from parent 232296
[232309 INFO] 2022-05-21 04:06:36 Start a corrected worker in 232309 from parent 232296
[232311 INFO] 2022-05-21 04:06:36 Start a corrected worker in 232311 from parent 232296
[232313 INFO] 2022-05-21 04:06:36 Start a corrected worker in 232313 from parent 232296
[232315 INFO] 2022-05-21 04:06:36 Start a corrected worker in 232315 from parent 232296
[232317 INFO] 2022-05-21 04:06:36 Start a corrected worker in 232317 from parent 232296
[232319 INFO] 2022-05-21 04:06:36 Start a corrected worker in 232319 from parent 232296
[232321 INFO] 2022-05-21 04:06:36 Start a corrected worker in 232321 from parent 232296
[232323 INFO] 2022-05-21 04:06:36 Start a corrected worker in 232323 from parent 232296
[232325 INFO] 2022-05-21 04:06:36 Start a corrected worker in 232325 from parent 232296
[232328 INFO] 2022-05-21 04:06:36 Start a corrected worker in 232328 from parent 232296
[232330 INFO] 2022-05-21 04:06:36 Start a corrected worker in 232330 from parent 232296
real 0m2.634s
user 0m1.851s
sys 0m0.672s
touch /calculate/home/user/test/03.test/01.assembly/workdir/test_nd_run/03.ctg_graph/03.ctg_cns.sh.work/ctg_cns25/test.sh.done
##############################################################
I would like to know how to resize the genome to be in line with my expectations and why the nd.asm.f.part*.fasta file size is 0;
Thank you again sincerely!
The text was updated successfully, but these errors were encountered: