Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect number of contigs in the assembly #274

Open
ahmedelgoutbi opened this issue Jan 7, 2025 · 0 comments
Open

Incorrect number of contigs in the assembly #274

ahmedelgoutbi opened this issue Jan 7, 2025 · 0 comments

Comments

@ahmedelgoutbi
Copy link

Hi!
I need to assemble more than 30 yeast genomes that have been sequenced
with PacBio Sequel IIe (12.1 Mb each with roughly 40x coverage).
According to the reference genome (Saccharomyces cerevisiae S288C) the
chromosomes are 16 but I get a much higher number of contigs (70 - 100)
after assembling with WTDBG2 (see below for the full commands). Could
this be a problem related to repeated sequences that the algorithm
cannot solve? If so, should I adjust the parameters to refine the
assembly? Alternatively, is there any (or set of) parameter/s that I can
tweak to get a more congruent number of chromosomes?

These are the commands that I used to get the full assembly:

wtdbg2 -x rs -g 12.1m -i raw_data/raw.fasta.gz -t 16 -R -fo assemble_out wtpoa-cns -t 16 -i assemble.ctg.lay.gz -fo consensus.raw.fa minimap2 -t 16 -ax map-pb -r2k consensus.raw.fa raw_data/raw.fasta.gz | samtools sort -@4 > polished.bam samtools view -F0x900 polished.bam | wtpoa-cns -t 16 -d consensus.raw.fa -i - -fo polished.cns.fa busco -i polished.cns.fa -o yeast_genome -l saccharomycetes_odb10 -m genome -c 12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant