-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For aligning chromosomes of different species over 100MYA or div is it better to use .masked files? #383
Comments
Using masked genomes is more efficient because it saves on computations (masked stuff is not aligned), with the price of possibly losing interesting alignments and therefore relationships between genomes! |
Ok and another question. Is there anyway that I could use PGGB for
reference guided genome scaffolding?
Now I have a catfish genome assembly that I'm working on. The QV is > 50
but there is still 500+ gaps after HiC scaffolding.
For instance let's say we have the chromosome 1 with 35 gaps.
Would it be possible to use minimap2 to find tge matching pairs of
homologous chromoomes in other species. Then use PGGB to align the
chromosomes 1 ( eventually masked ) in order to find the correct path in
the GFA file in bandage?
Can PGGB accomodate with nanopore reads and HIFI reads or maybe it would be
better to extract the path from the graph and then map thé reads on it to
see if the gap can be closed.
It is not really related to this github issue, but I am just wondering if
PGGB has ever been used for scaffolding / gapclosing using related species?
If so, how to do?
Thank you very much in advance Andrea.
…On Thu, Oct 17, 2024, 6:58 AM Andrea Guarracino ***@***.***> wrote:
Using masked genomes is more efficient because it saves on computations
(masked stuff is not aligned), with the price of possibly losing
interesting alignments and therefore relationships between genomes!
—
Reply to this email directly, view it on GitHub
<#383 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ASYS5TG6DCNEMT7WQ6VK3RDZ334T5AVCNFSM6AAAAABQCRQCR2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJYGE4DCMJZGE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
If I understand your questions correctly/partially, you would like to curate an assembly using assemblies of other related species modeled in a pangenome graph. It sounds like a task for tools able to align sequences against a graph, like Minigraph and GraphAligner, where you align your gapped assembly (scaffolds with NNNNNs for the gaps) against the other-species graphs. PGGB can accommodate everything, but its first step is an all-vs-all alignment, and you don't want to put millions of reads in the input. Moreover, PGGB is a 'trash-in -> trash-out' pipeline, so if your reads are noisy, your noise will smear your output. I smell PGGB could be used for scaffolding/gapclosing somehow, but we don't have a pipeline for that (we've never used it that way). |
Because in your paper you did this:
Yes you understand correctly my question. Do you think that it would be possible to first use PGGB to use related species to get a first graph and then use graphaligner to map the reads on the PGGB graph? Or I say something completely nonsense? |
Yes the toolkit is complete there should be new applications of PGGB in the future to help genome assembly. Ragtag is quite limited. |
A pangenome-based scaffolder would be hot, but I've never delved so deeply into the problems that I've been able to start hacking on them. Happy to chat separately more about that. PGGB+GraphAligner would make sense if the karyotypes are stable and veeeeeery similar between the different species. |
Hello,
I would like to know if using masked genomes is more efficient than non-masked genomes for all-vs-all cross species alignments?
Thank you in advance for your answer
Quentin
The text was updated successfully, but these errors were encountered: