Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can gaftools convert an unstable coordinate GAF to a stable coordinate GAF? #34

Open
peterdfields opened this issue Jan 16, 2025 · 9 comments
Assignees

Comments

@peterdfields
Copy link

I'm working on a problem where I'm trying to understand haplotype structure in an admixed mapping population using long-reads mapped to a pangenome genome generated using minigraph-cactus. One intermediate file I'd like to use to begin the analysis is a GAF which includes the stable genome coordinates where a given read maps across the different references included in the pangenome. I've modified the GFA which GraphAligner is using such that every input genome can be treated as a reference and I can get a GAF which includes unstable coordinates as described in https://github.com/lh3/gfatools/blob/master/doc/rGFA.md but I cannot seem to get the stable coordinate GAF. Can gaftools do such a conversion? Thank you for your time and advice.

@samarendra-pani
Copy link
Member

Gaftools should be able to do such a conversion. For converting the whole file, please try the following command

gaftools view --gfa <input GFA file> --format stable <input GAF file>

Please note that the code does require the GFA file to have SN, SO, and LN tags.

@samarendra-pani samarendra-pani self-assigned this Jan 16, 2025
@peterdfields
Copy link
Author

@samarendra-pani Thank you for your response! That's great news that gaftools can create the needed output! My GFA was created with minigraph-cactus -> vg convert with the --vg-algorithm flag in order to use GraphAligner. It does not appear to have SN, SO, and LN tags. Do you have a recommendation for converting my present GFA to the GFA formatted needed to generate the stable GAF conversion?

@samarendra-pani
Copy link
Member

I do not know of any tools that can do such a conversion yet. Gaftools assumes the reference graph to have the rGFA format as described here (https://github.com/lh3/gfatools/blob/master/doc/rGFA.md).

Can you post a detailed message on how the GFA you used in your conversion was made? I will recreate the GFA and see if it can be converted into an rGFA (this conversion will also be a good addition to gaftools).

@peterdfields
Copy link
Author

peterdfields commented Jan 17, 2025

Sure! I started with the .gbz file used primarily for aligning short reads with vg giraffe. I was following along with the steps described here for the GraphAligner alternative until vg giraffe releases long-read capabilities. The main command, as described in their example, for creating the GFA is then

vg convert ./hprc10/hprc10.gbz -f --vg-algorithm > ./hprc10/hprc10.gbz.gfa

Let me know if any additional details would be helpful.

*Quick addendum in case it's helpful. Looking at the GFA which minigraph-cactus generates as part of the larger pangenome creation, as compared to the GFA that is created for working with GraphAligner with vg convert, I do see SN, SO, and SR tags but no LN tags. I don't know how this would fit in with gaftools view given this GFA wasn't used as part of the alignment step, ie the GFA created by minigraph-cactus does not work for GraphAligner.

@peterdfields
Copy link
Author

Hi @samarendra-pani. I was just wondering if you had had time to have a closer look at this issue? Thank you again for all your help.

@samarendra-pani
Copy link
Member

Hello @peterdfields

Sorry for the delay. I am busy with something else now. I looked at the gfa generated by vg convert, and it will be hard to recreate the tags just from the gfa. I will work on a fix next week and let you know.

@peterdfields
Copy link
Author

@samarendra-pani That's great news, thank you!

I'm hoping to be able to continue to work within the minigraph-cactus/vg-giraffe ecosystem because it works so well for short-reads (and hopefully soon long-reads). However, might you have a recommendation of another tool chain which could possible generate the requisite inputs (graph, alignment indices, etc.) so that gaftools could generate the stable coordinate GAF?

@peterdfields
Copy link
Author

@samarendra-pani I just noticed that vg giraffe now offers long read mapping: https://github.com/vgteam/vg/releases/tag/v1.63.0

Do you think this could simplify the problem at all for creating the stable coordinate GAF?

@samarendra-pani
Copy link
Member

I can try mapping with vg giraffe later. I am in the middle of writing a script for doing the conversion. Need to make sure it works properly with some test cases.

Does vg giraffe give an option to output a file in stable coordinates? If that can solve the problem, then awesome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants