Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running error with Singularity image #422

Closed
shilinti opened this issue Oct 28, 2024 · 4 comments
Closed

Running error with Singularity image #422

shilinti opened this issue Oct 28, 2024 · 4 comments

Comments

@shilinti
Copy link

Hello, thanks for the great work!

I am trying to run PGGB with Singularity image, and I strictly followed the example commands line as below:

  1. Pull the docker file:
    git clone --recursive https://github.com/pangenome/pggb.git

  2. clone the repo and cd:
    git clone --recursive https://github.com/pangenome/pggb.git
    cd pggb

  3. run pggb:
    singularity run -B ${PWD}/data:/data ../pggb_latest.sif pggb -i /data/HLA/DRB1-3123.fa.gz -p 70 -s 3000 -n 10 -t 16 -V 'gi|568815561' -o /data/out
    which gives me:
    [pggb] warning: there are sequence names (like 'gi|568815592:32578768-32589835') that do not match the Pangenome Sequence Naming (PanSN).
    [pggb] ERROR: -V/--vcf-spec cannot be used if the Pangenome Sequence Naming (PanSN) is not respected.

I did checked the solution in issue#388, I am little confused, do I have to installed all other components individually (wfmash, seqwish...) and pull the docker image at the same time? is there anything I did wrong?

FYI, I did:
./pggb --version
pggb v0.7.2-0-g0e9c9e1

Thanks for the help!
-S.

@AndreaGuarracino
Copy link
Member

We should probably update the README and/or the input datasets. Currently, PanSN is mandatory in vg, the tool we used to project the graph into a VCF file format, but DRB1-3123.fa.gz's sequences do not follow PanSN convention. If you remove -V 'gi|568815561' from pggb's command line, so you skip the GFA->VCF projection, it should work.

A lazy way to fix the error and keep the VCF in output is to just hack sequence names in DRB1-3123.fa.gz:

zcat DRB1-3123.fa.gz | sed 's/:/#1#/g' > DRB1-3123.pansn.fa
samtools faidx DRB1-3123.pansn.fa # index the new FASTA file

and run pggb with DRB1-3123.pansn.fa.

@shilinti
Copy link
Author

Thanks for the fast response!

I tried both ways, which gave me same error:
singularity run -B ${PWD}/data:/data ../pggb_latest.sif pggb -i /data/HLA/DRB1-3123.pansn.fa -p 70 -s 3000 -n 10 -t 16 -V 'gi|568815561' -o /data/out Illegal option --

I also tried the solutions in #291 and #404, all ended up with showing same error info.

As the suggestions from these two notes, I tried below commands:
singularity run -B ${PWD}/data:/data ../pggb_latest.sif /bin/bash -c "pggb -i /data/HLA/DRB1-3123.pansn.fa -p 70 -s 3000 -G 2000 -n 10 -t 16 -v -V 'gi|568815561:#' -o /data/out -M -m" Illegal option --
singularity exec -B ${PWD}/data:/data ../pggb_latest.sif /bin/bash -c "pggb -i /data/HLA/DRB1-3123.pansn.fa -p 70 -s 3000 -G 2000 -n 10 -t 16 -v -V 'gi|568815561:#' -o /data/out -M -m" Illegal option --
singularity exec -B ${PWD}/data:/data ../pggb_latest.sif /bin/bash -c "pggb -i /data/HLA/DRB1-3123.pansn.fa -p 70 -s 3000 -G 2000 -n 10 -t 16 -v -V 'gi|568815561' -o /data/out -M -m"

I am running on HPC, so I don't have docker access to run pggb with docker commands.

Thanks,
-S.

@shilinti
Copy link
Author

unset -f which does the trick!

Many thanks!
-S.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants