Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chimera files which start with FASTA and then other formats are incorrectly detected as FASTA #7

Open
fmaccha opened this issue Jul 13, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@fmaccha
Copy link
Collaborator

fmaccha commented Jul 13, 2024

Files that have FASTA format followed by other formats, like show below, are incorrectly detected as FASTA.

>ref
AGCATGTTAGATAAGATAGCTGTGCTAGTAGGCAGTCAGCGCCAT
>ref2
aggttttataaaacaattaagtctacagagcaactacgcg
@HD     VN:1.0 SO:coordinate
./chimera.txt:
  decompressed:
    id: null
    label: null
  label: FASTA
  id: http://edamontology.org/format_1929

This happened because noodles::fasta read the sam header line at the end as FASTA sequence. Its documentation says,

FASTA is a text format with no formal specification and only has de facto rules. It typically consists of a list of records, each with a definition on the first line and a sequence in the following lines.

@fmaccha fmaccha added the bug Something isn't working label Jul 13, 2024
@fmaccha fmaccha self-assigned this Jul 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

1 participant