Skip to content

bamtofastq generated 3 reads with triplicate of each reads  #134

@Davidwei7

Description

@Davidwei7

Dear Sir/Madam,

Hope you are well.

I download the original bam file and did a bamtofastq convert. So I found that SRR7092170 has a bam. file (link: https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&page_size=10&acc=SRR7092170&display=data-access), so i downloaded the bam file and did a bamtofastq convert. My command is this: bamtofastq_linux /lustre/miifs01/project/m2_jgu-canshank3/Individual_Processing/Xiang_try_bam/YX_05.bam /lustre/miifs01/project/m2_jgu-canshank3/Individual_Processing/Xiang_try_bam/output
bamtofastq v1.4.1
Writing finished. Observed 127722890 read pairs. Wrote 127722890 read pair.

After the bamtofastq process, I obtained a full list of fastq files looking like this:

  • bamtofastq_S1_L001_I1_001.fastq.gz
  • bamtofastq_S1_L001_I1_002.fastq.gz
  • bamtofastq_S1_L001_I1_003.fastq.gz
  • bamtofastq_S1_L001_R1_001.fastq.gz
  • bamtofastq_S1_L001_R1_002.fastq.gz
  • bamtofastq_S1_L001_R1_003.fastq.gz
  • bamtofastq_S1_L001_R2_001.fastq.gz
  • bamtofastq_S1_L001_R2_002.fastq.gz
  • bamtofastq_S1_L001_R2_003.fastq.gz
  • bamtofastq_S1_L001_R3_001.fastq.gz
  • bamtofastq_S1_L001_R3_002.fastq.gz
  • bamtofastq_S1_L001_R3_003.fastq.gz

Please see the first few lines of these files:

bamtofastq_S1_L001_I1_001.fastq.gz:

@D00536:344:HFYLCBCXY:1:1114:4988:61338 4:N:0:0
AACGACAC
+
CCDDBIII
@D00536:344:HFYLCBCXY:1:1210:1863:95823 4:N:0:0
CGTCCTCT
+
DDDDAGHH
@D00536:344:HFYLCBCXY:1:2113:4071:88090 4:N:0:0
TTGATGGG

bamtofastq_S1_L001_R1_001.fastq.gz:

@D00536:344:HFYLCBCXY:1:1114:4988:61338 1:N:0:0
AATCTCGTTTAAACTACATGCAGGAACAGCAAAGGAAATCCGGCAAATTTGCGCAGTCATTCTCAACACCGGCCATGCAGCAAAATCATCAGTGGAAA
+
DDDDDHIEGHIIIIIHHHIHHHIIIIIIIIIIGHFHHIIGIIIIIGGIIIIIDHHHIIHHHHHHHIHIHFEDHHIFH?1FECG@GHGGGHHHHHIHHH
@D00536:344:HFYLCBCXY:1:1210:1863:95823 1:N:0:0
AAAGAAAAATGGTGAATGATACCCGGTGCTGGCAATCTCGTTTAAACTACATGCAGGAACAGCAAAGGAAATCCGGCAAATTTGCGCAGTCATTCTCA
+
DDCDCHIIHEGCCC1GCEEHHHFHHHIHHII1CCEHGGIIH1EGCHHHHHHHIGHIIHHHEGHIIFIGGHIIIIHIHIIIEHHHHHDHCCFGHHH?C1
@D00536:344:HFYLCBCXY:1:2113:4071:88090 1:N:0:0
AGTTAACGAAAAGAAAAATGGTGAATGATACCCGGTGCTGGCAATCTCGTTTAAACTACATGCAGGAACAGCAAAGGAAATCCGGCAAATTTGCGCAG

bamtofastq_S1_L001_R2_001.fastq.gz
ATAACATGACCAAC
+
ADA@DIIFI?<FHH
@D00536:344:HFYLCBCXY:1:1210:1863:95823 2:N:0:0
CACGCTACAGATGA
+
DDDDAICCHIIHIE
@D00536:344:HFYLCBCXY:1:2113:4071:88090 2:N:0:0
CACGCTACAGATGA

bamtofastq_S1_L001_R3_001.fastq.gz

@D00536:344:HFYLCBCXY:1:1114:4988:61338 3:N:0:0
GTAGGCAACA
+
DDDDCIIIIH
@D00536:344:HFYLCBCXY:1:1210:1863:95823 3:N:0:0
CGCAAAATAA
+
DDDDDIIIII
@D00536:344:HFYLCBCXY:1:2113:4071:88090 3:N:0:0
CGCAAAATAA

I am confused why there are R3? Did the read 1 got split into two reads? (because the length of R3 and R2 seems to make up to 24 base pair). And why there are R1 - R3 and seems every reads and index files always triplicated into 001, 002 and 003.

I am hoping that I have described my problem sufficiently for a response to solve my issue.

Thank you in advance, and thank you for developing this tool, and looking forward to your response.

Best Wishes,

David

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions