i have tried to split the compined vcf into samples and control to compare the count of known and novel varients in each of them.
bcftools view -s SRR5858157_F8_III.4,SRR5858162_F8_III.3,SRR5858204_F6_II.2 raw_variants_ann.vcf > raw_variants_ann_samples.vcf
bcftools view -s SRR5858160_F7_II.3,SRR5858161_F7_II.2 raw_variants_ann.vcf > raw_variants_ann_controls.vcf
grep -v "^#" raw_variants_ann_controls.vcf | awk '{print $3}' | wc -l ###total35472
grep -v "^#" raw_variants_ann_controls.vcf | awk '{print $3}' | grep "^rs" | wc -l ###known10900
grep -v "^#" raw_variants_ann_samples.vcf | awk '{print $3}' | wc -l ###total35472
grep -v "^#" raw_variants_ann_samples.vcf | awk '{print $3}' | grep "^rs" | wc -l ###Known10900
the nos are equal and i think that there is something wrong in my code but i really don't have time to try to solve it as today is the dead line and we can calculate them from the state file created before.
these links was helpful
- https://bioinformatics.stackexchange.com/questions/3477/how-to-subset-samples-from-a-vcf-file
- https://toolshed.g2.bx.psu.edu/repository/display_tool?repository_id=f667c2ee6f2ca971&tool_config=%2Fsrv%2Ftoolshed%2Fmain%2Fvar%2Fdata%2Frepos%2F002%2Frepo_2516%2Fbcftools_view.xml&changeset_revision=cc016cb332cd
i have tried to split the compined vcf into samples and control to compare the count of known and novel varients in each of them.
bcftools view -s SRR5858157_F8_III.4,SRR5858162_F8_III.3,SRR5858204_F6_II.2 raw_variants_ann.vcf > raw_variants_ann_samples.vcf
bcftools view -s SRR5858160_F7_II.3,SRR5858161_F7_II.2 raw_variants_ann.vcf > raw_variants_ann_controls.vcf
grep -v "^#" raw_variants_ann_controls.vcf | awk '{print $3}' | wc -l ###total35472
grep -v "^#" raw_variants_ann_controls.vcf | awk '{print $3}' | grep "^rs" | wc -l ###known10900
grep -v "^#" raw_variants_ann_samples.vcf | awk '{print $3}' | wc -l ###total35472
grep -v "^#" raw_variants_ann_samples.vcf | awk '{print $3}' | grep "^rs" | wc -l ###Known10900
the nos are equal and i think that there is something wrong in my code but i really don't have time to try to solve it as today is the dead line and we can calculate them from the state file created before.
these links was helpful