Skip to content

Latest commit

 

History

History
12 lines (10 loc) · 909 Bytes

FILTER.md

File metadata and controls

12 lines (10 loc) · 909 Bytes

Filter Data (optional)

The bioconda package includes jq as an option for simple filtering of JSONL variant data while combining:

// select only jsonl rows where protein change annotation (ANN_hgvs_p) from snpeff
// is not null and the variant allele frequency is > 0.01
cat *.jsonl | jq -c 'select(.ANN_hgvs_p!=null and .AF > 0.01)' > data.jsonl

You can read more about jq syntax here.

Note: If more extensive filtering is needed, Mucor3 should be flexible with any noSQL datastore that accepts JSONL input and outputs JSONL output. i.e. Elasticsearch, Apache Drill, or couchDB. Example python scripts for Elasticsearch are availiable in the examples folder. These scripts are not included currently in the bioconda package.