Skip to content

Bioinformatics pipeline to generate phylogenetic trees from BGC genbank records using Nextflow

License

Notifications You must be signed in to change notification settings

JBwdn/BGC_2_Tree

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BGC2Tree pipeline

For creating phylogenetic trees of homologous proteins from biosynthetic gene clusters

Requirements: (Tested in WSL2 Ubuntu 20.04.2 LTS)

Example local usage:

nextflow run main.nf --in data/example_query.fasta --db data/gbk_records

Remote usage:

nextflow run jbwdn/bgc_2_tree --in your_local_query.fasta --db your_local_direcory_of_gbks

Description:

1. Accept a query fasta containing 1 sequence and a path to a folder containing .gbk records
2. Using phmmer find a potential homolog from each record and save all into a fasta with labels
3. Perform protein sequence alignment using MUSCLE
4. Use alignment to calculate a phylogenetic Tree using FastTree
5. Paths to the three output files (homologs, alignment & tree) printed to stdout

See Nextflow and conda docs for installation instructions.

Built using this template.

References:

About

Bioinformatics pipeline to generate phylogenetic trees from BGC genbank records using Nextflow

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published