ighutil

Tools for working with immunoglobulin heavy chain sequences at high throughput

The project is divided into two sections: a python module for aligning sequences, and a JVM-based tool for post-processing.

Prerequisites

samtools
Java 7
Python 2.7
zlib
A C++11-compatible compiler (tested against gcc 4.8)
Cython (tested against v0.20)

Installation

Run make -C clj - creates clj/bin/ighutil
Run pip install ./python - installs vdjalign Python tool.

Alternatively, Debian packages ighutil and vdjalign can be installed from this repository:

deb     https://s3.amazonaws.com/cmccoy-debian-repo/ stable main

Packages are signed with key 3AA09EC1. Add with:

sudo apt-key adv --keyserver pgp.mit.edu --recv-keys 3AA09EC1

Then:

sudo apt-get update
sudo apt-get install -y ighutil vdjalign

Workflow

Alignment

IGH reads are aligned via the included python package, via the command ighutil align-fastq. The alignment process is simple: each sequence is aligned to all IGHV genes, keeping all alignments within a user-specified value of the maximum alignment score.

Alignment is multithreaded, and SSE3 accelerated using the excellent Smith-Waterman implementation from klib.

The remainder of the sequence is aligned against IGHJ genes; then the remainder from the best IGHJ hit against IGHD genes.

The resulting alignments are stored in BAM format.

Choosing a V/D/J assignment

Remaining post-processing uses subcommands of the JVM interface.

ighutil reset-primary randomly assigns a primary alignment among ties for each segment, and attempts to filter out alleles not present in the data.
ighutil calculate-match-probability calculates the expectation that each aligned base matches germline, treating all maximally scoring alignmnets as equiprobable.
samtools view -F 260 will subset to primary alignments only.

Annotating sequences

Annotate sequences with V/D/J gene assignment, productivity status, and more using:

ighutil annotate-vdj -i <BAM_FILE> -o <CSV_FILE>

Name		Name	Last commit message	Last commit date
Latest commit History 461 Commits
clj		clj
python		python
.gitignore		.gitignore
COPYING		COPYING
Dockerfile		Dockerfile
Gemfile		Gemfile
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ighutil

Prerequisites

Installation

Workflow

Alignment

Choosing a V/D/J assignment

Annotating sequences

About

Releases

Packages

Languages

License

cmccoy/ighutil

Folders and files

Latest commit

History

Repository files navigation

ighutil

Prerequisites

Installation

Workflow

Alignment

Choosing a V/D/J assignment

Annotating sequences

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages