Skip to content

bigbio/quantmsdiann

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

133 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

quantmsdiann

GitHub Actions CI Status GitHub Actions Linting Status Cite with Zenodo nf-test

Nextflow nf-core template version run with docker run with singularity

Introduction

quantmsdiann is a bigbio bioinformatics pipeline, built following nf-core guidelines, for quantitative mass spectrometry analysis using DIA-NN. It supports Data-Independent Acquisition (DIA) workflows including label-free, plexDIA (mTRAQ, SILAC, Dimethyl), phosphoproteomics with site localization, and Bruker timsTOF/PASEF data.

The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a portable manner. It uses Docker/Singularity containers making results highly reproducible. The Nextflow DSL2 implementation of this pipeline uses one container per process, making it easy to maintain and update software dependencies.

Pipeline summary

quantmsdiann workflow

The pipeline takes SDRF metadata and mass spectrometry data files (.raw, .mzML, .d, .dia) as input and performs:

  1. Input validation — SDRF parsing and validation via sdrf-pipelines
  2. File preparation — RAW to mzML conversion (ThermoRawFileParser), indexing, Bruker .d handling (tdf2mzml)
  3. In-silico spectral library generation — deep learning-based prediction, or use a user-provided library (--diann_speclib)
  4. Preliminary analysis — per-file calibration and mass accuracy estimation (parallelized)
  5. Empirical library assembly — consensus library from preliminary results with RT profiling
  6. Individual analysis — per-file search with the empirical library (parallelized)
  7. Final quantification — protein/peptide/gene group matrices with cross-run normalization
  8. MSstats conversion — DIA-NN report to MSstats-compatible format
  9. Quality control — interactive QC report via pmultiqc

Supported DIA-NN Versions

Version Profile Container Key features
1.8.1 (default) diann_v1_8_1 docker.io/biocontainers/diann:v1.8.1_cv1 Core DIA analysis, TSV output
2.1.0 diann_v2_1_0 ghcr.io/bigbio/diann:2.1.0 Native .raw support, Parquet output
2.2.0 diann_v2_2_0 ghcr.io/bigbio/diann:2.2.0 Speed optimizations (up to 1.6x on HPC)
2.3.2 diann_v2_3_2 ghcr.io/bigbio/diann:2.3.2 DDA support (beta), InfinDIA, up to 9 var mods

Switch versions with e.g. -profile diann_v2_2_0,docker. See the DIA-NN Version Selection guide and full parameter reference for details.

Quick start

Note

If you are new to Nextflow and nf-core, please refer to this page on how to set up Nextflow.

Run with test data:

nextflow run bigbio/quantmsdiann -profile test_dia,docker --outdir results

Run with your own data:

nextflow run bigbio/quantmsdiann \
    --input 'experiment.sdrf.tsv' \
    --database 'proteins.fasta' \
    --outdir './results' \
    -profile docker

Run with a specific DIA-NN version:

nextflow run bigbio/quantmsdiann \
    --input 'experiment.sdrf.tsv' \
    --database 'proteins.fasta' \
    --outdir './results' \
    -profile docker,diann_v2_2_0

Warning

Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files specified with -c must only be used for tuning process resource specifications, not for defining parameters.

Documentation

  • Usage — How to run the pipeline, input formats, optional outputs, and custom configuration
  • Parameters — Complete reference of all pipeline parameters organised by category
  • Output — Description of all output files produced by the pipeline

Credits

quantmsdiann is developed and maintained by:

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

Citation

If you use quantmsdiann in your research, please cite:

Dai et al. "quantms: a cloud-based pipeline for quantitative proteomics" (2024). DOI: 10.5281/zenodo.15573386

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.

License

MIT

About

quantms workflow for DIANN tool including DIA and DDA analysis

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors