bigbio/relink is a bioinformatics pipeline for crosslinking mass spectrometry (XL-MS) data analysis. It processes mass spectrometry data through several stages including file conversion, linear search for mass recalibration, mass recalibration, crosslinking search, and FDR correction using the xiSEARCH/xiFDR suite.
The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The pipeline follows nf-core guidelines.
- File Conversion - Convert Thermo RAW files to MGF format using ThermoRawFileParser
- Linear Search - Run xiSEARCH linear peptide search for mass error estimation
- Mass Recalibration - Calculate and apply mass corrections to MS1 and MS2 spectra
- Crosslinking Search - Run xiSEARCH crosslinking peptide search
- FDR Correction - Apply xiFDR for false discovery rate estimation
- Reporting - Generate MultiQC report with analysis summary
-
Install
Nextflow(>=23.04.0) -
Install any of
Docker,Singularity(you can follow this tutorial),Podman,ShifterorCharliecloudfor full pipeline reproducibility -
Download the pipeline and test it on a minimal dataset with a single command:
nextflow run bigbio/relink -profile test,docker --outdir ./results
-
Start running your own analysis:
nextflow run bigbio/relink \ -profile docker \ --input samplesheet.csv \ --outdir ./results
The input samplesheet is a CSV file with the following columns:
sample,file,fasta,xi_linear_config,xi_crosslink_config
sample1,/path/to/sample1.raw,/path/to/database.fasta,/path/to/xi_linear.conf,/path/to/xi_crosslinking.conf
sample2,/path/to/sample2.raw,/path/to/database.fasta,/path/to/xi_linear.conf,/path/to/xi_crosslinking.conf| Column | Description |
|---|---|
sample |
Sample identifier (unique) |
file |
Path to RAW or MGF file |
fasta |
Path to FASTA database |
xi_linear_config |
Path to xiSEARCH linear configuration file |
xi_crosslink_config |
Path to xiSEARCH crosslinking configuration file |
| Parameter | Description | Default |
|---|---|---|
--input |
Path to input samplesheet CSV | Required |
--outdir |
Output directory for results | ./results |
| Parameter | Description | Default |
|---|---|---|
--do_recalibration |
Perform mass recalibration | true |
--do_crosslinking_search |
Perform crosslinking search | true |
--do_fdr |
Perform FDR correction | true |
--do_mass_error_plots |
Generate mass error plots | false |
--link_fdr |
Link-level FDR threshold (%) | 5 |
The pipeline outputs the following directories:
results/
├── mgf/ # Converted MGF files
├── linear_search/ # Linear search results
├── recalibrated/ # Recalibrated MGF files
│ └── plots/ # Mass error plots (optional)
├── crosslinking_search/ # Crosslinking search results
├── fdr/ # FDR-corrected results
├── multiqc/ # MultiQC report
└── pipeline_info/ # Pipeline execution info
- ThermoRawFileParser - RAW file conversion
- xiSEARCH - Crosslinking peptide search
- xiFDR - FDR estimation
- pyOpenMS - Mass spectrometry data processing
- Polars - Data processing
If you use bigbio/relink for your analysis, please cite:
xiSEARCH: Mendes, M.L., et al. (2019). "An integrated workflow for crosslinking mass spectrometry." Molecular Systems Biology.
xiFDR: Fischer, L., & Rappsilber, J. (2017). "Quirks of error estimation in cross-linking/mass spectrometry." Analytical Chemistry.
An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.
We welcome contributions! Please see our contribution guidelines for details.
For questions, issues, or feature requests, please open an issue on the GitHub repository.
This pipeline is released under the Apache 2.0 License.