Releases: ftwkoopmans/msdap
Releases · ftwkoopmans/msdap
release 1.0.6
- major update to the differential detection metric
- computation of z-scores was further refined
- additional parameters via analysis_quickstart() give more control
- proteins with just 1 peptide are now skipped by default
- The
analysis_quickstart()
example on the main GitHub page has been updated accordingly - Additional details can be found in the differential detection vignette
- new function for summarizing statistical results at gene-level;
summarise_stats()
can merge results from DEA and differential detection, returning 1 result/statistic per gene (across contrasts and DEA algorithms).- Additional details can be found in the summarizing results from statistics + gene ID mapping vignette
- See the "summarizing results from statistics + gene ID mapping" vignette for more details
- new function for mapping mouse, rat and human proteins to human proteins, e.g. to prepare your data for downstream analyses where human gene identifiers are required. This requires external/downloaded mapping tables.
- Additional details can be found in the summarizing results from statistics + gene ID mapping vignette
- new function for merging technical replicates up-front;
merge_replicate_samples()
. Usage is situational, in some cases you may want to deal with (technical) replicates in downstream statistical modeling. Example and documentation @?merge_replicate_samples()
- the MS-DAP dataset object is now stored in the output directory by default ('dataset.RData')
release 1.0.5
- reworked import function for ProteomeDiscoverer (after installing this MS-DAP update, see the function documentation for
import_dataset_proteomediscoverer_txt
for details and options) - bugfix for importing Spectronaut reports that use multiple spectral libraries
- various minor updates
release 1.0.4
This release brings a major update to our support for FragPipe datasets, if you're using FragPipe make sure to upgrade MS-DAP:
- 'recent' FragPipe versions produce different output files, we have updated MS-DAP accordingly
- MS-DAP now supports 3 different FragPipe LFQ dataset output formats
- See further this documentation on using FragPipe and MS-DAP together
- To update your MS-DAP installation, refer to this brief instruction on the MS-DAP main page
release 1.0.3.1
minor update to resolve issues with DIA-NN report.tsv files that contain multiple rows of data for a precursor ID in the same sample, but with zero values in the retention time and quantity/intensity columns (which would result in a MS-DAP error message peptide_id*sample_id combinations are not unique ...
in version 1.0.3)
release 1.0.3
- breaking change; some parameters have been renamed
mostly applies to users writing code that directly invokes dea/normalization functions, or operates on thedataset$de_proteins
table- renamed
algo_de
todea_algorithm
throughout MS-DAP codebase - renamed
dea_protein_rollup
andalgo_rollup
torollup_algorithm
throughout MS-DAP codebase - refactored parameter names for
normalize_matrix()
- renamed
- normalization; a new variant of VWMB is now available, Mode Within Mode Between (MWMB). Normalize (/scale) samples within each sample group such that their pairwise log-foldchange modes are zero (whereas VWMB would normalize replicates by reducing peptide variation), then scales between groups such that the log-foldchange mode is zero (i.e. the between-group part is the same as VWMB). If the dataset has (unknown) covariates and a sufficient number of replicates, this might be beneficial because covariate-specific effects are not averaged out as they might be with
VWMB
. However, our recommendation for general-purpose (i.e. good start on initial investigation of a dataset) is stillnorm_algorithm = c("vsn", "modebetween_protein")
- peptide-to-protein rollup; Tukey's median polish (TMP) is now also available in
analysis_quickstart()
anddea()
through therollup_algorithm
parameter. MaxLFQ remains default; our benchmarking showed its marginally better overall than TMP - report; the foldchange-distribution visualizations of within-group outliers are now accompanied by a table with respective scores
- report; configured normalization and rollup algorithms are now used in most QC figures in the PDF report (instead of always using VWMB for the retention-time-error and CoV plots)
- export data; "peptide*sample" data matrices are now also exported when using
analysis_quickstart()
with parameteroutput_abundance_tables=TRUE
. If you're not using theanalysis_quickstart()
function, you may directly callexport_peptide_abundance_matrix()
to create "peptide*sample" TSV files - docker; the MS-DAP docker container is now built on recent R version 4.2.1 (and added minor improvements to the unix launcher script)
- MS-DAP can now import compressed fasta files and data tables while importing from MaxQuant / DIA-NN / Spectronaut / FragPipe / MetaMorpheus. Supported compression formats; .zip|.gz|.bz2|.xz|.7z|.zst|.lz4
- you can present compressed files as input, or provide file paths as-is without the extension for the archive (e.g. filename="C:/experiment_x/diann/report.tsv"). If the input file is not found, MS-DAP will automatically check if a compressed variant of the file is available (by matching against supported file extensions)
- for upstream software that produces a folder with multiple files that need to read into MS-DAP; compress each file individually (e.g. for MaxQuant, make separate zip/gzip/Zstandard/etc. files like evidence.txt.zip, proteinGroups.txt.zip and peptides.txt.zip , then work with MS-DAP as usual)
- you can now skip the creation of any output files/directories in
analysis_quickstart()
by settingoutput_dir=NA
- more documentation was added to both R functions and the GitHub vignettes
- bugfix for the DEqMS package, resolves a rare error when dealing with proteins that have (near) zero variation
- various few minor bugfixes and code speedups (no impact on analysis results)
release 1.0.2
- new implementation for bootstrapped inference of foldchange thresholds in DEA analysis, which scales to large datasets
- this applies to users who set
dea_log2foldchange_threshold=NA
inanalysis_quickstart()
to let an algorithm infer a threshold/cutoff for log2 foldchanges in volcano plots - importantly, be aware that this may result in a minor change in the estimated foldchange threshold (0.1~3% difference in the estimated threshold for 10 datasets we tested)
- keep an eye on this when comparing analyses across MS-DAP versions
- on that note; in datasets with few replicates (for instance WT~KO with 3 replicates each), a bootstrap analysis to find background-level foldchanges based on permutations of sample-to-group assigments is limited and possibly over-conservative (i.e. there aren't many permutations to make). It's recommended to apply this approach to datasets with 5+ replicates
- this applies to users who set
- additional input validations for volcano plots (for users that call this function directly, i.e. outside of canonical
analysis_quickstart()
workflow) - expanded QC plots for within-group variation, peptide foldchange distribution plots now come in 2 styles and have improved color-coding
- bugfix specific to R 4.x (a warning that was raised to an error only in R 4.x)
- MS-DAP has been tested against the new R release 4.2
- documentation updates
release 1.0.1
minor update:
- updated EncyclopeDIA import function #1
- updated user-guide for importing Spectronaut datasets #2
- new msdap::plot_volcano() function to create volcano plots with custom labels post-analysis #3
- new variance-explained analysis (experimental feature atm), optionally accessible through msdap::analysis_quickstart()
- a few minor bugfixes (cosmetics / user-convenience only, no impact on analysis results)
release 1.0
first production release
beta 0.2.8.2
minor bugfixes
beta 0.2.8.1
- reworked FragPipe import functions, now supports IonQuant (make sure to assign Experiment IDs in the workflow tab of FragPipe)
- protein rollup by MaxLFQ, implementation from iq package is now default
- reworked utility function to export protein abundance matrices
- various minor updates to data visualizations and documentation