bootstrap_enrichment_test/check_species- Validate species arguments before background generation.
- Fail fast when fewer than the minimum number of input hits are supplied, so invalid inputs return errors instead of upstream warnings.
- Update maintainer (Alan -> Hiru)
- Fix deprecated
ggplotarguments
- Remove print statements, clean up CTD creation steps.
- Making
ewce_plotfunctionality more clear - will no fail ifmake_dendro=TRUEandggdendrois not installed orCTDis not provided rather than issuing a warning.
check_bootstrap_args:- annotation level for checking cell type names was previous hardcoded to level 1, now updated to match user input for annotation level.
drop_uninformative_genes:- Will now catch cases where expression matrix is a dataframe and convert to a matrix
- This was causing weird errors, see issue 92: #92
- Fix made in
check_sce()function.
bootstrap_enrichment_test:- New args:
standardise_sct_data=,standardise_hits=: let users have more control over data standardisation steps. check_ewce_genelist_inputs: updated accordingly.- New arg:
store_gene_datato avoid hitting memory limits. - Modify test-bootstrap_enrichment_test_2.R to use test new args.
- New args:
ewce_plot()- Dendrogram not reordering cell types in plot- see issue
- Occurs when ctd does not contain the plotting info
- Fixed now and unit test added.
- Note that cell type order on the x-axis is based on hierarchical
clustering for both plots if
make_dendro = TRUEforewce_plot().
generate_bootstrap_plots- Use stored
gene_dataobject whenever possible. - Only show filtered celltypes.
- Use stored
generate_bootstrap_plots- Missing available parameters for
check_ewce_genelist_inputs()call.
- Missing available parameters for
drop_uninformative_genes- Hash out DGE options. Somehow these got re-exposed to users in Bioc>=3.16.
check_species:- Added new arg
sctSpecies_origin_default
- Added new arg
standardise_ctd:- Can now specify
sctSpecies_origin, which will be added to the metadata.
- Can now specify
standardise_ctd:- When "annot" slot is not provided in the original CTD,
a new one will be created from matrix columns names instead of assigning
NULL.
- When "annot" slot is not provided in the original CTD,
a new one will be created from matrix columns names instead of assigning
check_ewce_genelist_inputs/bootstrap_enrichment_test- New arg:
sctSpecies_originlets users clarify that their data originally came from mouse even when it is currently formatted as human orthologs. This is necessary for creating the appropriate background gene lists.
- New arg:
- Remove
grDevicesas dep entirely. fix_celltype_names- Add new arg
make_uniqueto make this function easily usable for vectors where the same celltype appears multiple times.
- Add new arg
bootstrap_enrichment_test- Return gene-level scores based on adaptation of code from
generate_bootstrap_plots. now stored as a list element namedgene_dataindata.tableformat.
- Return gene-level scores based on adaptation of code from
generate_bootstrap_plots- Revamp wrap code into reusable subfunctions.
- Avoid resampling random genes when the gene data is stored in the bootstrap results as
gene_data. It will also tell you which of these options it's using. - Save with
ggsaveinstead ifgrDevices. - Facet by celltype instead of generating tons of separate plots.
- Let users decide cutoff threshold with new arg
adj_pval_thresh - Now returns a named list with the plots themselves ("plot") and the paths to where they're saved ("paths") rather than just a higher-level directory path in which users had to search for the right files (and didn't ever have access to the ggplot2 objects).
- Show significance with barplot fill/color instead of asterices. Much easier to see now.
- Change
savePatharg to the more accuratesave_dir. Expose appending BootstrapPlots to the user within the argument.
generate_bootstrap_plots_for_transcriptome- Change
savePatharg to the more accuratesave_dir. Expose appending BootstrapPlots to the user within the argument. - Save with
ggsaveinstead ifgrDevices.
- Change
- Standardise
hits+hitGenesarg all tohits. - Update hex:
- Off load large source image from DALLE to Releases instead of including it within the package.
drop_uninformative_genes/generate_celltype_data- Pass
verbosearg to matrix formatting functions.
- Pass
generate_controlled_bootstrap_geneset- Removed
combinedGenesarg as it was not being used anywhere within.
- Removed
check_args_for_bootstrap_plot_generation- Removed unused args:
ttSpecies,sctSpecies
- Removed unused args:
- test-bootstrap_enrichment_test_2.R
- "monkey_ctd" tests seems to be running more smoothly than before (not just getting NAs).
This might have to with
orthogenedatabases improving. - Reassuringly, "godzilla" tests still fail as expected :)
- "monkey_ctd" tests seems to be running more smoothly than before (not just getting NAs).
This might have to with
- Add tess/testthat/Rplots.pdf to .gitignore.
- Use
rworkflowsGHA.- Add
rworkflows::use_badgesto README.Rmd. - Remove Dockerfile (no longer necessary).
- Make all 3 platforms (Linux, Mac, Windows) use Bioc dev,
as
ewceData (>=1.7.1)is now required, due to a fix made only in the development version ofrtracklayer.
- Add
- Remove
cowplotdependency. - Replace all
%>%with|>
calc_quantiles:- This function was only used in
filter_variance_quantiles - Compare
stats::ecdfvs.dplyr::ntilemethods. - Remove from
EWCEas it's not longer used anywhere.
- This function was only used in
bin_columns_into_quantiles:- Rename arg
matrixIn-->vecto reflect what the function actually does.
- Rename arg
filter_variance_quantiles:- Change to use
bin_columns_into_quantilesinstead ofcalc_quantilesto be consistent with how quantiles are handled in the rest ofEWCE. - Updated tests in test-get_celltype_table.r to reflect that the
number of genes filtered is unaffected by the normalization procedure
(when quantiles are computed with
stats::quantile).
- Change to use
ewce_plot:- Celltypes were producing NAs because the names in the results were not always standardized in the same way as the CTD. Now this is done internally.
- Celltypes were not ordered factors, meaning the dendrogram didn't line up correctly.
- Switched from
gridArrange/cowplottopatchwork. - Added dedicated unit tests file: test-ewce_plot.r
- Offline runs enabled with functions using reference datasets
(from
ewceData). These functions have the parameterlocalhubadded to control this.
- GHA fix.
- orthogene dependency has been replacing user entered background gene list with one generated from all known genes when species across gene lists and reference dataset are the same. This has now been fixed.
standardise_ctd:- Always force "specificity_quantiles" to be one of the matrices in each level.
filter_ctd_genes- Now exported.
- Can handle standardized CTD format.
get_ctd_matrix_names: New function to get a list of all data matrices in CTD.
check_ewce_genelist_inputs:- User reported potential bug in code:#71
- Fixed by removing conditional and instead always filtering out genes not present in CTD/SCT.
standardise_ctd:- Add
check_species() - Ensure all matrices become sparse when
as_sparse=TRUE. - Generalize to matrices of any name.
- Add
fix_celltype_names:- Ensure all celltype names are unique after standardization.
genelistSpeciesnow passed toprepare_genesize_control_networkinbootstrap_enrichment_testmeaning gene list species will be inferred from user input.
drop_uninformative_genes:- Expose new args:
dge_method,dge_test,min_variance_decile
- Expose new args:
merged_ctd: Actually merge the CTDs into one whenas_SCE=FALSE.
- Remove hard-coded file path separators
(e.g.
sprintf("%s/MRK_List2.rpt", tempdir())) to be more compatible with Windows.
- Made substantial updates to
orthogene, so going through and making sure everything still works / is able to take advantage of new features (e.g. separation ofnon121_strategyandagg_funcargs, many:many mapping):filter_nonorthologs: Pass up args fromorthogene::convert_orthologs.generate_celltype_data: @inheritDotParams
- Update GHA.
- Bump to R (>= 4.2) now that we're developing on Bioc 3.16.
- Avoid downloading large "MRK_List2.rpt" file any more than is necessary for testing.
methodargument fromorthogene::create_backgroundandorthogene::convert_orthologsis now passed up as an argument toEWCEfunctions to give users more control. "homologene" chosen as default for all functions. "homologene" has fewer species than "orthogene" but doesnt need to import data from the web. It also has more 1:1 mouse:human orthologs.- Include notes on mismatches between GitHub documentation and current Bioc release version.
- Allow
bin_specificity_into_quantilesto set specificity matrix name produced. - Merge GHA workflow yamls into one.
- Add
try({})anderror=TRUEto avoid "polygon edge not found" error in vignettes.
- Major changes: Pull Request from bschilder_dev branch.
- All functions can now use lists and CellTypeDatasets (CTD) from any species
and convert them to a common species (human by default) via
orthogene. - Automated CTD standardisation via
standardise_ctd. - Can handle (sparse) matrices.
- Can create CTD from very large datasets using
DelayedArrayobject class. - All functions automatically create appropriate gene backgrounds given species.
- More modular, simplified vignettes.
- Additional gene pre-filtering options (DESeq2, MAST, variance quantiles).
- New/improved plotting functions (e.g.
plot_ctd). - Added example bootstrapping enrichment results as extdata to
speed up examples (documented in data.R).
Accessed via
EWCE::example_bootstrap_results(). - Replaced GHA workflow with check-bioc to automatically: run R-CMD checks,
run
BiocCheck, and rebuild/deploy pkgdown site. - Parallelised functions:
drop_uninformative_genesgenerate_celltype-databootstrap_enrichment_test
- Added tests (multiple functions tests per file to reduce number
of times
ewceDatafiles have to be downloaded):test-DelayedArraytest-merge_scetest-get_celltype_tabletest-list_speciestest-run_DGEtest-check_percent_hits
- Added function
is_32bit()to all tests to ensure they don't get run twice on Windows OS. - Added GitHub Actions workflows:
check-bioc-docker.yml: Runs CRAN/Bioc checks, rebuilds and pushespkgdownwebsite, runs and uploads test coverage report,dockerhub.yml: Builds Bioconductor Docker container withEWCEinstalled, runs CRAN checks and (if checks are successful) pushes container to neurogenomicslab DockerHub.
- Removed
docsfolder, as the documentation website comes from the gh-pages branch now, and is automatically built by GHA workflow after each push to main branch. - Added new exported function
fix_celltype_namesto help with standardising celltype names in alignment withstandardise_ctd. generate_bootstrap_plots_for_transcriptome: Now supports any species (not just mouse or human).- Converts CTD and DGE table (
tt) intooutput_speciesgene symbols. - Automatically generates appropriate gene background.
- Faster due to now having the option to only generate certain plot types.
- Converts CTD and DGE table (
- Provide precomputed results from
ewce_expression_datavia newexample_transcriptome_resultsfunction. - Reduced build runtime and oversized vignettes by not evaluating
certain code chunks.
- Prevent extended vignette from running entirely.
@returndocumentation for internal functions.- Added more installation checks to GHA.
- Fixed inconsistent naming of unit test files:
test_==>test- - Removed DGE args in
drop_uninformative_genesfor now until we run benchmarking to see how each affects theEWCEresults. - Make
bootstrap_plotsfunction internal. - Add report on how
orthogeneimprove within- and across-species gene mappings in extended vignette. - Record extra info in
standardise_ctdoutput:- "species": both
input_speciesandoutput_species - "versions": of
EWCE,orthogene, andhomologene
- "species": both
- EWCE v1.0 on Bioconductor replaces the defunct EWCE v1.3.0 available on Bioconductor v3.5.
- EWCE has been rendered scalable to the analysis of large datasets
drop_uninformative_genes()has been expanded to allow the utilisation of differential expression approaches- EWCE can now handle SingleCellExperiment (SCE) objects or other Ranged SummarizedExperiment (SE) data types and as input as well as the original format, described as a single cell transcriptome (SCT) object.
Deprecated & Defunct
- The following functions have been renamed to use underscore in compliance with Bioconductor nomenclature:
check.ewce.genelist.inputscell.list.distbootstrap.enrichment.testbin.specificity.into.quantilesbin.columns.into.quantilesadd.res.to.merging.listprepare.genesize.control.networkprep.dendroget.celltype.tablecalculate.specificity.for.levelcalculate.meanexp.for.levelgenerate.celltype.datagenerate.bootstrap.plotsgenerate.bootstrap.plots.for.transcriptomefix.bad.mgi.symbolsfix.bad.hgnc.symbolsfilter.genes.without.1to1.homologewce.plotcells.in.ctddrop.uninformative.genes