Support sparse matrices in vectorize_counts_and_tree#1
Open
r0hansaxena wants to merge 10 commits intomainfrom
Open
Support sparse matrices in vectorize_counts_and_tree#1r0hansaxena wants to merge 10 commits intomainfrom
r0hansaxena wants to merge 10 commits intomainfrom
Conversation
* Update action versions in gh action workflows * Specify cibuildwheel version
) * ENH: add graphembed_rs wrapping for network embedding (scikit-bio#2212) * Fix lint errors in _graphembed.py * ENH: Parse string taxonomies into tree * Add changelog entry for string parsing (scikit-bio#2406) * ENH: Generalize taxonomy parsing in TreeNode.from_taxonomy * Restrict rank regex to standard prefixes and add extract_rank option * Update tests to Greengenes2 nomenclature and test extract_rank * fix Greengenes * docs: generalize taxonomy parsing description * Improve taxonomy rank extraction and null handling in from_taxonomy * cleaning up graphembed commits
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Please complete the following checklist:
I have read the contribution guidelines.
I have documented all public-facing changes in the changelog.
This pull request includes code, documentation, or other content derived from external source(s). If this is the case, ensure the external source's license is compatible with scikit-bio's license. Include the license in the
licensesdirectory and add a comment in the code giving proper attribution. Ensure any other requirements set forth by the license and/or author are satisfied.This pull request does not include code, documentation, or other content derived from external source(s).
Note: This document may also be helpful to see some of the things code reviewers will be verifying when reviewing your pull request.
Description
Adds support for sparse matrix inputs to vectorize_counts_and_tree, improving memory efficiency when working with large biological datasets. Internally routes to a new sparse-aware helper that uses scipy sparse operations instead of dense matrix expansion