-
Notifications
You must be signed in to change notification settings - Fork 25
feat: algorithm attributions #345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
tristan-f-r
wants to merge
5
commits into
Reed-CompBio:main
Choose a base branch
from
tristan-f-r:citations
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
0035242
feat: algorithm attributions
tristan-f-r 40ef631
fix: add attribution include false to analysis test
tristan-f-r 16f2410
test(config): attribution analysis
tristan-f-r b1d1e21
Merge branch 'main' into citations
tristan-f-r 99790f7
fix: add attribution schema
tristan-f-r File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,39 @@ | ||
| import urllib.parse | ||
| from pathlib import Path | ||
|
|
||
| import requests | ||
|
|
||
| from spras.runner import algorithms | ||
|
|
||
| DOI_BASE = "https://citation.doi.org/format?style=bibtex&lang=en-US&doi=" | ||
|
|
||
| def format_request(doi: str) -> str: | ||
| return DOI_BASE + urllib.parse.quote(doi) | ||
|
|
||
| def get_bibtex(doi: str) -> str: | ||
| response = requests.get(format_request(doi)) | ||
|
|
||
| return response.text.strip() | ||
|
|
||
| def attribute_algorithms(all_file: str, alg_files: list[str]): | ||
| """ | ||
| Attributes all algorithms specified by alg_files, aggregating them in | ||
| all_file. | ||
| """ | ||
| algorithm_name_files = [(Path(file).stem, file) for file in alg_files] | ||
|
|
||
| algorithm_citations = [ | ||
| (file, [get_bibtex(doi) for doi in algorithms[name].dois]) for (name, file) in algorithm_name_files | ||
| ] | ||
|
|
||
| for alg_output, alg_citations in algorithm_citations: | ||
| Path(alg_output).parent.mkdir(parents=True, exist_ok=True) | ||
| with open(alg_output, '+w') as handle: | ||
| for citation in alg_citations: | ||
| handle.write(citation + '\n') | ||
|
|
||
| Path(all_file).parent.mkdir(parents=True, exist_ok=True) | ||
| with open(all_file, '+w') as handle: | ||
| for _, alg_citations in algorithm_citations: | ||
| for citation in alg_citations: | ||
| handle.write(citation + '\n') |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -128,3 +128,5 @@ analysis: | |
| metric: 'euclidean' | ||
| evaluation: | ||
| include: false | ||
| attribution: | ||
| include: false | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -100,3 +100,5 @@ analysis: | |
| include: false | ||
| evaluation: | ||
| include: false | ||
| attribution: | ||
| include: false | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| @article{Supper_Spangenberg_Planatscher_Dräger_Schröder_Zell_2009, title={BowTieBuilder: modeling signal transduction pathways}, volume={3}, url={http://dx.doi.org/10.1186/1752-0509-3-67}, DOI={10.1186/1752-0509-3-67}, number={1}, journal={BMC Systems Biology}, publisher={Springer Science and Business Media LLC}, author={Supper, Jochen and Spangenberg, Lucía and Planatscher, Hannes and Dräger, Andreas and Schröder, Adrian and Zell, Andreas}, year={2009}, month=june, language={en} } | ||
| @article{Levi_Elkon_Shamir_2021, title={DOMINO: a network‐based active module identification algorithm with reduced rate of false calls}, volume={17}, url={http://dx.doi.org/10.15252/msb.20209593}, DOI={10.15252/msb.20209593}, number={1}, journal={Molecular Systems Biology}, publisher={Springer Science and Business Media LLC}, author={Levi, Hagai and Elkon, Ran and Shamir, Ron}, year={2021}, month=jan, language={en} } | ||
| @article{Gitter_Klein-Seetharaman_Gupta_Bar-Joseph_2010, title={Discovering pathways by orienting edges in protein interaction networks}, volume={39}, url={http://dx.doi.org/10.1093/nar/gkq1207}, DOI={10.1093/nar/gkq1207}, number={4}, journal={Nucleic Acids Research}, publisher={Oxford University Press (OUP)}, author={Gitter, Anthony and Klein-Seetharaman, Judith and Gupta, Anupam and Bar-Joseph, Ziv}, year={2010}, month=nov, pages={e22–e22}, language={en} } | ||
| @article{Yeger-Lotem_Riva_Su_Gitler_Cashikar_King_Auluck_Geddie_Valastyan_Karger_et al._2009, title={Bridging high-throughput genetic and transcriptional data reveals cellular responses to alpha-synuclein toxicity}, volume={41}, url={http://dx.doi.org/10.1038/ng.337}, DOI={10.1038/ng.337}, number={3}, journal={Nature Genetics}, publisher={Springer Science and Business Media LLC}, author={Yeger-Lotem, Esti and Riva, Laura and Su, Linhui Julie and Gitler, Aaron D and Cashikar, Anil G and King, Oliver D and Auluck, Pavan K and Geddie, Melissa L and Valastyan, Julie S and Karger, David R and Lindquist, Susan and Fraenkel, Ernest}, year={2009}, month=feb, pages={316–323}, language={en} } | ||
| @article{Tuncbag_Gosline_Kedaigle_Soltis_Gitter_Fraenkel_2016, title={Network-Based Interpretation of Diverse High-Throughput Datasets through the Omics Integrator Software Package}, volume={12}, url={http://dx.doi.org/10.1371/journal.pcbi.1004879}, DOI={10.1371/journal.pcbi.1004879}, number={4}, journal={PLOS Computational Biology}, publisher={Public Library of Science (PLoS)}, author={Tuncbag, Nurcan and Gosline, Sara J. C. and Kedaigle, Amanda and Soltis, Anthony R. and Gitter, Anthony and Fraenkel, Ernest}, editor={Prlic, Andreas}, year={2016}, month=apr, pages={e1004879}, language={en} } | ||
| @article{Tuncbag_Gosline_Kedaigle_Soltis_Gitter_Fraenkel_2016, title={Network-Based Interpretation of Diverse High-Throughput Datasets through the Omics Integrator Software Package}, volume={12}, url={http://dx.doi.org/10.1371/journal.pcbi.1004879}, DOI={10.1371/journal.pcbi.1004879}, number={4}, journal={PLOS Computational Biology}, publisher={Public Library of Science (PLoS)}, author={Tuncbag, Nurcan and Gosline, Sara J. C. and Kedaigle, Amanda and Soltis, Anthony R. and Gitter, Anthony and Fraenkel, Ernest}, editor={Prlic, Andreas}, year={2016}, month=apr, pages={e1004879}, language={en} } | ||
| @article{Ritz_Poirel_Tegge_Sharp_Simmons_Powell_Kale_Murali_2016, title={Pathways on demand: automated reconstruction of human signaling networks}, volume={2}, url={http://dx.doi.org/10.1038/npjsba.2016.2}, DOI={10.1038/npjsba.2016.2}, abstractNote={<jats:title>Abstract</jats:title><jats:p>Signaling pathways are a cornerstone of systems biology. Several databases store high-quality representations of these pathways that are amenable for automated analyses. Despite painstaking and manual curation, these databases remain incomplete. We present P<jats:sc>ATH</jats:sc>L<jats:sc>INKER</jats:sc>, a new computational method to reconstruct the interactions in a signaling pathway of interest. P<jats:sc>ATH</jats:sc>L<jats:sc>INKER</jats:sc> efficiently computes multiple short paths from the receptors to transcriptional regulators (TRs) in a pathway within a background protein interaction network. We use P<jats:sc>ATH</jats:sc>L<jats:sc>INKER</jats:sc> to accurately reconstruct a comprehensive set of signaling pathways from the NetPath and KEGG databases. We show that P<jats:sc>ATH</jats:sc>L<jats:sc>INKER</jats:sc> has higher precision and recall than several state-of-the-art algorithms, while also ensuring that the resulting network connects receptor proteins to TRs. P<jats:sc>ATH</jats:sc>L<jats:sc>INKER</jats:sc>’s reconstruction of the Wnt pathway identified CFTR, an ABC class chloride ion channel transporter, as a novel intermediary that facilitates the signaling of Ryk to Dab2, which are known components of Wnt/β-catenin signaling. In HEK293 cells, we show that the Ryk–CFTR–Dab2 path is a novel amplifier of β-catenin signaling specifically in response to Wnt 1, 2, 3, and 3a of the 11 Wnts tested. P<jats:sc>ATH</jats:sc>L<jats:sc>INKER</jats:sc> captures the structure of signaling pathways as represented in pathway databases better than existing methods. P<jats:sc>ATH</jats:sc>L<jats:sc>INKER</jats:sc>’s success in reconstructing pathways from NetPath and KEGG databases point to its applicability for complementing manual curation of these databases. P<jats:sc>ATH</jats:sc>L<jats:sc>INKER</jats:sc> may serve as a promising approach for prioritizing proteins and interactions for experimental study, as illustrated by its discovery of a novel pathway in Wnt/β-catenin signaling. Our supplementary website at <jats:ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="http://bioinformatics.cs.vt.edu/~murali/supplements/2016-sys-bio-applications-pathlinker/">http://bioinformatics.cs.vt.edu/~murali/supplements/2016-sys-bio-applications-pathlinker/</jats:ext-link> provides links to the P<jats:sc>ATH</jats:sc>L<jats:sc>INKER</jats:sc> software, input datasets, P<jats:sc>ATH</jats:sc>L<jats:sc>INKER</jats:sc> reconstructions of NetPath pathways, and links to interactive visualizations of these reconstructions on GraphSpace.</jats:p>}, number={1}, journal={npj Systems Biology and Applications}, publisher={Springer Science and Business Media LLC}, author={Ritz, Anna and Poirel, Christopher L and Tegge, Allison N and Sharp, Nicholas and Simmons, Kelsey and Powell, Allison and Kale, Shiv D and Murali, TM}, year={2016}, month=mar, language={en} } | ||
| @article{Poirel_Rodrigues_Chen_Tyson_Murali_2013, title={Top-Down Network Analysis to Drive Bottom-Up Modeling of Physiological Processes}, volume={20}, url={http://dx.doi.org/10.1089/cmb.2012.0274}, DOI={10.1089/cmb.2012.0274}, number={5}, journal={Journal of Computational Biology}, publisher={Mary Ann Liebert Inc}, author={Poirel, Christopher L. and Rodrigues, Richard R. and Chen, Katherine C. and Tyson, John J. and Murali, T.M.}, year={2013}, month=may, pages={409–418}, language={en} } | ||
| @article{Yeger-Lotem_Riva_Su_Gitler_Cashikar_King_Auluck_Geddie_Valastyan_Karger_et al._2009, title={Bridging high-throughput genetic and transcriptional data reveals cellular responses to alpha-synuclein toxicity}, volume={41}, url={http://dx.doi.org/10.1038/ng.337}, DOI={10.1038/ng.337}, number={3}, journal={Nature Genetics}, publisher={Springer Science and Business Media LLC}, author={Yeger-Lotem, Esti and Riva, Laura and Su, Linhui Julie and Gitler, Aaron D and Cashikar, Anil G and King, Oliver D and Auluck, Pavan K and Geddie, Melissa L and Valastyan, Julie S and Karger, David R and Lindquist, Susan and Fraenkel, Ernest}, year={2009}, month=feb, pages={316–323}, language={en} } |
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| @article{Supper_Spangenberg_Planatscher_Dräger_Schröder_Zell_2009, title={BowTieBuilder: modeling signal transduction pathways}, volume={3}, url={http://dx.doi.org/10.1186/1752-0509-3-67}, DOI={10.1186/1752-0509-3-67}, number={1}, journal={BMC Systems Biology}, publisher={Springer Science and Business Media LLC}, author={Supper, Jochen and Spangenberg, Lucía and Planatscher, Hannes and Dräger, Andreas and Schröder, Adrian and Zell, Andreas}, year={2009}, month=june, language={en} } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| @article{Levi_Elkon_Shamir_2021, title={DOMINO: a network‐based active module identification algorithm with reduced rate of false calls}, volume={17}, url={http://dx.doi.org/10.15252/msb.20209593}, DOI={10.15252/msb.20209593}, number={1}, journal={Molecular Systems Biology}, publisher={Springer Science and Business Media LLC}, author={Levi, Hagai and Elkon, Ran and Shamir, Ron}, year={2021}, month=jan, language={en} } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| @article{Gitter_Klein-Seetharaman_Gupta_Bar-Joseph_2010, title={Discovering pathways by orienting edges in protein interaction networks}, volume={39}, url={http://dx.doi.org/10.1093/nar/gkq1207}, DOI={10.1093/nar/gkq1207}, number={4}, journal={Nucleic Acids Research}, publisher={Oxford University Press (OUP)}, author={Gitter, Anthony and Klein-Seetharaman, Judith and Gupta, Anupam and Bar-Joseph, Ziv}, year={2010}, month=nov, pages={e22–e22}, language={en} } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| @article{Yeger-Lotem_Riva_Su_Gitler_Cashikar_King_Auluck_Geddie_Valastyan_Karger_et al._2009, title={Bridging high-throughput genetic and transcriptional data reveals cellular responses to alpha-synuclein toxicity}, volume={41}, url={http://dx.doi.org/10.1038/ng.337}, DOI={10.1038/ng.337}, number={3}, journal={Nature Genetics}, publisher={Springer Science and Business Media LLC}, author={Yeger-Lotem, Esti and Riva, Laura and Su, Linhui Julie and Gitler, Aaron D and Cashikar, Anil G and King, Oliver D and Auluck, Pavan K and Geddie, Melissa L and Valastyan, Julie S and Karger, David R and Lindquist, Susan and Fraenkel, Ernest}, year={2009}, month=feb, pages={316–323}, language={en} } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| @article{Tuncbag_Gosline_Kedaigle_Soltis_Gitter_Fraenkel_2016, title={Network-Based Interpretation of Diverse High-Throughput Datasets through the Omics Integrator Software Package}, volume={12}, url={http://dx.doi.org/10.1371/journal.pcbi.1004879}, DOI={10.1371/journal.pcbi.1004879}, number={4}, journal={PLOS Computational Biology}, publisher={Public Library of Science (PLoS)}, author={Tuncbag, Nurcan and Gosline, Sara J. C. and Kedaigle, Amanda and Soltis, Anthony R. and Gitter, Anthony and Fraenkel, Ernest}, editor={Prlic, Andreas}, year={2016}, month=apr, pages={e1004879}, language={en} } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| @article{Tuncbag_Gosline_Kedaigle_Soltis_Gitter_Fraenkel_2016, title={Network-Based Interpretation of Diverse High-Throughput Datasets through the Omics Integrator Software Package}, volume={12}, url={http://dx.doi.org/10.1371/journal.pcbi.1004879}, DOI={10.1371/journal.pcbi.1004879}, number={4}, journal={PLOS Computational Biology}, publisher={Public Library of Science (PLoS)}, author={Tuncbag, Nurcan and Gosline, Sara J. C. and Kedaigle, Amanda and Soltis, Anthony R. and Gitter, Anthony and Fraenkel, Ernest}, editor={Prlic, Andreas}, year={2016}, month=apr, pages={e1004879}, language={en} } |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.