Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
129 commits
Select commit Hold shift + click to select a range
7cb68cd
Add R script to process data
oliverfanderson Oct 31, 2024
7165629
Update R script
oliverfanderson Nov 8, 2024
c417102
Preprocessed 14 pathways into S/T sets
oliverfanderson Nov 15, 2024
d512be8
Create node-prizes file from sources/targets
oliverfanderson Nov 20, 2024
0293ae4
Process datasets into Prizes with 100 as score
oliverfanderson Dec 4, 2024
01a0f07
added network thresholding feature and output images
ctrlaltaf Feb 24, 2025
626ae30
output images
ctrlaltaf Feb 27, 2025
e236451
optimized thresholding
ctrlaltaf Mar 31, 2025
6dc7e0a
added command line inputs
ctrlaltaf Mar 31, 2025
45b9c4e
added edge output file generation
ctrlaltaf Mar 31, 2025
614ac74
output files
ctrlaltaf Mar 31, 2025
0fc770c
added updated outputs
ctrlaltaf Mar 31, 2025
316b53b
new method
ctrlaltaf Apr 1, 2025
e983c86
added output to new method
ctrlaltaf Apr 1, 2025
f7a1961
namespace mapped outputs
ctrlaltaf Apr 2, 2025
a482e0f
refactored api calls
ctrlaltaf Apr 2, 2025
ce1b43e
updated output files
ctrlaltaf Apr 3, 2025
8be45da
update with overlap analytics
ntalluri Apr 5, 2025
12700b4
restructerd the files, updated the src files, adding a steps to proce…
ntalluri Apr 5, 2025
a1f78ae
Ignoring big files
ntalluri Apr 5, 2025
0f5bd1b
Remove attributes file
ntalluri Apr 5, 2025
bf13b4e
updating HumanInteractome script and added directions
ntalluri Apr 5, 2025
7cab836
add code to make pathways uniprot ids and spras compatible, update re…
ntalluri Apr 5, 2025
d287d3e
spras compatible pathway data
ntalluri Apr 5, 2025
3608e75
updated human interactome script to make the interactomes file
ntalluri Apr 5, 2025
79db1ff
updated spras compatible code location, removed files, and removed un…
ntalluri Apr 7, 2025
70c2f15
updated readme, gitignore, and src files
ntalluri Apr 7, 2025
848d948
updated the string-uniprot ids for the interactomes
ntalluri Apr 7, 2025
e57489a
renamed and moved files
ntalluri Apr 7, 2025
1940a1e
updated code to include unreviewed ids
ntalluri Apr 8, 2025
f15378e
updated code to deal with directionality
ntalluri Apr 8, 2025
086b668
remove unused code
ntalluri Apr 8, 2025
447561d
updated the pathway-data and src files to accomidate this
ntalluri Apr 8, 2025
63571f8
updated README with instructions on how to genereate synthethic netwo…
ntalluri Apr 8, 2025
4d2e889
switched the files for sources and targets
ntalluri Apr 9, 2025
93fe873
picked new pilot data using the ratios
ntalluri Apr 12, 2025
0cbc0df
added the sources and targets file origins
ntalluri Apr 12, 2025
ea3bef7
updated to get rid of extra step to get overlap analytics
ntalluri Apr 12, 2025
ce81316
update variable names
ntalluri Apr 14, 2025
db2e7a0
update the varibles names in the rscript
ntalluri Apr 14, 2025
7b12cf9
updated code to deal with duplicate edges
ntalluri Apr 14, 2025
2e9f47d
updated the prize values and the rank values
Apr 16, 2025
d882814
Merge branch 'Reed-CompBio:main' into synthetic_networks
ntalluri Apr 16, 2025
9477d85
ci: test 1
tristan-f-r Jun 17, 2025
f4e4777
ci: install correct spras module
tristan-f-r Jun 17, 2025
09e9151
ci: use correct snakefile
tristan-f-r Jun 17, 2025
22455d8
fix: correct config with missing eval key
tristan-f-r Jun 17, 2025
99fb7c7
ci: single core, drop output
tristan-f-r Jun 17, 2025
7c0299a
chore: devcontainer, print correct output
tristan-f-r Jun 17, 2025
52713cb
feat: raw data view
tristan-f-r Jun 17, 2025
b5b68a0
ci: bmp upload pages artifact
tristan-f-r Jun 17, 2025
a2a68f8
fix: recursively mkdir data output
tristan-f-r Jun 17, 2025
17f55ac
ci: bmp pages
tristan-f-r Jun 17, 2025
50ed3a7
ci: test on yeast
tristan-f-r Jun 17, 2025
ad3a098
ci: test pathlinker
tristan-f-r Jun 17, 2025
f1570a5
ci: split into DMMMs and PRAs
tristan-f-r Jun 17, 2025
a285cbb
ci: reorganize into datasets
tristan-f-r Jun 17, 2025
4f49ee0
docs: clearer wording
tristan-f-r Jun 17, 2025
c6136fc
ci: use base yaml config
tristan-f-r Jun 17, 2025
9ce298b
ci: enable pull request checks
tristan-f-r Jun 17, 2025
5ace8f2
fix: proper yaml
tristan-f-r Jun 17, 2025
1ca5c99
style: pre-commit, yamlfmt
tristan-f-r Jun 17, 2025
3b5fbf7
style: fmt
tristan-f-r Jun 17, 2025
882e912
ci: pre-commit checks
tristan-f-r Jun 17, 2025
b82ca89
ci: re-merge config
tristan-f-r Jun 17, 2025
c5d508e
ci: post-process responsenet, set up dual-way ci
tristan-f-r Jun 17, 2025
c35b023
style: fmt
tristan-f-r Jun 17, 2025
9ee295d
style: fmt?
tristan-f-r Jun 17, 2025
e9248ff
fix: use correct dataset name for ms2018, cache ci
tristan-f-r Jun 17, 2025
db70091
ci: drop strict channel priority
tristan-f-r Jun 17, 2025
c41ef42
ci: use correct hash
tristan-f-r Jun 17, 2025
b0dc1aa
ci: downgrade miniconda action
tristan-f-r Jun 17, 2025
f623383
ci: drop caching
tristan-f-r Jun 17, 2025
d61254e
ci: drop tar support!
tristan-f-r Jun 17, 2025
d8ccc15
ci: restrict pathlinker
tristan-f-r Jun 17, 2025
f556948
feat: use astro
tristan-f-r Jun 17, 2025
f69830b
ci: fix config name
tristan-f-r Jun 17, 2025
84362dc
ci: install deps beforehand
tristan-f-r Jun 17, 2025
669602b
feat: dynamically recognize runs
tristan-f-r Jun 17, 2025
e7b66f2
ci: specify build path
tristan-f-r Jun 17, 2025
dca8f85
feat: build time, associated output files
tristan-f-r Jun 17, 2025
62565ca
ci: bmp test
tristan-f-r Jun 18, 2025
320626e
fix: build properly
tristan-f-r Jun 18, 2025
b37b095
fix: build on correct path
tristan-f-r Jun 18, 2025
834aa47
fix: correct trailing slashes
tristan-f-r Jun 18, 2025
2d47ced
feat: begin dataset page
tristan-f-r Jun 18, 2025
452d521
feat: try out new categorization method
tristan-f-r Jun 18, 2025
025f17a
fix: actually grab data type
tristan-f-r Jun 18, 2025
0cb9b61
docs: correct dmm wording
tristan-f-r Jun 18, 2025
d18d17c
fix: correct file paths
tristan-f-r Jun 18, 2025
8915a69
added data processing scripts
AMINOexe Jun 18, 2025
3848f9c
style: fmt
tristan-f-r Jun 18, 2025
9f1aa10
feat: begin automating yeast
tristan-f-r Jun 18, 2025
15c7c9d
ci: run process through snakemake
tristan-f-r Jun 18, 2025
7402fb6
style: fmt
tristan-f-r Jun 18, 2025
c2c8750
ci: pin uv
tristan-f-r Jun 18, 2025
b6babec
docs(contributing): mention run_snakemake.sh
tristan-f-r Jun 18, 2025
4689109
ci: pin uv setup
tristan-f-r Jun 18, 2025
f511d5f
docs(run_snakemake.sh): clarifying comment
tristan-f-r Jun 18, 2025
68ee011
fix: add snakemake through uv
tristan-f-r Jun 18, 2025
4ae1421
fix(run_snakemake): add --cores 1
tristan-f-r Jun 18, 2025
c6aa008
chore: mv astro config to ts
tristan-f-r Jun 18, 2025
5dfa460
fix: move uv over to snakefile
tristan-f-r Jun 18, 2025
efe77d5
Merge branch 'pr/11'
tristan-f-r Jun 18, 2025
ac98c18
chore: mv synthetic data to datasets
tristan-f-r Jun 18, 2025
b5d2472
chore: mv more files around
tristan-f-r Jun 18, 2025
246639c
begin rewriting process pathways script
tristan-f-r Jun 18, 2025
15e4a4a
feat: convert full script to py
tristan-f-r Jun 19, 2025
72007b6
refactor: move everything over to intermediate
tristan-f-r Jun 19, 2025
17ee256
feat(synthetic-data): orchestrate non-interactome generation through …
tristan-f-r Jun 19, 2025
aa4dbe9
style: fmt
tristan-f-r Jun 19, 2025
4244cf9
feat: handle interactome downloading
tristan-f-r Jun 19, 2025
9882ec4
begin refactoring combine.py
tristan-f-r Jun 19, 2025
7c3c46a
fix: clear up last path issues with combine.py
tristan-f-r Jun 20, 2025
3c47f51
feat: full snakemake workflow for synthetic data
tristan-f-r Jun 20, 2025
f1bde4d
chore: run synthetic data snakemake
tristan-f-r Jun 20, 2025
b7376ee
ci: cores 4
tristan-f-r Jun 20, 2025
12e0c08
chore: bmp down to cores 1
tristan-f-r Jun 23, 2025
dcfef76
chore: drop unused docs
tristan-f-r Jun 23, 2025
b6bf8aa
chore: enable ml
tristan-f-r Jun 25, 2025
a4237cc
fix: filter analysis folders
tristan-f-r Jun 25, 2025
a648bba
feat: show pca graphs
tristan-f-r Jun 25, 2025
441fe3d
feat: allow image zoom
tristan-f-r Jun 25, 2025
2ef2f16
fix: rn type-datasets to use correct formatting
tristan-f-r Jun 25, 2025
b2aa2be
automated hiv dataset
AMINOexe Jun 30, 2025
efc877a
chore: fix config
tristan-f-r Jun 30, 2025
5ee88ad
added kegg orthology script
AMINOexe Jun 30, 2025
92deea3
style: lint
tristan-f-r Jul 1, 2025
0a88fdf
chore: rm unreviewed datasets
tristan-f-r Jul 1, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
FROM mcr.microsoft.com/devcontainers/anaconda:1-3
21 changes: 21 additions & 0 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
// Small devcontainer which loads anaconda. All postinstallation steps have to be done manually.
// This comes with snakemake and docker-in-docker.

// For format details, see https://aka.ms/devcontainer.json. For config options, see the
// README at: https://github.com/devcontainers/templates/tree/main/src/anaconda
{
"name": "Anaconda (Python 3)",
"build": {
"context": "..",
"dockerfile": "Dockerfile"
},
"features": {
"ghcr.io/devcontainers/features/docker-in-docker:2": {},
// For yamlfmt
"ghcr.io/devcontainers/features/go:1": {},
// For web display
"ghcr.io/devcontainers/features/node:1": {},
// For scripting
"ghcr.io/va-h/devcontainers-features/uv:1": {}
}
}
105 changes: 105 additions & 0 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
name: Test SPRAS

on:
pull_request:
branches: [main]
push:
branches: [main]

# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
permissions:
contents: read
pages: write
id-token: write

# Allow one concurrent deployment
concurrency:
group: 'pages'
cancel-in-progress: true

jobs:
pre-commit:
name: Run pre-commit checks
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Run pre-commit checks
uses: pre-commit/[email protected]
checks:
name: Run workflow
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
submodules: true
- name: Install uv for scripting
uses: astral-sh/[email protected]
with:
version: "0.7.13"
- name: Setup conda
uses: conda-incubator/setup-miniconda@v2
with:
activate-environment: spras
environment-file: spras/environment.yml
auto-activate-base: false
miniconda-version: 'latest'
# Install spras in the environment using pip
- name: Install spras in conda env
shell: bash --login {0}
run: pip install ./spras
# Log conda environment contents
- name: Log conda environment
shell: bash --login {0}
run: conda list
- name: Process raw data through Snakemake
run: sh run_snakemake.sh
- name: Run Snakemake workflow for DMMMs
shell: bash --login {0}
run: snakemake --cores 1 --configfile configs/dmmm.yaml --show-failed-logs -s spras/Snakefile
# TODO: re-enable PRAs once RN/synthetic data PRs are merged.
# - name: Run Snakemake workflow for PRAs
# shell: bash --login {0}
# run: snakemake --cores 1 --configfile configs/pra.yaml --show-failed-logs -s spras/Snakefile
- name: Setup PNPM
uses: pnpm/action-setup@v4
with:
version: 10
- name: Install web dependencies
working-directory: ./web
run: pnpm install
- name: Run web builder
working-directory: ./web
run: pnpm build
- name: Upload built website distribution folder
uses: actions/upload-artifact@v4
with:
name: build
path: web/dist
pages:
needs: checks
if: github.event_name != 'pull_request'
runs-on: ubuntu-latest
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
- name: Download Artifacts
uses: actions/download-artifact@v4
with:
name: build
path: dist
- name: Setup Pages
uses: actions/configure-pages@v2
- name: Upload artifact
uses: actions/upload-pages-artifact@v3
with:
path: dist
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4
18 changes: 11 additions & 7 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,6 @@ dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
Expand Down Expand Up @@ -155,8 +153,14 @@ dmypy.json
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
.idea/

# Snakemake
.snakemake

# Output
/output
/web/output

# pnpm
.pnpm-store
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "spras"]
path = spras
url = https://github.com/Reed-CompBio/spras
30 changes: 30 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
# See https://pre-commit.com/ for documentation
default_language_version:
# Match this to the version specified in environment.yml
python: python3.11
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0 # Use the ref you want to point at
hooks:
# Attempts to load all yaml files to verify syntax.
- id: check-yaml
# Attempts to load all TOML files to verify syntax.
- id: check-toml
# Trims trailing whitespace.
- id: trailing-whitespace
# Preserves Markdown hard linebreaks.
args: [--markdown-linebreak-ext=md]
# Do not trim whitespace from all files, input files may need trailing whitespace for empty values in columns.
types_or: [markdown, python, yaml]
# Skip this Markdown file, which has an example of an input text file within it.
exclude: input/README.md
- repo: https://github.com/charliermarsh/ruff-pre-commit
rev: 'v0.0.269'
hooks:
- id: ruff
- repo: https://github.com/google/yamlfmt
rev: v0.17.0
hooks:
- id: yamlfmt
1 change: 1 addition & 0 deletions .python-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.13
4 changes: 4 additions & 0 deletions .vscode/extensions.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{
"recommendations": ["astro-build.astro-vscode"],
"unwantedRecommendations": []
}
5 changes: 5 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"editor.rulers": [
150
]
}
2 changes: 2 additions & 0 deletions .yamlfmt.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
formatter:
retain_line_breaks_single: true
23 changes: 23 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Contributing

## Helping Out

There are `TODOs` that better enhance the reproducability of datasets or analysis of algorithm outputs, as well as
[open resolvable issues](https://github.com/Reed-CompBio/spras-benchmarking/).

## Adding a dataset

To add a dataset (see `datasets/yeast-osmotic-stress` as an example of a dataset):
1. Check that your dataset provider isn't already added (some of these datasets act as providers for multiple datasets)
1. Create a new folder under `datasets/<your-dataset>`
1. Add a `raw` folder containing your data
1. Add an attached Snakefile that converts your `raw` data to `processed` data
1. Add your snakefile to the top-level `run_snakemake.sh` file.
1. If your dataset is a paper reproduction, add a `reproduction/raw` and `reproduction/processed` folder
1. Add your datasets to the appropiate `configs`

## Adding an algorithm

If you want to add an algorithm, refer to the [SPRAS repository](https://github.com/Reed-CompBio/SPRAS) instead.
If you want to test your new algorithm you PRed to SPRAS, you can swap out the `spras` submodule that this repository uses
with your fork of SPRAS.
32 changes: 30 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,30 @@
# spras-benchmarking
Benchmarking datasets for the [SPRAS](https://github.com/Reed-CompBio/spras) project
# SPRAS benchmarking

![example workflow](https://github.com/Reed-CompBio/spras-benchmarking/actions/workflows/publish.yml/badge.svg)

Benchmarking datasets for the [SPRAS](https://github.com/Reed-CompBio/spras) project. This repository contains gold standard datasets to evaluate on as well as paper reproductions & improvements incorporating new methodologies.

## Setup

This repository depends on SPRAS. If you want to reproduce the results of benchmarking locally,
you will need to setup SPRAS. SPRAS depends on [Docker](https://www.docker.com/) and [Conda](https://docs.conda.io/projects/conda/en/stable/). If it is hard to install either of these tools,
a [devcontainer](https://containers.dev/) is available for easy setup.

```sh
conda env create -f spras/environment.yml
conda activate spras
pip install ./spras
```

To run the postprocess output scripts, we have a `pyproject.toml` which can be used with your desired python package manager. This separates
the `spras` conda environment from the small scripts we have. (on CI, we use [`uv`](https://docs.astral.sh/uv/).)

To run the benchmarking pipeline, use:

```sh
snakemake --cores 1 --configfile configs/dmmm.yaml --show-failed-logs -s spras/Snakefile
```

> [!NOTE]
> Each one of the dataset categories (at the time of writing, DMMM and PRA) are split into different configuration files.
> Run each one as you would want.
56 changes: 56 additions & 0 deletions configs/dmmm.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Base Settings
hash_length: 7
container_framework: docker
unpack_singularity: false

container_registry:
base_url: docker.io
owner: reedcompbio

reconstruction_settings:
locations:
reconstruction_dir: "output"
run: true

analysis:
summary:
include: false
graphspace:
include: false
cytoscape:
include: false
ml:
include: true
aggregate_per_algorithm: true
evaluation:
include: false

# Custom settings
algorithms:
- name: "omicsintegrator1"
params:
include: true
run1:
b: [2]
w: [.5]
d: [10]
mu: [2]
- name: "omicsintegrator2"
params:
include: true
run1:
b: [4]
g: [0]

datasets:
- label: dmmmhiv060
node_files: ["processed_prize_060.txt"]
edge_files: ["phosphosite-irefindex13.0-uniprot.txt"]
# Placeholder
other_files: []
data_dir: "datasets/hiv/processed"
- label: dmmmhiv05
node_files: ["processed_prize_05.txt"]
edge_files: ["phosphosite-irefindex13.0-uniprot.txt"]
other_files: []
data_dir: "datasets/hiv/processed"
62 changes: 62 additions & 0 deletions configs/pra.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Base Settings
# TODO: (same for dmmm.yaml): can we deduplicate this using snakemake?
hash_length: 7
container_framework: docker
unpack_singularity: false

container_registry:
base_url: docker.io
owner: reedcompbio

reconstruction_settings:
locations:
reconstruction_dir: "output"
run: true

analysis:
summary:
include: false
graphspace:
include: false
cytoscape:
include: false
ml:
include: true
aggregate_per_algorithm: true
evaluation:
include: false

# Custom settings
algorithms:
- name: "omicsintegrator1"
params:
include: true
run1:
b: [2]
w: [.5]
d: [10]
mu: [2]
- name: "omicsintegrator2"
params:
include: true
run1:
b: [4]
g: [0]
- name: "pathlinker"
params:
include: true
run1:
k: [10, 20]
- name: "allpairs"
params:
include: true

datasets:
- label: pramuscleskeletal2018
node_files: ["sources.txt", "targets.txt"]
# DataLoader.py can currently only load a single edge file, which is the primary network
edge_files: ["interactome.tsv"]
# Placeholder
other_files: []
# Relative path from the spras directory
data_dir: "datasets/rn-muscle-skeletal/processed"
1 change: 1 addition & 0 deletions datasets/hiv/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
processed
Loading