Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions datasets/rn-muscle-skeletal/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
processed
raw
18 changes: 18 additions & 0 deletions datasets/rn-muscle-skeletal/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# ResponseNet Muscle Skeletal dataset.

<!-- TODO: Can we find the source? -->
**This is a paper reproduction**. Other algorithms may also run on this dataset, but no clear source of this has been found.

ResponseNet has not been very clear about the data that they worked on. In this folder, all the files needed to replicate their work is found.

#### File Breakdown
The two files downloaded directly from ResponseNet are `ResponseNetNetwork.json` and `Muscle_Skeletal-Dec2018.tsv`. The JSON file is an output from ResponseNet's sample output, and is what we used to compare to SPRAS.

`sources.txt` and `targets.txt` were manually curated from `ResponseNetNetwork.json`.

The `Muscle_Skeletal-Dec2018.tsv` is the interactome that ResponseNet uses, they do provide a direct download on their site.

#### Other information
In order to download the files for yourself, you can do so at: https://netbio.bgu.ac.il/respnet/, specifically https://netbio.bgu.ac.il/labwebsite/the-responsenet-v-3-web-server-download-page/.

You can directly download the interactome by selecting which one you are interested in using. In order to download their sample, you need to look for the link for the `sample output` and wait for ResponseNet to run. At the time of writing, ResponseNet will not allow you to directly download the source and target files, you must go to the cytoscape section of the software, and download the cytoscape `.json` file.
29 changes: 29 additions & 0 deletions datasets/rn-muscle-skeletal/Snakefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
rule all:
input:
"processed/sources.txt",
"processed/targets.txt",
"processed/interactome.tsv"

rule get_interactome:
output:
"raw/Muscle_Skeletal-Dec2018.tsv"
shell:
"uv run gdown https://drive.google.com/file/d/1mkvWrCkeDz1DU-PSEsRaGNM20EcpVLwE/view?usp=sharing -O raw/Muscle_Skeletal-Dec2018.tsv"

rule process_interactome:
input:
"raw/Muscle_Skeletal-Dec2018.tsv"
output:
"processed/interactome.tsv"
shell:
"uv run process.py"

rule copy_curated:
input:
"curated/sources.txt",
"curated/targets.txt"
output:
"processed/sources.txt",
"processed/targets.txt"
shell:
"mkdir -p processed && cp curated/* processed"
14 changes: 14 additions & 0 deletions datasets/rn-muscle-skeletal/curated/sources.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
ENSG00000105993
ENSG00000119401
ENSG00000173402
ENSG00000120729
ENSG00000102119
ENSG00000178209
ENSG00000054654
ENSG00000155657
ENSG00000160789
ENSG00000022267
ENSG00000152795
ENSG00000131018
ENSG00000102683
ENSG00000142156
15 changes: 15 additions & 0 deletions datasets/rn-muscle-skeletal/curated/targets.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
ENSG00000108679
ENSG00000026025
ENSG00000182492
ENSG00000169504
ENSG00000159216
ENSG00000164692
ENSG00000082397
ENSG00000141753
ENSG00000136235
ENSG00000113140
ENSG00000069535
ENSG00000198947
ENSG00000196154
ENSG00000122359
ENSG00000168542
20 changes: 20 additions & 0 deletions datasets/rn-muscle-skeletal/process.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
from pathlib import Path
import pandas
import os

current_directory = Path(os.path.dirname(os.path.realpath(__file__)))
PROCESSED_DIR = current_directory / 'processed'

def process():
# TODO: what are the actual last two headers called?
data = pandas.read_csv(current_directory / 'raw' / 'Muscle_Skeletal-Dec2018.tsv',
delimiter='\t', header=None,
names=["Interactome1", "Interactome2", "Type1",
"Type2", "InteractionType", "Weight",
"Const1", "Const2"])
data = data.drop(columns=["Type1", "Type2", "InteractionType", "Const1", "Const2"])
data.insert(3, "Direction", "U")
data.to_csv(PROCESSED_DIR / 'interactome.tsv', sep='\t', header=False, index=False)

if __name__ == '__main__':
process()
Loading
Loading