Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
13824a6
chore: re-refactor PRM properties
tristan-f-r Jun 23, 2025
543915f
fix: correct prop names
tristan-f-r Jun 23, 2025
352ba56
style: fmt
tristan-f-r Jun 23, 2025
2fdb13c
chore: add second doi in pl
tristan-f-r Jun 23, 2025
49fd4be
refactor: don't use globals in runner
tristan-f-r Jun 24, 2025
c2e64f7
feat: begin config refactor
tristan-f-r Jun 25, 2025
4d1a19c
feat: mostly structured config
tristan-f-r Jun 25, 2025
b56ecde
feat: add enum variants on ml
tristan-f-r Jun 25, 2025
bf95888
fix: some defaults
tristan-f-r Jun 25, 2025
51d6a7b
feat: fully finish config parsing
tristan-f-r Jun 25, 2025
a27d38d
style: fmt
tristan-f-r Jun 25, 2025
dd4674a
fix: remove dep mark, use strict is None
tristan-f-r Jun 25, 2025
5a8826d
chore: correct config loc
tristan-f-r Jun 25, 2025
a47b0df
fix: specify hac params
tristan-f-r Jun 25, 2025
d721656
Merge branch 'umain' into config-pydantic
tristan-f-r Jun 26, 2025
8d75604
fix: expand class params
tristan-f-r Jun 26, 2025
afa1de5
fix: expand on pca_params
tristan-f-r Jun 26, 2025
5eefc51
fix: drop include dict
tristan-f-r Jun 26, 2025
5243186
fix: call items
tristan-f-r Jun 26, 2025
3b20c48
fix: better typing and deafults
tristan-f-r Jun 26, 2025
4c1fcb6
Merge branch 'umain' into config-pydantic
tristan-f-r Jun 26, 2025
2d4a90f
style: fmt
tristan-f-r Jun 26, 2025
31ba9d8
docs: mention oi2 paper link
tristan-f-r Jul 1, 2025
a296a9b
Merge branch 'umain' into property-expe
tristan-f-r Jul 1, 2025
fcbf673
chore: add nodoi to rwr
tristan-f-r Jul 1, 2025
ec7e19a
Merge branch 'umain' into no-globals
tristan-f-r Jul 3, 2025
b9352e8
fix: use correct naming convention for strwr
tristan-f-r Jul 3, 2025
1c55925
Merge branch 'umain' into config-pydantic
tristan-f-r Jul 9, 2025
2a4fb2e
refactor: add config forbid
tristan-f-r Jul 9, 2025
4ded57e
refactor: update config imports
tristan-f-r Jul 9, 2025
22b5686
refactor: better names to schema files
tristan-f-r Jul 9, 2025
fd091cb
Merge branch 'umain' into property-expe
tristan-f-r Jul 10, 2025
7df701d
chore: add btb doi
tristan-f-r Jul 10, 2025
ea59e4c
fix: no default include, mention model_config allow reason
tristan-f-r Jul 11, 2025
fa7d7c9
fix(config): case-insensitive check on labels
tristan-f-r Jul 11, 2025
52eab21
refactor: merge config
tristan-f-r Jul 14, 2025
5343fd0
chore: deduplicate err
tristan-f-r Jul 14, 2025
3c305f4
docs: use concepts link
tristan-f-r Jul 14, 2025
7d521ce
Merge branch 'property-expe' into config-args
tristan-f-r Jul 14, 2025
49e50a0
refactor: mv util_enum -> util
tristan-f-r Jul 14, 2025
cb28f61
docs: correct util_enum path
tristan-f-r Jul 14, 2025
9c85c56
Merge branch 'config-pydantic' into config-args
tristan-f-r Jul 14, 2025
647f947
feat: rough draft of args design
tristan-f-r Jul 14, 2025
76011e0
feat: type oi1/oi2, rwr/strwr
tristan-f-r Jul 14, 2025
94b50c8
refactor: meo, mcf, pl types
tristan-f-r Jul 14, 2025
09fa1ba
chore: begin slowly updating
tristan-f-r Jul 14, 2025
32d4b5c
refactor: moving more tests
tristan-f-r Jul 14, 2025
9b539e9
fix: correct params
tristan-f-r Jul 14, 2025
da67711
fix: specify default args out of run
tristan-f-r Jul 14, 2025
45cfe87
fix: more defaults
tristan-f-r Jul 14, 2025
a6406e2
Merge branch 'umain' into config-args
tristan-f-r Jul 14, 2025
cf93cec
Merge branch 'no-globals' into config-args
tristan-f-r Jul 14, 2025
e080857
feat: begin algorithm parsing
tristan-f-r Jul 14, 2025
53f55e2
fix: clean up type errors, begin nondetermnism
tristan-f-r Jul 14, 2025
2c938ed
fix: add spras.config to pyproject
tristan-f-r Jul 14, 2025
7c2454b
Merge branch 'config-pydantic' into config-args
tristan-f-r Jul 14, 2025
a4e265d
chore: begin little utility
tristan-f-r Jul 14, 2025
145b2ec
chore: mv container schema changes over
tristan-f-r Jul 14, 2025
5effe69
feat: initial schema
tristan-f-r Jul 15, 2025
398350e
feat: more algs schema handling
tristan-f-r Jul 15, 2025
72c4cbd
feat: default runs for default algorithms
tristan-f-r Jul 15, 2025
2ef2672
feat: function running
tristan-f-r Jul 15, 2025
9442b64
chore: drop play
tristan-f-r Jul 15, 2025
60b562f
fix(config): don't try to parse in config.py
tristan-f-r Jul 15, 2025
c1947e6
fix: subscriptability
tristan-f-r Jul 15, 2025
8beaf72
fix: auto-discriminator mapping & forbid
tristan-f-r Jul 15, 2025
b07a7ef
style: fmt
tristan-f-r Jul 15, 2025
0bcd1d1
fix: coerce fields to validate default
tristan-f-r Jul 15, 2025
1cb5d17
fix: test
tristan-f-r Jul 15, 2025
c93244f
fix: correct all algorithm usage
tristan-f-r Jul 15, 2025
69268f4
chore: talk about resumability
tristan-f-r Jul 15, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,8 +142,6 @@ and your editor's interpreter is set to using the SPRAS environment over the bas
Note the behaviors of the `request_node_columns` function when there are missing values in that column of the node table and when multiple columns are requested.
`request_node_columns` always returns the `NODEID` column in addition to the requested columns.

Note: If you encounter a `'property' object is not iterable` error arising from inside the Snakefile, this means that `required_inputs` is not set. This is because when `required_inputs` is not set inside an algorithm wrapper, it falls back to the underlying unimplemented function inside the PRM base class, which, while it is marked as a property function, is non-static; therefore, when the runner utility class tries to dynamically fetch `required_inputs` with reflection, it ends up grabbing the `property` function instead of the underlying error, and tries to iterate over it (since `required_inputs` is usually a list.)

Now implement the `generate_inputs` function.
Start by inspecting the `omicsintegrator1.py` example, but note the differences in the expected file formats generated for the two algorithms with respect to the header rows and node prize column.
The selected nodes should be any node in the dataset that has a prize set, any node that is active, any node that is a source, or any node that is a target.
Expand Down
4 changes: 2 additions & 2 deletions Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ import yaml
from spras.dataset import Dataset
from spras.evaluation import Evaluation
from spras.analysis import ml, summary, cytoscape
import spras.config as _config
import spras.config.config as _config

# Snakemake updated the behavior in the 6.5.0 release https://github.com/snakemake/snakemake/pull/1037
# and using the wrong separator prevents Snakemake from matching filenames to the rules that can produce them
Expand All @@ -25,7 +25,7 @@ algorithm_params = _config.config.algorithm_params
algorithm_directed = _config.config.algorithm_directed
pca_params = _config.config.pca_params
hac_params = _config.config.hac_params
FRAMEWORK = _config.config.container_framework
FRAMEWORK = _config.config.container_settings.framework

# Return the dataset or gold_standard dictionary from the config file given the label
def get_dataset(_datasets, label):
Expand Down
94 changes: 47 additions & 47 deletions config/config.yaml
Original file line number Diff line number Diff line change
@@ -1,32 +1,36 @@
# yaml-language-server: $schema=./schema.json

# Global workflow control

# The length of the hash used to identify a parameter combination
hash_length: 7

# Specify the container framework used by each PRM wrapper. Valid options include:
# - docker (default if not specified)
# - singularity -- Also known as apptainer, useful in HPC/HTC environments where docker isn't allowed
# - dsub -- experimental with limited support, used for running on Google Cloud with the All of Us cloud environment.
# - There is no support for other environments at the moment.
container_framework: docker

# Only used if container_framework is set to singularity, this will unpack the singularity containers
# to the local filesystem. This is useful when PRM containers need to run inside another container,
# such as would be the case in an HTCondor/OSPool environment.
# NOTE: This unpacks singularity containers to the local filesystem, which will take up space in a way
# that persists after the workflow is complete. To clean up the unpacked containers, the user must
# manually delete them. For convenience, these unpacked files will exist in the current working directory
# under `unpacked`.
unpack_singularity: false

# Allow the user to configure which container registry containers should be pulled from
# Note that this assumes container names are consistent across registries, and that the
# registry being passed doesn't require authentication for pull actions
container_registry:
base_url: docker.io
# The owner or project of the registry
# For example, "reedcompbio" if the image is available as docker.io/reedcompbio/allpairs
owner: reedcompbio
# Collection of container options
containers:
# Specify the container framework used by each PRM wrapper. Valid options include:
# - docker (default if not specified)
# - singularity -- Also known as apptainer, useful in HPC/HTC environments where docker isn't allowed
# - dsub -- experimental with limited support, used for running on Google Cloud with the All of Us cloud environment.
# - There is no support for other environments at the moment.
framework: docker

# Only used if container_framework is set to singularity, this will unpack the singularity containers
# to the local filesystem. This is useful when PRM containers need to run inside another container,
# such as would be the case in an HTCondor/OSPool environment.
# NOTE: This unpacks singularity containers to the local filesystem, which will take up space in a way
# that persists after the workflow is complete. To clean up the unpacked containers, the user must
# manually delete them. For convenience, these unpacked files will exist in the current working directory
# under `unpacked`.
unpack_singularity: false

# Allow the user to configure which container registry containers should be pulled from
# Note that this assumes container names are consistent across registries, and that the
# registry being passed doesn't require authentication for pull actions
registry:
base_url: docker.io
# The owner or project of the registry
# For example, "reedcompbio" if the image is available as docker.io/reedcompbio/allpairs
owner: reedcompbio

# This list of algorithms should be generated by a script which checks the filesystem for installs.
# It shouldn't be changed by mere mortals. (alternatively, we could add a path to executable for each algorithm
Expand All @@ -48,23 +52,23 @@ container_registry:

algorithms:
- name: "pathlinker"
params:
include: true
include: true
runs:
run1:
k: range(100,201,100)

- name: "omicsintegrator1"
params:
include: true
include: true
runs:
run1:
b: [5, 6]
w: np.linspace(0,5,2)
d: 10
dummy_mode: "file" # Or "terminals", "all", "others"

- name: "omicsintegrator2"
params:
include: true
include: true
runs:
run1:
b: 4
g: 0
Expand All @@ -73,48 +77,46 @@ algorithms:
g: 3

- name: "meo"
params:
include: true
include: true
runs:
run1:
max_path_length: 3
local_search: "Yes"
local_search: true
rand_restarts: 10

- name: "mincostflow"
params:
include: true
include: true
runs:
run1:
flow: 1 # The flow must be an int
capacity: 1

- name: "allpairs"
params:
include: true
include: true

- name: "domino"
params:
include: true
include: true
runs:
run1:
slice_threshold: 0.3
module_threshold: 0.05

- name: "strwr"
params:
include: true
include: true
runs:
run1:
alpha: [0.85]
threshold: [100, 200]

- name: "rwr"
params:
include: true
include: true
runs:
run1:
alpha: [0.85]
threshold: [100, 200]

- name: "bowtiebuilder"
params:
include: true
include: true

# Here we specify which pathways to run and other file location information.
# DataLoader.py can currently only load a single dataset
Expand Down Expand Up @@ -164,8 +166,6 @@ reconstruction_settings:
# TODO move to global
reconstruction_dir: "output"

run: true

analysis:
# Create one summary per pathway file and a single summary table for all pathways for each dataset
summary:
Expand Down
129 changes: 44 additions & 85 deletions config/egfr.yaml
Original file line number Diff line number Diff line change
@@ -1,116 +1,76 @@
# The length of the hash used to identify a parameter combination
hash_length: 7

# Specify the container framework used by each PRM wrapper. Valid options include:
# - docker (default if not specified)
# - singularity -- Also known as apptainer, useful in HPC/HTC environments where docker isn't allowed
# - dsub -- experimental with limited support, used for running on Google Cloud
container_framework: docker

# Only used if container_framework is set to singularity, this will unpack the singularity containers
# to the local filesystem. This is useful when PRM containers need to run inside another container,
# such as would be the case in an HTCondor/OSPool environment.
# NOTE: This unpacks singularity containers to the local filesystem, which will take up space in a way
# that persists after the workflow is complete. To clean up the unpacked containers, the user must
# manually delete them.
unpack_singularity: false
# yaml-language-server: $schema=./schema.json

# Allow the user to configure which container registry containers should be pulled from
# Note that this assumes container names are consistent across registries, and that the
# registry being passed doesn't require authentication for pull actions
container_registry:
base_url: docker.io
# The owner or project of the registry
# For example, "reedcompbio" if the image is available as docker.io/reedcompbio/allpairs
owner: reedcompbio
hash_length: 7
containers:
framework: docker
unpack_singularity: false
registry:
base_url: docker.io
owner: reedcompbio

algorithms:
- name: pathlinker
params:
include: true
include: true
runs:
run1:
k:
- 10
- 20
- 70
- name: omicsintegrator1
params:
include: true
include: true
runs:
run1:
b:
- 0.55
- 2
- 10
d:
- 10
g:
- 1e-3
r:
- 0.01
w:
- 0.1
mu:
- 0.008
d: 10
g: 1e-3
r: 0.01
w: 0.1
mu: 0.008
dummy_mode: ["file"]
- name: omicsintegrator2
params:
include: true
include: true
runs:
run1:
b:
- 4
g:
- 0
b: 4
g: 0
run2:
b:
- 2
g:
- 3
b: 2
g: 3
- name: meo
params:
include: true
include: true
runs:
run1:
local_search:
- "Yes"
max_path_length:
- 3
rand_restarts:
- 10
local_search: true
max_path_length: 3
rand_restarts: 10
run2:
local_search:
- "No"
max_path_length:
- 2
rand_restarts:
- 10
local_search: false
max_path_length: 2
rand_restarts: 10
- name: allpairs
params:
include: true
include: true
- name: domino
params:
include: true
include: true
runs:
run1:
slice_threshold:
- 0.3
module_threshold:
- 0.05
slice_threshold: 0.3
module_threshold: 0.05
- name: mincostflow
params:
include: true
include: true
runs:
run1:
capacity:
- 15
flow:
- 80
capacity: 15
flow: 80
run2:
capacity:
- 1
flow:
- 6
capacity: 1
flow: 6
run3:
capacity:
- 5
flow:
- 60
capacity: 5
flow: 60
datasets:
- data_dir: input
edge_files:
Expand All @@ -129,7 +89,6 @@ gold_standards:
reconstruction_settings:
locations:
reconstruction_dir: output/egfr
run: true
analysis:
cytoscape:
include: true
Expand Down
Loading
Loading