GitHub

Indroduction

FastTENET is an accelerated TENET algorithm based on manycore computing.

Installation

🐍 Anaconda is recommended to use and develop FastTENET.
🐧 Linux distros are tested and recommended to use and develop FastTENET.

Anaconda virtual environment

After installing anaconda, create a conda virtual environment for FastTENET. In the following command, you can change the Python version (e.g.,python=3.7 or python=3.9).

conda create -n fasttenet python=3.9

Now, we can activate our virtual environment for FastTENET as follows.

conda activate fasttenet

Install from PyPi

pip install fasttenet

Default backend framework of the FastTENET is PyTorch Lightning.
You need to install other backend frameworks such as CuPy, Jax, and TensorFlow

Install from GitHub repository

First, clone the recent version of this repository.

git clone https://github.com/cxinsys/fasttenet.git

Now, we need to install FastTENET as a module.

cd fasttenet
pip install -e .

Default backend framework of the FastTENET is PyTorch Lightning.

Install backend frameworks

FastTENET supports several backend frameworks including CuPy, JAX, TensorFlow, PyTorch and PyTorch-Lightning.
To use frameworks, you need to install the framework manually

PyTorch Lightning

PyTorch Lightning is a required dependency library for FastTENET and is installed automatically when you install FastTENET.
If the library is not installed, you can install it manually via pip.

python -m pip install lightning

PyTorch: Installing custom PyTorch version

PyTorch is a required dependency library for FastTENET and is installed automatically when you install FastTENET.
If the library is not installed, you can install it manually.

conda install pytorch torchvision torchaudio pytorch-cuda=xx.x -c pytorch -c nvidia (check your CUDA version)

CuPy: Installing CuPy from Conda-Forge with cudatoolkit

Install Cupy from Conda-Forge with cudatoolkit supported by your driver

conda install -c conda-forge cupy cuda-version=xx.x (check your CUDA version)

JAX: Installing JAX refer to the installation guide in the project README

Install JAX with CUDA > 12.x

pip install -U "jax[cuda12]"

Use 'XLA_PYTHON_CLIENT_PREALLOCATE=false' to disables the preallocation behavior
(https://jax.readthedocs.io/en/latest/gpu_memory_allocation.html)

TensorFlow: Installing TensorFlow refer to the installation guide

Install TensorFlow-GPU with CUDA

python3 -m pip install tensorflow[and-cuda]

FastTENET tutorial

Create FastTENET instance

FastTENET class requires data path as parameter

parameters

dpath_exp_data: expression data path, required
dpath_trj_data: trajectory data path, required
dpath_branch_data: branch(cell select) data path, required
dpath_tf_data: tf data path, required
spath_result_matrix: result matrix data path, optional, default: None
make_binary: if True, make binary expression and node name file, optional, default: False

import fasttenet as fte

worker = fte.FastTENET(dpath_exp_data=dpath_exp_data,
                       dpath_trj_data=dpath_trj_data,
                       dpath_branch_data=dpath_branch_data,
                       dpath_tf_data=dpath_tf_data,
                       spath_result_matrix=spath_result_matrix,
                       make_binary=True)

aligned_data: when directly using rearranged data with expression data, trajectory data and branch data, optional
node_name: 1d array of node names, required when using data directly
tf: 1d array of tf names, optional when using data directly

import fasttenet as fte

node_name, exp_data = fte.load_exp_data(dpath_exp_data, make_binary=True)
trajectory = fte.load_time_data(dpath_trj_data, dtype=np.float32)
branch = fte.load_time_data(dpath_branch_data, dtype=np.int32)
tf = np.loadtxt(dpath_tf_data, dtype=str)

aligned_data = fte.align_data(data=exp_data, trj=trajectory, branch=branch)
        
worker = fte.FastTENET(aligned_data=aligned_data,
                       node_name=node_name,
                       tfs=tf,
                       spath_result_matrix=spath_result_matrix) # Optional

Run FastTENET

parameters

backend: optional, default: 'cpu'
device_ids: list or number of devcies to use, optional, default: [0] (cpu), [list of whole gpu devices] (gpu)
procs_per_device: The number of processes to create per device when using non 'cpu' devices, optional, default: 1
batch_size: Required
kp: kernel percentile, optional, default: 0.5
binning_method: discretization method for expression values, optional, 'FSBW-L' is recommended to achieve results similar to TENET.

result_matrix = worker.run(backend='gpu',
                           device_ids=8,
                           procs_per_device=4,
                           batch_size=2 ** 16,
                           kp=0.5,
                           binning_method='FSBW-L')

Run FastTENET with config file

Before run tutorial_config.py, batch_size parameter must be modified to fit your gpu memory size
You can set parameters and run FastTENET via a YAML file
The config file must have values set for all required parameters

Usage

python tutorial_config.py --config [config file path]

Example

python tutorial_config.py --config ../configs/config_tuck_sub.yml

Output

TE_result_matrix.txt

ex)
TE	GENE_1	GENE_2	GENE_3	...	GENE_M
GENE_1	0	0.05	0.02	...	0.004
GENE_2	0.01	0	0.04	...	0.12
GENE_3	0.003	0.003	0	...	0.001
.
.
.
GENE_M	0.34	0.012	0.032	...	0

Run FastTENET with tutorial_notf.py

Before run tutorial_notf.py, batch_size parameter must be modified to fit your gpu memory size

Usage

python tutorial_notf.py --fp_exp [expression file path] 
                        --fp_trj [trajectory file path] 
                        --fp_br [cell select file path] 
                        --backend [name of backend framework]
                        --num_devices [number of devices]
                        --batch_size [batch size]
                        --sp_rm [save file path]

Example

python tutorial_notf.py --fp_exp expression_dataTuck.csv 
                        --fp_trj pseudotimeTuck.txt 
                        --fp_br cell_selectTuck.txt 
                        --backend lightning
                        --num_devices 8
                        --batch_size 32768
                        --sp_rm TE_result_matrix.txt

Output

TE_result_matrix.txt

Run FastTENET with tutorial_tf.py

Before run tutorial_tf.py, batch_size parameter must be modified to fit your gpu memory size

Usage

python tutorial_tf.py --fp_exp [expression file path] 
                      --fp_trj [trajectory file path] 
                      --fp_br [cell select file path] 
                      --fp_tf [tf file path] 
                      --backend [name of backend framework]
                      --num_devices [number of devices]
                      --batch_size [batch size]
                      --sp_rm [save file path]

Example

python tutorial_tf.py --fp_exp expression_dataTuck.csv 
                      --fp_trj pseudotimeTuck.txt 
                      --fp_br cell_selectTuck.txt 
                      --fp_tf mouse_tfs.txt 
                      --backend lightning
                      --num_devices 8
                      --batch_size 32768
                      --sp_rm TE_result_matrix.txt

Output

TE_result_matrix.txt

ex)
TE	GENE_1	GENE_2	GENE_3	...	GENE_M
GENE_1	0	0.05	0.02	...	0.004
GENE_2	0.01	0	0.04	...	0.12
GENE_3	0.003	0.003	0	...	0.001
.
.
.
GENE_M	0.34	0.012	0.032	...	0

Downstream analysis tutorial

Create NetWeaver instance

parameters

result_matrix: result TE matrix of FastTENET, required
gene_names: gene names from result matrix, required
tfs: tf list, optional
fdr: specifying fdr, optional, default: 0.01
links: specifying number of outdegrees, optional, default: 0
is_trimming: if set True, trimming operation is applied on grn, optional, default: True
trim_threshold: trimming threshold, optional, default: 0

result_matrix = np.loadtxt(fpath_result_matrix, delimiter='\t', dtype=str)
gene_name = result_matrix[0][1:]
result_matrix = result_matrix[1:, 1:].astype(np.float32)

tf = np.loadtxt(fpath_tf, dtype=str)

weaver = fte.NetWeaver(result_matrix=result_matrix,
                       gene_names=gene_name,
                       tfs=tf,
                       fdr=fdr,
                       links=links,
                       is_trimming=True,
                       trim_threshold=trim_threshold,
                       dtype=np.float32
                       )

Run weaver

backend: optional, default: 'cpu'
device_ids: list or number of devices to use, optional, default: [0] (cpu), [list of whole gpu devices] (gpu)
batch_size: if set to 0, batch size will automatically calculated, optional, default: 0

grn, trimmed_grn = weaver.run(backend=backend,
                              device_ids=device_ids,
                              batch_size=batch_size)

Count outdegree

grn: required

outdegrees = weaver.count_outdegree(grn)
trimmed_ods = weaver.count_outdegree(trimmed_grn)

Downstream analysis with reconstruct_grn.py

reconstruct_grn.py is a tutorial script for the output of grn and outdegree files.

Usage

When specifying an fdr

python reconstruct_grn.py --fp_rm [result matrix path] --fp_tf [tf file path] --fdr [fdr] --backend [backend] --device_ids [number of device]

Example

python reconstruct_grn.py --fp_rm TE_result_matrix.txt --fp_tf mouse_tf.txt --fdr 0.01 --backend gpu --device_ids 1

Output

TE_result_matrix.fdr0.01.sif, TE_result_matrix.fdr0.01.sif.outdegrees.txt
TE_result_matrix.fdr0.01.trimIndirect0.sif, TE_result_matrix.fdr0.01.trimIndirect0.sif.outdegrees.txt

Usage

When specifying the links

python reconstruct_grn.py --fp_rm [result matrix path] --fp_tf [tf file path] --links [links] --backend [backend] --device_ids [number of device]

Example

python reconstruct_grn.py --fp_rm TE_result_matrix.txt--fp_tf mouse_tf.txt --links 1000 --backend gpu --device_ids 1

Output

TE_result_matrix.links1000.sif, TE_result_matrix.links1000.sif.outdegrees.txt
TE_result_matrix.links1000.trimIndirect0.sif, TE_result_matrix.links1000.trimIndirect0.sif.outdegrees.txt

TODO

add 'JAX' backend module
add 'PyTorch Lightning' backend module
add 'TensorFlow' backend module

Name		Name	Last commit message	Last commit date
Latest commit History 167 Commits
.github/workflows		.github/workflows
.idea		.idea
assets		assets
configs		configs
fasttenet		fasttenet
tutorials		tutorials
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

License

cxinsys/fasttenet

Folders and files

Latest commit

History

Repository files navigation

Indroduction

Installation

Anaconda virtual environment

Install from PyPi

Install from GitHub repository

Install backend frameworks

FastTENET tutorial

Create FastTENET instance

parameters

Run FastTENET

parameters

Run FastTENET with config file

Usage

Example

Output

Run FastTENET with tutorial_notf.py

Usage

Example

Output

Run FastTENET with tutorial_tf.py

Usage

Example

Output

Downstream analysis tutorial

Create NetWeaver instance

parameters

Run weaver

Count outdegree

Downstream analysis with reconstruct_grn.py

Usage

Example

Output

Usage

Example

Output

TODO

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages