Skip to content

Commit

Permalink
Merge pull request #2 from HiDiHlabs/dev
Browse files Browse the repository at this point in the history
Prepare Publication
  • Loading branch information
niklasmueboe committed Aug 24, 2024
2 parents e07bc39 + 28f81f5 commit 8db4238
Show file tree
Hide file tree
Showing 8 changed files with 191 additions and 18 deletions.
21 changes: 21 additions & 0 deletions .github/workflows/publish-pypi.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
name: Publish Python Package to PyPI

on:
release:
types: [published]

jobs:
publish:
runs-on: ubuntu-latest
environment: pypi
permissions:
id-token: write # to authenticate as Trusted Publisher to pypi.org
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.x"
cache: "pip"
- run: pip install build
- run: python -m build
- uses: pypa/gh-action-pypi-publish@release/v1
10 changes: 6 additions & 4 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
ci:
autoupdate_schedule: quarterly
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
Expand All @@ -13,7 +15,7 @@ repos:
- id: no-commit-to-branch
args: [--branch=main]
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.3.5
rev: v0.6.1
hooks:
- id: ruff
args: [--fix]
Expand All @@ -22,15 +24,15 @@ repos:
hooks:
- id: isort
- repo: https://github.com/psf/black
rev: 24.3.0
rev: 24.8.0
hooks:
- id: black
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.9.0
rev: v1.11.1
hooks:
- id: mypy
- repo: https://github.com/codespell-project/codespell
rev: v2.2.6
rev: v2.3.0
hooks:
- id: codespell
additional_dependencies:
Expand Down
29 changes: 25 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,34 @@ but can also be used independently.

## Installation

`spatialleiden` will be made available on [PyPI](https://pypi.org/) and
[bioconda](https://bioconda.github.io/). For detailed installation instructions
please refer to the [documentation](https://spatialleiden.readthedocs.io/en/stable/installation.html).
`spatialleiden` is available on [PyPI](https://pypi.org/project/spatialleiden/) and
[bioconda](https://bioconda.github.io/recipes/spatialleiden/README.html).

```sh
# PyPI
pip install spatialleiden
```

```sh
# or conda
conda install bioconda::spatialleiden
```

For detailed installation instructions please refer to the
[documentation](https://spatialleiden.readthedocs.io/stable/installation.html).

## Documentation

For documentation of the package please refer to the [ReadTheDocs page](https://spatialleiden.readthedocs.io/)
For documentation of the package please refer to the
[ReadTheDocs page](https://spatialleiden.readthedocs.io/).

## Citations

If you are using `spatialleiden` for your research please cite

Müller-Bötticher, N., Sahay, S., Eils, R., and Ishaque, N.
"SpatialLeiden - Spatially-aware Leiden clustering"
bioRxiv (2024) https://doi.org/10.1101/2024.08.23.609349

## Versioning

Expand Down
4 changes: 4 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
"sphinx.ext.napoleon",
"sphinx.ext.autosummary",
"sphinx.ext.mathjax",
"myst_nb",
]


Expand All @@ -41,6 +42,8 @@
nitpicky = True
nitpick_ignore = [("py:class", "optional")]

# MyST-NB config
nb_execution_timeout = 3 * 60

exclude_patterns: list[str] = []

Expand All @@ -51,6 +54,7 @@
python=("https://docs.python.org/3", None),
scanpy=("https://scanpy.readthedocs.io/en/stable/", None),
scipy=("https://docs.scipy.org/doc/scipy/", None),
squidpy=("https://squidpy.readthedocs.io/en/stable/", None),
)

# -- Options for HTML output -------------------------------------------------
Expand Down
10 changes: 10 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,23 @@ SpatialLeiden integrates with the `scverse <https://scverse.org/>`_ by leveragin
`scanpy <https://scanpy.readthedocs.io/>`_ and `anndata <https://anndata.readthedocs.io/>`_
but can also be used independently.

Citations
---------

If you are using `spatialleiden` for your research please cite

Müller-Bötticher, N., Sahay, S., Eils, R., and Ishaque, N.
"SpatialLeiden - Spatially-aware Leiden clustering"
bioRxiv (2024) https://doi.org/10.1101/2024.08.23.609349


.. toctree::
:maxdepth: 1
:caption: Contents:

self
installation
usage
api


Expand Down
19 changes: 10 additions & 9 deletions docs/source/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@ Installation
PyPi and ``pip``
----------------

To install ``spatialleiden`` from `PyPI <https://pypi.org/>`_ using ``pip`` just run
To install ``spatialleiden`` from `PyPI <https://pypi.org/project/spatialleiden/>`_
using ``pip`` just run

.. code-block:: bash
Expand All @@ -15,19 +16,19 @@ To install ``spatialleiden`` from `PyPI <https://pypi.org/>`_ using ``pip`` just
bioconda and ``conda``
----------------------

``spatialleiden`` is not yet available for
`Miniconda <https://docs.conda.io/en/latest/miniconda.html>`_ installations. But we are
planning to add it to `bioconda <https://bioconda.github.io/>`_ soon.
``spatialleiden`` is available for `Miniconda <https://docs.conda.io/en/latest/miniconda.html>`_
installations from the `bioconda <https://bioconda.github.io/recipes/spatialleiden/README.html>`_
channel.


.. .. code-block:: bash
.. code-block:: bash
.. conda install -c conda-forge spatialleiden
conda install bioconda::spatialleiden
.. .. note::
.. note::

.. Of course, it is also possible to use ``mamba`` instead of ``conda``
.. to speed up the installation.
Of course, it is also possible to use ``mamba`` instead of ``conda``
to speed up the installation.


From GitHub
Expand Down
113 changes: 113 additions & 0 deletions docs/source/usage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
---
file_format: mystnb
kernelspec:
name: python
jupytext:
text_representation:
extension: .md
format_name: myst
format_version: 0.13
jupytext_version: 1.16.2
---

# Usage

+++

To demonstrate the usage of the `spatialleiden` package we are going to use a MERFISH data set from [Moffit _et al._ 2018](https://doi.org/10.1126/science.aau5324) that can be downloaded from [figshare](https://figshare.com/articles/dataset/MERFISH_datasets/22565170) and then loaded as {py:class}`anndata.AnnData` object.

```{code-cell} ipython3
---
tags: [hide-cell]
---
from tempfile import NamedTemporaryFile
from urllib.request import urlretrieve
import anndata as ad
with NamedTemporaryFile(suffix=".h5ad") as h5ad_file:
urlretrieve("https://figshare.com/ndownloader/files/40038538", h5ad_file.name)
adata = ad.read_h5ad(h5ad_file)
```

First of all we are going to load the relevant packages that we will be working with as well as setting a random seed that we will use throughout this example to make the results reproducible.

```{code-cell} ipython3
import scanpy as sc
import spatialleiden as sl
import squidpy as sq
seed = 42
```

The data set consists of 155 genes and ~5,500 cells including their annotation for cell type as well as domains.

+++

## SpatialLeiden

We will do some standard preprocessing by log-transforming the data and then using PCA for dimensionality reduction. The PCA will be used to build a kNN graph in the latent gene expression space. This graph is the basis for the Leiden clustering.

```{code-cell} ipython3
sc.pp.log1p(adata)
sc.pp.pca(adata, random_state=seed)
sc.pp.neighbors(adata, random_state=seed)
```

For SpatialLeiden we need an additional graph representing the connectivities in the topological space. Here we will use a kNN graph with 10 neighbors that we generate with {py:func}`squidpy.gr.spatial_neighbors`. Alternatives are Delaunay triangulation or regular grids in case of e.g. Visium data.

We can use the calculated distances between neighboring points and transform them into connectivities using the {py:func}`spatialleiden.distance2connectivity` function.

```{code-cell} ipython3
sq.gr.spatial_neighbors(adata, coord_type="generic", n_neighs=10)
adata.obsp["spatial_connectivities"] = sl.distance2connectivity(
adata.obsp["spatial_distances"]
)
```

Now, we can already run {py:func}`spatialleiden.spatialleiden` (which we will also compare to normal Leiden clustering).

The `layer_ratio` determines the weighting between the gene expression and the topological layer and is influenced by the graph structures (i.e. how many connections exist, the edge weights, etc.); the lower the value is the closer SpatialLeiden will be to normal Leiden clustering, while higher values lead to more spatially homogeneous clusters.

The resolution has the same effect as in Leiden clustering (higher resolution will lead to more clusters) and can be defined for each of the layers (but for now is left at its default value).

```{code-cell} ipython3
sc.tl.leiden(adata, directed=False, random_state=seed)
sl.spatialleiden(adata, layer_ratio=1.8, directed=(False, True), seed=seed)
sc.pl.embedding(adata, basis="spatial", color=["leiden", "spatialleiden"])
```

We can see how Leiden clustering identifies cell types while SpatialLeiden defines domains of the tissue.

+++

## Resolution search

If you already know how many domains you expect in your sample you can use the {py:func}`spatialleiden.search_resolution` function to identify the resolutions needed to obtain the correct number of clusters.

Conceptually, this function first runs Leiden clustering multiple times while changing the resolution to identify the value leading to the desired number of clusters. Next, this procedure is repeated by running SpatialLeiden, but now the resolution of the latent layer (gene expression) is kept fixed and the resolution of the spatial layer is varied.

```{code-cell} ipython3
n_clusters = adata.obs["domain"].nunique()
latent_resolution, spatial_resolution = sl.search_resolution(
adata,
n_clusters,
latent_kwargs={"seed": seed},
spatial_kwargs={"layer_ratio": 1.8, "seed": seed, "directed": (False, True)},
)
print(f"Latent resolution: {latent_resolution:.3f}")
print(f"Spatial resolution: {spatial_resolution:.3f}")
```

In our case we can compare the resulting clusters to the annotated ground truth regions. If we are not satisfied with the results, we can go back and tweak other parameters such as the underlying neighborhood graphs or the `layer_ratio` to achieve the desired granularity of our results.

```{code-cell} ipython3
sc.pl.embedding(adata, basis="spatial", color=["spatialleiden", "Region"])
```
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,12 @@ classifiers = [
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3 :: Only",
"Topic :: Scientific/Engineering",
"Topic :: Scientific/Engineering :: Bio-Informatics",
"Typing :: Typed",
]

[project.optional-dependencies]
docs = ["sphinx", "sphinx-copybutton", "sphinx-rtd-theme"]
docs = ["sphinx", "sphinx-copybutton", "sphinx-rtd-theme", "squidpy", "myst-nb"]
dev = ["spatialleiden[docs]", "pre-commit"]

[project.urls]
Expand Down

0 comments on commit 8db4238

Please sign in to comment.