Skip to content

Conversation

@ziw-liu
Copy link
Collaborator

@ziw-liu ziw-liu commented Nov 21, 2024

An ideal cell state representation has the following properties:

  • When the cell state does not change, the neighboring time points of the same cell should have more similar embeddings compared to other cells (continuity)
  • When the cell state does change, the embeddings change a lot so that different cell states can be expressed (dynamic range)

We observed these properties for time-regularized models visually in UMAPs. However, Euclidean distances in UMAP space carry no global meaning, so the observations cannot be directly quantified by comparing UMAP values. This PR provides methods to describe the embedding similarities close to how UMAP is computed: each sample is ranked as the $k$-th nearest neighbor for each sample, and the displacement in this neighborhood ($k$ at $t_{i+1}$ for each $t_i$) can then be used to describe the fluctuation of embeddings in a way that preserves latent space topology.

Example from ALFI dataset:

image

@ziw-liu ziw-liu added enhancement New feature or request representation Representation learning (SSL) labels Nov 21, 2024
@ziw-liu ziw-liu requested a review from edyoshikun November 21, 2024 01:03
@ziw-liu
Copy link
Collaborator Author

ziw-liu commented Nov 21, 2024

Using standardized features (949df3c) changes the angular distance, but doesn't really change the shape of the rankings curve:

image

Copy link
Member

@edyoshikun edyoshikun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works like a charm! Also super helpful and thanks for this elegant implementation.

@ziw-liu ziw-liu marked this pull request as ready for review November 27, 2024 22:47
@ziw-liu ziw-liu merged commit 987874f into main Nov 27, 2024
4 checks passed
@ziw-liu ziw-liu deleted the rank-neighbors branch November 27, 2024 22:48
edyoshikun pushed a commit that referenced this pull request Dec 18, 2024
* methods to rank nearest neighbors in embeddings

* example script to plot state change of a single track

* test using scaled features
edyoshikun pushed a commit that referenced this pull request Dec 19, 2024
* methods to rank nearest neighbors in embeddings

* example script to plot state change of a single track

* test using scaled features
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request representation Representation learning (SSL)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants