Implementation of Beta VAE for benchmarking #273

edyoshikun · 2025-07-21T22:44:24Z

This PR adds the following:

BetaVAE for DynaCLR benchmark
DINOV3, Openphenom and SAM2 CLI for Benchmarking
Functions for measuring the smoothness and dynamic range of our embeddings
Archives old code. We will get rid of this but let's do it in a separate PR

… batch size.

…Is vae

viscy/representation/engine.py

Copilot

Pull Request Overview

This PR adds support for β-VAE (Beta Variational Autoencoder) models and enhances the representation learning capabilities of VisCy. It includes VAE architectures, logging utilities, evaluation metrics, and data handling improvements.

Key Changes:

Added β-VAE model architectures (2.5D and MONAI-based) with encoder/decoder implementations
Implemented comprehensive VAE logging utilities for training monitoring and latent space visualization
Enhanced evaluation metrics including smoothness analysis, displacement computation, and GPU-accelerated distance calculations
Added cell division triplet dataset for .npy file handling
Improved data module with validation augmentation control

Reviewed Changes

Copilot reviewed 31 out of 42 changed files in this pull request and generated 12 comments.

Show a summary per file

File	Description
`viscy/transforms/_redef.py`	Added NormalizeIntensityd import and class wrapper; moved RandFlipd inside another class (indentation issue)
`viscy/transforms/__init__.py`	Exported NormalizeIntensityd transform
`viscy/representation/vae_logging.py`	New comprehensive VAE logging utilities for metrics, visualizations, and diagnostics
`viscy/representation/vae.py`	New VAE model implementations (encoder, decoder, 2.5D and MONAI variants)
`viscy/representation/engine.py`	Added BetaVaeModule Lightning wrapper and removed log_embeddings parameter
`viscy/representation/multi_modal.py`	Added embedding_log_frequency parameter
`viscy/representation/evaluation/smoothness.py`	New smoothness metrics computation for embeddings
`viscy/representation/evaluation/lca.py`	Enhanced logistic regression with train_ratio parameter and stratified sampling
`viscy/representation/evaluation/distance.py`	Refactored displacement computation using pairwise distance matrix
`viscy/representation/evaluation/dimensionality_reduction.py`	Added scaling option and random_state to PHATE computation
`viscy/representation/evaluation/clustering.py`	Added GPU-accelerated pairwise distance computation with PyTorch
`viscy/data/triplet.py`	Added augment_validation parameter for controlled augmentation
`viscy/data/cell_division_triplet.py`	New dataset and data module for cell division .npy files
Application scripts	Various evaluation, visualization, and benchmarking scripts

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

viscy/representation/vae.py

viscy/data/cell_division_triplet.py

viscy/representation/evaluation/lca.py

applications/contrastive_phenotyping/evaluation/smoothness/compute_smoothness.py

applications/benchmarking/DynaCLR/SAM2/sam2_visualizations.py

applications/contrastive_phenotyping/evaluation/smoothness/compute_smoothness.py

applications/pseudotime_analysis/evaluation/compare_dtw_embeddings_sam2.py

viscy/representation/evaluation/smoothness.py

Co-authored-by: Copilot <[email protected]>

…ings_sam2.py Co-authored-by: Copilot <[email protected]>

…pute_smoothness.py Co-authored-by: Copilot <[email protected]>

…ings_sam2.py Co-authored-by: Copilot <[email protected]>

…pute_smoothness.py Co-authored-by: Copilot <[email protected]>

…SD_v2.py Co-authored-by: Copilot <[email protected]>

mattersoflight

Approving and will do integration test by trying to reduce a couple of panels.

* simualte different embeddings * update the msd calculation to re-use cdist functions in the repo * adding a test for the msd * removing unused msd functions * renaming msd to compute_track_displacement * default to cosine distance * adding the gradient attribution video. * extend to training ratios * demo beta_vae 2.5D * improving the logging for readability and drop pythae baseclasses * condense the logging to have less tabs. * fix disentagle metrics * fixing beta warmup bug * renaming to loss * updating architecture to flatten vs spatial VAE with convs * chaning to use mse with mean reduction and normalizing the kl loss by batch size. * optunea proof of concept * add normalized sampled into the transforms so we can use it with MONAIs vae * update loss debugging code * adding sync for disentaglement metrics * adding the dataloader for rpe1 dataset and plotting utils * cleanup the vae and add the monai to lightning. adding configs * add saving hyperparameters * fix hyperparameter logging * add embedding logging to the CLIP version * test and plot of monaivae * handle monai_vae 2d * redifining rotation agumentsations * adding optional scaling to phate * adding alias and output 2d * normalizing by also the latent dim and swapping to FP32 for forward pass to avoid overflow with log and exp * update test for magnitudes * expose the normalization for vae * add sam 2 test * refactor smoothness metrics * rever to normalalize kl wrt to batch size and removing the the beta min value * commit dtwembeddings w sam * added a clamp to logvar, switch to mse loss sum reduction like the original formulation. * remove unecessary vae logging losses. * add a way to handle when using 'mean' reduction for proper scaling * adding optional config for middle slice index for computing sam2 embeddings and dinov3 * converting latent stats active_dimensions parameter to float to remove warning * ruff * removing the optuna config * numpy docstring * fix compute smoothness script * archiving old scripts * re org the pc features scripts * embeddings for phase * add smoothness (mean rand vs adj frame) to the csv * archiving old beta vae code * ruff * fix format * fix typo * remove the archived unecessary files * remove the test run archived file * adding normalizeintensity * fixing the vae_logging typing and removing PC plotting from here * fixing the compute_embedding_smoothness docstring * simplify the distance metrics and removing deprecated functions and scripts * remove deprecated functions from clustering.py * add timelpase to grad_attr.py script * refactoring the betavaemodule. removing the hyperparamter logging, adding the nn.Module as input for typing purposes and removing the fp32 custom fwd * remove the optuna dependency * deleting old msd test * ruff format * fix to explicitly stratify on fov level * adding reference to dataset for rpe1 * fix pyproject.toml dev * format and lint * restore no-augmentation flag effect * format tests * rename the sam2 file * removing unused arguments for logging embeddings. * removing duplication in the lca * remove disentaglement metrics * vectorized the anchor filtering for celldivisiontriplet dataset * map the channels to the rpe dataset convention * fix logistic regresion standardization * update rpe classifier to include mitosis * ruff * remove unused logging * datamodule agnostic * cleaning up duplicated code in the benchmarking * cleanup vae * keeping it consistent and using residual units * fix typings betavaemonai * update smoothness to handle adata * update clustering method and add test * pre-commit * Update viscy/data/cell_division_triplet.py Co-authored-by: Copilot <[email protected]> * Update applications/benchmarking/DynaCLR/SAM2/sam2_visualizations.py Co-authored-by: Copilot <[email protected]> * Update applications/pseudotime_analysis/evaluation/compare_dtw_embeddings_sam2.py Co-authored-by: Copilot <[email protected]> * Update applications/pseudotime_analysis/evaluation/compare_dtw_embeddings_sam2.py Co-authored-by: Copilot <[email protected]> * Update applications/contrastive_phenotyping/evaluation/smoothness/compute_smoothness.py Co-authored-by: Copilot <[email protected]> * Update applications/pseudotime_analysis/evaluation/compare_dtw_embeddings_sam2.py Co-authored-by: Copilot <[email protected]> * Update applications/contrastive_phenotyping/evaluation/smoothness/compute_smoothness.py Co-authored-by: Copilot <[email protected]> * Update applications/contrastive_phenotyping/evaluation/archive/ALFI_MSD_v2.py Co-authored-by: Copilot <[email protected]> * valuerror on the fidn peaks function * add literal to the betavae25d normalization * clipping similarity that was breaking the tests --------- Co-authored-by: Ziwen Liu <[email protected]> Co-authored-by: Copilot <[email protected]>

edyoshikun added 14 commits July 9, 2025 12:14

simualte different embeddings

4cc930c

update the msd calculation to re-use cdist functions in the repo

905247c

adding a test for the msd

c39d1d6

removing unused msd functions

cf97df4

renaming msd to compute_track_displacement

4c1a492

default to cosine distance

8638168

adding the gradient attribution video.

c40b64c

extend to training ratios

b51c1b8

demo beta_vae 2.5D

4eabce3

improving the logging for readability and drop pythae baseclasses

c976f98

condense the logging to have less tabs.

29a822e

fix disentagle metrics

86b3467

fixing beta warmup bug

2bb6d19

renaming to loss

8e8eba8

edyoshikun mentioned this pull request Jul 23, 2025

VanillaVAE #223

Closed

edyoshikun added 12 commits July 23, 2025 14:55

updating architecture to flatten vs spatial VAE with convs

116183d

chaning to use mse with mean reduction and normalizing the kl loss by…

e47de7c

… batch size.

optunea proof of concept

cfdc51a

add normalized sampled into the transforms so we can use it with MONA…

bcc1406

…Is vae

update loss debugging code

53a3e2d

adding sync for disentaglement metrics

a3510d0

adding the dataloader for rpe1 dataset and plotting utils

252a4d0

cleanup the vae and add the monai to lightning. adding configs

385322b

add saving hyperparameters

e0bf813

fix hyperparameter logging

a1ad2dc

add embedding logging to the CLIP version

17c1e89

test and plot of monaivae

4269787

ziw-liu reviewed Aug 6, 2025

View reviewed changes

viscy/representation/engine.py Outdated Show resolved Hide resolved

edyoshikun added 2 commits August 13, 2025 16:26

handle monai_vae 2d

ed7f5a7

redifining rotation agumentsations

5e27e7c

fix typings betavaemonai

35c9f75

edyoshikun requested a review from ziw-liu October 23, 2025 00:16

edyoshikun added 5 commits October 27, 2025 16:53

Merge branch 'main' into beta_vae

72441a2

update smoothness to handle adata

eed55f3

update clustering method and add test

4674e91

Merge branch 'main' into beta_vae

6a94385

pre-commit

b4a6398

edyoshikun requested review from alxndrkalinin and srivarra October 28, 2025 21:51

alxndrkalinin requested a review from Copilot October 28, 2025 23:11

Copilot AI reviewed Oct 28, 2025

View reviewed changes

edyoshikun and others added 10 commits October 29, 2025 17:19

Update viscy/data/cell_division_triplet.py

69d928f

Co-authored-by: Copilot <[email protected]>

Update applications/benchmarking/DynaCLR/SAM2/sam2_visualizations.py

d42d6da

Co-authored-by: Copilot <[email protected]>

Update applications/pseudotime_analysis/evaluation/compare_dtw_embedd…

84634e8

…ings_sam2.py Co-authored-by: Copilot <[email protected]>

Update applications/pseudotime_analysis/evaluation/compare_dtw_embedd…

817be47

…ings_sam2.py Co-authored-by: Copilot <[email protected]>

Update applications/contrastive_phenotyping/evaluation/smoothness/com…

2cc16e9

…pute_smoothness.py Co-authored-by: Copilot <[email protected]>

Update applications/pseudotime_analysis/evaluation/compare_dtw_embedd…

3acad6f

…ings_sam2.py Co-authored-by: Copilot <[email protected]>

Update applications/contrastive_phenotyping/evaluation/smoothness/com…

2a65bf0

…pute_smoothness.py Co-authored-by: Copilot <[email protected]>

Update applications/contrastive_phenotyping/evaluation/archive/ALFI_M…

9421b03

…SD_v2.py Co-authored-by: Copilot <[email protected]>

valuerror on the fidn peaks function

a3f015e

add literal to the betavae25d normalization

cf7450a

mattersoflight requested review from mattersoflight and removed request for ziw-liu November 3, 2025 17:48

mattersoflight approved these changes Nov 3, 2025

View reviewed changes

edyoshikun mentioned this pull request Nov 3, 2025

Add integration tests for our DynaCLR pipline #333

Open

edyoshikun added 2 commits November 3, 2025 09:57

Merge branch 'main' into beta_vae

6ddf2cf

clipping similarity that was breaking the tests

76f8ddb

edyoshikun merged commit e00448b into main Nov 3, 2025
4 checks passed

edyoshikun deleted the beta_vae branch November 3, 2025 18:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implementation of Beta VAE for benchmarking #273

Implementation of Beta VAE for benchmarking #273

Uh oh!

edyoshikun commented Jul 21, 2025 •

edited

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mattersoflight left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Implementation of Beta VAE for benchmarking #273

Implementation of Beta VAE for benchmarking #273

Uh oh!

Conversation

edyoshikun commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Key Changes:

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mattersoflight left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

edyoshikun commented Jul 21, 2025 •

edited

Loading