DAM-VSR: Disentanglement of Appearance and Motion for Video Super-Resolution (ACM SIGGRAPH 2025)

Zhe Kong · Le Li · Yong Zhang* · Feng Gao · Shaoshu Yang · Tao Wang · Kaihao Zhang · Zhuoliang Kang ·

Xiaoming Wei · Guanying Chen · Wenhan Luo*

^*Corresponding Authors

🏷️ Change Log

[2025/7/2] 🔥 We release the source code and technical report of DAM-VSR.

🔆 Method Overview

🔧 Dependencies and Installation

The code requires python==3.10.14, as well as pytorch==2.1.1 and torchvision==0.16.1. Please follow the instructions here to install both PyTorch and TorchVision dependencies. Installing both PyTorch and TorchVision with CUDA support is strongly recommended. The project has been tested on CUDA version of 12.1.

conda create -n dam-vsr python=3.10.14
conda activate dam-vsr
pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu121
pip install xformers==0.0.23 --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt

⏬ Pretrained Model Preparation

1) Automatic Download

You can directly download all the required model through the following command:

python download.py

All the models will be downloaded to the checkpoints path. Alternatively, you can download each model manually.

2) Manual Download

Download the following models and put them to the checkpoints path.

1. Video Super-Resolution Models: stabilityai/stable-video-diffusion-img2vid.

2. SupIR: stabilityai/stable-diffusion-xl-base-1.0, openai/clip-vit-large-patch14-336, liuhaotian/llava-v1.5-13b, openai/clip-vit-large-patch14, laion/CLIP-ViT-bigG-14-laion2B-39B-b160k, SUPIR-v0Q.

3. InvSR: stabilityai/sd-turbo, noise_predictor_sd_turbo_v5.pth

4. ResShift: resshift_realsrx4_s4_v3.pth, autoencoder_vq_f4.pth

5. DAM-VSR: Fucius/DAM-VSR

The checkpoints directory structure should be arranged as:

checkpoints
    ├── stable-diffusion-xl-base-1.0
    ├── sd-turbo
    ├── DAM-VSR
    │       ├── SUPIR-v0Q.ckpt
    │       ├── controlnet
    │       ├── unet
    │       ├── lora
    │       ├── autoencoder_vq_f4.pth
    │       └── resshift_realsrx4_s4_v3.pth
    ├── clip-vit-large-patch14-336
    ├── llava-v1.5-13b
    ├── CLIP-ViT-bigG-14-laion2B-39B-b160k
    ├── stable-video-diffusion-img2vid
    ├── clip-vit-large-patch14
    └── noise_predictor_sd_turbo_v5.pth

🚀 Inference

For image super-resoolution, you can choose SupIR, InvSR or ResShift.

For real-world or AIGC videos, it is recommended to utilize SupIR or InvSR for image super-resolution. Among them, SupIR can achieve the best visual effects, while InvSR can achieve the best evaluation metrics.

python infer.py \
    --validation_data_dir="example/example1.mp4" \
    --max_cfg 3.0 \
    --backwrad_scale 0.3 \
    --sr_type="supir" \ # or "invsr"
    --use_usm

For synthetic degradations, it is recommended to utilize ResShift for image super-resolution.

python infer.py \
    --validation_data_dir="example/example1.mp4" \
    --max_cfg 1.0 \
    --backwrad_scale 1.0 \
    --lora_path='checkpoints/DAM-VSR/lora/vae-decoder.safetensors' \
    --sr_type="resshift"

We also provide a lighter version that does not use bidirectional sampling for accelerated generation.

python infer_accelerated.py \
    --validation_data_dir="example/example1.mp4" \
    --sr_type="supir" \ # invsr/resshift
    --use_usm

❤️ Acknowledgments

This project is based on SupIR, InvSR, ResShift, svd-temporal-controlnet and svd_keyframe_interpolation. Thanks for their awesome works.

🎓Citations

If our project helps your research or work, please consider citing our paper:

@inproceedings{kong2025dam,
  title={DAM-VSR: Disentanglement of Appearance and Motion for Video Super-Resolution}, 
  author={Zhe Kong, Le Li, Yong Zhang, Feng Gao, Shaoshu Yang, Tao Wang, Kaihao Zhang, Zhuoliang Kang, Xiaoming Wei, Guanying Chen, Wenhan Luo},
  year={2025},
  booktitle={ACM SIGGRAPH 2025},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DAM-VSR: Disentanglement of Appearance and Motion for Video Super-Resolution (ACM SIGGRAPH 2025)

🏷️ Change Log

🔆 Method Overview

🔧 Dependencies and Installation

⏬ Pretrained Model Preparation

1) Automatic Download

2) Manual Download

1. Video Super-Resolution Models: stabilityai/stable-video-diffusion-img2vid.

2. SupIR: stabilityai/stable-diffusion-xl-base-1.0, openai/clip-vit-large-patch14-336, liuhaotian/llava-v1.5-13b, openai/clip-vit-large-patch14, laion/CLIP-ViT-bigG-14-laion2B-39B-b160k, SUPIR-v0Q.

3. InvSR: stabilityai/sd-turbo, noise_predictor_sd_turbo_v5.pth

4. ResShift: resshift_realsrx4_s4_v3.pth, autoencoder_vq_f4.pth

5. DAM-VSR: Fucius/DAM-VSR

🚀 Inference

❤️ Acknowledgments

🎓Citations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
SUPIR		SUPIR
assets		assets
checkpoints		checkpoints
example		example
invsr		invsr
llava		llava
resshift		resshift
sgm		sgm
src		src
README.md		README.md
download.py		download.py
infer.py		infer.py
infer_accelerated.py		infer_accelerated.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

DAM-VSR: Disentanglement of Appearance and Motion for Video Super-Resolution (ACM SIGGRAPH 2025)

🏷️ Change Log

🔆 Method Overview

🔧 Dependencies and Installation

⏬ Pretrained Model Preparation

1) Automatic Download

2) Manual Download

1. Video Super-Resolution Models: stabilityai/stable-video-diffusion-img2vid.

2. SupIR: stabilityai/stable-diffusion-xl-base-1.0, openai/clip-vit-large-patch14-336, liuhaotian/llava-v1.5-13b, openai/clip-vit-large-patch14, laion/CLIP-ViT-bigG-14-laion2B-39B-b160k, SUPIR-v0Q.

3. InvSR: stabilityai/sd-turbo, noise_predictor_sd_turbo_v5.pth

4. ResShift: resshift_realsrx4_s4_v3.pth, autoencoder_vq_f4.pth

5. DAM-VSR: Fucius/DAM-VSR

🚀 Inference

❤️ Acknowledgments

🎓Citations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages