Skip to content

[ICCV 2025] Fewer Denoising Steps or Cheaper Per-Step Inference: Towards Compute-Optimal Diffusion Model Deployment

Notifications You must be signed in to change notification settings

GATECH-EIC/PostDiff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fewer Denoising Steps or Cheaper Per-Step Inference: Towards Compute-Optimal Diffusion Model Deployment [ICCV 2025]

The code for the paper Fewer Denoising Steps or Cheaper Per-Step Inference: Towards Compute-Optimal Diffusion Model Deployment (ICCV 2025)

  • Authors: Zhenbang Du*, Yonggan Fu*, Lifu Wang*, Jiayi Qian, Xiao Luo, and Yingyan (Celine) Lin. *Equal Contribution

Abstract

Diffusion models have shown remarkable success across generative tasks, yet their high computational demands challenge deployment on resource-limited platforms. This paper investigates a critical question for compute-optimal diffusion model deployment: Under a post-training setting without fine-tuning, is it more effective to reduce the number of denoising steps or to use a cheaper per-step inference? Intuitively, reducing the number of denoising steps increases the variability of the distributions across steps, making the model more sensitive to compression. In contrast, keeping more denoising steps makes the differences smaller, preserving redundancy, and making post-training compression more feasible. To systematically examine this, we propose PostDiff, a training-free framework for accelerating pre-trained diffusion models by reducing redundancy at both the input level and module level in a post-training manner. At the input level, we propose a mixed-resolution denoising scheme based on the insight that reducing generation resolution in early denoising steps can enhance low-frequency components and improve final generation fidelity. At the module level, we employ a hybrid module caching strategy to reuse computations across denoising steps. Extensive experiments and ablation studies demonstrate that (1) PostDiff can significantly improve the fidelity-efficiency trade-off of state-of-the-art diffusion models, and (2) to boost efficiency while maintaining decent generation fidelity, reducing per-step inference cost is often more effective than reducing the number of denoising steps.

image

Dependency

conda create -n postdiff python=3.10.12
conda activate postdiff
conda install pytorch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 pytorch-cuda=12.1 -c pytorch -c nvidia

pip install -r requirements.txt

Inference

# You can change the hyperparameters or the backbone models here
# Recommended settings can be found in the paper

bash inference.sh

or

export CUDA_VISIBLE_DEVICES=0

python main.py \
  --saved_path "ldm" \
  --model "ldm" \
  --prompt "A serene lakeside at sunset, ultra-detailed digital art" \
  --num_img 1 \
  --batch_size 1 \
  --inference_step 20 \
  --strength 0.5 \
  --cfg 7.5 \
  --gate_step 15 \
  --sp_interval 1 \
  --fi_interval 1 \
  --warm_up 0 \
  --cache_interval 2 \
  --cache_branch_id 0 \
  --device cuda:0 \
  --deepcache \
  --use_progressive

Citation

If you find this work useful for your research, please cite:

@inproceedings{zdu2025postdiff,
  title={Early-Bird Diffusion: Investigating and Leveraging Timestep-Aware Early-Bird Tickets in Diffusion Models for Efficient Training},
  author={Du, Zhenbang and Fu, Yonggan and Wang, Lifu and Qian, Jiayi and Luo, Xiao and Lin, Yingyan (Celine)},
  booktitle={International Conference on Computer Vision},
  year={2025}
}

Acknowledgement

The code is based on T-GATE and DeepCache, thanks to their awesome works!

About

[ICCV 2025] Fewer Denoising Steps or Cheaper Per-Step Inference: Towards Compute-Optimal Diffusion Model Deployment

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published