You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _posts/2025-12-18-vllm-omni-diffusion-cache-acceleration.md
+25-23Lines changed: 25 additions & 23 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,10 +1,10 @@
1
-
___
1
+
---
2
2
layout: post
3
3
title: "vLLM-Omni Diffusion Cache Acceleration"
4
4
author: "vLLM-Omni Team"
5
5
---
6
6
7
-
# Turbocharge Your Diffusion Inference: vLLM-Omni Integrates Cache-DiT and TeaCache
7
+
# Turbocharge Your Diffusion Inference
8
8
9
9
We are thrilled to announce a major performance update for **vLLM-Omni**.
10
10
@@ -31,7 +31,7 @@ vLLM-Omni now supports two distinct caching backends to suit your specific needs
31
31
32
32
33
33
### 2. TeaCache: Simple & Adaptive
34
-
TeaCache offers a hook-based, adaptive caching mechanism. It monitors the difference between inputs and dynamically decides when to reuse the transformer computations from the previous timestep.
34
+
TeaCache is implemented natively within vLLM-Omni, providing a hook-based, adaptive caching mechanism. It monitors the difference between inputs and dynamically decides when to reuse the transformer computations from the previous timestep.
35
35
36
36
## Performance Benchmarks
37
37
@@ -43,20 +43,21 @@ We benchmarked these methods on NVIDIA H200 GPUs using **Qwen-Image** (1024x1024
0 commit comments