vllm-project · SamitHuang · Dec 23, 2025 · Dec 23, 2025 · congw729 · Dec 23, 2025
diff --git a/docs/getting_started/installation/gpu/cuda.inc.md b/docs/getting_started/installation/gpu/cuda.inc.md
@@ -20,9 +20,11 @@ Therefore, it is recommended to install vLLM and vLLM-Omni with a **fresh new**
 
 vLLM-Omni is built based on vLLM. Please install it with command below.
 ```bash
-uv pip install vllm==0.11.0 --torch-backend=auto
+uv pip install vllm==0.12.0 --torch-backend=auto
 ```
 
+**Note:** If you encounter attn errors after upgrading vllm to 0.12.0, you can uninstall or upgrade xformers manually since vllm 0.12.0 has deprecated xformers dependency.
+
 #### Installation of vLLM-Omni
 
 ```bash

diff --git a/docs/getting_started/quickstart.md b/docs/getting_started/quickstart.md
@@ -17,9 +17,12 @@ For installation on GPU using pre-built-wheel:
 ```bash
 uv venv --python 3.12 --seed
 source .venv/bin/activate
-uv pip install vllm==0.11.0 --torch-backend=auto
+uv pip install vllm==0.12.0 --torch-backend=auto
 uv pip install vllm-omni
 ```
+
+**Note:** If you encounter attn errors after upgrading vllm to 0.12.0, you can uninstall or upgrade xformers manually since vllm 0.12.0 has deprecated xformers dependency.
+
 For additional details—including alternative installation methods, installation on NPU and other platforms — please see the installation guide in [installation](installation/README.md)
 
 ## Offline Inference