diff --git a/docs/getting_started/installation/gpu/cuda.inc.md b/docs/getting_started/installation/gpu/cuda.inc.md index c88302939..84b0db447 100644 --- a/docs/getting_started/installation/gpu/cuda.inc.md +++ b/docs/getting_started/installation/gpu/cuda.inc.md @@ -20,9 +20,11 @@ Therefore, it is recommended to install vLLM and vLLM-Omni with a **fresh new** vLLM-Omni is built based on vLLM. Please install it with command below. ```bash -uv pip install vllm==0.11.0 --torch-backend=auto +uv pip install vllm==0.12.0 --torch-backend=auto ``` +**Note:** If you encounter attn errors after upgrading vllm to 0.12.0, you can uninstall or upgrade xformers manually since vllm 0.12.0 has deprecated xformers dependency. + #### Installation of vLLM-Omni ```bash diff --git a/docs/getting_started/quickstart.md b/docs/getting_started/quickstart.md index 7fe8954c9..be9599ec9 100644 --- a/docs/getting_started/quickstart.md +++ b/docs/getting_started/quickstart.md @@ -17,9 +17,12 @@ For installation on GPU using pre-built-wheel: ```bash uv venv --python 3.12 --seed source .venv/bin/activate -uv pip install vllm==0.11.0 --torch-backend=auto +uv pip install vllm==0.12.0 --torch-backend=auto uv pip install vllm-omni ``` + +**Note:** If you encounter attn errors after upgrading vllm to 0.12.0, you can uninstall or upgrade xformers manually since vllm 0.12.0 has deprecated xformers dependency. + For additional details—including alternative installation methods, installation on NPU and other platforms — please see the installation guide in [installation](installation/README.md) ## Offline Inference