Skip to content

Commit 6eebb68

Browse files
PatrykWomgawarkiewicz-inteladobrzyniboiko-habanaxuechendi
committed
Bulk docs cherrypick (#523)
It's a bulk cherry-pick of the documentation from main to v0.10.2 release. --------- Signed-off-by: PatrykWo <[email protected]> Signed-off-by: Iryna Boiko <[email protected]> Signed-off-by: Michal Adamczyk <[email protected]> Signed-off-by: Jacek Czaja <[email protected]> Signed-off-by: Krzysztof Smusz <[email protected]> Signed-off-by: Agata Dobrzyniewicz <[email protected]> Signed-off-by: Paweł Olejniczak <[email protected]> Signed-off-by: mhelf-intel <[email protected]> Co-authored-by: Michal Gawarkiewicz <[email protected]> Co-authored-by: Agata Dobrzyniewicz <[email protected]> Co-authored-by: Iryna Boiko <[email protected]> Co-authored-by: Chendi.Xue <[email protected]> Co-authored-by: Michal Adamczyk <[email protected]> Co-authored-by: Jacek Czaja <[email protected]> Co-authored-by: Krzysztof Smusz <[email protected]> Co-authored-by: Yaser Afshar <[email protected]> Co-authored-by: Michał Kuligowski <[email protected]> Co-authored-by: Paweł Olejniczak <[email protected]> Co-authored-by: Monika Helfer <[email protected]>
1 parent a8371fa commit 6eebb68

26 files changed

+1117
-849
lines changed

.cd/README.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -148,10 +148,9 @@ cd vllm-gaudi/.cd/
148148

149149
```bash
150150
HF_TOKEN=<your huggingface token> \
151-
DOCKER_IMAGE="vault.habana.ai/gaudi-docker/1.22.0/ubuntu22.04/habanalabs/vllm-installer-2.7.1:latest" \
152-
VLLM_SERVER_CONFIG_FILE=server/server_text.yaml \
151+
VLLM_SERVER_CONFIG_FILE=server/server_scenarios_text.yaml \
153152
VLLM_SERVER_CONFIG_NAME=llama31_8b_instruct \
154-
VLLM_BENCHMARK_CONFIG_FILE=benchmark/benchmark_text.yaml \
153+
VLLM_BENCHMARK_CONFIG_FILE=benchmark/benchmark_scenarios_text.yaml \
155154
VLLM_BENCHMARK_CONFIG_NAME=llama31_8b_instruct \
156155
docker compose --profile benchmark up
157156
```

README.md

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -24,10 +24,7 @@ vLLM Gaudi plugin (vllm-gaudi) integrates Intel Gaudi accelerators with vLLM to
2424

2525
This plugin follows the [[RFC]: Hardware pluggable](https://github.com/vllm-project/vllm/issues/11162) and [[RFC]: Enhancing vLLM Plugin Architecture](https://github.com/vllm-project/vllm/issues/19161) principles, providing a modular interface for Intel Gaudi hardware.
2626

27-
Learn more: 🚀 [vLLM Plugin System Overview](https://docs.vllm.ai/en/latest/design/plugin_system.html)
28-
29-
## Running vLLM on Gaudi with Docker Compose
30-
We are delivering ready-to-run container images that include both vLLM and Gaudi software. Please follow the [instruction](https://github.com/vllm-project/vllm-gaudi/tree/releases/v0.11.0/.cd) to quickly launch vLLM on Gaudi using a prebuilt Docker image and Docker Compose, with options for custom parameters and benchmarking.
27+
Learn more: 🚀 [vLLM Plugin System Overview](https://vllm-gaudi.readthedocs.io/en/latest/design/plugin_system.html)
3128

3229
## Getting Started
3330
0. Preparation of the Setup
@@ -45,6 +42,7 @@ We are delivering ready-to-run container images that include both vLLM and Gaudi
4542
git clone https://github.com/vllm-project/vllm-gaudi
4643
cd vllm-gaudi
4744
export VLLM_COMMIT_HASH=$(git show "origin/vllm/last-good-commit-for-vllm-gaudi:VLLM_STABLE_COMMIT" 2>/dev/null)
45+
cd ..
4846
```
4947

5048
2. Install vLLM with `pip` or [from source](https://docs.vllm.ai/en/latest/getting_started/installation/gpu/index.html#build-wheel-from-source):
@@ -54,7 +52,7 @@ We are delivering ready-to-run container images that include both vLLM and Gaudi
5452
git clone https://github.com/vllm-project/vllm
5553
cd vllm
5654
git checkout $VLLM_COMMIT_HASH
57-
pip install -r <(sed '/^[torch]/d' requirements/build.txt)
55+
pip install -r <(sed '/^torch/d' requirements/build.txt)
5856
VLLM_TARGET_DEVICE=empty pip install --no-build-isolation -e .
5957
cd ..
6058
```

docs/.nav.yml

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,11 @@ nav:
22
- Home:
33
- vLLM x Intel Gaudi: README.md
44
- Getting Started:
5-
- getting_started/quickstart.md
6-
- getting_started/installation.md
5+
- Quick Start:
6+
- getting_started/quickstart.md
7+
- getting_started/quickstart_configuration.md
8+
- getting_started/quickstart_inference.md
9+
- Installation: getting_started/installation.md
710
- Quick Links:
811
- User Guide: user_guide/README.md
912
- Developer Guide: dev_guide/README.md

docs/README.md

Lines changed: 22 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Welcome to vLLM x Intel Gaudi
1+
# Intel® Gaudi® vLLM Plugin
22

33
<figure markdown="span" style="display: flex; justify-content: center; align-items: center; gap: 10px; margin: auto;">
44
<img src="./assets/logos/vllm-logo-text-light.png" alt="vLLM" style="width: 30%; margin: 0;"> x
@@ -15,14 +15,28 @@
1515
<a class="github-button" href="https://github.com/vllm-project/vllm-gaudi/fork" data-show-count="true" data-icon="octicon-repo-forked" data-size="large" aria-label="Fork">Fork</a>
1616
</p>
1717

18-
vLLM Gaudi plugin (vllm-gaudi) integrates Intel Gaudi accelerators with vLLM to optimize large language model inference.
18+
Welcome to the **vLLM-Gaudi plugin**, a community-maintained integration layer that enables high-performance large language model (LLM) inference on Intel® Gaudi® AI accelerators.
1919

20-
This plugin follows the [[RFC]: Hardware pluggable](https://github.com/vllm-project/vllm/issues/11162) and [[RFC]: Enhancing vLLM Plugin Architecture](https://github.com/vllm-project/vllm/issues/19161) principles, providing a modular interface for Intel Gaudi hardware.
20+
## 🔍 What is vLLM-Gaudi?
2121

22-
Learn more:
22+
The **vLLM-Gaudi plugin** connects the vLLM serving engine with Intel Gaudi hardware, offering optimized inference capabilities for enterprise-scale LLM workloads. It is developed and maintained by Intel/Gaudi team and follows the Hardware Pluggable [RFC](https://github.com/vllm-project/vllm/issues/11162) and vLLM Plugin Architecture [RFC](https://github.com/vllm-project/vllm/issues/19161) for modular integration.
2323

24-
📚 [Intel Gaudi Documentation](https://docs.habana.ai/en/v1.21.1/index.html)
25-
🚀 [vLLM Plugin System Overview](https://docs.vllm.ai/en/latest/design/plugin_system.html)
24+
## 🚀 Why Use It?
2625

27-
## Running vLLM on Gaudi with Docker Compose
28-
We are delivering ready-to-run container images that include both vLLM and Gaudi software. Please follow the [instruction](https://github.com/vllm-project/vllm-gaudi/tree/releases/v0.11.0/.cd) to quickly launch vLLM on Gaudi using a prebuilt Docker image and Docker Compose, with options for custom parameters and benchmarking.
26+
- **Optimized for Gaudi**: Supports advanced features like bucketing mechanism, FP8 quantization, and custom graph caching for fast warm-up and efficient memory use.
27+
- **Scalable and Efficient**: Designed to maximize throughput and minimize latency for large-scale deployments, making it ideal for production-grade LLM inference.
28+
- **Community-Ready**: Actively maintained on [GitHub](https://github.com/vllm-project/vllm-gaudi) with contributions from Intel, Gaudi team, and the broader vLLM ecosystem.
29+
30+
## ✅ Action Items
31+
32+
To get started with the Intel® Gaudi® vLLM Plugin:
33+
34+
- [ ] **Set up your environment** using the [quickstart](getting_started/quickstart.md) and plugin locally or in your containerized environment.
35+
- [ ] **Run inference** using supported models like Llama 3.1, Mixtral, or DeepSeek.
36+
- [ ] **Explore advanced features** such as FP8 quantization, recipe caching, and expert parallelism.
37+
- [ ] **Join the community** by contributing to the [vLLM-Gaudi GitHub repo](https://github.com/vllm-project/vllm-gaudi).
38+
39+
### Learn more
40+
41+
📚 [Intel Gaudi Documentation](https://docs.habana.ai/en/latest/index.html)
42+
📦 [vLLM Plugin System Overview](https://docs.vllm.ai/en/latest/design/plugin_system.html)
12.3 KB
Loading
17.2 KB
Loading
11.3 KB
Loading
11.4 KB
Loading
10.2 KB
Loading

0 commit comments

Comments
 (0)