dp and tp size for 8 x A100 #1126

Desmond819 · 2024-08-16T17:15:29Z

Desmond819
Aug 16, 2024

Hello, I am running Llama 3.1 on 8 x A100. I tried tp 8 + dp 8 with --mem-fraction-static 0.4, but it gives CUDA error: invalid device ordinal. What should be the correct setup if I want both tp and dp?

Answered by merrymercy

Sep 10, 2024

num_gpu = dp x tp

View full answer

zhyncs · 2024-08-16T17:17:08Z

zhyncs
Aug 16, 2024
Maintainer

python3 -m sglang.check_env

0 replies

Desmond819 · 2024-08-16T17:26:21Z

Desmond819
Aug 16, 2024
Author

Python: 3.11.9 (main, Apr 19 2024, 16:48:06) [GCC 11.2.0]
CUDA available: True
GPU 0,1,2,3,4,5,6,7: NVIDIA A100 80GB PCIe
GPU 0,1,2,3,4,5,6,7 Compute Capability: 8.0
CUDA_HOME: /usr/local/cuda-12.5
NVCC: Cuda compilation tools, release 12.5, V12.5.82
CUDA Driver Version: 560.28.03
PyTorch: 2.4.0+cu121
sglang: 0.2.13
flashinfer: 0.1.5+cu121torch2.4
triton: 3.0.0
transformers: 4.44.0
requests: 2.32.3
tqdm: 4.66.5
numpy: 1.26.4
aiohttp: 3.10.3
fastapi: 0.112.1
hf_transfer: 0.1.8
huggingface_hub: 0.24.5
interegular: 0.3.3
packaging: 24.1
PIL: 10.4.0
psutil: 6.0.0
pydantic: 2.8.2
uvicorn: 0.30.6
uvloop: 0.20.0
zmq: 26.1.0
vllm: 0.5.4
multipart: 0.0.9
openai: 1.40.8
anthropic: 0.34.0
NVIDIA Topology:
GPU0 GPU1 GPU2 GPU3 GPU4 GPU5 GPU6 GPU7 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X NV12 NODE NODE SYS SYS SYS SYS 0-31,64-95 0 N/A
GPU1 NV12 X NODE NODE SYS SYS SYS SYS 0-31,64-95 0 N/A
GPU2 NODE NODE X NV12 SYS SYS SYS SYS 0-31,64-95 0 N/A
GPU3 NODE NODE NV12 X SYS SYS SYS SYS 0-31,64-95 0 N/A
GPU4 SYS SYS SYS SYS X NV12 NODE NODE 32-63,96-127 1 N/A
GPU5 SYS SYS SYS SYS NV12 X NODE NODE 32-63,96-127 1 N/A
GPU6 SYS SYS SYS SYS NODE NODE X NV12 32-63,96-127 1 N/A
GPU7 SYS SYS SYS SYS NODE NODE NV12 X 32-63,96-127 1 N/A

Legend:

X = Self
SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
PIX = Connection traversing at most a single PCIe bridge
NV# = Connection traversing a bonded set of # NVLinks

0 replies

zhyncs · 2024-08-16T17:27:52Z

zhyncs
Aug 16, 2024
Maintainer

Because you're using PCIE. Try to add --enable-p2p-check.

1 reply

Desmond819 Aug 16, 2024
Author

python3 -m sglang.launch_server --model-path Meta-Llama-3.1-8B-Instruct --port 8080 --host 0.0.0.0 --tp-size 8 --dp-size 8 --schedule-conservativeness 0.5 --mem-fraction-static 0.4 --enable-p2p-check

same error below:
RuntimeError: CUDA error: invalid device ordinal

Desmond819 · 2024-08-17T15:54:17Z

Desmond819
Aug 17, 2024
Author

does dp size x tp size = total no of GPU? so if I have 8 gpu, I can only do tp 4 x dp 2?

0 replies

zhyncs · 2024-08-17T15:58:07Z

zhyncs
Aug 17, 2024
Maintainer

cc @Ying1123 @hnyls2002

1 reply

KylinMountain Sep 1, 2024

The same here.
I have 4xA100. With command --tp 4 --dp 4, it will report invalid device ordinal.. So I removed the dp 4, it is normal...
Does that mean tp + dp should be the size of cards.

merrymercy · 2024-09-10T17:29:54Z

merrymercy
Sep 10, 2024
Maintainer

num_gpu = dp x tp

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dp and tp size for 8 x A100 #1126

{{title}}

Replies: 6 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

dp and tp size for 8 x A100 #1126

Desmond819 Aug 16, 2024

Replies: 6 comments · 2 replies

zhyncs Aug 16, 2024 Maintainer

Desmond819 Aug 16, 2024 Author

zhyncs Aug 16, 2024 Maintainer

Desmond819 Aug 16, 2024 Author

Desmond819 Aug 17, 2024 Author

zhyncs Aug 17, 2024 Maintainer

KylinMountain Sep 1, 2024

merrymercy Sep 10, 2024 Maintainer

Desmond819
Aug 16, 2024

Replies: 6 comments 2 replies

zhyncs
Aug 16, 2024
Maintainer

Desmond819
Aug 16, 2024
Author

zhyncs
Aug 16, 2024
Maintainer

Desmond819 Aug 16, 2024
Author

Desmond819
Aug 17, 2024
Author

zhyncs
Aug 17, 2024
Maintainer

merrymercy
Sep 10, 2024
Maintainer