Replies: 1 comment 1 reply
-
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am using four A800 servers, each with 8x40G GPUs, to deploy meituan/DeepSeek-R1-Block-INT8.
The startup commands I used are:
python3 -m sglang.launch_server --model /mnt/oceanfs/DeepSeek-R1-Block-INT8 --tp 32 --dist-init-addr 10.0.0.3:5000 --nnodes 4 --node-rank 0 --trust-remote --host 0.0.0.0 --port 30000 --enable-torch-compile --torch-compile-max-bs 8
python3 -m sglang.launch_server --model /mnt/oceanfs/DeepSeek-R1-Block-INT8 --tp 32 --dist-init-addr 10.0.0.3:5000 --nnodes 4 --node-rank 1 --trust-remote --enable-torch-compile --torch-compile-max-bs 8
python3 -m sglang.launch_server --model /mnt/oceanfs/DeepSeek-R1-Block-INT8 --tp 32 --dist-init-addr 10.0.0.3:5000 --nnodes 4 --node-rank 2 --trust-remote --enable-torch-compile --torch-compile-max-bs 8
python3 -m sglang.launch_server --model /mnt/oceanfs/DeepSeek-R1-Block-INT8 --tp 32 --dist-init-addr 10.0.0.3:5000 --nnodes 4 --node-rank 3 --trust-remote --enable-torch-compile --torch-compile-max-bs 8
However, during startup, I encountered the following error:
"Weight output_partition_size = 576 is not divisible by weight quantization block_n = 128"
Why is this happening, and how can I resolve it?
Beta Was this translation helpful? Give feedback.
All reactions