enable ut test for xpu devices#11712
Conversation
python/sglang/test/test_utils.py
Outdated
| @@ -1837,3 +1840,34 @@ def wrapper(self): | |||
| return wrapper | |||
|
|
|||
| return decorator | |||
|
|
|||
|
|
|||
| def get_gpu_rank(): | |||
There was a problem hiding this comment.
The function is getting device count, not the current rank or device index. I feel like it's better to name the function to something like get_gpu_count and rename the variable from gpu_rank to gpu_count.
python/sglang/test/test_utils.py
Outdated
| gpu_rank = torch.cuda.device_count() | ||
| elif is_rocm(): | ||
| gpu_rank = torch.rocm.device_count() | ||
| return gpu_rank |
There was a problem hiding this comment.
Suggest adding a final else to handle the case where none of the device backends apply:
| return gpu_rank | |
| else: | |
| gpu_count = 0 | |
| return gpu_count |
python/sglang/test/test_utils.py
Outdated
| if is_cuda(): | ||
| return torch.cuda.device_memory_used() / 1024**3 | ||
| elif is_xpu(): | ||
| return torch.xpu.device_memory_used() / 1024**3 |
There was a problem hiding this comment.
Suggest adding a final else
python/sglang/test/test_utils.py
Outdated
| return torch.xpu.device_memory_used() / 1024**3 | ||
|
|
||
|
|
||
| def get_gpu_capability(): |
There was a problem hiding this comment.
Is it possible to directly reuse the below function?
sglang/python/sglang/srt/utils/common.py
Line 1825 in 7257525
| @@ -149,10 +153,8 @@ def causal_conv1d_opcheck_fn( | |||
| @pytest.mark.parametrize("width", [4]) | |||
| @pytest.mark.parametrize("dim", [2048, 2048 + 16, 4096]) | |||
| def test_causal_conv1d_update(dim, width, seqlen, has_bias, silu_activation, itype): | |||
| if not torch.cuda.is_available(): | |||
| pytest.skip("CUDA device not available") | |||
There was a problem hiding this comment.
By removing this check, are we expecting this test to run on all devices, or only on cuda and xpu?
There was a problem hiding this comment.
This case is only in CUDA CI.
| @@ -188,10 +190,8 @@ def test_causal_conv1d_update(dim, width, seqlen, has_bias, silu_activation, ity | |||
| def test_causal_conv1d_update_with_batch_gather( | |||
| batch_size, with_padding, dim, width, seqlen, has_bias, silu_activation, itype | |||
| ): | |||
| if not torch.cuda.is_available(): | |||
| pytest.skip("CUDA device not available") | |||
| @@ -268,11 +268,9 @@ def test_causal_conv1d_update_with_batch_gather( | |||
| def test_causal_conv1d_varlen( | |||
| batch, with_padding, dim, seqlen, width, has_bias, silu_activation, itype | |||
| ): | |||
| if not torch.cuda.is_available(): | |||
| pytest.skip("CUDA device not available") | |||
|
|
||
| class TestCreateKvIndices(CustomTestCase): | ||
| @classmethod | ||
| def setUpClass(cls): | ||
| if not torch.cuda.is_available(): | ||
| raise unittest.SkipTest("CUDA is not available") |
test/srt/test_get_weights_by_name.py
Outdated
| device_type = getattr(torch.accelerator.current_accelerator(), "type", "cpu") | ||
|
|
||
|
|
||
| def get_gpu_rank(): |
There was a problem hiding this comment.
You've added this util function in python/sglang/test/test_utils.py in this PR. Can we directly reuse that one?
| from sglang.test.test_utils import empty_gpu_cache | ||
|
|
||
| device_type = getattr(torch.accelerator.current_accelerator(), "type", "cpu") | ||
| torch.set_default_device(device_type) |
There was a problem hiding this comment.
I feel like you could add a util function for this device setting in python/sglang/test/test_utils.py and then you can reuse this util function in all these files.
|
@ping1jing2 The PR can benefit ascend as well, could you give a review? |
mingfeima
left a comment
There was a problem hiding this comment.
generally LGTM, just some minor changes required.
python/sglang/test/test_utils.py
Outdated
| @@ -1837,3 +1840,34 @@ def wrapper(self): | |||
| return wrapper | |||
|
|
|||
| return decorator | |||
|
|
|||
|
|
|||
| def get_gpu_rank(): | |||
There was a problem hiding this comment.
| def get_gpu_rank(): | |
| def get_device_count(): | |
| """ | |
| Returns the number of available devices depending on the backend. | |
| Supports CUDA, ROCm, and XPU. | |
| """ |
| def empty_gpu_cache(): | ||
| if is_xpu(): | ||
| torch.xpu.empty_cache() | ||
| elif is_cuda(): | ||
| torch.cuda.empty_cache() |
There was a problem hiding this comment.
do we need to have rocm here?
and also it needs a final else.
| @@ -23,6 +23,9 @@ | |||
| from sglang.srt.utils.hf_transformers_utils import get_tokenizer | |||
| from sglang.test.test_utils import DEFAULT_SMALL_MODEL_NAME_FOR_TEST, CustomTestCase | |||
|
|
|||
| device_type = getattr(torch.accelerator.current_accelerator(), "type", "cpu") | |||
There was a problem hiding this comment.
what is the purpose of this line?
|
enable CI to verify. |
|
@DiweiSun make sure that you install pre-commit according to https://docs.sglang.ai/developer_guide/contribution_guide.html#format-code-with-pre-commit |
thank you, can't agree more with chunyuan-w and mingfeima's comments |
|
/rerun-failed-ci |
add xpu support for ut
|
/rerun-failed-ci |
|
/rerun-failed-ci |
|
/rerun-failed-ci |
| device: Device type ("auto", "cuda", "rocm" or "cpu"). | ||
| If "auto", will detect available platforms automatically. | ||
| """ | ||
| # Auto-detect device if needed |
There was a problem hiding this comment.
Why is this removed?
There was a problem hiding this comment.
The main issue is here:
sglang/python/sglang/test/test_utils.py
Line 418 in 7541da1
This change here may not be ideal, but I think we should not fall back to CPU and should raise an error directly.
|
/rerun-failed-ci |
|
/rerun-failed-ci |
|
/rerun-failed-ci |
|
/rerun-failed-ci |
1 similar comment
|
/rerun-failed-ci |
|
Hello, @Kangyan-Zhou Those PR issues aren’t caused by my change. Could you help review it? |
Yes I think the PR generally looks good, just wanted to have all the CI to pass to get more confidence. I'll keep a closer look at it. |
Co-authored-by: jundu <jun.du@intel.com> Co-authored-by: Gao, Pengfei <pengfei.gao@intel.com>
Co-authored-by: jundu <jun.du@intel.com> Co-authored-by: Gao, Pengfei <pengfei.gao@intel.com>
(Please be kindly informed that this PR encompasses a large scope. It will be split into smaller PRs as requested.)
This PR is to enable Sglang UTs on XPU. What we do in this PR:
How to Run UTs on XPU:
Apply this PR/diff on Sglang main, and then build sglang env via docker/Dockerfile.xpu.
Pypi Packages might be required by some cases and shall be manually installed:
flashinfer-python
sentencepiece
ray
accelerate
nest-asyncio
pytest -v test_*.py