enable ut test for xpu devices by DiweiSun · Pull Request #11712 · sgl-project/sglang

DiweiSun · 2025-10-16T09:34:14Z

(Please be kindly informed that this PR encompasses a large scope. It will be split into smaller PRs as requested.)

This PR is to enable Sglang UTs on XPU. What we do in this PR:

Enabling multi-hardware config in test/runners.py
Enabling multi-hardware config in test/test_utils.py
Enabling multi-hardware on test_*.py as required.

How to Run UTs on XPU:
Apply this PR/diff on Sglang main, and then build sglang env via docker/Dockerfile.xpu.
Pypi Packages might be required by some cases and shall be manually installed:
flashinfer-python
sentencepiece
ray
accelerate
nest-asyncio

pytest -v test_*.py

chunyuan-w · 2025-11-07T02:35:21Z

python/sglang/test/test_utils.py

@@ -1837,3 +1840,34 @@ def wrapper(self):
        return wrapper

    return decorator
+
+
+def get_gpu_rank():


The function is getting device count, not the current rank or device index. I feel like it's better to name the function to something like get_gpu_count and rename the variable from gpu_rank to gpu_count.

chunyuan-w · 2025-11-07T02:39:33Z

python/sglang/test/test_utils.py

+        gpu_rank = torch.cuda.device_count()
+    elif is_rocm():
+        gpu_rank = torch.rocm.device_count()
+    return gpu_rank


Suggest adding a final else to handle the case where none of the device backends apply:

Suggested change

return gpu_rank

else:

gpu_count = 0

return gpu_count

chunyuan-w · 2025-11-07T02:41:38Z

python/sglang/test/test_utils.py

+    if is_cuda():
+        return torch.cuda.device_memory_used() / 1024**3
+    elif is_xpu():
+        return torch.xpu.device_memory_used() / 1024**3


Suggest adding a final else

chunyuan-w · 2025-11-07T02:44:58Z

python/sglang/test/test_utils.py

+        return torch.xpu.device_memory_used() / 1024**3
+
+
+def get_gpu_capability():


Is it possible to directly reuse the below function?

sglang/python/sglang/srt/utils/common.py

Line 1825 in 7257525

def get_device_capability(device_id: int = 0) -> Tuple[int, int]:

chunyuan-w · 2025-11-07T02:49:18Z

test/registered/layers/mamba/test_causal_conv1d.py

@@ -149,10 +153,8 @@ def causal_conv1d_opcheck_fn(
 @pytest.mark.parametrize("width", [4])
 @pytest.mark.parametrize("dim", [2048, 2048 + 16, 4096])
 def test_causal_conv1d_update(dim, width, seqlen, has_bias, silu_activation, itype):
-    if not torch.cuda.is_available():
-        pytest.skip("CUDA device not available")


By removing this check, are we expecting this test to run on all devices, or only on cuda and xpu?

This case is only in CUDA CI.

chunyuan-w · 2025-11-07T02:49:34Z

test/registered/layers/mamba/test_causal_conv1d.py

@@ -188,10 +190,8 @@ def test_causal_conv1d_update(dim, width, seqlen, has_bias, silu_activation, ity
 def test_causal_conv1d_update_with_batch_gather(
    batch_size, with_padding, dim, width, seqlen, has_bias, silu_activation, itype
 ):
-    if not torch.cuda.is_available():
-        pytest.skip("CUDA device not available")


chunyuan-w · 2025-11-07T02:49:46Z

test/registered/layers/mamba/test_causal_conv1d.py

@@ -268,11 +268,9 @@ def test_causal_conv1d_update_with_batch_gather(
 def test_causal_conv1d_varlen(
    batch, with_padding, dim, seqlen, width, has_bias, silu_activation, itype
 ):
-    if not torch.cuda.is_available():
-        pytest.skip("CUDA device not available")


chunyuan-w · 2025-11-07T02:51:13Z

test/registered/attention/test_create_kvindices.py


 class TestCreateKvIndices(CustomTestCase):
    @classmethod
    def setUpClass(cls):
-        if not torch.cuda.is_available():
-            raise unittest.SkipTest("CUDA is not available")


chunyuan-w · 2025-11-07T02:52:56Z

test/srt/test_get_weights_by_name.py

+device_type = getattr(torch.accelerator.current_accelerator(), "type", "cpu")
+
+
+def get_gpu_rank():


You've added this util function in python/sglang/test/test_utils.py in this PR. Can we directly reuse that one?

chunyuan-w · 2025-11-07T02:56:03Z

test/srt/layers/attention/mamba/test_causal_conv1d.py

+from sglang.test.test_utils import empty_gpu_cache
+
+device_type = getattr(torch.accelerator.current_accelerator(), "type", "cpu")
+torch.set_default_device(device_type)


I feel like you could add a util function for this device setting in python/sglang/test/test_utils.py and then you can reuse this util function in all these files.

airMeng · 2025-11-07T05:25:07Z

@ping1jing2 The PR can benefit ascend as well, could you give a review?

mingfeima

generally LGTM, just some minor changes required.

mingfeima · 2025-11-07T05:45:11Z

python/sglang/test/test_utils.py

@@ -1837,3 +1840,34 @@ def wrapper(self):
        return wrapper

    return decorator
+
+
+def get_gpu_rank():


Suggested change

def get_gpu_rank():

def get_device_count():

"""

Returns the number of available devices depending on the backend.

Supports CUDA, ROCm, and XPU.

"""

mingfeima · 2025-11-07T06:46:33Z

python/sglang/test/test_utils.py

+def empty_gpu_cache():
+    if is_xpu():
+        torch.xpu.empty_cache()
+    elif is_cuda():
+        torch.cuda.empty_cache()


do we need to have rocm here?

and also it needs a final else.

mingfeima · 2025-11-07T06:47:44Z

test/srt/test_forward_split_prefill.py

@@ -23,6 +23,9 @@
 from sglang.srt.utils.hf_transformers_utils import get_tokenizer
 from sglang.test.test_utils import DEFAULT_SMALL_MODEL_NAME_FOR_TEST, CustomTestCase

+device_type = getattr(torch.accelerator.current_accelerator(), "type", "cpu")


what is the purpose of this line?

mingfeima · 2025-11-07T06:50:47Z

enable CI to verify.

mingfeima · 2025-11-07T06:52:07Z

@DiweiSun make sure that you install pre-commit according to https://docs.sglang.ai/developer_guide/contribution_guide.html#format-code-with-pre-commit

ping1jing2 · 2025-11-08T06:59:30Z

@ping1jing2 The PR can benefit ascend as well, could you give a review?

thank you, can't agree more with chunyuan-w and mingfeima's comments

1pikachu · 2026-01-04T00:45:15Z

/rerun-failed-ci

add xpu support for ut

1pikachu · 2026-01-09T02:38:52Z

/rerun-failed-ci

1pikachu · 2026-01-14T02:04:45Z

/rerun-failed-ci

1pikachu · 2026-01-16T08:23:25Z

/rerun-failed-ci

Kangyan-Zhou · 2026-01-30T05:46:16Z

python/sglang/test/test_utils.py

        device: Device type ("auto", "cuda", "rocm" or "cpu").
                If "auto", will detect available platforms automatically.
    """
-    # Auto-detect device if needed


Why is this removed?

The main issue is here:

sglang/python/sglang/test/test_utils.py

Line 418 in 7541da1

except (RuntimeError, ImportError) as e:

This change here may not be ideal, but I think we should not fall back to CPU and should raise an error directly.

1pikachu · 2026-01-30T06:44:04Z

/rerun-failed-ci

1pikachu · 2026-01-31T06:01:27Z

/rerun-failed-ci

1pikachu · 2026-02-02T02:42:13Z

/rerun-failed-ci

Kangyan-Zhou · 2026-02-02T03:36:25Z

/rerun-failed-ci

1pikachu · 2026-02-02T06:34:58Z

/rerun-failed-ci

1pikachu · 2026-02-02T08:02:44Z

Hello, @Kangyan-Zhou Those PR issues aren’t caused by my change. Could you help review it?

Kangyan-Zhou · 2026-02-02T22:40:56Z

Hello, @Kangyan-Zhou Those PR issues aren’t caused by my change. Could you help review it?

Yes I think the PR generally looks good, just wanted to have all the CI to pass to get more confidence. I'll keep a closer look at it.

Co-authored-by: jundu <jun.du@intel.com> Co-authored-by: Gao, Pengfei <pengfei.gao@intel.com>

sglang-bot added the run-ci label Oct 16, 2025

DiweiSun removed the run-ci label Oct 17, 2025

chunyuan-w reviewed Nov 7, 2025

View reviewed changes

mingfeima requested changes Nov 7, 2025

View reviewed changes

mingfeima added xpu intel gpu with device `torch.xpu` intel run-ci labels Nov 7, 2025

1pikachu added 2 commits November 21, 2025 03:43

merge latest commit

9f0ca6d

adjust the code

3899d93

github-actions bot added documentation Improvements or additions to documentation quant LLM Quantization amd dependencies Pull requests that update a dependency file lora Multi-modal multi-modal language model deepseek speculative-decoding labels Nov 21, 2025

1pikachu added 3 commits January 6, 2026 09:32

Merge pull request #21 from gaopengff/gaopengf/add_more_xpu_cases

f62d650

add xpu support for ut

fix conflict

e7d6d53

fix conflict

8df8a69

fix conflict

a07633e

fix conflict

1c62bba

1pikachu added 6 commits January 20, 2026 06:52

fix conflict

41dfd05

Merge branch 'main' into molly/ut_enabling_xpu

38916b7

Merge branch 'main' into molly/ut_enabling_xpu

f48ee60

Merge branch 'main' into molly/ut_enabling_xpu

5da947b

Merge branch 'main' into molly/ut_enabling_xpu

00ccb81

Merge branch 'main' into molly/ut_enabling_xpu

be731eb

Kangyan-Zhou reviewed Jan 30, 2026

View reviewed changes

fallback automatic device detection

1bd4d2e

fix conflict

1e216fb

Kangyan-Zhou added the high priority label Feb 2, 2026

Kangyan-Zhou self-assigned this Feb 2, 2026

Kangyan-Zhou merged commit 495290a into sgl-project:main Feb 3, 2026
267 of 321 checks passed

charlesHsuGG pushed a commit to charlesHsuGG/sglang that referenced this pull request Feb 5, 2026

enable ut test for xpu devices (sgl-project#11712)

81e621d

Co-authored-by: jundu <jun.du@intel.com> Co-authored-by: Gao, Pengfei <pengfei.gao@intel.com>

sfiisf pushed a commit to sfiisf/sglang that referenced this pull request Feb 5, 2026

enable ut test for xpu devices (sgl-project#11712)

9adaa46

Co-authored-by: jundu <jun.du@intel.com> Co-authored-by: Gao, Pengfei <pengfei.gao@intel.com>

-    return gpu_rank
+    else:
+        gpu_count = 0
+    return gpu_count

		return torch.xpu.device_memory_used() / 1024**3


		def get_gpu_capability():

		device_type = getattr(torch.accelerator.current_accelerator(), "type", "cpu")


		def get_gpu_rank():

-def get_gpu_rank():
+def get_device_count():
+    """
+    Returns the number of available devices depending on the backend.
+    Supports CUDA, ROCm, and XPU.
+    """

Conversation

DiweiSun commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chunyuan-w Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

airMeng commented Nov 7, 2025

Uh oh!

mingfeima left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mingfeima commented Nov 7, 2025

Uh oh!

mingfeima commented Nov 7, 2025

Uh oh!

ping1jing2 commented Nov 8, 2025

Uh oh!

1pikachu commented Jan 4, 2026

Uh oh!

1pikachu commented Jan 9, 2026

Uh oh!

1pikachu commented Jan 14, 2026

Uh oh!

1pikachu commented Jan 16, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

1pikachu commented Jan 30, 2026

Uh oh!

1pikachu commented Jan 31, 2026

Uh oh!

1pikachu commented Feb 2, 2026

Uh oh!

Kangyan-Zhou commented Feb 2, 2026

Uh oh!

1pikachu commented Feb 2, 2026

Uh oh!

1pikachu commented Feb 2, 2026

Uh oh!

Kangyan-Zhou commented Feb 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

DiweiSun commented Oct 16, 2025 •

edited

Loading

chunyuan-w Nov 7, 2025 •

edited

Loading