You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.
3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
5. Please use English, otherwise it will be closed.
Describe the bug
When using xgrammar with an EBNF grammar, SGLang will crash if the model outputs a reserved token.
[2025-01-24 04:52:54 TP1] Scheduler hit an exception: Traceback (most recent call last):
File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1756, in run_scheduler_process
scheduler.event_loop_overlap()
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 512, in event_loop_overlap
self.process_batch_result(tmp_batch, tmp_result)
File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1089, in process_batch_result
self.process_batch_result_decode(batch, result)
File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1253, in process_batch_result_decode
req.grammar.accept_token(next_token_id)
File "/sgl-workspace/sglang/python/sglang/srt/constrained/xgrammar_backend.py", line 52, in accept_token
assert self.matcher.accept_token(token)
File "/usr/local/lib/python3.10/dist-packages/xgrammar/matcher.py", line 205, in accept_token
return self._handle.accept_token(token_id, debug_print)
RuntimeError: [04:52:54] /workspace/cpp/grammar_matcher.cc:361: Token id 128255: <|reserved_special_token_247|> is regarded as a special token, and cannot be accepted by the GrammarMatcher
[2025-01-24 04:52:54 TP2] Scheduler hit an exception: Traceback (most recent call last):
File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1756, in run_scheduler_process
scheduler.event_loop_overlap()
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 512, in event_loop_overlap
self.process_batch_result(tmp_batch, tmp_result)
File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1089, in process_batch_result
self.process_batch_result_decode(batch, result)
File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1253, in process_batch_result_decode
req.grammar.accept_token(next_token_id)
File "/sgl-workspace/sglang/python/sglang/srt/constrained/xgrammar_backend.py", line 52, in accept_token
assert self.matcher.accept_token(token)
File "/usr/local/lib/python3.10/dist-packages/xgrammar/matcher.py", line 205, in accept_token
return self._handle.accept_token(token_id, debug_print)
RuntimeError: [04:52:54] /workspace/cpp/grammar_matcher.cc:361: Token id 128255: <|reserved_special_token_247|> is regarded as a special token, and cannot be accepted by the GrammarMatcher
[2025-01-24 04:52:54] Received sigquit from a child proces. It usually means the child failed.
[2025-01-24 04:52:54] Received sigquit from a child proces. It usually means the child failed.
[2025-01-24 04:52:54] Received sigquit from a child proces. It usually means the child failed.
[2025-01-24 04:52:54] Received sigquit from a child proces. It usually means the child failed.
[2025-01-24 04:52:54] Received sigquit from a child proces. It usually means the child failed.
...
Followed by an infinite stream of:
[2025-01-24 04:53:06] Exception in callback Loop._read_from_self
handle: <Handle Loop._read_from_self>
Traceback (most recent call last):
File "uvloop/cbhandles.pyx", line 66, in uvloop.loop.Handle._run
File "uvloop/loop.pyx", line 399, in uvloop.loop.Loop._read_from_self
File "uvloop/loop.pyx", line 404, in uvloop.loop.Loop._invoke_signals
File "uvloop/loop.pyx", line 379, in uvloop.loop.Loop._ceval_process_signals
File "/sgl-workspace/sglang/python/sglang/srt/entrypoints/engine.py", line 332, in sigquit_handler
kill_process_tree(os.getpid())
File "/sgl-workspace/sglang/python/sglang/srt/utils.py", line 508, in kill_process_tree
itself.send_signal(signal.SIGQUIT)
File "/usr/local/lib/python3.10/dist-packages/psutil/__init__.py", line 1285, in send_signal
self._send_signal(sig)
File "/usr/local/lib/python3.10/dist-packages/psutil/__init__.py", line 1266, in _send_signal
os.kill(self.pid, sig)
File "/sgl-workspace/sglang/python/sglang/srt/entrypoints/engine.py", line 332, in sigquit_handler
kill_process_tree(os.getpid())
...
Checklist
Describe the bug
When using xgrammar with an EBNF grammar, SGLang will crash if the model outputs a reserved token.
Followed by an infinite stream of:
Reproduction
docker run -d --gpus all \ -p 8000:8000 \ -v /home/azureuser/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=*****" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server --model-path deepseek-ai/DeepSeek-R1-Distill-Llama-70B --host 0.0.0.0 --port 8000 --tp 4--dp 1 --grammar-backend xgrammar
Environment
Latest docker image: https://hub.docker.com/layers/lmsysorg/sglang/latest/images/sha256-576f608ad94fda242249416b3d9d27f8448091cfeff5776f6b99d90f4a42c13b
Microsoft Azure 4xA100 80G.
The text was updated successfully, but these errors were encountered: