[Feature] Control the stage init timeout threshold by --stage-init-timeout #393

tzhouam · 2025-12-20T18:26:45Z

Purpose

This PR aims to control the timeout threshold in the stage init as in Issue #386, which is defined in PR #328 by --stage-init-timeout, and is renamed from --init-sleep-time.

Test Plan

Tested for Qwen2.5/3 Omni 3B for both online and offline

Test Result

Successfully controlled the stage init timeout for both online and offline.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

…ion 0.12.0 - Updated installation commands and version references in CUDA, ROCm, and NPU documentation to reflect the new vLLM v0.12.0 release. - Adjusted Docker image tags and version checks accordingly.

…points - Replaced instances of `AnyTokenizer` with `TokenizerLike` in `output_processor.py`, `async_omni.py`, and `omni_stage.py`. - Updated the tokenizer initialization function to `init_tokenizer_from_config` in `async_omni.py` for consistency.

…ingle stage init timeout Signed-off-by: Taichang Zhou <[email protected]>

Signed-off-by: Taichang Zhou <[email protected]>

Copilot

Pull request overview

This PR refactors the stage initialization timeout mechanism by renaming init_sleep_seconds to stage_init_timeout and changing its behavior from a sleep duration between stage starts to a timeout threshold for device lock acquisition during stage initialization. The default value is updated from 20/30 seconds to 300 seconds (5 minutes).

Key changes:

Renamed parameter from init_sleep_seconds/init-sleep-seconds to stage_init_timeout/stage-init-timeout across all interfaces
Removed hardcoded 300-second timeout in favor of configurable parameter passed to worker functions
Removed sleep between stage starts in orchestration logic
Added informational logging for successful stage initialization with timing

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
vllm_omni/entrypoints/omni_stage.py	Added `stage_init_timeout` parameter to `__init__`, removed hardcoded `max_wait_time`, propagated timeout to worker functions, added initialization success logging
vllm_omni/entrypoints/omni_llm.py	Renamed parameter from `init_sleep_seconds` to `stage_init_timeout`, updated default from 20 to 300, removed sleep after stage start, updated error messages
vllm_omni/entrypoints/cli/serve.py	Renamed CLI argument from `--init-sleep-seconds` to `--stage-init-timeout`, updated default from 30 to 300, improved help text
vllm_omni/entrypoints/async_omni.py	Renamed parameter throughout async implementation, updated error messages, removed sleep after stage start
tests/e2e/online_serving/test_qwen3_omni.py	Updated test fixture to use new `--stage-init-timeout` argument
tests/e2e/offline_inference/test_qwen3_omni.py	Updated test to use new `stage_init_timeout` parameter with value 300
tests/e2e/offline_inference/conftest.py	Renamed parameter in OmniRunner, updated default to 300, improved docstring
examples/offline_inference/qwen3_omni/run_single_prompt_tp.sh	Updated script to use new `--stage-init-timeout` argument
examples/offline_inference/qwen3_omni/end2end.py	Renamed argument from `--init-sleep-seconds` to `--stage-init-timeout`, updated default to 300, passed to Omni constructor
examples/offline_inference/qwen2_5_omni/end2end.py	Renamed argument from `--init-sleep-seconds` to `--stage-init-timeout`, updated default to 300, passed to OmniLLM constructor

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-20T18:30:45Z

vllm_omni/entrypoints/omni_llm.py

        log_stats: bool = False,
        log_file: str | None = None,
-        init_sleep_seconds: int = 20,
+        stage_init_timeout: int = 20,


There's an inconsistency in the default value for 'stage_init_timeout'. The OmniLLM class uses a default of 20 seconds, while the CLI serve.py uses a default of 300 seconds. Given that this parameter now represents a timeout threshold (not a sleep duration), and considering the change mentioned in the PR description, 300 seconds appears to be the intended default. The default in OmniLLM should be updated to match.

Copilot · 2025-12-20T18:30:46Z

vllm_omni/entrypoints/omni_stage.py

            try:
                _os.close(lock_fd)
                _logging.getLogger(__name__).debug("[Stage-%s] Released initialization lock (fd=%s)", stage_id, lock_fd)
+                _logging.getLogger(__name__).info(


The variable 'wait_start' is only defined inside the 'if device_type == "cuda"' block (line 556), but it's being referenced in the log message on line 630, which is outside that block. This will cause a NameError when running on non-CUDA devices. The variable should be initialized before the device_type check, or the log message should be moved inside the CUDA block.

examples/offline_inference/qwen3_omni/run_single_prompt_tp.sh

vllm_omni/entrypoints/omni_llm.py

vllm_omni/entrypoints/async_omni.py

Co-authored-by: Copilot <[email protected]> Signed-off-by: Zhou Taichang <[email protected]>

Signed-off-by: Taichang Zhou <[email protected]>

…/vllm-omni into dev/control-init-timeout

Signed-off-by: Taichang Zhou <[email protected]>

…as a new input parameter Signed-off-by: Taichang Zhou <[email protected]>

hsliuustc0106 · 2025-12-21T00:32:18Z

please also add your test result for Qwen3-Omni

hsliuustc0106 · 2025-12-21T11:59:16Z

examples/offline_inference/qwen3_omni/run_single_prompt_tp.sh

 python end2end.py --output-wav output_audio \
                  --query-type use_audio \
-                  --init-sleep-seconds 90
+                  --stage-init-timeout 90


why we choose 90 seconds

why not the default 300?

vllm_omni/entrypoints/omni_stage.py

Signed-off-by: tzhouam <[email protected]>

…e default value 300s. Signed-off-by: tzhouam <[email protected]>

tzhouam · 2025-12-23T08:23:13Z

please also add your test result for Qwen3-Omni

Tested on Qwen3 Omni and also works.

hsliuustc0106 · 2025-12-23T12:00:51Z

docs/getting_started/installation/gpu/cuda.inc.md

 vLLM-Omni is built based on vLLM. Please install it with command below.
 ```bash
-uv pip install vllm==0.11.0 --torch-backend=auto
+uv pip install vllm==0.12.0 --torch-backend=auto


why we need to change this? this will bring troubles since we have not release v0.12.0rc

hsliuustc0106 · 2025-12-23T12:00:58Z

docs/getting_started/installation/gpu/cuda.inc.md

    -p 8091:8091 \
    --ipc=host \
-    vllm/vllm-omni:v0.11.0rc1 \
+    vllm/vllm-omni:v0.12.0rc1 \


why we need to change this? this will bring troubles since we have not release v0.12.0rc

Signed-off-by: Taichang Zhou <[email protected]>

tzhouam added 5 commits December 19, 2025 11:11

[Doc] Update installation instructions for vLLM and vLLM-Omni to vers…

cc89958

…ion 0.12.0 - Updated installation commands and version references in CUDA, ROCm, and NPU documentation to reflect the new vLLM v0.12.0 release. - Adjusted Docker image tags and version checks accordingly.

rename the --init-sleep-time to --stage-init-timeout to control the s…

6f55cd3

…ingle stage init timeout Signed-off-by: Taichang Zhou <[email protected]>

fix the bug that online timeout is not passed in

3b8787c

Signed-off-by: Taichang Zhou <[email protected]>

correct the log

5cb4759

Signed-off-by: Taichang Zhou <[email protected]>

tzhouam requested review from Gaohan123 and Copilot December 20, 2025 18:26

tzhouam requested a review from hsliuustc0106 as a code owner December 20, 2025 18:26

Copilot started reviewing on behalf of tzhouam December 20, 2025 18:27 View session

follow pre-commit

dfaebd6

Signed-off-by: Taichang Zhou <[email protected]>

Copilot AI reviewed Dec 20, 2025

View reviewed changes

tzhouam and others added 6 commits December 21, 2025 02:32

Update examples/offline_inference/qwen3_omni/run_single_prompt_tp.sh

90ac08d

Co-authored-by: Copilot <[email protected]> Signed-off-by: Zhou Taichang <[email protected]>

Update vllm_omni/entrypoints/async_omni.py

d2fe522

Co-authored-by: Copilot <[email protected]> Signed-off-by: Zhou Taichang <[email protected]>

Update vllm_omni/entrypoints/omni_llm.py

27c6782

Co-authored-by: Copilot <[email protected]> Signed-off-by: Zhou Taichang <[email protected]>

unify the defaul value for stage init timeout

5003d5a

Signed-off-by: Taichang Zhou <[email protected]>

Merge branch 'dev/control-init-timeout' of https://github.com/tzhouam…

54d008b

…/vllm-omni into dev/control-init-timeout

remove out of block reference for wait_start

154858e

Signed-off-by: Taichang Zhou <[email protected]>

tzhouam added the ready label to trigger buildkite CI label Dec 20, 2025

tzhouam mentioned this pull request Dec 20, 2025

[Bug]: online inference failed for the launch of local qwen3omni instruct 30B #386

Closed

1 task

update the unit test _FakeStage as it accepts the stage_init_timeout …

eb92908

…as a new input parameter Signed-off-by: Taichang Zhou <[email protected]>

hsliuustc0106 reviewed Dec 21, 2025

View reviewed changes

tzhouam added 4 commits December 22, 2025 16:34

unify the name to stage_init_timeout

3cfa121

Signed-off-by: tzhouam <[email protected]>

Unify to the stage init timeout for the run_single_prompt_tp.sh to th…

9edda8c

…e default value 300s. Signed-off-by: tzhouam <[email protected]>

Merge branch 'main' of https://github.com/tzhouam/vllm-omni into main

0d39365

Merge branch 'main' into dev/control-init-timeout

0dd2912

hsliuustc0106 reviewed Dec 23, 2025

View reviewed changes

correct installation info

c0770f1

Signed-off-by: Taichang Zhou <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Control the stage init timeout threshold by --stage-init-timeout #393

[Feature] Control the stage init timeout threshold by --stage-init-timeout #393

Uh oh!

tzhouam commented Dec 20, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 20, 2025

Uh oh!

Copilot AI Dec 20, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hsliuustc0106 commented Dec 21, 2025

Uh oh!

hsliuustc0106 Dec 21, 2025

Uh oh!

hsliuustc0106 Dec 21, 2025

Uh oh!

Uh oh!

tzhouam commented Dec 23, 2025

Uh oh!

hsliuustc0106 Dec 23, 2025

Uh oh!

hsliuustc0106 Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Feature] Control the stage init timeout threshold by --stage-init-timeout #393

Are you sure you want to change the base?

[Feature] Control the stage init timeout threshold by --stage-init-timeout #393

Uh oh!

Conversation

tzhouam commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Successfully controlled the stage init timeout for both online and offline.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hsliuustc0106 commented Dec 21, 2025

Uh oh!

hsliuustc0106 Dec 21, 2025

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 Dec 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tzhouam commented Dec 23, 2025

Uh oh!

hsliuustc0106 Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tzhouam commented Dec 20, 2025 •

edited

Loading