[Roadmap]: preparing for 1230 release

### Motivation.

This live page describes the roadmap to v0.12.0 release of vllm-omni, which is in companion with vllm v0.12.0. We also list help wanted item as 🙋in areas that the committer group is seeking more dedicated contributions.

### Proposed Change.

### CI/CD

**P0**: E2E test

- [ ] online serving
    - [x] Qwen3-Omni #257 
    - [ ] Z-Image&Qwen-Image #292 
- [x] offline serving
    - [x] Qwen2.5-Omni #168 
    - [x] Qwen3-Omni #216 
    - [x] Z-Image #174 

**P1**: UT/ST for the following models  
- [ ] UT/ST for current and new features.
- [ ] CI workflow for NPU/Rocm/.etc. 
    - [x] NPU #231  
    - [x] Rocm #280 
- [x] CI for wheel package compilation.  #238 @congw729 
- [ ] CI improvements #246 

### Model Support 🙋
**P0**: 
- [ ] [MiMo-Audio](https://github.com/XiaomiMiMo/MiMo-Audio) #151 
- [ ] [HunyuanImage-3.0](https://github.com/Tencent-Hunyuan/HunyuanImage-3.0) #42 
- [ ] [Bagel](https://github.com/ByteDance-Seed/Bagel/tree/main) #203 #319 

**P1**: 
- [ ] [LongCat-Flash-Omni](https://arxiv.org/pdf/2511.00279) #213 
- [ ] [Step-Audio2](https://github.com/stepfun-ai/Step-Audio2) #271 
- [ ] [Step-Audio-EditX](https://huggingface.co/stepfun-ai/Step-Audio-EditX) #272 
- [ ] [MammothModa2-Preview](https://huggingface.co/bytedance-research/MammothModa2-Preview) #314 

### Docs Refinement
**P0**: 
- [x] vLLM-Omni main architecture #258 
- [ ] how to add a new model into vLLM-Omni @R2-Y @ywang96 
- [ ] EntryPoints design @fake0fan 
- [ ] AR module design @Gaohan123 
- [ ] DiT module desgin @SamitHuang 
- [ ] Cache acceleration & Attention backend @ZJY0516 @SamitHuang 

### Core 🙋
**P0**: 
- [ ] Support streaming input and output for both offline and online inference. @fake0fan @Gaohan123 
    - [ ] streaming input https://github.com/vllm-project/vllm/pull/28973
    - [ ] streaming output  @Gaohan123 
- [ ] Flexible and robust input processing for mixed modalities. (e.g. use_audio_in video) @ywang96 
- [x] Flexible output modality control and support vllm cli args for online serving #162 @Gaohan123 
- [ ] Support single card deployment and memory profiling with auto gpu mem util. #160 @tzhouam 
- [ ] Endpoints
    - [ ] /v1/images/generation #197 
    - [ ] /v1/audio/speech #218 #305 

**P1**: 
- [ ] Abstract request related state updates away from model implementation.
- [ ] Support async computation and communication across stages by chunks. @R2-Y #268 

### Disaggregation
**P0**: 
- [x] Support basic OmniConnector for disaggregated stages within one node . #215 

Mode:
**P0**:
- [x] (EPD)G

**P1**: 
- [ ] E(PD)G
- [ ] EPDG

Model adaptation:
- [ ] Bagel
- [ ] HunyuanImage-3.0
- [ ] Qwen3-Omni&LongCat-Omni

### Hardware:
**P0**:
- [ ] plugin platform abstraction for multiple hardware registry.

### Benchmark 🙋
- [x] Implement vllm benchmark –omni for offline serving benchmarks comparing with HF #212 
- [ ] support both online and offline benchmarks #344 
    - [ ] t2i
    - [ ] t2v
    - [ ] i2v
    - [ ] ti2v 
    
### vLLM alignment and verification: 🙋
**P0**: 
- [ ] logger system @erfgss @tzhouam 

**P1**: 
- [ ] caching
- [ ] parallelism
- [ ] lora
- [ ] multimodal input processing
- [ ] PD/EPD disaggregation

### Refactor 🙋
**P0**: 

- [x] vLLM 0.12.0 alignment after CI prepared  #335 
- [ ] Stage configs and model implementation optimization and simplification. #74 

**P1**: 
- [ ] Simple and Unified init and running arguments setting for both offline and online inference. @tzhouam 
- [ ] Unified implementation of stage_worker across offline, async online and multi-node. @Gaohan123 

For diffusion supports, please check another independent issue #85 


### Feedback Period.

_No response_

### CC List.

@Gaohan123 @ywang96 @Isotr0py @DarkLight1337 @david6666666 @ZJY0516 

### Any Other Things.

_No response_

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://vllm-omni.readthedocs.io), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Roadmap]: preparing for 1230 release #165

Motivation.

Proposed Change.

CI/CD

Model Support 🙋

Docs Refinement

Core 🙋

Disaggregation

Hardware:

Benchmark 🙋

vLLM alignment and verification: 🙋

Refactor 🙋

Feedback Period.

CC List.

Any Other Things.

Before submitting a new issue...

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Roadmap]: preparing for 1230 release #165

Description

Motivation.

Proposed Change.

CI/CD

Model Support 🙋

Docs Refinement

Core 🙋

Disaggregation

Hardware:

Benchmark 🙋

vLLM alignment and verification: 🙋

Refactor 🙋

Feedback Period.

CC List.

Any Other Things.

Before submitting a new issue...

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions