-
-
Notifications
You must be signed in to change notification settings - Fork 9k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Bugfix] Fix PyNcclCommunicator device assertion for un-indexed CUDA devices
#21869
opened Jul 29, 2025 by
CarlosArguilar
Loading…
[DOC] Fix path of v1 related figures
documentation
Improvements or additions to documentation
tpu
Related to Google TPUs
#21868
opened Jul 29, 2025 by
heheda12345
Loading…
3 of 4 tasks
[Perf] Using
__nv_fp8_e4m3
instead of c10::e4m3
for per_token_group_quant
#21867
opened Jul 29, 2025 by
yewentao256
Loading…
[BugFix] Fix interleaved sliding window not set for Gemma3n
ready
ONLY add when PR is ready to merge/full CI is needed
#21863
opened Jul 29, 2025 by
sarckk
Loading…
3 of 4 tasks
[Perf] Parallelize fill_bitmask to accelerate high-throughput guided decoding
structured-output
v1
#21862
opened Jul 29, 2025 by
benchislett
Loading…
[Test] Add Benchmark and Unit Test for Performance-related issues
per_token_group_quant
performance
#21860
opened Jul 29, 2025 by
yewentao256
Loading…
[Docs] Update docker.md with HF_TOKEN, new model, and podman fix
documentation
Improvements or additions to documentation
force-merge
#21856
opened Jul 29, 2025 by
mgoin
Loading…
[V0 Deprecation] [P/D] Move
kv_connector/v1
to kv_connector
(2/2)
v1
#21855
opened Jul 29, 2025 by
lk-chen
Loading…
4 tasks
[Performance] Eliminate unnecessary H2D copies in FlashInfer decode
v1
#21854
opened Jul 29, 2025 by
MatthewBonanni
Loading…
3 of 4 tasks
[Docs] Improve docs search experience by limiting code block height in search results
documentation
Improvements or additions to documentation
#21853
opened Jul 29, 2025 by
mgoin
Loading…
[Docs] Switch to better markdown linting pre-commit hook
ci/build
documentation
Improvements or additions to documentation
performance
Performance-related issues
#21851
opened Jul 29, 2025 by
hmellor
Loading…
[WIP] Add Kimi-Audio integration for vLLM
new-model
Requests to new models
#21849
opened Jul 29, 2025 by
HelloWorldU
Loading…
[Bugfix] Fixing bug inside MultiModalProfiler.
llama
Related to Llama models
multi-modality
Related to multi-modality (#4194)
#21842
opened Jul 29, 2025 by
shenoyvvarun
Loading…
[Bugfix][PD] set max_completion_tokens=1 if req has this value
documentation
Improvements or additions to documentation
#21841
opened Jul 29, 2025 by
Abirdcfly
Loading…
3 of 4 tasks
[Bugfix] Actually disable processing cache when API server is scaled out
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
#21839
opened Jul 29, 2025 by
DarkLight1337
Loading…
1 of 4 tasks
[Bugfix] [Performance] DeepEPHighThroughput + DeepSeek : Quant and then Dispatch
deepseek
Related to DeepSeek models
#21837
opened Jul 29, 2025 by
varun-sundar-rabindranath
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.