[bugfix] ascend schedule encountered an incorrect req block length in… #2429

liziyu179 · 2025-08-19T02:30:51Z

… the check_watermark_for_prefill function

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.10.0
vLLM main: vllm-project/vllm@6603288

… the check_watermark_for_prefill function Signed-off-by: liziyu <[email protected]>

gemini-code-assist

Code Review

This PR fixes a bug in the watermark check by using the correct block length. The fix is correct in principle, but the implementation is not robust and can lead to crashes for new requests. I've added a critical comment with a suggested improvement to prevent potential KeyError and IndexError exceptions.

gemini-code-assist · 2025-08-19T02:31:49Z

vllm_ascend/core/scheduler.py

        req_blocks = self.kv_cache_manager.coordinator.get_blocks(
            request.request_id)
-        num_new_blocks = (num_required_blocks - len(req_blocks) -
+        num_new_blocks = (num_required_blocks - len(req_blocks[0]) -
                          len(computed_blocks))


The change from len(req_blocks) to len(req_blocks[0]) is conceptually correct for getting the number of allocated blocks. However, the current implementation is not robust and can lead to runtime errors:

self.kv_cache_manager.coordinator.get_blocks(request.request_id) will raise a KeyError if a request has no blocks allocated yet, which is the case for new requests from the waiting queue.

If get_blocks were to return an empty list for a new request, req_blocks[0] would then raise an IndexError.

This can cause the scheduler to crash. The implementation should be updated to handle these cases gracefully.

req_blocks = self.kv_cache_manager.coordinator.req_to_blocks.get( request.request_id, []) num_allocated_blocks = len(req_blocks[0]) if req_blocks else 0 num_new_blocks = (num_required_blocks - num_allocated_blocks - len(computed_blocks))

github-actions · 2025-08-19T02:32:29Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

[bugfix] ascend schedule encountered an incorrect req block length in…

1ff20d1

… the check_watermark_for_prefill function Signed-off-by: liziyu <[email protected]>

gemini-code-assist bot reviewed Aug 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[bugfix] ascend schedule encountered an incorrect req block length in… #2429

[bugfix] ascend schedule encountered an incorrect req block length in… #2429

liziyu179 commented Aug 19, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Aug 19, 2025

Uh oh!

github-actions bot commented Aug 19, 2025

Uh oh!

Uh oh!

[bugfix] ascend schedule encountered an incorrect req block length in… #2429

Are you sure you want to change the base?

[bugfix] ascend schedule encountered an incorrect req block length in… #2429

Conversation

liziyu179 commented Aug 19, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 19, 2025

Uh oh!

Uh oh!

liziyu179 commented Aug 19, 2025 •

edited by github-actions bot

Loading