Skip to content

Conversation

AsyaPronina
Copy link
Contributor

Details:

  • Fix LLM inference on NPU for input prompt of length 1

Tickets:

  • N/A

@AsyaPronina AsyaPronina requested review from a team as code owners October 1, 2025 13:43
@github-actions github-actions bot added category: NPU OpenVINO NPU plugin category: NPUW NPUW plugin labels Oct 1, 2025
@dmatveev dmatveev added this to the 2025.4 milestone Oct 1, 2025
Comment on lines 545 to -546
uint32_t num_embeds_dim = 1 - batch_dim;
if (shape[num_embeds_dim] > max_generation_token_len) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

somehow I overlooked it in the past but what is batch_dim? can this 1 - x underflow to some hugely positive value here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: NPU OpenVINO NPU plugin category: NPUW NPUW plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants