Fix key and value heads patching for models with different n_heads from n_key_value_heads #981

nikolaystanishev · 2025-07-25T11:48:17Z

Description

When patch key or value model heads n_key_value_heads is used instead of n_heads.
When stacking the results for them in the attention head patching methods their results are padded due to different dimentions.

The problem is described in the corresponding issue.

Fixes #980

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)

Screenshots

Please attach before and after screenshots of the change if applicable.

Checklist:

I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have not rewritten tests relating to key interfaces which would affect backward compatibility

…odel

nikolaystanishev added 2 commits July 25, 2025 11:21

Fix the case where n_head and n_key_value_heads are different for a m…

49b973f

…odel

Update doc string

f5efadf

nikolaystanishev mentioned this pull request Jul 25, 2025

[Bug Report] Error when patching key or value heads #980

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix key and value heads patching for models with different n_heads from n_key_value_heads #981

Fix key and value heads patching for models with different n_heads from n_key_value_heads #981

Uh oh!

nikolaystanishev commented Jul 25, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix key and value heads patching for models with different n_heads from n_key_value_heads #981

Are you sure you want to change the base?

Fix key and value heads patching for models with different n_heads from n_key_value_heads #981

Uh oh!

Conversation

nikolaystanishev commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Screenshots

Checklist:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

nikolaystanishev commented Jul 25, 2025 •

edited

Loading