Draft: Merge LoRA Adapters with AWQ BaseModels#2418
Draft: Merge LoRA Adapters with AWQ BaseModels#2418Whadup wants to merge 3 commits intohuggingface:mainfrom
Conversation
BenjaminBossan
left a comment
There was a problem hiding this comment.
Thanks for adding merging capabilities to AWQ. I only skimmed the PR so far, but could you please:
- Also implement the
unmergemethod? It should be very similar to themergemethod, but remove the delta weight - There should be a unit test to ensure that merging works, e.g. similar to this test (without DoRA).
- Let's run
make styleon your changes.
|
@BenjaminBossan Thanks for looking into it already! Your three points are on my agenda, I will give you a ping when I commit the changes. |
|
Great, thanks a lot. |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
|
@Whadup Do you still plan on working on this? |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
|
It's not quite clear to me, but it appears like AutoAWQ will be integrated into llm-compressor:
|
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
This PR extends the AwqLoraLinear class to allow merging in of LoRA Adapters.
Instead of re-quantizing the whole model, we use the original quantization scales and zeros.