Skip to content

Draft: Merge LoRA Adapters with AWQ BaseModels#2418

Closed
Whadup wants to merge 3 commits intohuggingface:mainfrom
Whadup:awq-lora-merge
Closed

Draft: Merge LoRA Adapters with AWQ BaseModels#2418
Whadup wants to merge 3 commits intohuggingface:mainfrom
Whadup:awq-lora-merge

Conversation

@Whadup
Copy link

@Whadup Whadup commented Mar 10, 2025

This PR extends the AwqLoraLinear class to allow merging in of LoRA Adapters.
Instead of re-quantizing the whole model, we use the original quantization scales and zeros.

@Whadup Whadup changed the title Merge LoRA Adapters with AWQ BaseModels [Experimental] Draft: Merge LoRA Adapters with AWQ BaseModels Mar 10, 2025
Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding merging capabilities to AWQ. I only skimmed the PR so far, but could you please:

  • Also implement the unmerge method? It should be very similar to the merge method, but remove the delta weight
  • There should be a unit test to ensure that merging works, e.g. similar to this test (without DoRA).
  • Let's run make style on your changes.

@Whadup
Copy link
Author

Whadup commented Mar 11, 2025

@BenjaminBossan Thanks for looking into it already! Your three points are on my agenda, I will give you a ping when I commit the changes.

@BenjaminBossan
Copy link
Member

Great, thanks a lot.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@BenjaminBossan
Copy link
Member

@Whadup Do you still plan on working on this?

@github-actions
Copy link

github-actions bot commented May 5, 2025

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@BenjaminBossan
Copy link
Member

It's not quite clear to me, but it appears like AutoAWQ will be integrated into llm-compressor:

AutoAWQ Integration: Perform low-bit weight-only quantization efficiently using AutoAWQ, now part of LLM Compressor. Note: This integration should be considered experimental for now.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@github-actions github-actions bot closed this Jun 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants