Skip to content

Added Support for Custom Quantization #35915

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Feb 18, 2025

Conversation

keetrap
Copy link
Contributor

@keetrap keetrap commented Jan 27, 2025

This PR adds a new feature to support custom quantization in the Transformers library.

Closes #35814

@SunMarc @MekkCyber

Sorry, something went wrong.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
… into Custom_Quantization
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@MekkCyber
Copy link
Contributor

Hi @keetrap, Thanks for PR ! It looks great and it's a very handy feature for the community ! I just left some very small nits

Comment on lines 11 to 22
@register_quantization_config("custom")
class CustomConfig(QuantizationConfigMixin):
def __init__(self):
self.quant_method = "custom"
self.bits = 8

def to_dict(self) -> Dict[str, Any]:
output = {
"num_bits": self.bits,
}
return output

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if the example should be here, maybe we can add some doc about it, wdyt @SunMarc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put it in the transformers/examples/quantization folder. Also can you try to do a complete example e.g implementing a 8-bit quantization ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a doc about this new feature in the quantization ?
Maybe add a quick description in the overview docs and potentially update the following doc or create a new section called Custom Quantization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, I tried to add a complete example for 8-bit quantization, and it seems to be working fine as far as I know. However, since I’m still learning, it might be better if someone with more experience could add the example.
It would be easier to add the documentation if there's a complete working example available, as we could reference that in the docs. However, if you'd prefer me to continue with my example and create the documentation based on that, I will do it. Just let me know how you'd like to proceed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to do a complete example for 8-bit quantization ! We will help you review it if necessary =)

Comment on lines 3644 to 3648
try:
user_agent["quant"] = hf_quantizer.quantization_config.quant_method.value
except Exception:
user_agent["quant"] = hf_quantizer.quantization_config.quant_method
# Force-set to `True` for more mem efficiency
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's better to use an if/else statement to avoid capturing unintended exceptions, wdyt ?

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a nice solution, thanks for adding this. Left a couple of comments cc @ice-tong if you also want to have a look

Comment on lines 3644 to 3647
try:
user_agent["quant"] = hf_quantizer.quantization_config.quant_method.value
except Exception:
user_agent["quant"] = hf_quantizer.quantization_config.quant_method
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of a try except, you can just check if value attribute exist or not

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update this.

Comment on lines 11 to 22
@register_quantization_config("custom")
class CustomConfig(QuantizationConfigMixin):
def __init__(self):
self.quant_method = "custom"
self.bits = 8

def to_dict(self) -> Dict[str, Any]:
output = {
"num_bits": self.bits,
}
return output

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put it in the transformers/examples/quantization folder. Also can you try to do a complete example e.g implementing a 8-bit quantization ?

Comment on lines 11 to 22
@register_quantization_config("custom")
class CustomConfig(QuantizationConfigMixin):
def __init__(self):
self.quant_method = "custom"
self.bits = 8

def to_dict(self) -> Dict[str, Any]:
output = {
"num_bits": self.bits,
}
return output

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a doc about this new feature in the quantization ?
Maybe add a quick description in the overview docs and potentially update the following doc or create a new section called Custom Quantization.

@ice-tong
Copy link

ice-tong commented Jan 30, 2025

@SunMarc @MekkCyber @keetrap Sorry I'm just seeing this now - I've been on Chinese New Year vacation.

I ran into a problem with the QuantizationMethod enum not being extensible, so I couldn't come up with a good solution.

I think this PR is impressive and aligns well with my expectations. LGTM!

@SunMarc
Copy link
Member

SunMarc commented Feb 5, 2025

Thanks your your input @ice-tong ! @keetrap, i've merged a PR that requires you to do some updates on your code. Could you try to fix the conflits ? Thanks !

@keetrap
Copy link
Contributor Author

keetrap commented Feb 5, 2025

@SunMarc @MekkCyber
I have resolved the conflict and I do not think any change is required to logic, correct me if I am wrong. It is working locally.
But tests are failing.
Can we merge this ?

@MekkCyber
Copy link
Contributor

Sorry for the delay @keetrap, the CI is green will merge now

@MekkCyber MekkCyber merged commit 8eaae6b into huggingface:main Feb 18, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Support register customize quantization method out-of-tree
5 participants