Added Support for Custom Quantization #35915

keetrap · 2025-01-27T17:49:16Z

This PR adds a new feature to support custom quantization in the Transformers library.

Did you read the contributor guideline,
Was this discussed/approved via a Github issue

… into Custom_Quantization

HuggingFaceDocBuilderDev · 2025-01-28T11:01:07Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

MekkCyber · 2025-01-28T11:05:54Z

Hi @keetrap, Thanks for PR ! It looks great and it's a very handy feature for the community ! I just left some very small nits

MekkCyber · 2025-01-28T10:40:43Z

custom_quant_example.py

+@register_quantization_config("custom")
+class CustomConfig(QuantizationConfigMixin):
+    def __init__(self):
+        self.quant_method = "custom"
+        self.bits = 8
+
+    def to_dict(self) -> Dict[str, Any]:
+        output = {
+            "num_bits": self.bits,
+        }
+        return output
+


I am not sure if the example should be here, maybe we can add some doc about it, wdyt @SunMarc

Let's put it in the transformers/examples/quantization folder. Also can you try to do a complete example e.g implementing a 8-bit quantization ?

Can you also add a doc about this new feature in the quantization ?
Maybe add a quick description in the overview docs and potentially update the following doc or create a new section called Custom Quantization.

Hey, I tried to add a complete example for 8-bit quantization, and it seems to be working fine as far as I know. However, since I’m still learning, it might be better if someone with more experience could add the example.
It would be easier to add the documentation if there's a complete working example available, as we could reference that in the docs. However, if you'd prefer me to continue with my example and create the documentation based on that, I will do it. Just let me know how you'd like to proceed.

Try to do a complete example for 8-bit quantization ! We will help you review it if necessary =)

MekkCyber · 2025-01-28T11:04:27Z

src/transformers/modeling_utils.py

+            try:
+                user_agent["quant"] = hf_quantizer.quantization_config.quant_method.value
+            except Exception:
+                user_agent["quant"] = hf_quantizer.quantization_config.quant_method
            # Force-set to `True` for more mem efficiency


Maybe it's better to use an if/else statement to avoid capturing unintended exceptions, wdyt ?

SunMarc

That's a nice solution, thanks for adding this. Left a couple of comments cc @ice-tong if you also want to have a look

SunMarc · 2025-01-28T11:19:09Z

src/transformers/modeling_utils.py

+            try:
+                user_agent["quant"] = hf_quantizer.quantization_config.quant_method.value
+            except Exception:
+                user_agent["quant"] = hf_quantizer.quantization_config.quant_method


instead of a try except, you can just check if value attribute exist or not

I will update this.

SunMarc · 2025-01-28T11:24:01Z

custom_quant_example.py

+@register_quantization_config("custom")
+class CustomConfig(QuantizationConfigMixin):
+    def __init__(self):
+        self.quant_method = "custom"
+        self.bits = 8
+
+    def to_dict(self) -> Dict[str, Any]:
+        output = {
+            "num_bits": self.bits,
+        }
+        return output
+


Let's put it in the transformers/examples/quantization folder. Also can you try to do a complete example e.g implementing a 8-bit quantization ?

SunMarc · 2025-01-28T11:30:17Z

custom_quant_example.py

+@register_quantization_config("custom")
+class CustomConfig(QuantizationConfigMixin):
+    def __init__(self):
+        self.quant_method = "custom"
+        self.bits = 8
+
+    def to_dict(self) -> Dict[str, Any]:
+        output = {
+            "num_bits": self.bits,
+        }
+        return output
+


Can you also add a doc about this new feature in the quantization ?
Maybe add a quick description in the overview docs and potentially update the following doc or create a new section called Custom Quantization.

ice-tong · 2025-01-30T11:39:28Z

@SunMarc @MekkCyber @keetrap Sorry I'm just seeing this now - I've been on Chinese New Year vacation.

I ran into a problem with the QuantizationMethod enum not being extensible, so I couldn't come up with a good solution.

I think this PR is impressive and aligns well with my expectations. LGTM!

SunMarc · 2025-02-05T12:51:45Z

Thanks your your input @ice-tong ! @keetrap, i've merged a PR that requires you to do some updates on your code. Could you try to fix the conflits ? Thanks !

keetrap · 2025-02-05T16:25:35Z

@SunMarc @MekkCyber
I have resolved the conflict and I do not think any change is required to logic, correct me if I am wrong. It is working locally.
But tests are failing.
Can we merge this ?

MekkCyber · 2025-02-18T15:13:56Z

Sorry for the delay @keetrap, the CI is green will merge now

keetrap added 5 commits January 27, 2025 23:08

Added Support for Custom Quantization

644f9e2

Merge branch 'main' into Custom_Quantization

9a59495

Update code

0b7cd98

Merge branch 'Custom_Quantization' of github.com:keetrap/transformers…

1a44b5f

… into Custom_Quantization

code reformatted

7a9ce1f

MekkCyber reviewed Jan 28, 2025

View reviewed changes

SunMarc reviewed Jan 28, 2025

View reviewed changes

keetrap added 2 commits January 28, 2025 17:18

Updated Changes

89d102a

Updated Changes

de6c7e7

keetrap added 2 commits February 5, 2025 20:57

Merge remote-tracking branch 'origin/main' into Custom_Quantization

97bb61d

Merge branch 'main' into Custom_Quantization

28b0ec7

Merge branch 'main' into Custom_Quantization

fd6da66

MekkCyber approved these changes Feb 18, 2025

View reviewed changes

MekkCyber merged commit 8eaae6b into huggingface:main Feb 18, 2025
23 checks passed

fxmarty-amd mentioned this pull request Feb 24, 2025

Support loading Quark quantized models in Transformers #36372

Merged

Added Support for Custom Quantization #35915

Added Support for Custom Quantization #35915

Uh oh!

Conversation

keetrap commented Jan 27, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jan 28, 2025

Uh oh!

MekkCyber commented Jan 28, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SunMarc left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ice-tong commented Jan 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SunMarc commented Feb 5, 2025

Uh oh!

keetrap commented Feb 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MekkCyber commented Feb 18, 2025

Uh oh!

Uh oh!

Uh oh!

SunMarc left a comment •

edited

Loading

ice-tong commented Jan 30, 2025 •

edited

Loading

keetrap commented Feb 5, 2025 •

edited

Loading