-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
int4_weight_only
api got error when saving transformers models
#1704
Open
jiqing-feng opened this issue
Feb 12, 2025
· 3 comments
· May be fixed by huggingface/transformers#36206
Open
int4_weight_only
api got error when saving transformers models
#1704
jiqing-feng opened this issue
Feb 12, 2025
· 3 comments
· May be fixed by huggingface/transformers#36206
Comments
Same error on CUDA, we cannot save model if we pass import torch
from transformers import TorchAoConfig, AutoModelForCausalLM, AutoTokenizer
from torchao.dtypes import TensorCoreTiledLayout
model_name = "meta-llama/Llama-3.1-8B-Instruct"
device_map = "cuda:0"
# We support int4_weight_only, int8_weight_only and int8_dynamic_activation_int8_weight
# More examples and documentations for arguments can be found in https://github.com/pytorch/ao/tree/main/torchao/quantization#other-available-quantization-techniques
quantization_config = TorchAoConfig("int4_weight_only", group_size=128, layout=TensorCoreTiledLayout())
quantized_model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map=device_map, quantization_config=quantization_config)
quantized_model.save_pretrained("./llama3-8B-ao-int4", safe_serialization=False) error:
|
4 tasks
int4_weight_only
api is not friendly for transformers model when savingint4_weight_only
api git error when saving transformers models
int4_weight_only
api git error when saving transformers modelsint4_weight_only
api got error when saving transformers models
andrewor14
added a commit
to andrewor14/transformers
that referenced
this issue
Feb 14, 2025
**Summary:** TorchAoConfig optionally contains a `torchao.dtypes.Layout` object which is a dataclass and not JSON serializable, and so the following fails: ``` import json from torchao.dtypes import TensorCoreTiledLayout from transformers import TorchAoConfig config = TorchAoConfig("int4_weight_only", layout=TensorCoreTiledLayout()) config.to_json_string() json.dumps(config.to_dict()) ``` This also causes `quantized_model.save_pretrained(...)` to fail because the first step of this call is to JSON serialize the config. Fixes pytorch/ao#1704. **Test Plan:** python tests/quantization/torchao_integration/test_torchao.py -k test_json_serializable
Hi @jiqing-feng, thanks for reporting this. The issue is the |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When I load a int4 cpu quantized model and want to save this model, I got this issue:
TypeError: Object of type Int4CPULayout is not JSON serializable
To reproduce it:
output:
I was thinking if we could change to a more friendly data structure to save layout data.
The text was updated successfully, but these errors were encountered: