`int4_weight_only` api got error when saving transformers models

When I load a int4 cpu quantized model and want to save this model, I got this issue: `TypeError: Object of type Int4CPULayout is not JSON serializable`

To reproduce it:
```python
import torch
from transformers import TorchAoConfig, AutoModelForCausalLM, AutoTokenizer
from torchao.dtypes import Int4CPULayout

model_name = "meta-llama/Llama-3.1-8B-Instruct"
# We support int4_weight_only, int8_weight_only and int8_dynamic_activation_int8_weight
# More examples and documentations for arguments can be found in https://github.com/pytorch/ao/tree/main/torchao/quantization#other-available-quantization-techniques
device_map = "cpu"
quantization_config = TorchAoConfig("int4_weight_only", group_size=128, layout=Int4CPULayout())
quantized_model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map=device_map, quantization_config=quantization_config)
quantized_model.save_pretrained("./llama3-8b-ao-int4", safe_serialization=False)
```

output:
```
Traceback (most recent call last):
  File "/home/jiqingfe/test_torchao.py", line 11, in <module>
    quantized_model.save_pretrained("./llama3-8b-ao-int4", safe_serialization=False)
  File "/home/jiqingfe/transformers/src/transformers/modeling_utils.py", line 2800, in save_pretrained
    model_to_save.config.save_pretrained(save_directory)
  File "/home/jiqingfe/transformers/src/transformers/configuration_utils.py", line 419, in save_pretrained
    self.to_json_file(output_config_file, use_diff=True)
  File "/home/jiqingfe/transformers/src/transformers/configuration_utils.py", line 941, in to_json_file
    writer.write(self.to_json_string(use_diff=use_diff))
  File "/home/jiqingfe/transformers/src/transformers/configuration_utils.py", line 927, in to_json_string
    return json.dumps(config_dict, indent=2, sort_keys=True) + "\n"
  File "/usr/lib/python3.10/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/usr/lib/python3.10/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/usr/lib/python3.10/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/usr/lib/python3.10/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/usr/lib/python3.10/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/usr/lib/python3.10/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/usr/lib/python3.10/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/usr/lib/python3.10/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type Int4CPULayout is not JSON serializable
```

I was thinking if we could change to a more friendly data structure to save layout data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`int4_weight_only` api got error when saving transformers models #1704

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

int4_weight_only api got error when saving transformers models #1704

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`int4_weight_only` api got error when saving transformers models #1704