Skip to content

int4_weight_only api got error when saving transformers models #1704

Closed
huggingface/transformers
#36206
@jiqing-feng

Description

@jiqing-feng

When I load a int4 cpu quantized model and want to save this model, I got this issue: TypeError: Object of type Int4CPULayout is not JSON serializable

To reproduce it:

import torch
from transformers import TorchAoConfig, AutoModelForCausalLM, AutoTokenizer
from torchao.dtypes import Int4CPULayout

model_name = "meta-llama/Llama-3.1-8B-Instruct"
# We support int4_weight_only, int8_weight_only and int8_dynamic_activation_int8_weight
# More examples and documentations for arguments can be found in https://github.com/pytorch/ao/tree/main/torchao/quantization#other-available-quantization-techniques
device_map = "cpu"
quantization_config = TorchAoConfig("int4_weight_only", group_size=128, layout=Int4CPULayout())
quantized_model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map=device_map, quantization_config=quantization_config)
quantized_model.save_pretrained("./llama3-8b-ao-int4", safe_serialization=False)

output:

Traceback (most recent call last):
  File "/home/jiqingfe/test_torchao.py", line 11, in <module>
    quantized_model.save_pretrained("./llama3-8b-ao-int4", safe_serialization=False)
  File "/home/jiqingfe/transformers/src/transformers/modeling_utils.py", line 2800, in save_pretrained
    model_to_save.config.save_pretrained(save_directory)
  File "/home/jiqingfe/transformers/src/transformers/configuration_utils.py", line 419, in save_pretrained
    self.to_json_file(output_config_file, use_diff=True)
  File "/home/jiqingfe/transformers/src/transformers/configuration_utils.py", line 941, in to_json_file
    writer.write(self.to_json_string(use_diff=use_diff))
  File "/home/jiqingfe/transformers/src/transformers/configuration_utils.py", line 927, in to_json_string
    return json.dumps(config_dict, indent=2, sort_keys=True) + "\n"
  File "/usr/lib/python3.10/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/usr/lib/python3.10/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/usr/lib/python3.10/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/usr/lib/python3.10/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/usr/lib/python3.10/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/usr/lib/python3.10/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/usr/lib/python3.10/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/usr/lib/python3.10/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type Int4CPULayout is not JSON serializable

I was thinking if we could change to a more friendly data structure to save layout data.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions