Skip to content

Latest commit

 

History

History
23 lines (16 loc) · 531 Bytes

README.md

File metadata and controls

23 lines (16 loc) · 531 Bytes

GPTAQ

Neural Network Quantization Framework based on GPTQ With addition of:

  • Activations quantization (RTN + weight reoptimization + Token-wise)
  • Hessian Eigenvalues in sensitivity params
  • Cross-layer equalization


Algorithm

GPTAQ Algorithm

Experiments

Experiments