Quartet: Native FP4 Training Can Be Optimal for Large Language Models

This is the official code for the Quartet FP4 training paper

[UPDATE 28.09.25]: Quartet has been accepted to NeurIPS 2025!

[UPDATE 28.09.25]: Check out our latest work on MXFP4/NVFP4 for PTQ.

This work was presented at the GPU MODE lecture cycle

Quickstart

Create a conda environment and install dependencies (we recommend Python 3.11):

conda create -n env python=3.11
conda activate env

Install the requirements (we recommend to install torch from specific channels and compile fast_hadamard_transform from source):

pip install -r requirements.txt

Run a pseudo-quantization e2e MXFP4 pre-training with:

bash main_setup.sh

The above command trains a 30M parameters model with the Llama-style architecture on 3B tokens.

MXFP4 Kernels

Quartet kernels are released as part of the QuTLASS library and the FP-Quant training/inference addon to transformers and vLLM.

To measure the speedups on RTX 5090, install this QuTLASS commit and run this notebook.

QuTLASS also provides certain kernels for B200 and we're working on a B300 implementation as well.

Cite This Work

@misc{castro2025quartetnativefp4training,
      title={Quartet: Native FP4 Training Can Be Optimal for Large Language Models}, 
      author={Roberto L. Castro and Andrei Panferov and Soroush Tabesh and Oliver Sieberling and Jiale Chen and Mahdi Nikdan and Saleh Ashkboos and Dan Alistarh},
      year={2025},
      eprint={2505.14669},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2505.14669}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github/workflows		.github/workflows
figures		figures
notebooks		notebooks
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
main_setup.sh		main_setup.sh
requirements.txt		requirements.txt
sweep_tokens_backward.sh		sweep_tokens_backward.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Quartet: Native FP4 Training Can Be Optimal for Large Language Models

Quickstart

MXFP4 Kernels

Cite This Work

About

Uh oh!

Releases 1

Packages

Contributors 4

Uh oh!

Languages

License

IST-DASLab/Quartet

Folders and files

Latest commit

History

Repository files navigation

Quartet: Native FP4 Training Can Be Optimal for Large Language Models

Quickstart

MXFP4 Kernels

Cite This Work

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 4

Uh oh!

Languages

Packages