This repository provides a modular, extensible framework for data preparation, fine-tuning, evaluation, and compression of large language models (LLMs).
- Modular Python package:
llm_training - Unified CLI:
llm-trainfor data prep, training, and evaluation - Configuration via JSON/YAML files with OmegaConf
- Support for tensorized model compression (Llama, Mixtral)
- Integrated WandB logging for experiment tracking
- Unit tests with pytest
- MIT licensed with community guidelines
- Install dependencies:
pip install -e . - Run data preparation:
llm-train data-prep --config configs/data_prep_config.json
- Fine-tune a model:
llm-train train --config configs/sft_config.json
- Evaluate a model:
llm-train eval --config configs/evaluate_config.json
For model compression, see the scripts in scripts/compression/.
All workflows are configured via JSON or YAML files using OmegaConf. See configs/ for examples. The config files specify model paths, hyperparameters, dataset paths, and training arguments.
data_prep_config.json- Dataset preparation configurationsft_config.json- Supervised fine-tuning parametersevaluate_config.json- Evaluation benchmark settingsaccelerate_config.yaml- Multi-GPU training with Accelerate
llm_training/- Main Python package with core functionalityscripts/- Standalone scripts for data prep and compression workflowsconfigs/- Configuration files for different workflowstests/- Unit tests
Run all tests with:
pytestBuild and run in a reproducible environment:
docker build -t llm_training .
docker run -it llm_trainingSee CONTRIBUTING.md for guidelines. All contributions and issues are welcome!
See CHANGELOG.md for release history.
If you find this work useful, please cite it as follows:
@misc{sarkar2024llmtraining,
author = {Abhijoy Sarkar},
title = {LLM Training: A Modular Framework for Fine-tuning Large Language Models},
year = {2024},
publisher = {GitHub},
journal = {GitHub Repository},
howpublished = {\url{https://github.com/acebot712/llm_training}},
}