Mini-LLaMA2: A Compact Implementation of LLaMA2

This project is a compact version of Llama2, a leading open-source language model. It was originally developed as an exercise for Carnegie Mellon University's CS11-711 Advanced NLP course. The project is now open-source and welcomes contributions from the community.

Project Overview

Mini-LLaMA2 is a streamlined version of the Llama2 language model. The model is capable of performing a variety of tasks, including text completion, zero-shot prompt-based sentiment analysis, and task-specific finetuning.

The text completion functionality allows the model to generate coherent and grammatically correct English continuations given a sentence. This feature showcases the model's ability to understand and generate language in a contextually appropriate manner.

The zero-shot, prompt-based sentiment analysis functionality enables the model to perform sentiment analysis on two datasets (SST-5 and CFIMDB) without any task-specific finetuning. This demonstrates the model's ability to generalize its understanding of language to new tasks.

The task-specific finetuning functionality allows the model to be fine-tuned for a specific task, resulting in significantly improved classification results. This feature highlights the model's adaptability and its ability to learn from specific task-related data.

The model utilizes pretrained weights from stories42M.pt, an 8-layer, 42M parameter language model pretrained on the TinyStories dataset. This allows the model to leverage a vast amount of pre-existing language understanding, enhancing its performance on the aforementioned tasks.

Key components of the model are housed in run_llama.py, llama.py, classifier.py, and optimizer.py. These components include the main model architecture, the classification head, and the optimization algorithm, all of which are crucial to the model's functionality.

The project includes three main functionalities:

Text completion: Given a sentence, the model generates a coherent, grammatical English continuation.
Zero-shot, prompt-based sentiment analysis: The model performs sentiment analysis on two datasets (SST-5 and CFIMDB) without any task-specific finetuning.
Task-specific finetuning: The model is finetuned for a specific task, providing much stronger classification results.

Getting Started

To set up the environment and install dependencies, follow the instructions in setup.sh.

The main code can be found in run_llama.py. Other important components are in llama.py, classifier.py, and optimizer.py.

Usage

You can run the model with the following command:

python3 run_llama.py --option [generate/prompt/finetune] --epochs 5 --lr 2e-5 --train data/sst-train.txt --dev data/sst-dev.txt --test data/sst-test.txt

Contributing

We welcome contributions to this project. If you have a feature request, bug report, or proposal for code improvements, please open an issue or submit a pull request.

Acknowledgement

This code is based on llama2.c by Andrej Karpathy. Parts of the code are also from the transformers library (Apache License 2.0).

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.idea		.idea
data		data
figs		figs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
base_llama.py		base_llama.py
cfimdb-dev-finetuning-output.txt		cfimdb-dev-finetuning-output.txt
cfimdb-dev-prompting-output.txt		cfimdb-dev-prompting-output.txt
cfimdb-test-finetuning-output.txt		cfimdb-test-finetuning-output.txt
cfimdb-test-prompting-output.txt		cfimdb-test-prompting-output.txt
classifier.py		classifier.py
config.py		config.py
generated-sentence-temp-0.txt		generated-sentence-temp-0.txt
generated-sentence-temp-1.txt		generated-sentence-temp-1.txt
llama.py		llama.py
optimizer.py		optimizer.py
optimizer_test.npy		optimizer_test.npy
optimizer_test.py		optimizer_test.py
prepare_submit.py		prepare_submit.py
rope.py		rope.py
rope_test.py		rope_test.py
rotary_embedding_actual.data		rotary_embedding_actual.data
rotary_embedding_actual.npy		rotary_embedding_actual.npy
run_llama.py		run_llama.py
sanity_check.data		sanity_check.data
sanity_check.py		sanity_check.py
setup.sh		setup.sh
sst-dev-finetuning-output.txt		sst-dev-finetuning-output.txt
sst-dev-prompting-output.txt		sst-dev-prompting-output.txt
sst-test-finetuning-output.txt		sst-test-finetuning-output.txt
sst-test-prompting-output.txt		sst-test-prompting-output.txt
structure.md		structure.md
tokenizer.model		tokenizer.model
tokenizer.py		tokenizer.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mini-LLaMA2: A Compact Implementation of LLaMA2

Project Overview

Getting Started

Usage

Contributing

Acknowledgement

About

Releases

Packages

Languages

License

YEthYuan/mini-Llama2

Folders and files

Latest commit

History

Repository files navigation

Mini-LLaMA2: A Compact Implementation of LLaMA2

Project Overview

Getting Started

Usage

Contributing

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages