Skip to content

masonentrican/DankGPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DankGPT

A GPT-style language model built from scratch with Python, PyTorch, and tiktoken.

Requirements

  • Python 3.11–3.12 (3.13+ does not yet have PyTorch wheel support)

Installation

git clone https://github.com/masonentrican/DankGPT.git
cd DankGPT

python -m venv .venv
source .venv/bin/activate    # Windows: .venv\Scripts\activate

Install PyTorch for your hardware, then install the project:

NVIDIA GPU (CUDA)

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
pip install -e .

Intel Arc GPU (XPU)

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/xpu
pip install -e .

CPU only

pip install -e .

Apple Silicon (MPS)

pip install -e .

MPS is supported by the default PyTorch wheel on macOS.

The correct device is selected automatically at runtime (cuda > xpu > mps > cpu).

Prepare Data

python scripts/prepare_data.py

Downloads example text into data/raw/.

Download OpenAI Weights (Optional)

python scripts/load_gpt2_model.py

Downloads GPT-2 124M weights from OpenAI into models/gpt2/.

Train Model

python scripts/tests/test_train.py

Trains a small GPT-style model on the prepared dataset with checkpoint saving/loading.

Testing

Run the full test suite:

python scripts/perform_tests.py

This executes all tests in scripts/tests/ and provides a summary. Available test categories:

  • Model Components: Attention mechanisms (causal, multi-head, self-attention), transformer blocks, normalization layers, GPT model architecture
  • Activation Functions: GELU vs ReLU comparisons
  • Text Generation: Generation with temperature scaling and top-k sampling
  • Training: Full training loop, loss calculation, checkpoint saving/loading
  • Integration: Loading OpenAI GPT-2 weights, model size validation
  • Configuration: Model configuration validation

Individual tests can be run directly, e.g., python scripts/tests/test_gptmodel.py.

Project Structure

src/llm/     # library code (dataset, tokenizer, model, training)
scripts/     # run scripts (prepare_data, train)
data/raw/    # downloaded data (ignored in git)
configs/     # optional configs

Notes

  • data/raw/, .venv/, and build artifacts are ignored by git
  • Installed in editable mode via pip install -e .

About

Home grown LLM

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages