A GPT-style language model built from scratch with Python, PyTorch, and tiktoken.
- Python 3.11–3.12 (3.13+ does not yet have PyTorch wheel support)
git clone https://github.com/masonentrican/DankGPT.git
cd DankGPT
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activateInstall PyTorch for your hardware, then install the project:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
pip install -e .pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/xpu
pip install -e .pip install -e .pip install -e .MPS is supported by the default PyTorch wheel on macOS.
The correct device is selected automatically at runtime (cuda > xpu > mps > cpu).
python scripts/prepare_data.pyDownloads example text into data/raw/.
python scripts/load_gpt2_model.pyDownloads GPT-2 124M weights from OpenAI into models/gpt2/.
python scripts/tests/test_train.pyTrains a small GPT-style model on the prepared dataset with checkpoint saving/loading.
Run the full test suite:
python scripts/perform_tests.pyThis executes all tests in scripts/tests/ and provides a summary. Available test categories:
- Model Components: Attention mechanisms (causal, multi-head, self-attention), transformer blocks, normalization layers, GPT model architecture
- Activation Functions: GELU vs ReLU comparisons
- Text Generation: Generation with temperature scaling and top-k sampling
- Training: Full training loop, loss calculation, checkpoint saving/loading
- Integration: Loading OpenAI GPT-2 weights, model size validation
- Configuration: Model configuration validation
Individual tests can be run directly, e.g., python scripts/tests/test_gptmodel.py.
src/llm/ # library code (dataset, tokenizer, model, training)
scripts/ # run scripts (prepare_data, train)
data/raw/ # downloaded data (ignored in git)
configs/ # optional configs
data/raw/,.venv/, and build artifacts are ignored by git- Installed in editable mode via
pip install -e .