Windows prebuilt of llama.cpp combining Multi-Token Prediction (MTP) + TurboQuant KV cache compression + native sm_120 (Blackwell consumer GPU, FP4 tensor cores). For RTX 5060 Ti / 5070 / 5080 / 5090.
-
Updated
Jun 5, 2026
Windows prebuilt of llama.cpp combining Multi-Token Prediction (MTP) + TurboQuant KV cache compression + native sm_120 (Blackwell consumer GPU, FP4 tensor cores). For RTX 5060 Ti / 5070 / 5080 / 5090.
Run llama.cpp with Multi-Token Prediction and TurboQuant on Windows using native sm_120 Blackwell support for RTX 50-series GPUs.
Add a description, image, and links to the rtx-5060ti topic page so that developers can more easily learn about it.
To associate your repository with the rtx-5060ti topic, visit your repo's landing page and select "manage topics."