LLMs on any GPU: True Open Source to run LLMs #670
Replies: 3 comments 2 replies
-
The reason nobody is answering is because everyone is too busy working on the details.... and that "everybody" is not as many people as you think. I think it's common knowledge that everyone who knows about this, agrees with you, but much fewer people are able to make it happen. This project is just one of many projects that contribute to making it easier to support other hardware. The smart thing that Nvidia did here, was to create tooling that makes their hardware much easier to use, and many if not most people who have specialized in data science are not going to spend time becoming experts in compilers and hardware if there is already a way for them to continue with the work that they are most interested in. The 'invisible hand" of the market worked here, by driving the costs of Nvidia hardware up enough, so that it started making economic sense for more people to build support for alternative hardware, and there are a number of projects doing this. The hip fork of llm.c, Pytorch, tinygrad are just some examples that already work with some AMD gpus. You might not realize or believe it, but within maybe 2 or 3 years you can skill yourself up to being able to contribute to things like this too, Maybe even sooner. Just start coding... try to get a computer to do something that it can't do... however silly or small, just make it do something that will amuse you... at some point it might make mathematics more interesting to you, and then might in turn make coding more interesting again. Just one of the many things that is likely to happen along the way. If nothing else, it will almost be impossible to do this, without it helping you to understand and appreciate more about the world we live in! |
Beta Was this translation helpful? Give feedback.
-
Very timely... I'm preparing to announce (probably early next week) a project I've been working on to address this. It's called gpu.cpp - a minimalist library that makes portable GPU compute with C++ simple, using the WebGPU API specification as a portable low-level GPU interface: https://github.com/AnswerDotAI/gpu.cpp/tree/main I think the scope of llm.c is probably already settled on CUDA, but one of my short-term goals is to make the llm.c cuda kernels to WebGPU and make them available as part of the library. I was thinking of submitting a PR here to add a link to the related projects section of the README here if Andrej is okay with including it there. |
Beta Was this translation helpful? Give feedback.
-
If you read the README, you would see that there are forks for Gaudi 2, AMD, Metal, and WebGPU. Support has not been added to the main repo to maintain simplicity (by keeping it in C/CUDA, according to #112). |
Beta Was this translation helpful? Give feedback.
-
While the open source models are open, since one can only run them through CUDA and therefore only on NVIDIA GPUs which few can afford, why not replace CUDA to run LLMs on all GPUs?
9 votes ·
Beta Was this translation helpful? Give feedback.
All reactions