|
| 1 | +# llamafile |
| 2 | + |
| 3 | +> **We want to hear from you!** |
| 4 | +Mozilla.ai recently adopted the llamafile project, and we're planning an approach for codebase modernization. Please share what you find most valuable about llamafile and what would make it more useful for your work. |
| 5 | +[Read more via the blog](https://blog.mozilla.ai/llamafile-returns/) and add your voice to the discussion [here](https://github.com/mozilla-ai/llamafile/discussions/809). |
| 6 | + |
| 7 | + |
| 8 | +[](https://github.com/Mozilla-Ocho/llamafile/actions/workflows/ci.yml)<br/> |
| 9 | +[](https://discord.gg/YuMNeuKStr)<br/><br/> |
| 10 | + |
| 11 | +<img src="images/llamafile-640x640.png" width="320" height="320" |
| 12 | + alt="[line drawing of llama animal head in front of slightly open manilla folder filled with files]"> |
| 13 | + |
| 14 | +**llamafile lets you distribute and run LLMs with a single file. ([announcement blog post](https://hacks.mozilla.org/2023/11/introducing-llamafile/))** |
| 15 | + |
| 16 | +Our goal is to make open LLMs much more |
| 17 | +accessible to both developers and end users. We're doing that by |
| 18 | +combining [llama.cpp](https://github.com/ggerganov/llama.cpp) with [Cosmopolitan Libc](https://github.com/jart/cosmopolitan) into one |
| 19 | +framework that collapses all the complexity of LLMs down to |
| 20 | +a single-file executable (called a "llamafile") that runs |
| 21 | +locally on most computers, with no installation.<br/><br/> |
| 22 | + |
| 23 | +<a href="https://builders.mozilla.org/"><img src="images/mozilla-logo-bw-rgb.png" width="150"></a><br/> |
| 24 | +llamafile is a <a href="https://builders.mozilla.org/">Mozilla Builders</a> project.<br/><br/> |
| 25 | + |
| 26 | + |
| 27 | + |
| 28 | +## How llamafile works |
| 29 | + |
| 30 | +A llamafile is an executable LLM that you can run on your own |
| 31 | +computer. It contains the weights for a given open LLM, as well |
| 32 | +as everything needed to actually run that model on your computer. |
| 33 | +There's nothing to install or configure (with a few caveats, discussed |
| 34 | +in subsequent sections of this document). |
| 35 | + |
| 36 | +This is all accomplished by combining llama.cpp with Cosmopolitan Libc, |
| 37 | +which provides some useful capabilities: |
| 38 | + |
| 39 | +1. llamafiles can run on multiple CPU microarchitectures. We |
| 40 | +added runtime dispatching to llama.cpp that lets new Intel systems use |
| 41 | +modern CPU features without trading away support for older computers. |
| 42 | + |
| 43 | +2. llamafiles can run on multiple CPU architectures. We do |
| 44 | +that by concatenating AMD64 and ARM64 builds with a shell script that |
| 45 | +launches the appropriate one. Our file format is compatible with WIN32 |
| 46 | +and most UNIX shells. It's also able to be easily converted (by either |
| 47 | +you or your users) to the platform-native format, whenever required. |
| 48 | + |
| 49 | +3. llamafiles can run on six OSes (macOS, Windows, Linux, |
| 50 | +FreeBSD, OpenBSD, and NetBSD). If you make your own llama files, you'll |
| 51 | +only need to build your code once, using a Linux-style toolchain. The |
| 52 | +GCC-based compiler we provide is itself an Actually Portable Executable, |
| 53 | +so you can build your software for all six OSes from the comfort of |
| 54 | +whichever one you prefer most for development. |
| 55 | + |
| 56 | +4. The weights for an LLM can be embedded within the llamafile. |
| 57 | +We added support for PKZIP to the GGML library. This lets uncompressed |
| 58 | +weights be mapped directly into memory, similar to a self-extracting |
| 59 | +archive. It enables quantized weights distributed online to be prefixed |
| 60 | +with a compatible version of the llama.cpp software, thereby ensuring |
| 61 | +its originally observed behaviors can be reproduced indefinitely. |
| 62 | + |
| 63 | +5. Finally, with the tools included in this project you can create your |
| 64 | +*own* llamafiles, using any compatible model weights you want. You can |
| 65 | +then distribute these llamafiles to other people, who can easily make |
| 66 | +use of them regardless of what kind of computer they have. |
| 67 | + |
| 68 | + |
| 69 | +## Licensing |
| 70 | + |
| 71 | +While the llamafile project is Apache 2.0-licensed, our changes |
| 72 | +to llama.cpp are licensed under MIT (just like the llama.cpp project |
| 73 | +itself) so as to remain compatible and upstreamable in the future, |
| 74 | +should that be desired. |
| 75 | + |
| 76 | +The llamafile logo on this page was generated with the assistance of DALL·E 3. |
| 77 | + |
| 78 | + |
| 79 | +[](https://star-history.com/#Mozilla-Ocho/llamafile&Date) |
0 commit comments