Skip to content

Commit 9509d91

Browse files
njbrakeaittalam
andauthored
feat: move docs to mkdocs (#824)
* feat: move docs to mkdocs * Refactor so that docs are direct copy paste * Fix links * Update README.md Co-authored-by: Davide Eynard <[email protected]> * Update docs/quickstart.md Co-authored-by: Davide Eynard <[email protected]> * Apply suggestions from code review Co-authored-by: Davide Eynard <[email protected]> --------- Co-authored-by: Davide Eynard <[email protected]>
1 parent 9d975d0 commit 9509d91

File tree

14 files changed

+962
-789
lines changed

14 files changed

+962
-789
lines changed

.github/workflows/docs.yml

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
name: Documentation
2+
3+
on:
4+
push:
5+
branches: [main]
6+
paths:
7+
- mkdocs.yml
8+
- 'docs/**'
9+
- '.github/workflows/docs.yml'
10+
pull_request:
11+
paths:
12+
- mkdocs.yml
13+
- 'docs/**'
14+
- '.github/workflows/docs.yml'
15+
workflow_dispatch:
16+
17+
jobs:
18+
docs:
19+
permissions:
20+
contents: write
21+
runs-on: ubuntu-latest
22+
steps:
23+
- name: Check out the repository
24+
uses: actions/checkout@v5
25+
with:
26+
fetch-depth: 0
27+
28+
- name: Set up Python
29+
uses: actions/setup-python@v5
30+
with:
31+
python-version: '3.x'
32+
33+
- name: Configure git
34+
run: |
35+
git config user.name 'github-actions[bot]'
36+
git config user.email 'github-actions[bot]@users.noreply.github.com'
37+
38+
- name: Install dependencies
39+
run: |
40+
pip install mkdocs-material
41+
42+
- name: Build docs
43+
if: github.event_name == 'pull_request'
44+
run: mkdocs build -s
45+
46+
- name: Publish docs
47+
if: ${{ github.event_name == 'push' || github.event_name == 'workflow_dispatch' }}
48+
run: mkdocs gh-deploy --force

README.md

Lines changed: 17 additions & 789 deletions
Large diffs are not rendered by default.

docs/creating_llamafiles.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
If you want to be able to just say:
2+
3+
```sh
4+
./llava.llamafile
5+
```
6+
7+
...and have it run the web server without having to specify arguments,
8+
then you can embed both the weights and a special `.args` inside, which
9+
specifies the default arguments. First, let's create a file named
10+
`.args` which has this content:
11+
12+
```sh
13+
-m
14+
llava-v1.5-7b-Q8_0.gguf
15+
--mmproj
16+
llava-v1.5-7b-mmproj-Q8_0.gguf
17+
--host
18+
0.0.0.0
19+
-ngl
20+
9999
21+
...
22+
```
23+
24+
As we can see above, there's one argument per line. The `...` argument
25+
optionally specifies where any additional CLI arguments passed by the
26+
user are to be inserted. Next, we'll add both the weights and the
27+
argument file to the executable:
28+
29+
```sh
30+
cp /usr/local/bin/llamafile llava.llamafile
31+
32+
zipalign -j0 \
33+
llava.llamafile \
34+
llava-v1.5-7b-Q8_0.gguf \
35+
llava-v1.5-7b-mmproj-Q8_0.gguf \
36+
.args
37+
38+
./llava.llamafile
39+
```
40+
41+
Congratulations. You've just made your own LLM executable that's easy to
42+
share with your friends.
43+
44+
45+
## Distribution
46+
47+
One good way to share a llamafile with your friends is by posting it on
48+
Hugging Face. If you do that, then it's recommended that you mention in
49+
your Hugging Face commit message what git revision or released version
50+
of llamafile you used when building your llamafile. That way everyone
51+
online will be able verify the provenance of its executable content. If
52+
you've made changes to the llama.cpp or cosmopolitan source code, then
53+
the Apache 2.0 license requires you to explain what changed. One way you
54+
can do that is by embedding a notice in your llamafile using `zipalign`
55+
that describes the changes, and mention it in your Hugging Face commit.

docs/example_llamafiles.md

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
We provide example llamafiles for a variety of models, so you can easily
2+
try out llamafile with different kinds of LLMs.
3+
4+
| Model | Size | License | llamafile | other quants |
5+
| --- | --- | --- | --- | --- |
6+
| LLaMA 3.2 1B Instruct | 1.11 GB | [LLaMA 3.2](https://huggingface.co/Mozilla/Llama-3.2-1B-Instruct-llamafile/blob/main/LICENSE) | [Llama-3.2-1B-Instruct.Q6\_K.llamafile](https://huggingface.co/Mozilla/Llama-3.2-1B-Instruct-llamafile/blob/main/Llama-3.2-1B-Instruct.Q6_K.llamafile?download=true) | [See HF repo](https://huggingface.co/Mozilla/Llama-3.2-1B-Instruct-llamafile) |
7+
| LLaMA 3.2 3B Instruct | 2.62 GB | [LLaMA 3.2](https://huggingface.co/Mozilla/Llama-3.2-3B-Instruct-llamafile/blob/main/LICENSE) | [Llama-3.2-3B-Instruct.Q6\_K.llamafile](https://huggingface.co/Mozilla/Llama-3.2-3B-Instruct-llamafile/blob/main/Llama-3.2-3B-Instruct.Q6_K.llamafile?download=true) | [See HF repo](https://huggingface.co/Mozilla/Llama-3.2-3B-Instruct-llamafile) |
8+
| LLaMA 3.1 8B Instruct | 5.23 GB | [LLaMA 3.1](https://huggingface.co/Mozilla/Meta-Llama-3.1-8B-Instruct-llamafile/blob/main/LICENSE) | [Llama-3.1-8B-Instruct.Q4\_K\_M.llamafile](https://huggingface.co/Mozilla/Meta-Llama-3.1-8B-Instruct-llamafile/resolve/main/Meta-Llama-3.1-8B-Instruct.Q4_K_M.llamafile?download=true) | [See HF repo](https://huggingface.co/Mozilla/Meta-Llama-3.1-8B-Instruct-llamafile) |
9+
| Gemma 3 1B Instruct | 1.32 GB | [Gemma 3](https://ai.google.dev/gemma/terms) | [gemma-3-1b-it.Q6\_K.llamafile](https://huggingface.co/Mozilla/gemma-3-1b-it-llamafile/resolve/main/google_gemma-3-1b-it-Q6_K.llamafile?download=true) | [See HF repo](https://huggingface.co/Mozilla/gemma-3-1b-it-llamafile) |
10+
| Gemma 3 4B Instruct | 3.50 GB | [Gemma 3](https://ai.google.dev/gemma/terms) | [gemma-3-4b-it.Q6\_K.llamafile](https://huggingface.co/Mozilla/gemma-3-4b-it-llamafile/resolve/main/google_gemma-3-4b-it-Q6_K.llamafile?download=true) | [See HF repo](https://huggingface.co/Mozilla/gemma-3-4b-it-llamafile) |
11+
| Gemma 3 12B Instruct | 7.61 GB | [Gemma 3](https://ai.google.dev/gemma/terms) | [gemma-3-12b-it.Q4\_K\_M.llamafile](https://huggingface.co/Mozilla/gemma-3-12b-it-llamafile/resolve/main/google_gemma-3-12b-it-Q4_K_M.llamafile?download=true) | [See HF repo](https://huggingface.co/Mozilla/gemma-3-12b-it-llamafile) |
12+
| QwQ 32B | 7.61 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [Qwen\_QwQ-32B-Q4\_K\_M.llamafile](https://huggingface.co/Mozilla/QwQ-32B-llamafile/resolve/main/Qwen_QwQ-32B-Q4_K_M.llamafile?download=true) | [See HF repo](https://huggingface.co/Mozilla/QwQ-32B-llamafile) |
13+
| R1 Distill Qwen 14B | 9.30 GB | [MIT](https://choosealicense.com/licenses/mit/) | [DeepSeek-R1-Distill-Qwen-14B-Q4\_K\_M](https://huggingface.co/Mozilla/DeepSeek-R1-Distill-Qwen-14B-llamafile/resolve/main/DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.llamafile?download=true) | [See HF repo](https://huggingface.co/Mozilla/DeepSeek-R1-Distill-Qwen-14B-llamafile)|
14+
| R1 Distill Llama 8B | 5.23 GB | [MIT](https://choosealicense.com/licenses/mit/) | [DeepSeek-R1-Distill-Llama-8B-Q4\_K\_M](https://huggingface.co/Mozilla/DeepSeek-R1-Distill-Llama-8B-llamafile/resolve/main/DeepSeek-R1-Distill-Llama-8B-Q4_K_M.llamafile?download=true) | [See HF repo](https://huggingface.co/Mozilla/DeepSeek-R1-Distill-Llama-8B-llamafile)|
15+
| LLaVA 1.5 | 3.97 GB | [LLaMA 2](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) | [llava-v1.5-7b-q4.llamafile](https://huggingface.co/Mozilla/llava-v1.5-7b-llamafile/resolve/main/llava-v1.5-7b-q4.llamafile?download=true) | [See HF repo](https://huggingface.co/Mozilla/llava-v1.5-7b-llamafile) |
16+
| Mistral-7B-Instruct v0.3| 4.42 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [mistral-7b-instruct-v0.3.Q4\_0.llamafile](https://huggingface.co/Mozilla/Mistral-7B-Instruct-v0.3-llamafile/resolve/main/Mistral-7B-Instruct-v0.3.Q4_0.llamafile?download=true) | [See HF repo](https://huggingface.co/Mozilla/Mistral-7B-Instruct-v0.3-llamafile) |
17+
| Granite 3.2 8B Instruct | 5.25 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [granite-3.2-8b-instruct-Q4\_K\_M.llamafile](https://huggingface.co/Mozilla/granite-3.2-8b-instruct-llamafile/resolve/main/granite-3.2-8b-instruct-Q4_K_M.llamafile?download=true) | [See HF repo](https://huggingface.co/Mozilla/granite-3.2-8b-instruct-llamafile) |
18+
| Phi-3-mini-4k-instruct | 7.67 GB | [Apache 2.0](https://huggingface.co/Mozilla/Phi-3-mini-4k-instruct-llamafile/blob/main/LICENSE) | [Phi-3-mini-4k-instruct.F16.llamafile](https://huggingface.co/Mozilla/Phi-3-mini-4k-instruct-llamafile/resolve/main/Phi-3-mini-4k-instruct.F16.llamafile?download=true) | [See HF repo](https://huggingface.co/Mozilla/Phi-3-mini-4k-instruct-llamafile) |
19+
| Mixtral-8x7B-Instruct | 30.03 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [mixtral-8x7b-instruct-v0.1.Q5\_K\_M.llamafile](https://huggingface.co/Mozilla/Mixtral-8x7B-Instruct-v0.1-llamafile/resolve/main/mixtral-8x7b-instruct-v0.1.Q5_K_M.llamafile?download=true) | [See HF repo](https://huggingface.co/Mozilla/Mixtral-8x7B-Instruct-v0.1-llamafile) |
20+
| OLMo-7B | 5.68 GB | [Apache 2.0](https://huggingface.co/Mozilla/OLMo-7B-0424-llamafile/blob/main/LICENSE) | [OLMo-7B-0424.Q6\_K.llamafile](https://huggingface.co/Mozilla/OLMo-7B-0424-llamafile/resolve/main/OLMo-7B-0424.Q6_K.llamafile?download=true) | [See HF repo](https://huggingface.co/Mozilla/OLMo-7B-0424-llamafile) |
21+
| *Text Embedding Models* | | | | |
22+
| E5-Mistral-7B-Instruct | 5.16 GB | [MIT](https://choosealicense.com/licenses/mit/) | [e5-mistral-7b-instruct-Q5_K_M.llamafile](https://huggingface.co/Mozilla/e5-mistral-7b-instruct/resolve/main/e5-mistral-7b-instruct-Q5_K_M.llamafile?download=true) | [See HF repo](https://huggingface.co/Mozilla/e5-mistral-7b-instruct) |
23+
| mxbai-embed-large-v1 | 0.7 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [mxbai-embed-large-v1-f16.llamafile](https://huggingface.co/Mozilla/mxbai-embed-large-v1-llamafile/resolve/main/mxbai-embed-large-v1-f16.llamafile?download=true) | [See HF Repo](https://huggingface.co/Mozilla/mxbai-embed-large-v1-llamafile) |
24+
25+
26+
Here is an example for the Mistral command-line llamafile:
27+
28+
```sh
29+
./mistral-7b-instruct-v0.2.Q5_K_M.llamafile --temp 0.7 -p '[INST]Write a story about llamas[/INST]'
30+
```
31+
32+
And here is an example for WizardCoder-Python command-line llamafile:
33+
34+
```sh
35+
./wizardcoder-python-13b.llamafile --temp 0 -e -r '```\n' -p '```c\nvoid *memcpy_sse2(char *dst, const char *src, size_t size) {\n'
36+
```
37+
38+
And here's an example for the LLaVA command-line llamafile:
39+
40+
```sh
41+
./llava-v1.5-7b-q4.llamafile --temp 0.2 --image lemurs.jpg -e -p '### User: What do you see?\n### Assistant:'
42+
```
43+
44+
As before, macOS, Linux, and BSD users will need to use the "chmod"
45+
command to grant execution permissions to the file before running these
46+
llamafiles for the first time.
47+
48+
Unfortunately, Windows users cannot make use of many of these example
49+
llamafiles because Windows has a maximum executable file size of 4GB,
50+
and all of these examples exceed that size. (The LLaVA llamafile works
51+
on Windows because it is 30MB shy of the size limit.) But don't lose
52+
heart: llamafile allows you to use external weights; this is described
53+
later in this document.
54+
55+
56+
**Having trouble? See the [Troubleshooting](troubleshooting.md) page.**
57+
58+
59+
## A note about models
60+
61+
The example llamafiles provided above should not be interpreted as
62+
endorsements or recommendations of specific models, licenses, or data
63+
sets on the part of Mozilla.

docs/images/llamafile-640x640.png

188 KB
Loading
24.9 KB
Loading

docs/index.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# llamafile
2+
3+
> **We want to hear from you!**
4+
Mozilla.ai recently adopted the llamafile project, and we're planning an approach for codebase modernization. Please share what you find most valuable about llamafile and what would make it more useful for your work.
5+
[Read more via the blog](https://blog.mozilla.ai/llamafile-returns/) and add your voice to the discussion [here](https://github.com/mozilla-ai/llamafile/discussions/809).
6+
7+
8+
[![ci status](https://github.com/Mozilla-Ocho/llamafile/actions/workflows/ci.yml/badge.svg)](https://github.com/Mozilla-Ocho/llamafile/actions/workflows/ci.yml)<br/>
9+
[![](https://dcbadge.vercel.app/api/server/YuMNeuKStr)](https://discord.gg/YuMNeuKStr)<br/><br/>
10+
11+
<img src="images/llamafile-640x640.png" width="320" height="320"
12+
alt="[line drawing of llama animal head in front of slightly open manilla folder filled with files]">
13+
14+
**llamafile lets you distribute and run LLMs with a single file. ([announcement blog post](https://hacks.mozilla.org/2023/11/introducing-llamafile/))**
15+
16+
Our goal is to make open LLMs much more
17+
accessible to both developers and end users. We're doing that by
18+
combining [llama.cpp](https://github.com/ggerganov/llama.cpp) with [Cosmopolitan Libc](https://github.com/jart/cosmopolitan) into one
19+
framework that collapses all the complexity of LLMs down to
20+
a single-file executable (called a "llamafile") that runs
21+
locally on most computers, with no installation.<br/><br/>
22+
23+
<a href="https://builders.mozilla.org/"><img src="images/mozilla-logo-bw-rgb.png" width="150"></a><br/>
24+
llamafile is a <a href="https://builders.mozilla.org/">Mozilla Builders</a> project.<br/><br/>
25+
26+
27+
28+
## How llamafile works
29+
30+
A llamafile is an executable LLM that you can run on your own
31+
computer. It contains the weights for a given open LLM, as well
32+
as everything needed to actually run that model on your computer.
33+
There's nothing to install or configure (with a few caveats, discussed
34+
in subsequent sections of this document).
35+
36+
This is all accomplished by combining llama.cpp with Cosmopolitan Libc,
37+
which provides some useful capabilities:
38+
39+
1. llamafiles can run on multiple CPU microarchitectures. We
40+
added runtime dispatching to llama.cpp that lets new Intel systems use
41+
modern CPU features without trading away support for older computers.
42+
43+
2. llamafiles can run on multiple CPU architectures. We do
44+
that by concatenating AMD64 and ARM64 builds with a shell script that
45+
launches the appropriate one. Our file format is compatible with WIN32
46+
and most UNIX shells. It's also able to be easily converted (by either
47+
you or your users) to the platform-native format, whenever required.
48+
49+
3. llamafiles can run on six OSes (macOS, Windows, Linux,
50+
FreeBSD, OpenBSD, and NetBSD). If you make your own llama files, you'll
51+
only need to build your code once, using a Linux-style toolchain. The
52+
GCC-based compiler we provide is itself an Actually Portable Executable,
53+
so you can build your software for all six OSes from the comfort of
54+
whichever one you prefer most for development.
55+
56+
4. The weights for an LLM can be embedded within the llamafile.
57+
We added support for PKZIP to the GGML library. This lets uncompressed
58+
weights be mapped directly into memory, similar to a self-extracting
59+
archive. It enables quantized weights distributed online to be prefixed
60+
with a compatible version of the llama.cpp software, thereby ensuring
61+
its originally observed behaviors can be reproduced indefinitely.
62+
63+
5. Finally, with the tools included in this project you can create your
64+
*own* llamafiles, using any compatible model weights you want. You can
65+
then distribute these llamafiles to other people, who can easily make
66+
use of them regardless of what kind of computer they have.
67+
68+
69+
## Licensing
70+
71+
While the llamafile project is Apache 2.0-licensed, our changes
72+
to llama.cpp are licensed under MIT (just like the llama.cpp project
73+
itself) so as to remain compatible and upstreamable in the future,
74+
should that be desired.
75+
76+
The llamafile logo on this page was generated with the assistance of DALL·E 3.
77+
78+
79+
[![Star History Chart](https://api.star-history.com/svg?repos=Mozilla-Ocho/llamafile&type=Date)](https://star-history.com/#Mozilla-Ocho/llamafile&Date)

0 commit comments

Comments
 (0)