Skip to content

Commit 56b4f5b

Browse files
committed
doc: starting with
1 parent 5aaa5b6 commit 56b4f5b

5 files changed

Lines changed: 205 additions & 1 deletion

File tree

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
---
2+
sidebar_position: 3
3+
title: "🦙Starting with Llama.cpp"
4+
---
5+
6+
## Overview
7+
8+
Open WebUI makes it simple and flexible to connect and manage a local Llama.cpp server to run efficient, quantized language models. Whether you’ve compiled Llama.cpp yourself or you're using precompiled binaries, this guide will walk you through how to:
9+
10+
- Set up your Llama.cpp server
11+
- Load large models locally
12+
- Integrate with Open WebUI for a seamless interface
13+
14+
Let’s get you started!
15+
16+
---
17+
18+
## Step 1: Install Llama.cpp
19+
20+
To run models with Llama.cpp, you first need the Llama.cpp server installed locally.
21+
22+
You can either:
23+
24+
- 📦 [Download prebuilt binaries](https://github.com/ggerganov/llama.cpp/releases)
25+
- 🛠️ Or build it from source by following the [official build instructions](https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md)
26+
27+
After installing, make sure `llama-server` is available in your local system path or take note of its location.
28+
29+
---
30+
31+
## Step 2: Download a Supported Model
32+
33+
You can load and run various GGUF-format quantized LLMs using Llama.cpp. One impressive example is the DeepSeek-R1 1.58-bit model optimized by UnslothAI. To download this version:
34+
35+
1. Visit the [Unsloth DeepSeek-R1 repository on Hugging Face](https://huggingface.co/unsloth/DeepSeek-R1-GGUF)
36+
2. Download the 1.58-bit quantized version – around 131GB.
37+
38+
Alternatively, use Python to download programmatically:
39+
40+
```python
41+
# pip install huggingface_hub hf_transfer
42+
43+
from huggingface_hub import snapshot_download
44+
45+
snapshot_download(
46+
repo_id = "unsloth/DeepSeek-R1-GGUF",
47+
local_dir = "DeepSeek-R1-GGUF",
48+
allow_patterns = ["*UD-IQ1_S*"], # Download only 1.58-bit variant
49+
)
50+
```
51+
52+
This will download the model files into a directory like:
53+
```
54+
DeepSeek-R1-GGUF/
55+
└── DeepSeek-R1-UD-IQ1_S/
56+
├── DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf
57+
├── DeepSeek-R1-UD-IQ1_S-00002-of-00003.gguf
58+
└── DeepSeek-R1-UD-IQ1_S-00003-of-00003.gguf
59+
```
60+
61+
📍 Keep track of the full path to the first GGUF file — you’ll need it in Step 3.
62+
63+
---
64+
65+
## Step 3: Serve the Model with Llama.cpp
66+
67+
Start the model server using the llama-server binary. Navigate to your llama.cpp folder (e.g., build/bin) and run:
68+
69+
```bash
70+
./llama-server \
71+
--model /your/full/path/to/DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf \
72+
--port 10000 \
73+
--ctx-size 1024 \
74+
--n-gpu-layers 40
75+
```
76+
77+
🛠️ Tweak the parameters to suit your machine:
78+
79+
- --model: Path to your .gguf model file
80+
- --port: 10000 (or choose another open port)
81+
- --ctx-size: Token context length (can increase if RAM allows)
82+
- --n-gpu-layers: Layers offloaded to GPU for faster performance
83+
84+
Once the server runs, it will expose a local OpenAI-compatible API on:
85+
86+
```
87+
http://127.0.0.1:10000
88+
```
89+
90+
---
91+
92+
## Step 4: Connect Llama.cpp to Open WebUI
93+
94+
To control and query your locally running model directly from Open WebUI:
95+
96+
1. Open Open WebUI in your browser
97+
2. Go to ⚙️ Admin Settings → Connections → OpenAI Connections
98+
3. Click ➕ Add Connection and enter:
99+
100+
- URL: `http://127.0.0.1:10000/v1`
101+
(Or use `http://host.docker.internal:10000/v1` if running WebUI inside Docker)
102+
- API Key: `none` (leave blank)
103+
104+
💡 Once saved, Open WebUI will begin using your local Llama.cpp server as a backend!
105+
106+
![Llama.cpp Connection in Open WebUI](/images/tutorials/deepseek/connection.png)
107+
108+
---
109+
110+
## Quick Tip: Try Out the Model via Chat Interface
111+
112+
Once connected, select the model from the Open WebUI chat menu and start interacting!
113+
114+
![Model Chat Preview](/images/tutorials/deepseek/response.png)
115+
116+
---
117+
118+
## You're Ready to Go!
119+
120+
Once configured, Open WebUI makes it easy to:
121+
122+
- Manage and switch between local models served by Llama.cpp
123+
- Use the OpenAI-compatible API with no key needed
124+
- Experiment with massive models like DeepSeek-R1 — right from your machine!
125+
126+
---
127+
128+
🚀 Have fun experimenting and building!

docs/getting-started/quick-start/starting-with-ollama.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
sidebar_position: 1
3-
title: "🦙 Starting With Ollama"
3+
title: "👉 Starting With Ollama"
44
---
55

66
## Overview
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
---
2+
3+
sidebar_position: 2
4+
title: "🤖 Starting With OpenAI"
5+
6+
---
7+
8+
## Overview
9+
10+
Open WebUI makes it easy to connect and use OpenAI and other OpenAI-compatible APIs. This guide will walk you through adding your API key, setting the correct endpoint, and selecting models — so you can start chatting right away.
11+
12+
---
13+
14+
## Step 1: Get Your OpenAI API Key
15+
16+
To use OpenAI models (such as GPT-4 or GPT-3.5), you need an API key from a supported provider.
17+
18+
You can use:
19+
20+
- OpenAI directly (https://platform.openai.com/account/api-keys)
21+
- Azure OpenAI
22+
- An OpenAI-compatible service (e.g., LocalAI, FastChat, etc.)
23+
24+
👉 Once you have the key, copy it and keep it handy.
25+
26+
For most OpenAI usage, the default API base URL is:
27+
https://api.openai.com/v1
28+
29+
Other providers may use different URLs — check your provider’s documentation.
30+
31+
---
32+
33+
## Step 2: Add the API Connection in Open WebUI
34+
35+
Once Open WebUI is running:
36+
37+
1. Go to the ⚙️ **Admin Settings**.
38+
2. Navigate to **Connections > OpenAI > Manage** (look for the wrench icon).
39+
3. Click ➕ **Add New Connection**.
40+
4. Fill in the following:
41+
- API URL: https://api.openai.com/v1
42+
- API Key: Paste your key here
43+
44+
5. Click Save ✅.
45+
46+
This securely stores your credentials and sets up the connection.
47+
48+
Here’s what it looks like:
49+
50+
![OpenAI Connection Screen](/images/getting-started/quick-start/manage-openai.png)
51+
52+
---
53+
54+
## Step 3: Start Using Models
55+
56+
Once your connection is saved, you can start using models right inside Open WebUI.
57+
58+
🧠 You don’t need to download any models — just select one from the Model Selector and start chatting. If a model is supported by your provider, you’ll be able to use it instantly via their API.
59+
60+
Here’s what model selection looks like:
61+
62+
![OpenAI Model Selector](/images/getting-started/quick-start/selector-openai.png)
63+
64+
Simply choose GPT-4, GPT-3.5, or any compatible model offered by your provider.
65+
66+
---
67+
68+
## All Set!
69+
70+
That’s it! Your OpenAI-compatible API connection is ready to use.
71+
72+
With Open WebUI and OpenAI, you get powerful language models, an intuitive interface, and instant access to chat capabilities — no setup headaches.
73+
74+
If you run into issues or need additional support, visit our [help section](/troubleshooting).
75+
76+
Happy prompting! 🎉
76.3 KB
Loading
64.7 KB
Loading

0 commit comments

Comments
 (0)