open-webui
diff --git a/‎docs/getting-started/quick-start/starting-with-llama-cpp.mdx‎
Lines changed: 128 additions & 0 deletions b/‎docs/getting-started/quick-start/starting-with-llama-cpp.mdx‎
Lines changed: 128 additions & 0 deletions
diff --git a/‎docs/getting-started/quick-start/starting-with-ollama.mdx‎
Lines changed: 1 addition & 1 deletion b/‎docs/getting-started/quick-start/starting-with-ollama.mdx‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/getting-started/quick-start/starting-with-openai.mdx‎
Lines changed: 76 additions & 0 deletions b/‎docs/getting-started/quick-start/starting-with-openai.mdx‎
Lines changed: 76 additions & 0 deletions
diff --git a/‎static/images/getting-started/quick-start/manage-openai.png‎
76.3 KB b/‎static/images/getting-started/quick-start/manage-openai.png‎
76.3 KB
diff --git a/‎static/images/getting-started/quick-start/selector-openai.png‎
64.7 KB b/‎static/images/getting-started/quick-start/selector-openai.png‎
64.7 KB
@@ -0,0 +1,128 @@
+---
+sidebar_position: 3
+title: "🦙Starting with Llama.cpp"
+---
+
+## Overview
+
+Open WebUI makes it simple and flexible to connect and manage a local Llama.cpp server to run efficient, quantized language models. Whether you’ve compiled Llama.cpp yourself or you're using precompiled binaries, this guide will walk you through how to:
+
+- Set up your Llama.cpp server
+- Load large models locally
+- Integrate with Open WebUI for a seamless interface
+
+Let’s get you started!
+
+---
+
+## Step 1: Install Llama.cpp
+
+To run models with Llama.cpp, you first need the Llama.cpp server installed locally.
+
+You can either:
+
+- 📦 [Download prebuilt binaries](https://github.com/ggerganov/llama.cpp/releases)
+- 🛠️ Or build it from source by following the [official build instructions](https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md)
+
+After installing, make sure `llama-server` is available in your local system path or take note of its location.
+
+---
+
+## Step 2: Download a Supported Model
+
+You can load and run various GGUF-format quantized LLMs using Llama.cpp. One impressive example is the DeepSeek-R1 1.58-bit model optimized by UnslothAI. To download this version:
+
+1. Visit the [Unsloth DeepSeek-R1 repository on Hugging Face](https://huggingface.co/unsloth/DeepSeek-R1-GGUF)
+2. Download the 1.58-bit quantized version – around 131GB.
+
+Alternatively, use Python to download programmatically:
+
+```python
+# pip install huggingface_hub hf_transfer
+
+from huggingface_hub import snapshot_download
+
+snapshot_download(
+    repo_id = "unsloth/DeepSeek-R1-GGUF",
+    local_dir = "DeepSeek-R1-GGUF",
+    allow_patterns = ["*UD-IQ1_S*"],  # Download only 1.58-bit variant
+)
+```
+
+This will download the model files into a directory like:
+```
+DeepSeek-R1-GGUF/
+└── DeepSeek-R1-UD-IQ1_S/
+    ├── DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf
+    ├── DeepSeek-R1-UD-IQ1_S-00002-of-00003.gguf
+    └── DeepSeek-R1-UD-IQ1_S-00003-of-00003.gguf
+```
+
+📍 Keep track of the full path to the first GGUF file — you’ll need it in Step 3.
+
+---
+
+## Step 3: Serve the Model with Llama.cpp
+
+Start the model server using the llama-server binary. Navigate to your llama.cpp folder (e.g., build/bin) and run:
+
+```bash
+./llama-server \
+  --model /your/full/path/to/DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf \
+  --port 10000 \
+  --ctx-size 1024 \
+  --n-gpu-layers 40
+```
+
+🛠️ Tweak the parameters to suit your machine:
+
+- --model: Path to your .gguf model file
+- --port: 10000 (or choose another open port)
+- --ctx-size: Token context length (can increase if RAM allows)
+- --n-gpu-layers: Layers offloaded to GPU for faster performance
+
+Once the server runs, it will expose a local OpenAI-compatible API on:
+
+```
+http://127.0.0.1:10000
+```
+
+---
+
+## Step 4: Connect Llama.cpp to Open WebUI
+
+To control and query your locally running model directly from Open WebUI:
+
+1. Open Open WebUI in your browser
+2. Go to ⚙️ Admin Settings → Connections → OpenAI Connections
+3. Click ➕ Add Connection and enter:
+
+- URL: `http://127.0.0.1:10000/v1`  
+  (Or use `http://host.docker.internal:10000/v1` if running WebUI inside Docker)
+- API Key: `none` (leave blank)
+
+💡 Once saved, Open WebUI will begin using your local Llama.cpp server as a backend!
+
+![Llama.cpp Connection in Open WebUI](/images/tutorials/deepseek/connection.png)
+
+---
+
+## Quick Tip: Try Out the Model via Chat Interface
+
+Once connected, select the model from the Open WebUI chat menu and start interacting!
+
+![Model Chat Preview](/images/tutorials/deepseek/response.png)
+
+---
+
+## You're Ready to Go!
+
+Once configured, Open WebUI makes it easy to:
+
+- Manage and switch between local models served by Llama.cpp
+- Use the OpenAI-compatible API with no key needed
+- Experiment with massive models like DeepSeek-R1 — right from your machine!
+
+---
+
+🚀 Have fun experimenting and building!
@@ -1,6 +1,6 @@
 ---
 sidebar_position: 1
-title: "🦙 Starting With Ollama"
+title: "👉 Starting With Ollama"
 ---
 
 ## Overview
 
@@ -0,0 +1,76 @@
+---
+
+sidebar_position: 2  
+title: "🤖 Starting With OpenAI"
+
+---
+
+## Overview
+
+Open WebUI makes it easy to connect and use OpenAI and other OpenAI-compatible APIs. This guide will walk you through adding your API key, setting the correct endpoint, and selecting models — so you can start chatting right away.
+
+---
+
+## Step 1: Get Your OpenAI API Key
+
+To use OpenAI models (such as GPT-4 or GPT-3.5), you need an API key from a supported provider.
+
+You can use:
+
+- OpenAI directly (https://platform.openai.com/account/api-keys)
+- Azure OpenAI
+- An OpenAI-compatible service (e.g., LocalAI, FastChat, etc.)
+
+👉 Once you have the key, copy it and keep it handy.
+
+For most OpenAI usage, the default API base URL is:  
+https://api.openai.com/v1
+
+Other providers may use different URLs — check your provider’s documentation.
+
+---
+
+## Step 2: Add the API Connection in Open WebUI
+
+Once Open WebUI is running:
+
+1. Go to the ⚙️ **Admin Settings**.
+2. Navigate to **Connections > OpenAI > Manage** (look for the wrench icon).
+3. Click ➕ **Add New Connection**.
+4. Fill in the following:
+   - API URL: https://api.openai.com/v1
+   - API Key: Paste your key here
+
+5. Click Save ✅.
+
+This securely stores your credentials and sets up the connection.
+
+Here’s what it looks like:
+
+![OpenAI Connection Screen](/images/getting-started/quick-start/manage-openai.png)
+
+---
+
+## Step 3: Start Using Models
+
+Once your connection is saved, you can start using models right inside Open WebUI.
+
+🧠 You don’t need to download any models — just select one from the Model Selector and start chatting. If a model is supported by your provider, you’ll be able to use it instantly via their API.
+
+Here’s what model selection looks like:
+
+![OpenAI Model Selector](/images/getting-started/quick-start/selector-openai.png)
+
+Simply choose GPT-4, GPT-3.5, or any compatible model offered by your provider.
+
+---
+
+## All Set!
+
+That’s it! Your OpenAI-compatible API connection is ready to use.
+
+With Open WebUI and OpenAI, you get powerful language models, an intuitive interface, and instant access to chat capabilities — no setup headaches.
+
+If you run into issues or need additional support, visit our [help section](/troubleshooting).
+
+Happy prompting! 🎉