Ollama is a lightweight API server designed for hosting and interacting with machine learning models. Open WebUI is a user-friendly web interface for managing and utilizing these models. Together, they provide a seamless way to deploy, interact with, and manage ML models locally.
I could have used the Open WebUI image that is bundled ollama support, ghcr.io/open-webui/open-webui:ollama, but having a separate container for both allows for more flexibility.
Make sure NVIDIA Container Toolkit is installed and configured.
To start the Ollama and Open WebUI containers, run
podman compose up --detachBoth Ollama and Open WebUI can be accessed independently. Ollama is accessed using its API at localhost:11434, while Open WebUI is accessed via a web interface at localhost:10000.
Check if Ollama is up and running
curl --request GET --location http://localhost:11434Pull the model deepseek-r1
curl --request POST --location http://localhost:11434/api/pull -d '{
"model": "deepseek-r1"
}'Sending a prompt to the model
curl --silent --request POST --location http://localhost:11434/api/generate --data '{
"model": "deepseek-r1",
"stream": false,
"prompt": "Why is the sky blue?"
}' | jq --compact-outputTo stop and remove the containers, run
podman compose down --volumes