This web service provides a simple frontend to view loaded Ollama models and vRAM usage in the ollama
namespace.
- View currently loaded Ollama models
- Monitor vRAM usage on g5.2xlarge nodes
- Web-based interface with Bootstrap styling
- OpenShift OAuth integration for secure access
- A running Kubernetes/OpenShift cluster
kubectl
oroc
configured to connect to your cluster- The
ollama
namespace must exist in your cluster - An Ollama pod running with label
app=ollama-serve
-
Install Go dependencies:
go mod tidy
-
Run the service:
go run main.go
The service will be available at
http://localhost:8080
. It will use your local kubeconfig for authentication.
-
Build the Docker image:
docker build -t quay.io/redhat-ai-dev/ollama-model-viewer:latest .
-
Run the Docker container:
docker run -p 8080:8080 quay.io/redhat-ai-dev/ollama-model-viewer:latest
To deploy this service to your OpenShift cluster with OAuth authentication:
- Create secret resource
COOKIE_SECRET=$(openssl rand -base64 32) envsubst < deploy/secret.yaml | oc apply -f -
-
Deploy using the provided resources:
oc apply -f deploy/
-
Get the external route:
oc get route ollama-model-viewer -n ollama
See the deploy/README.md for detailed deployment instructions and security considerations.
The application consists of:
- Main container: Go application serving the web interface
- OAuth proxy sidecar: Handles OpenShift OAuth authentication
- Service account: With necessary RBAC permissions to read pods and execute commands
- Uses OpenShift OAuth for authentication
- Requires proper RBAC permissions to access the Kubernetes API
- Cookie-based session management for the OAuth proxy