Skip to content

RCP Server fail, ollama 404 #1

@misterjice

Description

@misterjice

I've been up for two days trying to fix this. Very little sleep, very frustrated, even used AI to try to fix this... nothing works.

I have two machines, I'm trying to run distributed inference with ollama in WSL.

olol server --host 0.0.0.0 --port 50051 --ollama-host http://localhost:11434
(works on both machines Ollama gRPC server started on port 50051)

olol rpc-server --host 0.0.0.0 --port 50052 --device auto
(fails on both machines WARNING - Failed to get Ollama status: HTTP 404)

curl with api/tags pulls the models, ollama ls pulls the models, but olol rpc-server does not pull the models. So the problem must be with olol since I can use ollama just find and pull from other places as well.

olol proxy --host 0.0.0.0 --port 8000 --servers "192.168.x.x:50051,192.168.y.y:50051" --distributed --rpc-servers "192.168.x.x:50052,192.168.y.y:50052"

Servers: 2/2 healthy Distributed Inference: ENABLED Active Requests: 0 Total Requests: 0 Uptime: 00:00:02 Generate: 0 Chat: 0 Embeddings: 0

BUT it says 0 models on both!

What's the deal?

Tried again:

olol rpc-server --host 0.0.0.0 --port 50052 --device cuda --quantize q5_0 --flash-attention --context-window 16384
2025-08-14 23:17:02,870 - INFO - Ollama already running at http://localhost:11434
2025-08-14 23:17:02,870 - INFO - Connected to Ollama at http://localhost:11434
2025-08-14 23:17:03,904 - WARNING - Failed to get Ollama status: HTTP 404
2025-08-14 23:17:05,119 - INFO - Server initialized with device: cuda:0
2025-08-14 23:17:05,119 - INFO - Device capabilities: {'backend_type': 'cuda', 'device_id': 0, 'memory': 25756696576, 'compute_capability': '8.9', 'name': 'NVIDIA GeForce RTX 4090'}
2025-08-14 23:17:05,119 - INFO - Starting Ollama health check thread (interval: 30s)
2025-08-14 23:17:05,125 - INFO - RPC server started on 0.0.0.0:50052 with device cuda:0
2025-08-14 23:17:07,199 - INFO - Discovery service started for server with ID 93bed723-b704-44a6-ab49-8b6f4f46cd75 (IPv6: Supported)
2025-08-14 23:17:07,199 - INFO - Auto-discovery service started

----UPDATE FROM AI RESEARCH--

To fix the issue where the olol RPC server is making a status check call that results in a 404 from Ollama, you will need to modify or disable the health/status check request that olol sends to the Ollama HTTP server.

Currently, olol is attempting to fetch a status endpoint that Ollama does not provide, which leads to the HTTP 404. Since Ollama's API doesn't officially have a status endpoint (beyond version checks), the 404 is expected unless olol is updated to use a valid endpoint.
Options to fix this:

Modify olol to skip or change the status check:
Edit the code in olol that performs the health or status check on the Ollama HTTP server.

    Change or remove the call to the HTTP path causing the 404.

    Use /api/version or another valid endpoint for health checks instead.

Patch olol source if open source:

    Find the section in olol’s RPC server code that queries Ollama’s HTTP status endpoint during startup or health checks.

    Replace it with a call to /api/version or a no-op health check that doesn't cause 404.

Request a feature or fix from olol maintainers:

    If this behavior can't be configured, consider opening an issue or feature request with olol’s repository to support a proper health check or skip invalid calls.

Suppress warnings if code change is not feasible:

    If you cannot modify olol code immediately, ignore the 404 warning because it doesn’t block the RPC server startup or operation.

In summary:

The root cause is olol making a call to an unsupported Ollama API endpoint.

Fix involves modifying olol code/config to avoid that invalid call.

Use /api/version as a simple alternative endpoint for health check.

If you need, I can help identify the piece of code or config in olol causing this and assist with changing it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions