RCP Server fail, ollama 404

I've been up for two days trying to fix this.  Very little sleep, very frustrated, even used AI to try to fix this... nothing works.  

I have two machines, I'm trying to run distributed inference with ollama in WSL.

olol server --host 0.0.0.0 --port 50051 --ollama-host http://localhost:11434
(works on both machines Ollama gRPC server started on port 50051)


olol rpc-server --host 0.0.0.0 --port 50052 --device auto
(fails on both machines WARNING - Failed to get Ollama status: HTTP 404)

curl with api/tags pulls the models, ollama ls pulls the models, but olol rpc-server does not pull the models.  So the problem must be with olol since I can use ollama just find and pull from other places as well.


olol proxy --host 0.0.0.0 --port 8000 --servers "192.168.x.x:50051,192.168.y.y:50051" --distributed --rpc-servers "192.168.x.x:50052,192.168.y.y:50052"

 Servers: 2/2 healthy                                                                    Distributed Inference: ENABLED  Active Requests: 0                                                                                   Total Requests: 0  Uptime: 00:00:02                                                                                                        Generate: 0              Chat: 0             Embeddings: 0                

BUT it says 0 models on both!  

What's the deal?  

Tried again:

olol rpc-server --host 0.0.0.0 --port 50052 --device cuda --quantize q5_0 --flash-attention --context-window 16384
2025-08-14 23:17:02,870 - INFO - Ollama already running at http://localhost:11434
2025-08-14 23:17:02,870 - INFO - Connected to Ollama at http://localhost:11434
**2025-08-14 23:17:03,904 - WARNING - Failed to get Ollama status: HTTP 404**
2025-08-14 23:17:05,119 - INFO - Server initialized with device: cuda:0
2025-08-14 23:17:05,119 - INFO - Device capabilities: {'backend_type': 'cuda', 'device_id': 0, 'memory': 25756696576, 'compute_capability': '8.9', 'name': 'NVIDIA GeForce RTX 4090'}
2025-08-14 23:17:05,119 - INFO - Starting Ollama health check thread (interval: 30s)
2025-08-14 23:17:05,125 - INFO - RPC server started on 0.0.0.0:50052 with device cuda:0
2025-08-14 23:17:07,199 - INFO - Discovery service started for server with ID 93bed723-b704-44a6-ab49-8b6f4f46cd75 (IPv6: Supported)
2025-08-14 23:17:07,199 - INFO - Auto-discovery service started

----UPDATE FROM AI RESEARCH--

To fix the issue where the olol RPC server is making a status check call that results in a 404 from Ollama, you will need to modify or disable the health/status check request that olol sends to the Ollama HTTP server.

Currently, olol is attempting to fetch a status endpoint that Ollama does not provide, which leads to the HTTP 404. Since Ollama's API doesn't officially have a status endpoint (beyond version checks), the 404 is expected unless olol is updated to use a valid endpoint.
Options to fix this:

    Modify olol to skip or change the status check:
    Edit the code in olol that performs the health or status check on the Ollama HTTP server.

        Change or remove the call to the HTTP path causing the 404.

        Use /api/version or another valid endpoint for health checks instead.

    Patch olol source if open source:

        Find the section in olol’s RPC server code that queries Ollama’s HTTP status endpoint during startup or health checks.

        Replace it with a call to /api/version or a no-op health check that doesn't cause 404.

    Request a feature or fix from olol maintainers:

        If this behavior can't be configured, consider opening an issue or feature request with olol’s repository to support a proper health check or skip invalid calls.

    Suppress warnings if code change is not feasible:

        If you cannot modify olol code immediately, ignore the 404 warning because it doesn’t block the RPC server startup or operation.

In summary:

    The root cause is olol making a call to an unsupported Ollama API endpoint.

    Fix involves modifying olol code/config to avoid that invalid call.

    Use /api/version as a simple alternative endpoint for health check.

    If you need, I can help identify the piece of code or config in olol causing this and assist with changing it.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RCP Server fail, ollama 404 #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

RCP Server fail, ollama 404 #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions