-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fuseki unresponsive to requests if performing with a very large sync #173
Comments
This is at startup? This is how it currently works: As well as the sync that happens when a request comes in (the normal case for an active system), there is a background thread that does a sync every 5 minutes. This ensures an idle server does not get too far behind in an active cluster. Sync at start is done by making that background thread sync immediately the background task starts, then every 5 minutes. The original idea was to not hold up startup for the case when a server has an existing database. But I can see if this a long way behind, it is not useful for serving requests and if it is not a long way behind, the sync is quick. This could be changed to run the first sync synchronously as the dataset is constructed, before the HTTP server is started. If this is happening not at startup, then something else is going on. #171 be be the issue and the fix includes a "if already doing a sync elsewhere (same server), do not sync but serve the request from the current (unsync'ed) state" as if the request A sync is a write transaction on the database. A write request will be held up because TDB only allows one writer at a time but any number of readers can overlap (and see the before-write state of the database). The design favours query/reading data - read requests proceed with no need for any data locking. |
Hi @afs ,
When I rasised the issue it was driven by observations in our environment:
Anyhow, if something like that happens (which might happen in reality) it is essential in a HA setup that a load balancer or in our case Kubernetes can spot that (liveness and readiness probe) and don't route traffic to such a node. |
If a Fuseki is performing a very large sync they are unresponsive. This causes a problem in a setup with load balancers which are not able to detect that and direct traffic to the Fuseki node.
The text was updated successfully, but these errors were encountered: