Make generateEmbeddings session timeout configurable#612
Open
Make generateEmbeddings session timeout configurable#612
Conversation
…imeout The embedding session had a hardcoded 30-minute maxDuration which causes 'Session expired' errors on slow hardware (e.g. CPU-only embedding on a Raspberry Pi with 500+ documents). Since embedding duration is proportional to corpus size and hardware speed, a fixed timeout is inappropriate. Add a maxDuration parameter to EmbedOptions (and the public QMDStore.embed API) that defaults to 0, which disables the timeout. Callers that want a timeout (like the CLI) can pass one explicitly.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Make generateEmbeddings session timeout configurable (default: no timeout)
generateEmbeddings currently uses a hardcoded 30-minute maxDuration for its LLM session. On slower hardware — CPU-only embedding on a Raspberry Pi with 500+ documents, for example — this wall-clock limit is easily exceeded, causing the session to abort mid-run with:
The timeout fires regardless of whether the session is actively doing work. Since embedding duration scales with corpus size and hardware speed, a fixed cap is the wrong constraint here.
This PR adds a maxDuration field to EmbedOptions (and the public QMDStore.embed() API), defaulting to 0 which disables the timer. Callers that want a safety timeout can pass one explicitly.
The session timeout machinery in LLMSession already handles maxDuration: 0 correctly (skips the setTimeout), so no changes to the session layer are needed.