Switch MCP HTTP server to stateless mode#624
Open
DavidDwyer87 wants to merge 46 commits intotobi:mainfrom
Open
Switch MCP HTTP server to stateless mode#624DavidDwyer87 wants to merge 46 commits intotobi:mainfrom
DavidDwyer87 wants to merge 46 commits intotobi:mainfrom
Conversation
Implements the LLM interface using the Ollama REST API as an alternative to the default node-llama-cpp local GGUF inference. New files: - src/ollama.ts: OllamaLLM class with embed(), generate(), expandQuery(), rerank(), modelExists(), dispose() methods using Ollama REST API - test/ollama.test.ts: 45 unit tests covering all methods, error handling, configuration, and getDefaultLLM() routing Modified files: - src/llm.ts: Added getDefaultLLM() function that routes to OllamaLLM when QMD_LLM_BACKEND=ollama, otherwise falls back to LlamaCpp Configuration (env vars): - QMD_LLM_BACKEND=ollama — enable Ollama backend - QMD_OLLAMA_BASE_URL — server URL (default: http://localhost:11434) - QMD_OLLAMA_EMBED_MODEL — embedding model (default: nomic-embed-text) - QMD_OLLAMA_GENERATE_MODEL — generation model (default: qwen3:1.7b) - QMD_OLLAMA_RERANK_MODEL — reranking model (default: qwen3:0.6b) Reranking uses chat-based relevance scoring since Ollama has no native rerank API. The model outputs relevance scores which are parsed and normalized to [0, 1]. Also verified MCP server works via both stdio and HTTP transports with all 4 tools (query, get, multi_get, status) accessible.
Adds 9 new tools to the QMD MCP server for full programmatic control: Collection Management: - collections: list all collections with stats - add_collection: add a new collection (name, path, pattern, ignore) - remove_collection: remove a collection by name - rename_collection: rename a collection Context Management: - contexts: list all contexts + global context - add_context: add context to a collection path - remove_context: remove context from a collection path Indexing: - update_index: re-index collections from filesystem - embed: generate vector embeddings for documents Previously the MCP server only exposed 4 read-only tools (query, get, multi_get, status). Now agents can manage collections, set context, and trigger indexing operations remotely via MCP. Test coverage: 20 unit tests covering all new tools. Combined with the existing 4 tools, QMD now exposes 13 MCP tools total.
feat: add Ollama LLM provider for remote model inference
feat: add MCP management tools for collections, contexts, and indexing
…llama When QMD_LLM_BACKEND=ollama is set, QMD no longer downloads or loads local GGUF models. The store layer now uses the generic LLM interface throughout, routing to OllamaLLM when configured. Changes to src/store.ts: - getLlm() returns LLM interface instead of LlamaCpp concrete type - Store.llm field typed as LLM instead of LlamaCpp - All llmOverride params typed as LLM instead of LlamaCpp - All getDefaultLlamaCpp() calls replaced with getDefaultLLM() - chunkDocumentByTokens() uses instanceof guard for backend-specific logic Changes to src/llm.ts: - Added intent to ExpandQueryOptions in LLM interface - Added embedBatch() and embedModelName to LLM interface - Added SimpleLLMSession for non-LlamaCpp backends - Updated withLLMSessionForLlm() to accept LLM interface Changes to src/ollama.ts: - Added embedModelName getter and embedBatch() method - Renamed private fields to avoid interface naming conflicts Changes to test/ollama.test.ts: - Fixed field name references after private field rename
…tion refactor: use LLM interface in store.ts to skip GGUF downloads with Ollama
- /health now returns total indexed documents and docs needing embedding - Update test to verify new fields - Add Jenkinsfile for CI/CD pipeline
Remove in-memory session management in favor of creating a fresh McpServer + transport per request (sessionIdGenerator: undefined). This eliminates 'Session not found' errors that occur when the server restarts and clients hold expired session IDs.
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
sessionIdGenerator: undefined)/mcprequest now creates a freshMcpServer+WebStandardStreamableHTTPServerTransport, handles the request, then cleans up/mcpnow return 405 (not applicable for stateless)sessionsMap,createSession()helper, and related session lifecycle codeWhy
When the MCP HTTP server restarts (crash, deploy, OOM), all in-memory sessions are lost. Clients holding old session IDs receive persistent "Session not found" errors and cannot recover without restarting their own process. This is especially painful for remote MCP setups (e.g.,
type: "remote"in opencode config) where the client and server are on different machines.Changes
sessionsMap,createSession(), session routing logic, stale session handlingrandomUUID,isInitializeRequest)McpServer+ stateless transport per POST request, with cleanup after response/mcpendpointTesting
All verified locally with
node dist/cli/qmd.js mcp --http --port 8181 --daemon: