Releases: tobi/qmd
v2.1.0
[2.1.0] - 2026-04-05
Code files now chunk at function and class boundaries via tree-sitter,
clickable editor links land you at the right line from search results,
and per-collection model configuration means you can point different
collections at different embedding models. 25+ community PRs fix
embedding stability, BM25 accuracy, and cross-platform launcher issues.
Changes
- AST-aware chunking for code files via
web-tree-sitter. Supported
languages: TypeScript/JavaScript, Python, Go, and Rust. Code files
are chunked at function, class, and import boundaries instead of
arbitrary text positions. Markdown and unknown file types are unchanged.
--chunk-strategy <auto|regex>flag onqmd embedandqmd query
(defaultregex). SDK:chunkStrategyoption onembed()and
search().qmd statusshows grammar availability. qmd bench <fixture.json>command for search quality benchmarks.
Measures precision@k, recall, MRR, and F1 across BM25, vector, hybrid,
and full pipeline backends. Ships with an example fixture against
the eval-docs test collection. #470 (thanks @jmilinovich)models:section inindex.ymllets you configureembed,rerank,
andgeneratemodel URIs per collection. Resolution order is
config > env var (QMD_EMBED_MODEL,QMD_RERANK_MODEL,
QMD_GENERATE_MODEL) > built-in default. #502
(thanks @JohnRichardEnders)- CLI search output now emits clickable OSC 8 terminal hyperlinks when
stdout is a TTY. Links resolveqmd://paths to absolute filesystem
paths and open in editors via URI templates (default:
vscode://file/{path}:{line}:{col}). Configure withQMD_EDITOR_URI
oreditor_uriin the YAML config. #508 (thanks @danmackinlay) --no-rerankflag skips the reranking step inqmd query— useful
when you want fast results or don't have a GPU. Also exposed as
rerank: falseon the MCPquerytool. #370 (thanks @mvanhorn),
#478 (thanks @zestyboy)- ONNX conversion script for deploying embedding models via
Transformers.js. #399 (thanks @shreyaskarnik) - GitHub Actions workflow to build the Nix flake on Linux and macOS.
Fixes
- Embedding: prevent
qmd embedfrom running indefinitely when the
embedding loop stalls. #458 (thanks @ccc-fff) - Embedding: truncate oversized text before embedding to prevent GGML
crash, and bound memory usage during batch embedding. #393
(thanks @lskun), #395 (thanks @ProgramCaiCai) - Embedding: set explicit embed context size (default 2048, configurable
viaQMD_EMBED_CONTEXT_SIZE) instead of using the model's full
window. #500 - Embedding: error on dimension mismatch instead of silently rebuilding
the vec0 table. #501 - Embedding: handle vec0
OR REPLACElimitation ininsertEmbedding.
#456 (thanks @antonio-mello-ai) - Embedding: fix model selection when multiple models are configured.
#494 - BM25: correct field weights to include all 3 FTS columns — title,
body, and path were not weighted correctly. #462 (thanks @goldsr09) - BM25: handle hyphenated tokens in FTS5 lex queries so terms like
"real-time" match correctly. #463 (thanks @goldsr09) - BM25: preserve underscores in search terms instead of stripping them.
#404 - BM25: use CTE in
searchFTSto prevent query planner regression with
collection filter. - Reranker: increase default context size 2048→4096 and make
configurable viaQMD_RERANK_CONTEXT_SIZE. Fix template overhead
underestimate 200→512. #453 (thanks @builderjarvis) - GPU: catch initialization failures and fall back to CPU instead of
crashing. - MCP: read version from
package.jsoninstead of hardcoding. #431 - MCP: include collection name in status output. #416
- Multi-get: support brace expansion patterns in glob matching. #424
- Launcher: prioritize
package-lock.jsonto prevent Bun false
positive. #385 (thanks @rymalia) - Launcher: remove
$BUN_INSTALLcheck that caused false Bun detection.
#362 (thanks @syedair) - Launcher: skip Git Bash path detection on WSL. #371
(thanks @oysteinkrog) - Model cache: respect
XDG_CACHE_HOMEfor model cache directory. #457
(thanks @antonio-mello-ai) - SQLite: add macOS Homebrew SQLite support for Bun and restore
actionable errors. #377 (thanks @serhii12) - Pin zod to exact 4.2.1 to fix
tscbuild failure. #382
(thanks @rymalia) - Preserve dots and original case in
handelize()— filenames like
MEMORY.mdno longer becomememory-md. #475 (thanks @alexei-led) - Include
linein--jsonsearch output so editor integrations can
jump directly tofile:line. #506 (thanks @danmackinlay) - Nix: fix paths in flake and make Bun dependency a fixed-output
derivation so sandboxed Linux builds work offline. #479
(thanks @surma-dump) - Sync stale
bun.lock(better-sqlite311.x → 12.x). CI and release
script now use--frozen-lockfileto prevent recurrence. #386
(thanks @Mic92) - Approve native build scripts in pnpm so
better-sqlite3and
tree-sitter modules compile correctly. Update vitest ^3.0.0 → ^3.2.4.
v2.0.1
[2.0.1] - 2026-03-10
Changes
qmd skill installcopies the packaged QMD skill into
~/.claude/commands/for one-command setup. #355 (thanks @nibzard)
Fixes
- Fix Qwen3-Embedding GGUF filename case — HuggingFace filenames are
case-sensitive, the lowercase variant returned 404. #349 (thanks @byheaven) - Resolve symlinked global launcher path so
qmdworks correctly when
installed vianpm i -g. #352 (thanks @nibzard)
[2.0.0] - 2026-03-10
QMD 2.0 declares a stable library API. The SDK is now the primary interface —
the MCP server is a clean consumer of it, and the source is organized into
src/cli/ and src/mcp/. Also: Node 25 support and a runtime-aware bin wrapper
for bun installs.
Changes
- Stable SDK API with
QMDStoreinterface — search, retrieval, collection/context
management, indexing, lifecycle - Unified
search(): passqueryfor auto-expansion orqueriesfor
pre-expanded lex/vec/hyde — replaces the old query/search/structuredSearch split - New
getDocumentBody(),getDefaultCollectionNames(),Maintenanceclass - MCP server rewritten as a clean SDK consumer — zero internal store access
- CLI and MCP organized into
src/cli/andsrc/mcp/subdirectories - Runtime-aware
bin/qmdwrapper detects bun vs node to avoid ABI mismatches.
Closes #319 better-sqlite3bumped to ^12.4.5 for Node 25 support. Closes #257- Utility exports:
extractSnippet,addLineNumbers,DEFAULT_MULTI_GET_MAX_BYTES
Fixes
- Remove unused
import { resolve }in store.ts that shadowed local export
v2.0.0
[2.0.0] - 2026-03-10
QMD 2.0 declares a stable library API. The SDK is now the primary interface —
the MCP server is a clean consumer of it, and the source is organized into
src/cli/ and src/mcp/. Also: Node 25 support and a runtime-aware bin wrapper
for bun installs.
Changes
- Stable SDK API with
QMDStoreinterface — search, retrieval, collection/context
management, indexing, lifecycle - Unified
search(): passqueryfor auto-expansion orqueriesfor
pre-expanded lex/vec/hyde — replaces the old query/search/structuredSearch split - New
getDocumentBody(),getDefaultCollectionNames(),Maintenanceclass - MCP server rewritten as a clean SDK consumer — zero internal store access
- CLI and MCP organized into
src/cli/andsrc/mcp/subdirectories - Runtime-aware
bin/qmdwrapper detects bun vs node to avoid ABI mismatches.
Closes #319 better-sqlite3bumped to ^12.4.5 for Node 25 support. Closes #257- Utility exports:
extractSnippet,addLineNumbers,DEFAULT_MULTI_GET_MAX_BYTES
Fixes
- Remove unused
import { resolve }in store.ts that shadowed local export
v1.1.6
[1.1.6] - 2026-03-09
QMD can now be used as a library. import { createStore } from '@tobilu/qmd'
gives you the full search and indexing API — hybrid query, BM25, structured
search, collection/context management — without shelling out to the CLI.
Changes
- SDK / library mode:
createStore({ dbPath, config })returns a
QMDStorewithquery(),search(),structuredSearch(),get(),
multiGet(), and collection/context management methods. Supports inline
config (no files needed) or a YAML config path. - Package exports:
package.jsonnow declaresmain,types, and
exportsso bundlers and TypeScript resolve@tobilu/qmdcorrectly.
[1.1.5] - 2026-03-07
Ambiguous queries like "performance" now produce dramatically better results
when the caller knows what they mean. The new intent parameter steers all
five pipeline stages — expansion, strong-signal bypass, chunk selection,
reranking, and snippet extraction — without searching on its own. Design and
original implementation by Ilya Grigorik (@vyalamar) in #180.
Changes
- Intent parameter: optional
intentstring disambiguates queries across
the entire search pipeline. Available via CLI (--intentflag orintent:
line in query documents), MCP (intentfield on the query tool), and
programmatic API. Adapted from PR #180 (thanks @vyalamar). - Query expansion: when intent is provided, the expansion LLM prompt
includesQuery intent: {intent}, matching the finetune training data
format for better-aligned expansions. - Reranking: intent is prepended to the rerank query so Qwen3-Reranker
scores with domain context. - Chunk selection: intent terms scored at 0.5× weight alongside query
terms (1.0×) when selecting the best chunk per document for reranking. - Snippet extraction: intent terms scored at 0.3× weight to nudge
snippets toward intent-relevant lines without overriding query anchoring. - Strong-signal bypass disabled with intent: when intent is provided, the
BM25 strong-signal shortcut is skipped — the obvious keyword match may not
be what the caller wants. - MCP instructions: callers are now guided to provide
intenton every
search call for disambiguation. - Query document syntax:
intent:recognized as a line type. At most one
per document, cannot appear alone. Grammar updated indocs/SYNTAX.md.
[1.1.2] - 2026-03-07
13 community PRs merged. GPU initialization replaced with node-llama-cpp's
built-in autoAttempt — deleting ~220 lines of manual fallback code and
fixing GPU issues reported across 10+ PRs in one shot. Reranking is faster
through chunk deduplication and a parallelism cap that prevents VRAM
exhaustion.
Changes
- GPU init: use node-llama-cpp's
build: "autoAttempt"instead of manual
GPU backend detection. Automatically tries Metal/CUDA/Vulkan and falls back
gracefully. #310 (thanks @giladgd — the node-llama-cpp author) - Query
--explain:qmd query --explainexposes retrieval score traces
— backend scores, per-list RRF contributions, top-rank bonus, reranker
score, and final blended score. Works in JSON and CLI output. #242
(thanks @vyalamar) - Collection ignore patterns:
ignore: ["Sessions/**", "*.tmp"]in
collection config to exclude files from indexing. #304 (thanks @sebkouba) - Multilingual embeddings:
QMD_EMBED_MODELenv var lets you swap in
models like Qwen3-Embedding for non-English collections. #273 (thanks
@daocoding) - Configurable expansion context:
QMD_EXPAND_CONTEXT_SIZEenv var
(default 2048) — previously used the model's full 40960-token window,
wasting VRAM. #313 (thanks @0xble) candidateLimitexposed:-C/--candidate-limitflag and MCP
parameter to tune how many candidates reach the reranker. #255 (thanks
@pandysp)- MCP multi-session: HTTP transport now supports multiple concurrent
client sessions, each with its own server instance. #286 (thanks @joelev)
Fixes
- Reranking performance: cap parallel rerank contexts at 4 to prevent
VRAM exhaustion on high-core machines. Deduplicate identical chunk texts
before reranking — same content from different files now shares a single
reranker call. Cache scores by content hash instead of file path. - Deactivate stale docs when all files are removed from a collection and
qmd updateis run. #312 (thanks @0xble) - Handle emoji-only filenames (
🐘.md→1f418.md) instead of crashing.
#308 (thanks @debugerman) - Skip unreadable files during indexing (e.g. iCloud-evicted files returning
EAGAIN) instead of crashing. #253 (thanks @jimmynail) - Suppress progress bar escape sequences when stderr is not a TTY. #230
(thanks @dgilperez) - Emit format-appropriate empty output (
[]for JSON, CSV header for CSV,
etc.) instead of plain text "No results." #228 (thanks @amsminn) - Correct Windows sqlite-vec package name (
sqlite-vec-windows-x64) and add
sqlite-vec-linux-arm64. #225 (thanks @ilepn) - Fix claude plugin setup CLI commands in README. #311 (thanks @gi11es)
[1.1.1] - 2026-03-06
Fixes
- Reranker: truncate documents exceeding the 2048-token context window
instead of silently producing garbage scores. Long chunks (e.g. from
PDF ingestion) now get a fair ranking. - Nix: add python3 and cctools to build dependencies. #214 (thanks
@pcasaretto)
[1.1.0] - 2026-02-20
QMD now speaks in query documents — structured multi-line queries where every line is typed (lex:, vec:, hyde:), combining keyword precision with semantic recall. A single plain query still works exactly as before (it's treated as an implicit expand: and auto-expanded by the LLM). Lex now supports quoted phrases and negation ("C++ performance" -sports -athlete), making intent-aware disambiguation practical. The formal query grammar is documented in docs/SYNTAX.md.
The npm package now uses the standard #!/usr/bin/env node bin convention, replacing the custom bash wrapper. This fixes native module ABI mismatches when installed via bun and works on any platform with node >= 22 on PATH.
Changes
- Query document format: multi-line queries with typed sub-queries (
lex:,vec:,hyde:). Plain queries remain the default (expand:implicit, but not written inside the document). First sub-query gets 2× fusion weight — put your strongest signal first. Formal grammar indocs/SYNTAX.md. - Lex syntax: full BM25 operator support.
"exact phrase"for verbatim matching;-termand-"phrase"for exclusions. Essential for disambiguation when a term is overloaded across domains (e.g.performance -sports -athlete). expand:shortcut: send a single plain query (or start the document withexpand:on its only line) to auto-expand via the local LLM. Query documents themselves are limited tolex,vec, andhydelines.- MCP
querytool (renamed fromstructured_search): rewrote the tool description to fully teach AI agents the query document format, lex syntax, and combination strategy. Includes worked examples with intent-aware lex. - HTTP
/queryendpoint (renamed from/search;/searchkept as silent alias). collectionsarray filter: filter by multiple collections in a single query (collections: ["notes", "brain"]). Removed the singlecollectionstring param — array only.- Collection
include/exclude:includeByDefault: falsehides a collection from all queries unless explicitly named viacollections. CLI:qmd collection exclude <name>/qmd collection include <name>. - Collection
update-cmd: attach a shell command that runs before everyqmd update(e.g.git stash && git pull --rebase --ff-only && git stash pop). CLI:qmd collection update-cmd <name> '<cmd>'. qmd statustips: shows actionable tips when collections lack context descriptions or update commands.qmd collectionsubcommands:show,update-cmd,include,exclude. Bareqmd collectionnow prints help.- Packaging: replaced custom bash wrapper with standard
#!/usr/bin/env nodeshebang ondist/qmd.js. Fixes native module ABI mismatches when installed via bun, and works on any platform where node >= 22 is on PATH. - Removed MCP tools
search,vector_search,deep_search— all superseded byquery. - Removed
qmd context checkcommand. - CLI timing: each LLM step (expand, embed, rerank) prints elapsed time inline (
Expanding query... (4.2s)).
Fixes
qmd collection listshows[excluded]tag for collections withincludeByDefault: false.- Default searches now respect
includeByDefault— excluded collections are skipped unless explicitly named. - Fix main module detection when installed globally via npm/bun (symlink resolution).
v1.1.5
[1.1.5] - 2026-03-07
Ambiguous queries like "performance" now produce dramatically better results
when the caller knows what they mean. The new intent parameter steers all
five pipeline stages — expansion, strong-signal bypass, chunk selection,
reranking, and snippet extraction — without searching on its own. Design and
original implementation by Ilya Grigorik (@vyalamar) in #180.
Changes
- Intent parameter: optional
intentstring disambiguates queries across
the entire search pipeline. Available via CLI (--intentflag orintent:
line in query documents), MCP (intentfield on the query tool), and
programmatic API. Adapted from PR #180 (thanks @vyalamar). - Query expansion: when intent is provided, the expansion LLM prompt
includesQuery intent: {intent}, matching the finetune training data
format for better-aligned expansions. - Reranking: intent is prepended to the rerank query so Qwen3-Reranker
scores with domain context. - Chunk selection: intent terms scored at 0.5× weight alongside query
terms (1.0×) when selecting the best chunk per document for reranking. - Snippet extraction: intent terms scored at 0.3× weight to nudge
snippets toward intent-relevant lines without overriding query anchoring. - Strong-signal bypass disabled with intent: when intent is provided, the
BM25 strong-signal shortcut is skipped — the obvious keyword match may not
be what the caller wants. - MCP instructions: callers are now guided to provide
intenton every
search call for disambiguation. - Query document syntax:
intent:recognized as a line type. At most one
per document, cannot appear alone. Grammar updated indocs/SYNTAX.md.
[1.1.2] - 2026-03-07
13 community PRs merged. GPU initialization replaced with node-llama-cpp's
built-in autoAttempt — deleting ~220 lines of manual fallback code and
fixing GPU issues reported across 10+ PRs in one shot. Reranking is faster
through chunk deduplication and a parallelism cap that prevents VRAM
exhaustion.
Changes
- GPU init: use node-llama-cpp's
build: "autoAttempt"instead of manual
GPU backend detection. Automatically tries Metal/CUDA/Vulkan and falls back
gracefully. #310 (thanks @giladgd — the node-llama-cpp author) - Query
--explain:qmd query --explainexposes retrieval score traces
— backend scores, per-list RRF contributions, top-rank bonus, reranker
score, and final blended score. Works in JSON and CLI output. #242
(thanks @vyalamar) - Collection ignore patterns:
ignore: ["Sessions/**", "*.tmp"]in
collection config to exclude files from indexing. #304 (thanks @sebkouba) - Multilingual embeddings:
QMD_EMBED_MODELenv var lets you swap in
models like Qwen3-Embedding for non-English collections. #273 (thanks
@daocoding) - Configurable expansion context:
QMD_EXPAND_CONTEXT_SIZEenv var
(default 2048) — previously used the model's full 40960-token window,
wasting VRAM. #313 (thanks @0xble) candidateLimitexposed:-C/--candidate-limitflag and MCP
parameter to tune how many candidates reach the reranker. #255 (thanks
@pandysp)- MCP multi-session: HTTP transport now supports multiple concurrent
client sessions, each with its own server instance. #286 (thanks @joelev)
Fixes
- Reranking performance: cap parallel rerank contexts at 4 to prevent
VRAM exhaustion on high-core machines. Deduplicate identical chunk texts
before reranking — same content from different files now shares a single
reranker call. Cache scores by content hash instead of file path. - Deactivate stale docs when all files are removed from a collection and
qmd updateis run. #312 (thanks @0xble) - Handle emoji-only filenames (
🐘.md→1f418.md) instead of crashing.
#308 (thanks @debugerman) - Skip unreadable files during indexing (e.g. iCloud-evicted files returning
EAGAIN) instead of crashing. #253 (thanks @jimmynail) - Suppress progress bar escape sequences when stderr is not a TTY. #230
(thanks @dgilperez) - Emit format-appropriate empty output (
[]for JSON, CSV header for CSV,
etc.) instead of plain text "No results." #228 (thanks @amsminn) - Correct Windows sqlite-vec package name (
sqlite-vec-windows-x64) and add
sqlite-vec-linux-arm64. #225 (thanks @ilepn) - Fix claude plugin setup CLI commands in README. #311 (thanks @gi11es)
[1.1.1] - 2026-03-06
Fixes
- Reranker: truncate documents exceeding the 2048-token context window
instead of silently producing garbage scores. Long chunks (e.g. from
PDF ingestion) now get a fair ranking. - Nix: add python3 and cctools to build dependencies. #214 (thanks
@pcasaretto)
[1.1.0] - 2026-02-20
QMD now speaks in query documents — structured multi-line queries where every line is typed (lex:, vec:, hyde:), combining keyword precision with semantic recall. A single plain query still works exactly as before (it's treated as an implicit expand: and auto-expanded by the LLM). Lex now supports quoted phrases and negation ("C++ performance" -sports -athlete), making intent-aware disambiguation practical. The formal query grammar is documented in docs/SYNTAX.md.
The npm package now uses the standard #!/usr/bin/env node bin convention, replacing the custom bash wrapper. This fixes native module ABI mismatches when installed via bun and works on any platform with node >= 22 on PATH.
Changes
- Query document format: multi-line queries with typed sub-queries (
lex:,vec:,hyde:). Plain queries remain the default (expand:implicit, but not written inside the document). First sub-query gets 2× fusion weight — put your strongest signal first. Formal grammar indocs/SYNTAX.md. - Lex syntax: full BM25 operator support.
"exact phrase"for verbatim matching;-termand-"phrase"for exclusions. Essential for disambiguation when a term is overloaded across domains (e.g.performance -sports -athlete). expand:shortcut: send a single plain query (or start the document withexpand:on its only line) to auto-expand via the local LLM. Query documents themselves are limited tolex,vec, andhydelines.- MCP
querytool (renamed fromstructured_search): rewrote the tool description to fully teach AI agents the query document format, lex syntax, and combination strategy. Includes worked examples with intent-aware lex. - HTTP
/queryendpoint (renamed from/search;/searchkept as silent alias). collectionsarray filter: filter by multiple collections in a single query (collections: ["notes", "brain"]). Removed the singlecollectionstring param — array only.- Collection
include/exclude:includeByDefault: falsehides a collection from all queries unless explicitly named viacollections. CLI:qmd collection exclude <name>/qmd collection include <name>. - Collection
update-cmd: attach a shell command that runs before everyqmd update(e.g.git stash && git pull --rebase --ff-only && git stash pop). CLI:qmd collection update-cmd <name> '<cmd>'. qmd statustips: shows actionable tips when collections lack context descriptions or update commands.qmd collectionsubcommands:show,update-cmd,include,exclude. Bareqmd collectionnow prints help.- Packaging: replaced custom bash wrapper with standard
#!/usr/bin/env nodeshebang ondist/qmd.js. Fixes native module ABI mismatches when installed via bun, and works on any platform where node >= 22 is on PATH. - Removed MCP tools
search,vector_search,deep_search— all superseded byquery. - Removed
qmd context checkcommand. - CLI timing: each LLM step (expand, embed, rerank) prints elapsed time inline (
Expanding query... (4.2s)).
Fixes
qmd collection listshows[excluded]tag for collections withincludeByDefault: false.- Default searches now respect
includeByDefault— excluded collections are skipped unless explicitly named. - Fix main module detection when installed globally via npm/bun (symlink resolution).
v1.1.2
[1.1.2] - 2026-03-07
13 community PRs merged. GPU initialization replaced with node-llama-cpp's
built-in autoAttempt — deleting ~220 lines of manual fallback code and
fixing GPU issues reported across 10+ PRs in one shot. Reranking is faster
through chunk deduplication and a parallelism cap that prevents VRAM
exhaustion.
Changes
- GPU init: use node-llama-cpp's
build: "autoAttempt"instead of manual
GPU backend detection. Automatically tries Metal/CUDA/Vulkan and falls back
gracefully. #310 (thanks @giladgd — the node-llama-cpp author) - Query
--explain:qmd query --explainexposes retrieval score traces
— backend scores, per-list RRF contributions, top-rank bonus, reranker
score, and final blended score. Works in JSON and CLI output. #242
(thanks @vyalamar) - Collection ignore patterns:
ignore: ["Sessions/**", "*.tmp"]in
collection config to exclude files from indexing. #304 (thanks @sebkouba) - Multilingual embeddings:
QMD_EMBED_MODELenv var lets you swap in
models like Qwen3-Embedding for non-English collections. #273 (thanks
@daocoding) - Configurable expansion context:
QMD_EXPAND_CONTEXT_SIZEenv var
(default 2048) — previously used the model's full 40960-token window,
wasting VRAM. #313 (thanks @0xble) candidateLimitexposed:-C/--candidate-limitflag and MCP
parameter to tune how many candidates reach the reranker. #255 (thanks
@pandysp)- MCP multi-session: HTTP transport now supports multiple concurrent
client sessions, each with its own server instance. #286 (thanks @joelev)
Fixes
- Reranking performance: cap parallel rerank contexts at 4 to prevent
VRAM exhaustion on high-core machines. Deduplicate identical chunk texts
before reranking — same content from different files now shares a single
reranker call. Cache scores by content hash instead of file path. - Deactivate stale docs when all files are removed from a collection and
qmd updateis run. #312 (thanks @0xble) - Handle emoji-only filenames (
🐘.md→1f418.md) instead of crashing.
#308 (thanks @debugerman) - Skip unreadable files during indexing (e.g. iCloud-evicted files returning
EAGAIN) instead of crashing. #253 (thanks @jimmynail) - Suppress progress bar escape sequences when stderr is not a TTY. #230
(thanks @dgilperez) - Emit format-appropriate empty output (
[]for JSON, CSV header for CSV,
etc.) instead of plain text "No results." #228 (thanks @amsminn) - Correct Windows sqlite-vec package name (
sqlite-vec-windows-x64) and add
sqlite-vec-linux-arm64. #225 (thanks @ilepn) - Fix claude plugin setup CLI commands in README. #311 (thanks @gi11es)
[1.1.1] - 2026-03-06
Fixes
- Reranker: truncate documents exceeding the 2048-token context window
instead of silently producing garbage scores. Long chunks (e.g. from
PDF ingestion) now get a fair ranking. - Nix: add python3 and cctools to build dependencies. #214 (thanks
@pcasaretto)
[1.1.0] - 2026-02-20
QMD now speaks in query documents — structured multi-line queries where every line is typed (lex:, vec:, hyde:), combining keyword precision with semantic recall. A single plain query still works exactly as before (it's treated as an implicit expand: and auto-expanded by the LLM). Lex now supports quoted phrases and negation ("C++ performance" -sports -athlete), making intent-aware disambiguation practical. The formal query grammar is documented in docs/SYNTAX.md.
The npm package now uses the standard #!/usr/bin/env node bin convention, replacing the custom bash wrapper. This fixes native module ABI mismatches when installed via bun and works on any platform with node >= 22 on PATH.
Changes
- Query document format: multi-line queries with typed sub-queries (
lex:,vec:,hyde:). Plain queries remain the default (expand:implicit, but not written inside the document). First sub-query gets 2× fusion weight — put your strongest signal first. Formal grammar indocs/SYNTAX.md. - Lex syntax: full BM25 operator support.
"exact phrase"for verbatim matching;-termand-"phrase"for exclusions. Essential for disambiguation when a term is overloaded across domains (e.g.performance -sports -athlete). expand:shortcut: send a single plain query (or start the document withexpand:on its only line) to auto-expand via the local LLM. Query documents themselves are limited tolex,vec, andhydelines.- MCP
querytool (renamed fromstructured_search): rewrote the tool description to fully teach AI agents the query document format, lex syntax, and combination strategy. Includes worked examples with intent-aware lex. - HTTP
/queryendpoint (renamed from/search;/searchkept as silent alias). collectionsarray filter: filter by multiple collections in a single query (collections: ["notes", "brain"]). Removed the singlecollectionstring param — array only.- Collection
include/exclude:includeByDefault: falsehides a collection from all queries unless explicitly named viacollections. CLI:qmd collection exclude <name>/qmd collection include <name>. - Collection
update-cmd: attach a shell command that runs before everyqmd update(e.g.git stash && git pull --rebase --ff-only && git stash pop). CLI:qmd collection update-cmd <name> '<cmd>'. qmd statustips: shows actionable tips when collections lack context descriptions or update commands.qmd collectionsubcommands:show,update-cmd,include,exclude. Bareqmd collectionnow prints help.- Packaging: replaced custom bash wrapper with standard
#!/usr/bin/env nodeshebang ondist/qmd.js. Fixes native module ABI mismatches when installed via bun, and works on any platform where node >= 22 is on PATH. - Removed MCP tools
search,vector_search,deep_search— all superseded byquery. - Removed
qmd context checkcommand. - CLI timing: each LLM step (expand, embed, rerank) prints elapsed time inline (
Expanding query... (4.2s)).
Fixes
qmd collection listshows[excluded]tag for collections withincludeByDefault: false.- Default searches now respect
includeByDefault— excluded collections are skipped unless explicitly named. - Fix main module detection when installed globally via npm/bun (symlink resolution).
v1.1.1
[1.1.1] - 2026-03-06
Fixes
- Reranker: truncate documents exceeding the 2048-token context window
instead of silently producing garbage scores. Long chunks (e.g. from
PDF ingestion) now get a fair ranking. - Nix: add python3 and cctools to build dependencies. #214 (thanks
@pcasaretto)
[1.1.0] - 2026-02-20
QMD now speaks in query documents — structured multi-line queries where every line is typed (lex:, vec:, hyde:), combining keyword precision with semantic recall. A single plain query still works exactly as before (it's treated as an implicit expand: and auto-expanded by the LLM). Lex now supports quoted phrases and negation ("C++ performance" -sports -athlete), making intent-aware disambiguation practical. The formal query grammar is documented in docs/SYNTAX.md.
The npm package now uses the standard #!/usr/bin/env node bin convention, replacing the custom bash wrapper. This fixes native module ABI mismatches when installed via bun and works on any platform with node >= 22 on PATH.
Changes
- Query document format: multi-line queries with typed sub-queries (
lex:,vec:,hyde:). Plain queries remain the default (expand:implicit, but not written inside the document). First sub-query gets 2× fusion weight — put your strongest signal first. Formal grammar in[docs/SYNTAX.md](https://github.com/tobi/qmd/blob/main/docs/SYNTAX.md). - Lex syntax: full BM25 operator support.
"exact phrase"for verbatim matching;-termand-"phrase"for exclusions. Essential for disambiguation when a term is overloaded across domains (e.g.performance -sports -athlete). expand:shortcut: send a single plain query (or start the document withexpand:on its only line) to auto-expand via the local LLM. Query documents themselves are limited tolex,vec, andhydelines.- MCP
querytool (renamed fromstructured_search): rewrote the tool description to fully teach AI agents the query document format, lex syntax, and combination strategy. Includes worked examples with intent-aware lex. - HTTP
/queryendpoint (renamed from/search;/searchkept as silent alias). collectionsarray filter: filter by multiple collections in a single query (collections: ["notes", "brain"]). Removed the singlecollectionstring param — array only.- Collection
include/exclude:includeByDefault: falsehides a collection from all queries unless explicitly named viacollections. CLI:qmd collection exclude <name>/qmd collection include <name>. - Collection
update-cmd: attach a shell command that runs before everyqmd update(e.g.git stash && git pull --rebase --ff-only && git stash pop). CLI:qmd collection update-cmd <name> '<cmd>'. qmd statustips: shows actionable tips when collections lack context descriptions or update commands.qmd collectionsubcommands:show,update-cmd,include,exclude. Bareqmd collectionnow prints help.- Packaging: replaced custom bash wrapper with standard
#!/usr/bin/env nodeshebang ondist/qmd.js. Fixes native module ABI mismatches when installed via bun, and works on any platform where node >= 22 is on PATH. - Removed MCP tools
search,vector_search,deep_search— all superseded byquery. - Removed
qmd context checkcommand. - CLI timing: each LLM step (expand, embed, rerank) prints elapsed time inline (
Expanding query... (4.2s)).
Fixes
qmd collection listshows[excluded]tag for collections withincludeByDefault: false.- Default searches now respect
includeByDefault— excluded collections are skipped unless explicitly named. - Fix main module detection when installed globally via npm/bun (symlink resolution).
v1.0.7
[1.0.7] - 2026-02-18
Changes
- LLM: add LiquidAI LFM2-1.2B as an alternative base model for query
expansion fine-tuning. LFM2's hybrid architecture (convolutions + attention)
is 2x faster at decode/prefill vs standard transformers — good fit for
on-device inference. - CLI: support multiple
-cflags to search across several collections at
once (e.g.qmd search -c notes -c journals "query"). #191 (thanks
@openclaw)
Fixes
- Return empty JSON array
[]instead of no output when--jsonsearch
finds no results. - Resolve relative paths passed to
--indexso they don't produce malformed
config entries. - Respect
XDG_CONFIG_HOMEfor collection config path instead of always
using~/.config. #190 (thanks @openclaw) - CLI: empty-collection hint now shows the correct
collection addcommand.
#200 (thanks @vincentkoc)
[1.0.6] - 2026-02-16
Changes
- CLI:
qmd statusnow shows models with full HuggingFace links instead of
static names in--help. Model info is derived from the actual configured
URIs so it stays accurate if models change. - Release tooling: pre-push hook handles non-interactive shells (CI, editors)
gracefully — warnings auto-proceed instead of hanging on a tty prompt.
Annotated tags now resolve correctly for CI checks.
[1.0.5] - 2026-02-16
The npm package now ships compiled JavaScript instead of raw TypeScript,
removing the tsx runtime dependency. A new /release skill automates the
full release workflow with changelog validation and git hook enforcement.
Changes
- Build: compile TypeScript to
dist/viatscso the npm package no longer
requirestsxat runtime. Theqmdshell wrapper now runsdist/qmd.js
directly. - Release tooling: new
/releaseskill that manages the full release
lifecycle — validates changelog, installs git hooks, previews release notes,
and cuts the release. Auto-populates[Unreleased]from git history when
empty. - Release tooling:
scripts/extract-changelog.shextracts cumulative notes
for the full minor series (e.g. 1.0.0 through 1.0.5) for GitHub releases.
Includes[Unreleased]content in previews. - Release tooling:
scripts/release.shrenames[Unreleased]to a versioned
heading and inserts a fresh empty[Unreleased]section automatically. - Release tooling: pre-push git hook blocks
v*tag pushes unless
package.jsonversion matches the tag, a changelog entry exists, and CI
passed on GitHub. - Publish workflow: GitHub Actions now builds TypeScript, creates a GitHub
release with cumulative notes extracted from the changelog, and publishes
to npm with provenance.
[1.0.0] - 2026-02-15
QMD now runs on both Node.js and Bun, with up to 2.7x faster reranking
through parallel GPU contexts. GPU auto-detection replaces the unreliable
gpu: "auto" with explicit CUDA/Metal/Vulkan probing.
Changes
- Runtime: support Node.js (>=22) alongside Bun via a cross-runtime SQLite
abstraction layer (src/db.ts).bun:sqliteon Bun,better-sqlite3on
Node. Theqmdwrapper auto-detects a suitable Node.js install via PATH,
then falls back to mise, asdf, nvm, and Homebrew locations. - Performance: parallel embedding & reranking via multiple LlamaContext
instances — up to 2.7x faster on multi-core machines. - Performance: flash attention for ~20% less VRAM per reranking context,
enabling more parallel contexts on GPU. - Performance: right-sized reranker context (40960 → 2048 tokens, 17x less
memory) since chunks are capped at ~900 tokens. - Performance: adaptive parallelism — context count computed from available
VRAM (GPU) or CPU math cores rather than hardcoded. - GPU: probe for CUDA, Metal, Vulkan explicitly at startup instead of
relying on node-llama-cpp'sgpu: "auto".qmd statusshows device info. - Tests: reorganized into flat
test/directory with vitest for Node.js and
bun test for Bun. Neweval-bm25andstore.helpers.unitsuites.
Fixes
- Prevent VRAM waste from duplicate context creation during concurrent
embedBatchcalls — initialization lock now covers the full path. - Collection-aware FTS filtering so scoped keyword search actually restricts
results to the requested collection.
v1.0.6
[1.0.6] - 2026-02-16
Changes
- CLI:
qmd statusnow shows models with full HuggingFace links instead of
static names in--help. Model info is derived from the actual configured
URIs so it stays accurate if models change. - Release tooling: pre-push hook handles non-interactive shells (CI, editors)
gracefully — warnings auto-proceed instead of hanging on a tty prompt.
Annotated tags now resolve correctly for CI checks.
[1.0.5] - 2026-02-16
The npm package now ships compiled JavaScript instead of raw TypeScript,
removing the tsx runtime dependency. A new /release skill automates the
full release workflow with changelog validation and git hook enforcement.
Changes
- Build: compile TypeScript to
dist/viatscso the npm package no longer
requirestsxat runtime. Theqmdshell wrapper now runsdist/qmd.js
directly. - Release tooling: new
/releaseskill that manages the full release
lifecycle — validates changelog, installs git hooks, previews release notes,
and cuts the release. Auto-populates[Unreleased]from git history when
empty. - Release tooling:
scripts/extract-changelog.shextracts cumulative notes
for the full minor series (e.g. 1.0.0 through 1.0.5) for GitHub releases.
Includes[Unreleased]content in previews. - Release tooling:
scripts/release.shrenames[Unreleased]to a versioned
heading and inserts a fresh empty[Unreleased]section automatically. - Release tooling: pre-push git hook blocks
v*tag pushes unless
package.jsonversion matches the tag, a changelog entry exists, and CI
passed on GitHub. - Publish workflow: GitHub Actions now builds TypeScript, creates a GitHub
release with cumulative notes extracted from the changelog, and publishes
to npm with provenance.
[1.0.0] - 2026-02-15
QMD now runs on both Node.js and Bun, with up to 2.7x faster reranking
through parallel GPU contexts. GPU auto-detection replaces the unreliable
gpu: "auto" with explicit CUDA/Metal/Vulkan probing.
Changes
- Runtime: support Node.js (>=22) alongside Bun via a cross-runtime SQLite
abstraction layer (src/db.ts).bun:sqliteon Bun,better-sqlite3on
Node. Theqmdwrapper auto-detects a suitable Node.js install via PATH,
then falls back to mise, asdf, nvm, and Homebrew locations. - Performance: parallel embedding & reranking via multiple LlamaContext
instances — up to 2.7x faster on multi-core machines. - Performance: flash attention for ~20% less VRAM per reranking context,
enabling more parallel contexts on GPU. - Performance: right-sized reranker context (40960 → 2048 tokens, 17x less
memory) since chunks are capped at ~900 tokens. - Performance: adaptive parallelism — context count computed from available
VRAM (GPU) or CPU math cores rather than hardcoded. - GPU: probe for CUDA, Metal, Vulkan explicitly at startup instead of
relying on node-llama-cpp'sgpu: "auto".qmd statusshows device info. - Tests: reorganized into flat
test/directory with vitest for Node.js and
bun test for Bun. Neweval-bm25andstore.helpers.unitsuites.
Fixes
- Prevent VRAM waste from duplicate context creation during concurrent
embedBatchcalls — initialization lock now covers the full path. - Collection-aware FTS filtering so scoped keyword search actually restricts
results to the requested collection.
v1.0.5
[1.0.5] - 2026-02-16
The npm package now ships compiled JavaScript instead of raw TypeScript,
removing the tsx runtime dependency. A new /release skill automates the
full release workflow with changelog validation and git hook enforcement.
Changes
- Build: compile TypeScript to
dist/viatscso the npm package no longer
requirestsxat runtime. Theqmdshell wrapper now runsdist/qmd.js
directly. - Release tooling: new
/releaseskill that manages the full release
lifecycle — validates changelog, installs git hooks, previews release notes,
and cuts the release. Auto-populates[Unreleased]from git history when
empty. - Release tooling:
scripts/extract-changelog.shextracts cumulative notes
for the full minor series (e.g. 1.0.0 through 1.0.5) for GitHub releases.
Includes[Unreleased]content in previews. - Release tooling:
scripts/release.shrenames[Unreleased]to a versioned
heading and inserts a fresh empty[Unreleased]section automatically. - Release tooling: pre-push git hook blocks
v*tag pushes unless
package.jsonversion matches the tag, a changelog entry exists, and CI
passed on GitHub. - Publish workflow: GitHub Actions now builds TypeScript, creates a GitHub
release with cumulative notes extracted from the changelog, and publishes
to npm with provenance.
[1.0.0] - 2026-02-15
QMD now runs on both Node.js and Bun, with up to 2.7x faster reranking
through parallel GPU contexts. GPU auto-detection replaces the unreliable
gpu: "auto" with explicit CUDA/Metal/Vulkan probing.
Changes
- Runtime: support Node.js (>=22) alongside Bun via a cross-runtime SQLite
abstraction layer (src/db.ts).bun:sqliteon Bun,better-sqlite3on
Node. Theqmdwrapper auto-detects a suitable Node.js install via PATH,
then falls back to mise, asdf, nvm, and Homebrew locations. - Performance: parallel embedding & reranking via multiple LlamaContext
instances — up to 2.7x faster on multi-core machines. - Performance: flash attention for ~20% less VRAM per reranking context,
enabling more parallel contexts on GPU. - Performance: right-sized reranker context (40960 → 2048 tokens, 17x less
memory) since chunks are capped at ~900 tokens. - Performance: adaptive parallelism — context count computed from available
VRAM (GPU) or CPU math cores rather than hardcoded. - GPU: probe for CUDA, Metal, Vulkan explicitly at startup instead of
relying on node-llama-cpp'sgpu: "auto".qmd statusshows device info. - Tests: reorganized into flat
test/directory with vitest for Node.js and
bun test for Bun. Neweval-bm25andstore.helpers.unitsuites.
Fixes
- Prevent VRAM waste from duplicate context creation during concurrent
embedBatchcalls — initialization lock now covers the full path. - Collection-aware FTS filtering so scoped keyword search actually restricts
results to the requested collection.