fix(embed): truncate oversized chunks to prevent context window crash by debugerman · Pull Request #316 · tobi/qmd

debugerman · 2026-03-07T19:47:05Z

The embedding model (embeddinggemma-300M) has a 2048-token context window. Chunks exceeding this limit cause node-llama-cpp to crash with SIGABRT on Apple Silicon, or silently return null embeddings.

While the chunker targets 900 tokens, edge cases (dense code, base64, format prefixes) can produce chunks that exceed the context window. The reranker already had truncation logic; the embedding path did not.

Changes:

Add truncateForEmbedding() in LlamaCpp that tokenizes and truncates text exceeding the 2048-token context window (minus 100 overhead)
Apply truncation in both embed() and embedBatch() before calling into node-llama-cpp, preventing SIGABRT and null results
Replace first-chunk dimension probing with a virtual probe text, decoupling dimension detection from user data
Add test/oversized-chunk.test.ts covering oversized single embed, mixed batch, and formatted chunks with titles

Normal chunks (<=900 tokens) are unaffected -- truncation only activates on abnormally large inputs that would otherwise crash or be silently dropped.

Fixes #303

The embedding model (embeddinggemma-300M) has a 2048-token context window. Chunks exceeding this limit cause node-llama-cpp to crash with SIGABRT on Apple Silicon, or silently return null embeddings. While the chunker targets 900 tokens, edge cases (dense code, base64, format prefixes) can produce chunks that exceed the context window. The reranker already had truncation logic; the embedding path did not. Changes: - Add truncateForEmbedding() in LlamaCpp that tokenizes and truncates text exceeding the 2048-token context window (minus 100 overhead) - Apply truncation in both embed() and embedBatch() before calling into node-llama-cpp, preventing SIGABRT and null results - Replace first-chunk dimension probing with a virtual probe text, decoupling dimension detection from user data - Add test/oversized-chunk.test.ts covering oversized single embed, mixed batch, and formatted chunks with titles Normal chunks (<=900 tokens) are unaffected -- truncation only activates on abnormally large inputs that would otherwise crash or be silently dropped. Fixes tobi#303

tobi force-pushed the main branch from d2a6c42 to ed0249f Compare March 10, 2026 16:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(embed): truncate oversized chunks to prevent context window crash#316

fix(embed): truncate oversized chunks to prevent context window crash#316
debugerman wants to merge 1 commit intotobi:mainfrom
debugerman:fix/embed-oversized-chunk-crash

debugerman commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

debugerman commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant