77### Self-hosted RAG search engine. Production-ready in 3 minutes.
88
99[ ![ License: MIT] ( https://img.shields.io/badge/License-MIT-yellow.svg )] ( LICENSE )
10- [ ![ Version] ( https://img.shields.io/badge/version-1.5.3 -blue.svg )] ( CHANGELOG.md )
10+ [ ![ Version] ( https://img.shields.io/badge/version-1.6.1 -blue.svg )] ( CHANGELOG.md )
1111[ ![ Python] ( https://img.shields.io/badge/python-3.8+-blue.svg )] ( https://www.python.org/ )
1212[ ![ Docker] ( https://img.shields.io/badge/docker-ready-brightgreen.svg )] ( https://hub.docker.com/r/flamehaven/filesearch )
1313
1717
1818---
1919
20- ## 🎯 Why FLAMEHAVEN?
20+ ## 🎯 Why FLAMEHAVEN FileSearch ?
2121
22- Stop sending your sensitive documents to third-party services. Get enterprise -grade semantic search running locally in minutes, not days.
22+ Stop sending your sensitive documents to third-party services. FLAMEHAVEN FileSearch is a production -grade RAG search engine — BM25+hybrid retrieval, 34 file formats, multi-LLM (Gemini, OpenAI, Claude, Ollama) — running self-hosted in minutes, not days.
2323
2424``` bash
2525# One command. Three minutes. Done.
26- docker run -d -p 8000:8000 -e GEMINI_API_KEY=" your_key" flamehaven-filesearch:1.5.2
26+ docker run -d -p 8000:8000 -e GEMINI_API_KEY=" your_key" flamehaven-filesearch:1.6.1
2727```
2828
2929<table >
@@ -57,9 +57,9 @@ Open source & MIT licensed</p>
5757
5858| Capability | Detail |
5959| ---| ---|
60- | ** Search Modes** | Keyword, semantic, and hybrid with automatic typo correction |
60+ | ** Search Modes** | Keyword, semantic, and hybrid (BM25+RRF) with automatic typo correction |
6161| ** 34 File Formats** | PDF, DOCX/DOC, XLSX, PPTX, RTF, HTML, CSV, LaTeX, WebVTT, images + plain text — see [ Document Parsing] ( docs/wiki/Document_Parsing.md ) |
62- | ** RAG Pipeline** | Structure-aware chunking, sliding-window context enrichment, mtime parse cache |
62+ | ** RAG Pipeline** | Structure-aware chunking, KnowledgeAtom 2-level indexing, sliding-window context enrichment, mtime parse cache |
6363| ** Ultra-Fast Vectors** | DSP v2.0 generates embeddings in <1ms — no ML frameworks required |
6464| ** Source Attribution** | Every answer links back to the originating document and chunk |
6565| ** Framework SDKs** | LangChain, LlamaIndex, Haystack, CrewAI adapters out of the box |
@@ -83,7 +83,7 @@ docker run -d \
8383 -e GEMINI_API_KEY=" your_gemini_api_key" \
8484 -e FLAMEHAVEN_ADMIN_KEY=" secure_admin_password" \
8585 -v $( pwd) /data:/app/data \
86- flamehaven-filesearch:1.5.2
86+ flamehaven-filesearch:1.6.1
8787```
8888
8989✅ Server running at ` http://localhost:8000 `
@@ -167,7 +167,7 @@ pip install flamehaven-filesearch[all]
167167# Build from source
168168git clone https://github.com/flamehaven01/Flamehaven-Filesearch.git
169169cd Flamehaven-Filesearch
170- docker build -t flamehaven-filesearch:1.5.2 .
170+ docker build -t flamehaven-filesearch:1.6.1 .
171171```
172172
173173### Framework Integrations
@@ -259,7 +259,7 @@ security:
259259</tr>
260260<tr>
261261<td>Test Suite</td>
262- <td><code>443 tests</code></td>
262+ <td><code>476 tests</code></td>
263263<td>All passing (pytest)</td>
264264</tr>
265265<tr>
@@ -299,8 +299,9 @@ flowchart TD
299299 subgraph Engine["Engine Layer"]
300300 FP["FileParser\n+ BackendRegistry\n(34 formats)"]
301301 Cache["ParseCache\n(mtime-based)"]
302- Chunker["TextChunker\n+ ContextExtractor "]
302+ Chunker["TextChunker\n+ KnowledgeAtom\n(chunk atoms) "]
303303 DSP["DSP v2.0\nEmbedding Generator\n(<1ms, zero-ML)"]
304+ BM25["BM25 + RRF\nHybrid Search\n(v1.6.0)"]
304305 Scorer["SemanticScorer\n+ TypoCorrector"]
305306 end
306307
@@ -383,7 +384,14 @@ Full roadmap: [ROADMAP.md](ROADMAP.md)
383384- [x] Backend Plugin Architecture — ` AbstractFormatBackend ` + ` BackendRegistry ` (v1.5.2)
384385- [x] Parse cache — mtime-based, ` extract_text(use_cache=True) ` (v1.5.2)
385386- [x] ContextExtractor — sliding-window RAG chunk enrichment (v1.5.2)
386- - [x] 443 tests; AI-Slop-Detector critical deficits: 0 (v1.5.2)
387+ - [x] Multi-provider LLM support — OpenAI, Claude, Ollama, Gemini (v1.5.3)
388+
389+ ### v1.6.0 (Completed)
390+ - [x] BM25 + RRF hybrid search — Korean+English tokenizer, lazy per-store index
391+ - [x] KnowledgeAtom 2-level indexing — chunk atoms with fragment URIs
392+ - [x] Stable URI scheme — ` local://<store>/<quote(abs_path)> ` , collision-free
393+ - [x] core.py mixin segmentation — 1258 → 221 lines, 3 focused modules
394+ - [x] Fix: ` search_stream ` double intent-refine bug
387395
388396### v2.0.0 (Q3 2026)
389397- [ ] Multi-language support (15+ languages) — multilingual stopwords + jieba
@@ -465,9 +473,10 @@ Use the links below to jump to the most relevant guide.
465473| Topic | Description |
466474| -------| -------------|
467475| [ Document Parsing] ( docs/wiki/Document_Parsing.md ) | Supported formats, internal parsers, RAG chunking |
476+ | [ Hybrid Search] ( docs/wiki/Hybrid_Search.md ) | BM25+RRF, KnowledgeAtom indexing, stable URI scheme (v1.6.0) |
468477| [ Framework Integrations] ( docs/wiki/Framework_Integrations.md ) | LangChain, LlamaIndex, Haystack, CrewAI adapters |
469478| [ API Reference] ( docs/wiki/API_Reference.md ) | REST endpoints, payloads, rate limits |
470- | [ Architecture] ( docs/wiki/Architecture.md ) | How all layers fit together (v1.5.2 ) |
479+ | [ Architecture] ( docs/wiki/Architecture.md ) | How all layers fit together (v1.6.0 ) |
471480| [ Configuration Reference] ( docs/wiki/Configuration.md ) | Full list of environment variables and config fields |
472481| [ Production Deployment] ( docs/wiki/Production_Deployment.md ) | Docker, systemd, reverse proxy, scaling tips |
473482| [ Troubleshooting] ( docs/wiki/Troubleshooting.md ) | Step-by-step debugging playbook |
@@ -536,6 +545,6 @@ Built with amazing open source tools:
536545
537546Built with 🔥 by the Flamehaven Core Team
538547
539- * Last updated: April 19, 2026 • Version 1.5.3 *
548+ * Last updated: April 19, 2026 • Version 1.6.1 *
540549
541550</div >
0 commit comments