Xantus v0.1.0 - Initial Release

I'm excited to announce the initial release of Xantus, a privacy-first RAG (Retrieval-Augmented Generation) chat system that lets you have conversations with your documents using AI.

What is Xantus?

Xantus is an open-source document chat system that allows you to upload documents and ask questions about them using large language models. Unlike cloud-only solutions, Xantus can run completely locally or use cloud providers - your choice.

Key Philosophy:

Privacy First - All data stays on your system with local AI
Extensible - MCP integration for external tools
Multiple UIs - Streamlit interface + OpenAI-compatible API
Multi-Provider - Supports Ollama, OpenAI, Anthropic, and more

Key Features

Document Processing

Multiple Format Support - Upload PDF, DOCX, TXT, and Markdown files
Semantic Search - RAG-powered retrieval with ChromaDB vector store
Smart Chunking - Configurable chunk size and overlap for optimal retrieval
Complete Deletion - Properly removes documents from vector store (no orphaned data!)

Chat & Retrieval

Interactive Chat - Natural conversation interface with context awareness
Source Citations - See exactly where answers came from with:
- Document name and page number
- Chunk index for precise location
- Relevance score (0-100%)
- Full text excerpts (expandable)
Configurable RAG - Adjust similarity_top_k, chunk size, and overlap
Chat History - Conversation persistence in the UI

AI Provider Support

Ollama - Completely local, privacy-first (Llama 3.2, Mistral, etc.)
OpenAI - GPT-4, GPT-3.5-turbo with streaming support
Anthropic - Claude Sonnet 4, Claude Haiku
Hybrid Mode - Cloud LLM + local embeddings for cost optimization

MCP (Model Context Protocol) Integration

External Tools - Calculator, file system, text processing, weather
Extensible - Add custom MCP servers easily
TypeScript Support - Integrated MCP server template
Built-in Tools:
- Calculator - Perform mathematical operations
- File System - Read/write/list files
- Text Processing - Word count, sentiment analysis, case conversion
- Weather - Weather data retrieval

User Interface

Streamlit UI - Clean, modern chat interface
Settings Panel - Toggle RAG context and source citations
Document Management - Upload, list, and delete documents
Real-time Updates - Live document list and chat history
Responsive - Works on desktop and tablet

API & Integration

RESTful API - OpenAI-compatible chat completions endpoint
FastAPI - High-performance async API server
CORS Support - Configurable cross-origin requests
Health Checks - Monitor system status
OpenAPI Docs - Auto-generated API documentation at /docs

Architecture & Quality

Dependency Injection - Clean, testable architecture using Injector
Factory Pattern - Easy to swap LLMs, embeddings, vector stores
Type Safety - Pydantic models throughout
Async Support - Non-blocking operations
Structured Logging - Clear visibility into system operations

What's Included

Core Components

xantus/
├── xantus/                    # Main Python package
│   ├── api/                   # FastAPI endpoints
│   │   ├── chat_router.py     # Chat completions
│   │   ├── ingest_router.py   # Document upload/management
│   │   └── embeddings_router.py
│   ├── services/              # Business logic
│   │   ├── chat_service.py    # RAG chat with sources
│   │   ├── ingest_service.py  # Document processing
│   │   └── mcp_service.py     # MCP orchestration
│   ├── components/            # Component factories
│   │   ├── llm/              # LLM provider factory
│   │   ├── embeddings/       # Embedding factory
│   │   └── vector_store/     # Vector store factory
│   ├── models/               # Data models
│   └── config/               # Settings management
├── ui/                       # Streamlit interface
├── mcp-servers/             # MCP integration (submodule)
└── config.yaml              # Main configuration

Supported Configurations

Fully Local (Privacy-First):

Ollama for LLM (Llama 3.2, Mistral, etc.)
HuggingFace embeddings (BAAI/bge-small-en-v1.5)
ChromaDB vector store
Zero external API calls

Cloud-Powered:

Anthropic Claude Sonnet 4 / Haiku
OpenAI GPT-4 / GPT-3.5-turbo
OpenAI embeddings
ChromaDB local storage

Hybrid (Recommended):

Cloud LLM (better quality)
Local embeddings (lower cost)
ChromaDB local storage

Technical Details

Vector Store

Provider: ChromaDB with persistent storage
Features:
- Proper document deletion (filters by file_name)
- Metadata recovery on restart
- Configurable collection names
Storage: Local SQLite-based storage

RAG Configuration

Default chunk size: 1024 characters
Default overlap: 200 characters
Default top_k: 5 similar chunks
Splitter: SentenceSplitter for semantic boundaries

API Endpoints

Chat:

POST /v1/chat/completions - Chat with RAG context
POST /v1/chunks/retrieve - Retrieve similar chunks

Documents:

POST /v1/ingest/file - Upload document
GET /v1/documents - List all documents
DELETE /v1/documents/{id} - Delete document

System:

GET /health - Health check
GET /docs - OpenAPI documentation

Known Issues & Limitations

Current Limitations

Streaming with Sources - Source citations only available in non-streaming mode
Single Collection - All documents in one ChromaDB collection
No User Authentication - API is open (add middleware for production)
In-Memory Metadata - Document metadata not persisted to database (recovered from files)
No Multi-tenancy - Single-user design (can be extended)

Workarounds

For streaming, disable include_sources parameter
For production, add FastAPI authentication middleware
For multiple users, extend with database-backed metadata storage

What's Next (Future Releases)

Planned features for upcoming versions:

v0.2.0 (Planned)

Streaming support with sources
Persistent metadata storage (PostgreSQL/SQLite)
Multiple vector store collections
Reranking support for better retrieval
Document update/versioning

v0.3.0 (Planned)

User authentication & authorization
Multi-tenancy support
Document folder organization
Advanced search filters
Conversation management (save/load)

Future Considerations

Qdrant vector store support
Additional embedding providers
Chat export functionality
Document preprocessing pipeline
OCR support for scanned PDFs
Image/table extraction

Metrics

Lines of Code: ~2,500 (excluding MCP server)
Dependencies: 15 core packages
Supported File Types: 4 (PDF, DOCX, TXT, MD)
LLM Providers: 3 (Ollama, OpenAI, Anthropic)
MCP Tools: 4 built-in tools

Acknowledgments

Built with:

FastAPI - Modern async web framework
LlamaIndex - RAG framework
ChromaDB - Vector database
Streamlit - UI framework
Model Context Protocol - Tool integration

License

Xantus is released under the MIT License. See LICENSE for details.

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

Ways to contribute:

Report bugs
Suggest features
Improve documentation
Submit pull requests
Star the repo!

Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: See README.md and docs/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Xantus v0.1.0 - Initial Release

What is Xantus?

Key Features

Document Processing

Chat & Retrieval

AI Provider Support

MCP (Model Context Protocol) Integration

User Interface

API & Integration

Architecture & Quality

What's Included

Core Components

Supported Configurations

Technical Details

Vector Store

RAG Configuration

API Endpoints

Known Issues & Limitations

Current Limitations

Workarounds

What's Next (Future Releases)

v0.2.0 (Planned)

v0.3.0 (Planned)

Future Considerations

Metrics

Acknowledgments

License

Contributing

Support

Uh oh!

Releases: onamfc/rag-chat

v0.1.0 - Initial Release

Xantus v0.1.0 - Initial Release

What is Xantus?

Key Features

Document Processing

Chat & Retrieval

AI Provider Support

MCP (Model Context Protocol) Integration

User Interface

API & Integration

Architecture & Quality

What's Included

Core Components

Supported Configurations

Technical Details

Vector Store

RAG Configuration

API Endpoints

Known Issues & Limitations

Current Limitations

Workarounds

What's Next (Future Releases)

v0.2.0 (Planned)

v0.3.0 (Planned)

Future Considerations

Metrics

Acknowledgments

License

Contributing

Support

Uh oh!