Skip to content

Releases: onamfc/rag-chat

v0.1.0 - Initial Release

07 Nov 21:48

Choose a tag to compare

Xantus v0.1.0 - Initial Release

I'm excited to announce the initial release of Xantus, a privacy-first RAG (Retrieval-Augmented Generation) chat system that lets you have conversations with your documents using AI.

What is Xantus?

Xantus is an open-source document chat system that allows you to upload documents and ask questions about them using large language models. Unlike cloud-only solutions, Xantus can run completely locally or use cloud providers - your choice.

Key Philosophy:

  • Privacy First - All data stays on your system with local AI
  • Extensible - MCP integration for external tools
  • Multiple UIs - Streamlit interface + OpenAI-compatible API
  • Multi-Provider - Supports Ollama, OpenAI, Anthropic, and more

Key Features

Document Processing

  • Multiple Format Support - Upload PDF, DOCX, TXT, and Markdown files
  • Semantic Search - RAG-powered retrieval with ChromaDB vector store
  • Smart Chunking - Configurable chunk size and overlap for optimal retrieval
  • Complete Deletion - Properly removes documents from vector store (no orphaned data!)

Chat & Retrieval

  • Interactive Chat - Natural conversation interface with context awareness
  • Source Citations - See exactly where answers came from with:
    • Document name and page number
    • Chunk index for precise location
    • Relevance score (0-100%)
    • Full text excerpts (expandable)
  • Configurable RAG - Adjust similarity_top_k, chunk size, and overlap
  • Chat History - Conversation persistence in the UI

AI Provider Support

  • Ollama - Completely local, privacy-first (Llama 3.2, Mistral, etc.)
  • OpenAI - GPT-4, GPT-3.5-turbo with streaming support
  • Anthropic - Claude Sonnet 4, Claude Haiku
  • Hybrid Mode - Cloud LLM + local embeddings for cost optimization

MCP (Model Context Protocol) Integration

  • External Tools - Calculator, file system, text processing, weather
  • Extensible - Add custom MCP servers easily
  • TypeScript Support - Integrated MCP server template
  • Built-in Tools:
    • Calculator - Perform mathematical operations
    • File System - Read/write/list files
    • Text Processing - Word count, sentiment analysis, case conversion
    • Weather - Weather data retrieval

User Interface

  • Streamlit UI - Clean, modern chat interface
  • Settings Panel - Toggle RAG context and source citations
  • Document Management - Upload, list, and delete documents
  • Real-time Updates - Live document list and chat history
  • Responsive - Works on desktop and tablet

API & Integration

  • RESTful API - OpenAI-compatible chat completions endpoint
  • FastAPI - High-performance async API server
  • CORS Support - Configurable cross-origin requests
  • Health Checks - Monitor system status
  • OpenAPI Docs - Auto-generated API documentation at /docs

Architecture & Quality

  • Dependency Injection - Clean, testable architecture using Injector
  • Factory Pattern - Easy to swap LLMs, embeddings, vector stores
  • Type Safety - Pydantic models throughout
  • Async Support - Non-blocking operations
  • Structured Logging - Clear visibility into system operations

What's Included

Core Components

xantus/
├── xantus/                    # Main Python package
│   ├── api/                   # FastAPI endpoints
│   │   ├── chat_router.py     # Chat completions
│   │   ├── ingest_router.py   # Document upload/management
│   │   └── embeddings_router.py
│   ├── services/              # Business logic
│   │   ├── chat_service.py    # RAG chat with sources
│   │   ├── ingest_service.py  # Document processing
│   │   └── mcp_service.py     # MCP orchestration
│   ├── components/            # Component factories
│   │   ├── llm/              # LLM provider factory
│   │   ├── embeddings/       # Embedding factory
│   │   └── vector_store/     # Vector store factory
│   ├── models/               # Data models
│   └── config/               # Settings management
├── ui/                       # Streamlit interface
├── mcp-servers/             # MCP integration (submodule)
└── config.yaml              # Main configuration

Supported Configurations

Fully Local (Privacy-First):

  • Ollama for LLM (Llama 3.2, Mistral, etc.)
  • HuggingFace embeddings (BAAI/bge-small-en-v1.5)
  • ChromaDB vector store
  • Zero external API calls

Cloud-Powered:

  • Anthropic Claude Sonnet 4 / Haiku
  • OpenAI GPT-4 / GPT-3.5-turbo
  • OpenAI embeddings
  • ChromaDB local storage

Hybrid (Recommended):

  • Cloud LLM (better quality)
  • Local embeddings (lower cost)
  • ChromaDB local storage

Technical Details

Vector Store

  • Provider: ChromaDB with persistent storage
  • Features:
    • Proper document deletion (filters by file_name)
    • Metadata recovery on restart
    • Configurable collection names
  • Storage: Local SQLite-based storage

RAG Configuration

  • Default chunk size: 1024 characters
  • Default overlap: 200 characters
  • Default top_k: 5 similar chunks
  • Splitter: SentenceSplitter for semantic boundaries

API Endpoints

Chat:

  • POST /v1/chat/completions - Chat with RAG context
  • POST /v1/chunks/retrieve - Retrieve similar chunks

Documents:

  • POST /v1/ingest/file - Upload document
  • GET /v1/documents - List all documents
  • DELETE /v1/documents/{id} - Delete document

System:

  • GET /health - Health check
  • GET /docs - OpenAPI documentation

Known Issues & Limitations

Current Limitations

  1. Streaming with Sources - Source citations only available in non-streaming mode
  2. Single Collection - All documents in one ChromaDB collection
  3. No User Authentication - API is open (add middleware for production)
  4. In-Memory Metadata - Document metadata not persisted to database (recovered from files)
  5. No Multi-tenancy - Single-user design (can be extended)

Workarounds

  • For streaming, disable include_sources parameter
  • For production, add FastAPI authentication middleware
  • For multiple users, extend with database-backed metadata storage

What's Next (Future Releases)

Planned features for upcoming versions:

v0.2.0 (Planned)

  • Streaming support with sources
  • Persistent metadata storage (PostgreSQL/SQLite)
  • Multiple vector store collections
  • Reranking support for better retrieval
  • Document update/versioning

v0.3.0 (Planned)

  • User authentication & authorization
  • Multi-tenancy support
  • Document folder organization
  • Advanced search filters
  • Conversation management (save/load)

Future Considerations

  • Qdrant vector store support
  • Additional embedding providers
  • Chat export functionality
  • Document preprocessing pipeline
  • OCR support for scanned PDFs
  • Image/table extraction

Metrics

Lines of Code: ~2,500 (excluding MCP server)
Dependencies: 15 core packages
Supported File Types: 4 (PDF, DOCX, TXT, MD)
LLM Providers: 3 (Ollama, OpenAI, Anthropic)
MCP Tools: 4 built-in tools

Acknowledgments

Built with:

License

Xantus is released under the MIT License. See LICENSE for details.

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

Ways to contribute:

  • Report bugs
  • Suggest features
  • Improve documentation
  • Submit pull requests
  • Star the repo!

Support