A simple yet powerful RAG (Retrieval Augmented Generation) system built with Go, SQLite, and Ollama. Lil-RAG provides both CLI and HTTP API interfaces for indexing documents and performing semantic similarity searches with compression and deduplication.
- π Semantic Vector Search - Advanced similarity search using SQLite with sqlite-vec
- π Document Deduplication - Intelligent result deduplication for multi-chunk documents
- ποΈ Automatic Compression - Transparent gzip compression for optimal storage
- π PDF Support - Native PDF parsing with page-based chunking
- π§ Dual Interface - Both CLI and HTTP API for maximum flexibility
- π€ Ollama Integration - Configurable embedding models via Ollama
- β‘ High Performance - Optimized Go implementation with efficient SQLite storage
- ποΈ Profile Configuration - User-friendly configuration management
- π File Handling - Support for text files, PDFs, and stdin input
- π Complete Documents - Returns full document content, not just chunks
- Go 1.21+ with CGO support
- Ollama with an embedding model installed
- SQLite with sqlite-vec extension support
-
Install Go: Download from golang.org
-
Install Ollama: Follow instructions at ollama.ai
# Start Ollama ollama serve # Pull an embedding model ollama pull nomic-embed-text
-
SQLite-vec Extension: The Go bindings handle this automatically via CGO
# Clone the repository
git clone https://github.com/your-username/lil-rag.git
cd lil-rag
# Build both CLI and server
make build
# Or build individually
make build-cli # builds bin/lil-rag
make build-server # builds bin/lil-rag-server
# Install to $GOPATH/bin (optional)
make install# Install CLI directly
go install github.com/your-username/lil-rag/cmd/lil-rag@latest
# Install server directly
go install github.com/your-username/lil-rag/cmd/lil-rag-server@latest# Start Ollama (in a separate terminal)
ollama serve
# Pull an embedding model
ollama pull nomic-embed-text# Initialize user profile configuration
./bin/lil-rag config init
# View current settings
./bin/lil-rag config show# Index direct text
./bin/lil-rag index doc1 "This is about machine learning and neural networks."
# Index from a file
./bin/lil-rag index doc2 document.txt
# Index a PDF file
./bin/lil-rag index doc3 research_paper.pdf
# Index from stdin
echo "Content about artificial intelligence" | ./bin/lil-rag index doc4 -# Search with default limit (10)
./bin/lil-rag search "machine learning"
# Search with custom limit
./bin/lil-rag search "neural networks" 3
# Get full document content (limit=1 shows complete documents)
./bin/lil-rag search "AI concepts" 1Example Output:
Found 2 results:
1. ID: doc1 [Best match: Chunk 1] (Score: 0.8542)
This is about machine learning and neural networks. Neural networks are...
[complete document content shown]
2. ID: doc3 [Best match: Page 1] (Score: 0.7891)
Research Paper: Deep Learning Fundamentals...
[complete document content shown]
index <id> [text|file|-]- Index content with a unique IDsearch <query> [limit]- Search for similar contentconfig <init|show> [path]- Manage configuration
# Index examples
lil-rag index doc1 "Hello world" # Direct text
lil-rag index doc2 document.txt # From file
echo "Hello world" | lil-rag index doc3 - # From stdin
cat large_file.txt | lil-rag index doc4 - # Pipe large files
# Search examples
lil-rag search "hello" 10 # Search with limit
lil-rag search "machine learning concepts" # Default limit (10)
# Configuration
lil-rag config init # Initialize profile config
lil-rag config show # Show current config
lil-rag config set ollama.model all-MiniLM-L6-v2 # Update configuration-db string Database path (overrides profile config)
-data-dir string Data directory (overrides profile config)
-ollama string Ollama URL (overrides profile config)
-model string Embedding model (overrides profile config)
-vector-size int Vector size (overrides profile config)
-help Show help# Start with default settings (localhost:8080)
./bin/lil-rag-server
# Start with custom host/port
./bin/lil-rag-server --host 0.0.0.0 --port 9000Visit http://localhost:8080 for the web interface with API documentation.
Index content with a unique document ID.
JSON Request:
curl -X POST http://localhost:8080/api/index \
-H "Content-Type: application/json" \
-d '{
"id": "doc1",
"text": "This document discusses machine learning algorithms and their applications in modern AI systems."
}'File Upload:
curl -X POST http://localhost:8080/api/index \
-F "id=doc2" \
-F "file=@document.pdf"Response:
{
"success": true,
"id": "doc1",
"message": "Successfully indexed 123 characters"
}Search using query parameters.
# Basic search
curl "http://localhost:8080/api/search?query=machine%20learning&limit=5"
# Single result (returns complete document)
curl "http://localhost:8080/api/search?query=neural%20networks&limit=1"Search using JSON body (recommended for complex queries).
curl -X POST http://localhost:8080/api/search \
-H "Content-Type: application/json" \
-d '{
"query": "artificial intelligence applications",
"limit": 3
}'Response:
{
"results": [
{
"ID": "doc1",
"Text": "This document discusses machine learning algorithms...",
"Score": 0.8542,
"Metadata": {
"chunk_index": 1,
"chunk_type": "text",
"is_chunk": true,
"file_path": "/path/to/compressed/file.gz",
"matching_chunk": "...algorithms and their applications..."
}
}
]
}Health check endpoint for monitoring.
curl http://localhost:8080/api/healthResponse:
{
"status": "healthy",
"timestamp": "2024-01-15T10:30:00Z",
"version": "1.0.0"
}Performance metrics and system information.
curl http://localhost:8080/api/metricsLilRag uses a profile-based configuration system that stores settings in a JSON file in your user profile directory (~/.lilrag/config.json).
# Initialize profile configuration with defaults
./bin/lil-rag config init
# View current configuration
./bin/lil-rag config showThe configuration includes:
- Ollama Settings: Endpoint URL, embedding model, and vector size
- Storage: Database path and data directory for indexed content
- Server: HTTP server host and port
Example profile configuration (~/.lilrag/config.json):
{
"ollama": {
"endpoint": "http://localhost:11434",
"embedding_model": "nomic-embed-text",
"vector_size": 768
},
"storage_path": "/home/user/.lilrag/data/minirag.db",
"data_dir": "/home/user/.lilrag/data",
"server": {
"host": "localhost",
"port": 8080
}
}# Set Ollama endpoint
./bin/lil-rag config set ollama.endpoint http://192.168.1.100:11434
# Change embedding model
./bin/lil-rag config set ollama.model all-MiniLM-L6-v2
# Update vector size (must match model)
./bin/lil-rag config set ollama.vector-size 384
# Change data directory
./bin/lil-rag config set data.dir /path/to/my/data
# Update server settings
./bin/lil-rag config set server.port 9000package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"lil-rag/pkg/minirag"
)
func main() {
// Create configuration
homeDir, _ := os.UserHomeDir()
dataDir := filepath.Join(homeDir, ".lilrag", "data")
config := &minirag.Config{
DatabasePath: filepath.Join(dataDir, "test.db"),
DataDir: dataDir,
OllamaURL: "http://localhost:11434",
Model: "nomic-embed-text",
VectorSize: 768,
}
// Initialize LilRag
rag, err := minirag.New(config)
if err != nil {
log.Fatal(err)
}
defer rag.Close()
if err := rag.Initialize(); err != nil {
log.Fatal(err)
}
ctx := context.Background()
// Index content - note the parameter order: text first, then id
err = rag.Index(ctx, "This is a document about Go programming", "doc1")
if err != nil {
log.Fatal(err)
}
// Search for similar content
results, err := rag.Search(ctx, "Go programming", 5)
if err != nil {
log.Fatal(err)
}
for _, result := range results {
fmt.Printf("ID: %s, Score: %.4f\n", result.ID, result.Score)
fmt.Printf("Text: %s\n\n", result.Text)
}
}# Run tests
make test
# Build for current platform
make build
# Build for all platforms (Linux, macOS, Windows)
make build-cross
# Format code
make fmt
# Lint code
make lint
# Clean build artifacts
make clean
# Install binaries to $GOPATH/bin
make install
# Show current version
make versionThe project uses semantic versioning stored in the VERSION file. When code is merged to the main branch, the build system automatically:
- Increments the patch version (e.g., 1.0.0 β 1.0.1)
- Builds cross-platform binaries for Linux, macOS, and Windows
- Embeds the version into the binaries at build time
- Creates release archives with checksums
- Updates the VERSION file in the repository
The CI/CD system builds binaries for:
- Linux: AMD64, ARM64
- macOS: AMD64 (Intel), ARM64 (Apple Silicon)
- Windows: AMD64
All binaries include the version information and can be checked with:
./lil-rag --version
./lil-rag-server --versionlil-rag/
βββ cmd/ # Main applications
β βββ lil-rag/ # CLI application
β βββ lil-rag-server/ # HTTP API server
βββ pkg/ # Public library packages
β βββ minirag/ # Core RAG functionality
β β βββ storage.go # SQLite + sqlite-vec storage
β β βββ embedder.go # Ollama integration
β β βββ chunker.go # Text chunking logic
β β βββ compression.go # Gzip compression
β β βββ pdf.go # PDF parsing
β β βββ minirag.go # Main library interface
β βββ config/ # Configuration management
βββ internal/ # Private application code
β βββ handlers/ # HTTP request handlers
βββ examples/ # Example programs
β βββ library/ # Library usage example
β βββ profile/ # Profile config example
βββ .github/ # GitHub templates and workflows
β βββ workflows/ # CI/CD pipelines
β βββ ISSUE_TEMPLATE/ # Issue templates
βββ docs/ # Additional documentation
- Storage Layer: SQLite with sqlite-vec for efficient vector operations
- Embedding Layer: Ollama integration with configurable models
- Processing Layer: Text chunking, PDF parsing, and compression
- API Layer: REST endpoints and CLI interface
- Configuration: Profile-based user configuration system
- Profile config location:
~/.lilrag/config.json - Initialize config if missing:
lil-rag config init - Check config values:
lil-rag config show - Reset to defaults: Delete config file and run
lil-rag config init
- Ensure sqlite-vec is installed and available in your SQLite
- The extension file should be accessible as
vec0
- Verify Ollama is running:
ollama list - Check the Ollama URL:
lil-rag config show - Update endpoint:
lil-rag config set ollama.endpoint http://localhost:11434 - Ensure the embedding model is pulled:
ollama pull nomic-embed-text
- Different models have different vector sizes
- Common sizes: 768 (nomic-embed-text), 384 (all-MiniLM-L6-v2), 1536 (text-embedding-ada-002)
- Update vector size:
lil-rag config set ollama.vector-size 768
- Files are stored in the configured data directory
- Check location:
lil-rag config show - Change location:
lil-rag config set data.dir /path/to/data - Ensure write permissions to the directory
MIT License