A distributed knowledge graph system that tracks what's on all your machines: what's installed, what's running, what changed, what you were looking at, and what you were typing. Features master/satellite architecture with always-on queue services for zero data loss.
- System State Graph: Maps entities and topology across all machines (files, projects, containers, services, configs)
- Memory & Change: Event stream tracking file changes, container events, service restarts, package installations
- Visual Timeline: Continuous screenshots with smart diffing, OCR, and lazy processing
- GUI Event Capture: AT-SPI collector (ColdWatch lineage) records focused text and accessibility events
- MCP Registry & Control: Central orchestrator for all MCP servers (Docker + local tools)
- Agent Interface: Natural language queries powered by Ollama (local) with OpenRouter fallback
- Distributed Architecture: Master/satellite/queue topology for managing multiple computers
- Backup Management: Neo4j, configs, Docker volumes → S3 with verification and FIDO2 encryption
- Network Monitoring: Interfaces, firewall rules, DNS, VPN, routing tables
- Git Runner Management: Self-hosted CI/CD runners (GitHub Actions, GitLab CI) with auto-scaling
- Obsidian Vault Sync: Auto-commit, conflict detection, media rsync, backup integration
- Data Retention: Configurable policies per domain with exemptions and GDPR compliance
- FIDO2 Encryption: Hardware key encryption for backups, secrets, and sensitive data
domains/system_graph/- Project, software, Docker, config scannersdomains/memory_change/- File system watchers, event trackingdomains/visual_timeline/- Screenshot capture, OCR, embeddingsdomains/mcp_registry/- MCP server registry and Docker controldomains/agent_interface/- Query routing, LLM integrationdomains/gui_collector/- AT-SPI ingestion and normalizationdomains/file_ingest/- Downloads dedupe, routing, and ingestion metadata- Media Deduplication: SHA-256 hash-based duplicate detection with tag routing
- Document Ingestion: Automatic RAG system feeding with age-based stale management
- Export Processing: Zip extraction and categorization for knowledge artifacts
- Database: Neo4j 5.x (graph + vector search)
- API: FastAPI + Uvicorn
- LLM: Ollama (local) with OpenRouter fallback
- OCR: Tesseract + ATSPI
- Containerization: Docker + Docker Compose
- Docker & Docker Compose
- Ollama running at http://192.168.1.69:11434 (or configure in
.env) - X11 display (for screenshot capture)
-
Clone the repository
-
Copy configuration template:
cp config.toml.example config.toml
Or use environment variables:
cp .env.example .env
-
Edit
config.toml(or.env) with your configuration -
Start services:
docker-compose up -d
-
Initialize Neo4j schema:
docker-compose exec api python -m scripts.init_schema
- Neo4j Browser: http://localhost:7474 (user:
neo4j, pass:watchman123) - API Docs: http://localhost:8000/docs
- Health Check: http://localhost:8000/health
Locate files/projects:
curl -X POST http://localhost:8000/ask \
-H "Content-Type: application/json" \
-d '{"query": "Where is my docker-compose.yml for the dashboard?"}'Recent changes:
curl -X POST http://localhost:8000/ask \
-H "Content-Type: application/json" \
-d '{"query": "What changed in /etc since 10:00?"}'Screen recall:
curl -X POST http://localhost:8000/ask \
-H "Content-Type: application/json" \
-d '{"query": "Find OCR text about TLS cert from this morning"}'MCP status:
curl -X POST http://localhost:8000/ask \
-H "Content-Type: application/json" \
-d '{"query": "Which MCP servers are running?"}'Trigger system scan:
curl -X POST http://localhost:8000/admin/scanForce screenshot capture:
curl -X POST http://localhost:8000/admin/screenshotStart MCP server:
curl -X POST http://localhost:8000/mcp/start/bookmarks.
├── app/ # FastAPI application
│ ├── api/ # API routes
│ ├── models/ # Pydantic models
│ └── utils/ # Shared utilities
├── domains/ # Domain implementations
├── schemas/ # Neo4j schemas
├── config/ # Configuration files
└── tests/ # Tests
# Run all tests
docker-compose exec api pytest
# Run specific domain tests
docker-compose exec api pytest tests/unit/test_visual_timeline.py
# Integration tests
docker-compose exec api pytest tests/integration/- Create scanner in appropriate domain (e.g.,
domains/system_graph/scanners/) - Implement scanner interface
- Register in scanner registry
- Add tests
The Watchman supports TOML configuration files (recommended) or environment variables (legacy).
Priority order: config.toml > environment variables > .env file > defaults
Copy config.toml.example to config.toml and customize:
[screenshot]
interval = 300 # seconds
enable_diffing = true # Only capture when screen changes
diff_threshold = 0.10 # 10% change required
[screenshot.smart_capture]
enable_smart_capture = true
capture_on_app_switch = true # Capture when changing apps
capture_on_idle_return = true # Capture when returning from idle
[ocr]
enable_lazy_processing = false # Process immediately vs. on-demand
[privacy]
redact_patterns = [".*@.*\\.com", "sk-.*", "ghp_.*"]
exclude_apps = ["keepassxc", "1password"]
[features]
visual_timeline = true
system_graph = true
gui_collector = false # AT-SPI event captureSee config.toml.example for all available options.
Still supported for backward compatibility:
SCREENSHOT_INTERVAL: Capture interval in seconds (default: 300)REDACT_PATTERNS: Regex patterns to redact from OCREXCLUDE_APPS: Apps to skip screenshot or GUI captureIMAGE_RETENTION_DAYS: How long to keep raw images (default: 14)OCR_RETENTION_DAYS: How long to keep OCR text (default: 90)
Check health: docker-compose ps
View logs: docker-compose logs neo4j
Ensure X11 forwarding: xhost +local:docker
Check DISPLAY variable is set
Verify Tesseract is installed in container
Check OCR worker logs: docker-compose logs ocr-worker
Ensure the container runs under a desktop user, verify GTK_MODULES configuration, and confirm AT-SPI packages are installed. See docs/unified/troubleshooting.md for deeper diagnostics.
Verify minimum file age, destination permissions, and lockfile status. See docs/unified/troubleshooting.md for the full checklist.
- Unified architecture:
docs/unified/architecture.md- Complete system overview - Distributed architecture:
docs/unified/distributed_architecture.md- Master/satellite/queue topology - System management:
docs/unified/system_management.md- Git runners, Obsidian, backups, network, retention, FIDO2 - MCP management:
docs/unified/mcp_management.md- MCP server orchestration strategy - Smart capture:
docs/unified/smart_capture.md- Screenshot diffing, lazy OCR, smart triggers
- Privacy & data handling:
docs/unified/privacy.md - Testing plan:
docs/unified/testing.md - Troubleshooting:
docs/unified/troubleshooting.md - Logging standards:
docs/observability/logging.md
- Phase 0: Foundation & Contracts ✅
- Phase 1: Domain Implementations (70% complete)
- Visual Timeline (Basic) ✅
- Visual Timeline (Smart Features) - Planned
- Screenshot diffing/hashing
- Smart capture triggers
- Similarity clustering
- Lazy OCR processing
- System Graph Seeders ✅
- Event & Change Tracking ✅
- File Ingest Domain ✅
- Phase 2: Distributed & Infrastructure
- Distributed Architecture
- Master/satellite/queue modes
- Data forwarding and offline buffering
- Machine provisioning API
- Always-on queue service (Raspberry Pi)
- MCP Registry & Control (25% - stubs)
- Complete MCP server lifecycle management
- Docker Hub integration
- Local tool installation strategy
- System Management Domain
- Software install monitoring (documented)
- Backup management (orchestrator, verification, S3, FIDO2)
- Network monitoring (topology, firewall, VPN)
- Git runner management (GitHub Actions, GitLab CI)
- Obsidian vault sync (git, rsync, conflicts)
- Data retention policies (per-domain, exemptions)
- FIDO2 encryption (backups, secrets)
- Resource monitoring (CPU, memory, disk, I/O)
- Configuration management (validation, rollback)
- Distributed Architecture
- Phase 3: Integration & Orchestration
- Agent Interface completion
- Review API (/review endpoint)
- MCP orchestration & assignment
- Phase 4: Advanced Features
- Web UI (React + GraphQL)
- Browser extension integration
- Mobile companion app
- Proactive automation triggers
- Security monitoring (CVE tracking, audit trail)
MIT
See CONTRIBUTING.md for development guidelines.