Development Roadmap

Phase 0: Foundation (Complete)

Goal: Minimal viable infrastructure

Phase 1: Core Loop (Complete)

Goal: Single agent responding via Telegram

Phase 2: Memory (Complete)

Goal: Persistent context across sessions

SQLite storage layer
Working memory (session)
Episodic memory (conversation logs)
Basic retrieval (FTS5 + fallback)
Memory injection into prompts
Core memory blocks (Letta concept)
Conversation summarization on session end
Memory importance scoring

Exit criteria: Bot remembers previous conversations.

Phase 3: Multi-Provider (Complete)

Goal: LLM flexibility and cost optimization

Provider router with fallback logic
Cost tracking per request
OpenRouter integration
Local LLM support (Ollama/LM Studio)
Model selection by task type (TaskType enum)

Exit criteria: Different queries route to appropriate LLM.

Phase 4: Background Agents (Complete)

Goal: Proactive behavior

Agent lifecycle management (Orchestrator)
Sleep agent (memory consolidation)
Awareness agent (proactive checks)
Async task scheduler
Semantic memory extraction

Exit criteria: Bot consolidates memories during idle, can proactively notify.

Phase 5: Intelligent LLM Routing (Complete)

Goal: Cost-optimized model selection by task type

Task-based model selection (not fallback)
Model difficulty registry (HARD/INTERMEDIATE/EASY)
Task→Difficulty mapping (CHAT→HARD, SUMMARIZATION→INTERMEDIATE, etc)
Cost tracking per request with budget enforcement
Automatic downgrade when approaching budget limit
Model capability filtering (multimodal, context window)
Provider fallback and health checking
Integration tests for routing behavior

Exit criteria: Each task type routes to optimal model by cost/capability. ✅

Phase 6: Tool Workspace (Complete)

Goal: Sandboxed environment for code execution

Workspace directory structure (data/workspace/)
Python environment isolation (venv per workspace)
Script execution with output capture (stdout/stderr)
File read/write within sandbox with path validation
Execution timeout and resource limits
Safety validator (blocks dangerous imports/functions)
CodeAgent for LLM-driven script generation
Comprehensive test coverage

Exit criteria: Bot can write, execute Python scripts and read results safely. ✅

Phase 6.5: Task Scheduling System (Complete)

Goal: Proactive async task execution

Task type system (REMINDER, AGENT_TASK, API_CALL, WEB_SEARCH)
Schedule parsing (delays: "5m", "2h"; patterns: "daily 9am", "weekdays 6pm")
SQLite-backed task persistence with indexing
Recurring task rescheduling after execution
Task execution engine with notification callbacks
Telegram commands: /remind, /schedule, /tasks, /cancel
Integration with AwarenessAgent for proactive checks
Comprehensive integration test suite

Exit criteria: Users can schedule one-time and recurring tasks via natural language. ✅ Documentation: task_system.md

Phase 6.6: Tool Calling Framework (Complete)

Goal: LLM-driven function execution via native APIs

Tool definition system with @tool decorator
Native API integration (Anthropic tool use, OpenAI function calling)
Tool registry with provider-specific converters
Built-in tools: task management, system utilities
Tool executor with validation and error handling
DialogAgent integration with two-pass approach
Automatic tool result formatting for LLM
Integration tests with live API calls

Exit criteria: Agent can call functions from natural language using native provider APIs. ✅ Documentation: tool_calling.md

Phase 7: Safety & Self-Modification

Goal: Safe autonomous operation

Exit criteria: Agent can propose and safely apply code changes.

Phase 8: Rich Interface

Goal: Full multimodal support

Exit criteria: Natural multimodal conversations.

Future Phases

Phase 9: Specialized Agents

Code agent (uses Tool Workspace)
Research agent (web search, synthesis)
Calendar agent (scheduling)

Phase 10: External Integrations

Calendar APIs
Note-taking apps
Smart home
Email

Phase 11: Multi-User

User isolation
Shared knowledge base
User-to-user introductions

Development Principles

Vertical slices: Each phase delivers usable functionality
Test-driven: Tests before features
Documentation-first: Design before code
Minimal dependencies: Add only when needed
Machine-readable: Code optimized for AI understanding

Current Focus

Phase 7: Safety & Self-Modification (Not Started)

Phases 0-6 complete including:

✅ Core infrastructure and memory
✅ Multi-provider LLM routing with cost optimization
✅ Tool Workspace for sandboxed code execution
✅ Task scheduling system for proactive behavior
✅ Native API tool calling framework

Next actions:

Design action classification system (read/write/execute severity levels)
Implement approval workflow for high-risk actions
Add audit logging for all agent actions
Create test harness for proposed code changes
Build self-modification sandbox using Tool Workspace

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Development Roadmap

Phase 0: Foundation (Complete)

Phase 1: Core Loop (Complete)

Phase 2: Memory (Complete)

Phase 3: Multi-Provider (Complete)

Phase 4: Background Agents (Complete)

Phase 5: Intelligent LLM Routing (Complete)

Phase 6: Tool Workspace (Complete)

Phase 6.5: Task Scheduling System (Complete)

Phase 6.6: Tool Calling Framework (Complete)

Phase 7: Safety & Self-Modification

Phase 8: Rich Interface

Future Phases

Phase 9: Specialized Agents

Phase 10: External Integrations

Phase 11: Multi-User

Development Principles

Current Focus

FilesExpand file tree

roadmap.md

Latest commit

History

roadmap.md

File metadata and controls

Development Roadmap

Phase 0: Foundation (Complete)

Phase 1: Core Loop (Complete)

Phase 2: Memory (Complete)

Phase 3: Multi-Provider (Complete)

Phase 4: Background Agents (Complete)

Phase 5: Intelligent LLM Routing (Complete)

Phase 6: Tool Workspace (Complete)

Phase 6.5: Task Scheduling System (Complete)

Phase 6.6: Tool Calling Framework (Complete)

Phase 7: Safety & Self-Modification

Phase 8: Rich Interface

Future Phases

Phase 9: Specialized Agents

Phase 10: External Integrations

Phase 11: Multi-User

Development Principles

Current Focus