This document lists all features that are currently implemented and working in Autobeat.
Last Updated: May 2026 (2026-05-08)
- API translation proxy: HTTP proxy translating Anthropic Messages API to OpenAI Chat Completions; enables Claude on any OpenAI-compatible backend (OpenRouter, Together, vLLM, etc.)
- Proxy configuration:
beat agents config set <agent> proxy openaiCLI andConfigureAgentMCP tool withproxyparameter - Translation architecture: Codecs (bidirectional format translation), middleware (streaming, errors, headers), IR (format-agnostic message passing)
- Ollama runtime integration: Wraps agent spawns with
ollama launchfor local LLM execution;beat agents config set <agent> runtime ollama - Proxy/runtime mutual exclusivity: Setting proxy clears runtime and vice versa
- Model schema relaxed: Model names accept
/,:,@separators for Ollama-style identifiers (e.g.,llama3:8b) - Interactive orchestrator mode:
beat orchestrate --interactive/-i "<goal>"— foreground TTY with SIGINT coordination and PID tracking - Dashboard layout overhaul: Responsive 3-tile top row (Tasks, Workers, Orchestrations), full-width entity browser, full entity names, degraded modes for narrow terminals
- Pipeline management MCP tools:
PipelineStatus,ListPipelines,CancelPipelinefor first-class pipeline entities - Skills/docs alignment: Skill content updated to cover v1.2.0–v1.4.0 features
- Migration v24:
pipelinestable — first-class pipeline entities with steps, status, foreign keys, and indexes - Migration v25:
orchestrations.mode(CHECK: standard/interactive) andorchestrations.pidcolumns
- System prompt support:
--system-promptflag onbeat run,beat loop,beat orchestrate;systemPromptparam on MCPDelegateTask,CreateLoop,CreateOrchestrator,ScheduleTask,SchedulePipeline,ScheduleLooptools - Per-agent injection: Claude (
--append-system-prompt), Codex (-c developer_instructions), Gemini (GEMINI_SYSTEM_MDenv var) - System prompt persistence: Stored per-task in database; survives retry/resume —
TaskStatusreturns it viaincludeSystemPrompt: true - Custom orchestrator scaffolding:
beat orchestrate init <goal>CLI command andInitCustomOrchestratorMCP tool - Generated artifacts: State file, exit condition script, and system prompt snippets for delegation, state management, and constraint enforcement
- Agent configuration documentation: README section covering API keys, base URLs, model selection, and local LLM usage
- Migration 23: Adds
system_prompt TEXTcolumn totaskstable (nullable, auto-applied)
- Two-view dashboard: Metrics view (overview tiles + activity feed) and Workspace view (per-orchestration task grid)
- Live agent output streaming: 1-2s latency via per-task polling with ring buffer; auto-tail follows new output
- Cost and token tracking: Captures input/output/cache tokens and USD cost for Claude agents; stored in
task_usagetable; 24h rolling aggregate in Metrics view - Cancel cascade for orchestrations:
ckey on an orchestration in the dashboard cancels it and all attributed child tasks in one operation - Responsive layout: Terminal size detection via stderr; graceful degradation to narrow-mode or too-small message
- Orchestrator_id propagation: Sub-tasks spawned by orchestrators are attributed via
orchestrator_idcolumn; propagated through both CLI (AUTOBEAT_ORCHESTRATOR_IDenv var) and MCP (metadata.orchestratorId) spawn paths
| Key | Action |
|---|---|
v |
Toggle between Metrics and Workspace views |
m |
Jump to Metrics view from anywhere |
w |
Jump to Workspace view from anywhere |
↑/↓ in Activity panel |
Navigate activity feed rows |
Enter on Activity row |
Drill into detail view for that entity |
Tab (from orchestrations panel) |
Cycle focus into Activity feed |
Esc in Activity focus |
Return to panel grid |
Tab / Shift+Tab (Workspace) |
Cycle focus: nav ↔ grid panels |
1–9 (Workspace grid) |
Jump to panel by number |
f (Workspace grid) |
Toggle fullscreen for focused task panel |
[/] |
Scroll focused task panel up/down |
g |
Jump to top of focused task panel (disables auto-tail) |
G |
Jump to bottom of focused task panel (re-engages auto-tail) |
c (Workspace) |
Cancel orchestration (nav focus) or child task (grid focus) |
PgUp / PgDn (Workspace) |
Page through task grid |
d (Workspace grid) |
Delete focused child task (terminal status only) |
- Three eval strategies via
evalTypefield on loop creation:'feedforward'(default) — gathers agent findings per iteration and injects them as context into the next iteration's prompt; always continues tomaxIterations. When noevalPromptis set, acts as a pure pass-through (no eval agent spawned).'judge'— two-phase eval+judge: Phase 1 eval agent generates findings; Phase 2 judge agent reads findings and writes a decision to.autobeat-judge-task-{uuid}in the working directory (TOCTOU-safe). Safe fallback tocontinue: trueif both structured output and file mechanisms fail.'schema'— deterministic Claude--json-schemaeval; the eval agent must respond with{"continue": bool, "reasoning": string}. No judge agent required.
judgeAgentandjudgePromptfields for configuring the judge phase independently of the eval agent- Atomic PID file locking for the schedule executor —
O_EXCLfile creation prevents double-execution races; stale PID files are cleaned up automatically SpawnOptionsinterface —AgentAdapter.spawn()now accepts a named options object instead of 6 positional parameters (internal refactor, no user-visible behavior change)- Extracted pure functions with full unit test coverage:
refetchAfterAgentEval,handleStopDecision,buildEvalPromptBase,acquirePidFile,checkActiveSchedules,registerSignalHandlers,startIdleCheckLoop
- Migration 18: Adds
orchestrator_idcolumn totasksfor sub-task attribution - Migration 19: Adds
task_usagetable for token/cost tracking per task - Migration 21: Adds
workers.last_heartbeat,loops.eval_type(default'feedforward'),loops.judge_agent,loops.judge_promptcolumns - Migration 22: Recreates
loopstable with CHECK constraints oneval_typeandjudge_agent— full table rebuild; back up~/.autobeat/autobeat.dbbefore upgrading
beat dashboard/beat dash: Interactive terminal UI (requires TTY)- 4 panels: Loops, Tasks, Schedules, Orchestrations
- Keyboard navigation: Tab/Shift+Tab (cycle panels), 1-4 (jump), j/k or ↑/↓ (move selection), Enter (drill-in), Esc (back), f (cycle filter), q (quit), r (refresh)
- Per-panel filter cycles — each panel cycles only its valid statuses
- Detail views with entity-specific field rendering (task, loop, schedule, orchestration)
- Truncation indicators when lists exceed panel capacity
- Smart EmptyState with true counts when filters hide items
- Built with Ink (React for terminal UIs)
- Per-agent
baseUrlandmodelin~/.autobeat/config.json --model/-mflag onbeat runandbeat orchestrate- Provider env var injection at spawn:
ANTHROPIC_BASE_URL,OPENAI_BASE_URL,GEMINI_BASE_URL - User env vars take precedence over injected values
- Claude: experimental-betas auto-disabled when custom
baseUrlis set (prevents proxy failures) - Warning printed when
baseUrlis set on Claude without a detected API key - Extended MCP tools:
ConfigureAgentandListAgentssupportbaseUrl/model
- Migration 16: Adds
modelcolumn totasksandorchestrationstables
- Loops can use an AI agent to evaluate iteration results instead of a shell command
evalMode: 'agent'(MCP) /--eval-mode agent(CLI)- Custom evaluation prompts via
evalPrompt/--eval-prompt - Agent judges pass/fail (retry strategy) or scores 0-100 (optimize strategy)
AgentExitConditionEvaluatorandCompositeExitConditionEvaluatorhandle evaluation dispatch
- Skill files in
skills/autobeat/provide structured context for AI agents - Capability hierarchy decision tree (Task < Pipeline < Loop < Orchestrator)
- Complete MCP tool and CLI command reference
- Composition patterns and anti-patterns
- Reference guides: orchestration, loops, dependencies, monitoring, capability matrix
beat init --install-skills: install skill files to agent-specific directories--skills-agents claude,codex,gemini: target specific agents- Detects existing skills and prompts for update
- Agent-specific paths:
.claude/skills/autobeat,.agents/skills/autobeat,.gemini/skills/autobeat
- MCP Instructions: server-side instructions injected into MCP Server for structured agent context
- ListAgents: list available agents with registration and auth status (no parameters)
- ConfigureAgent: check auth status (
check), store API key (set), or reset stored key (reset)
- Goal-Driven Execution: Give the orchestrator a goal and it autonomously breaks it into subtasks, delegates to worker agents, monitors progress, and iterates until done
- Meta-Agent Architecture: The orchestrator is a loop running a lead agent whose prompt gives it access to all of Autobeat's CLI commands (
beat run,beat status,beat loop, etc.) - Persistent State File: JSON state file in
~/.autobeat/orchestrator-state/with plan, step statuses, iteration count, and agent context. Atomic writes (temp + rename) prevent corruption - Crash Recovery: Orchestrations persisted in SQLite. Recovery manager detects and resumes interrupted orchestrations on startup
- Multi-Agent: Per-orchestration agent selection (Claude, Codex, Gemini)
- CreateOrchestrator: Create and start an orchestration with goal, guardrails, and agent selection
- OrchestratorStatus: Get orchestration details including plan steps and state
- ListOrchestrators: List orchestrations with status filter and pagination
- CancelOrchestrator: Cancel an active orchestration with optional reason
- InitCustomOrchestrator: Scaffold a custom orchestrator with state file, exit condition, and system prompt (v1.4.0)
beat orchestrate "<goal>": Create and run an orchestration (detached by default)beat orchestrate "<goal>" --foreground: Block and wait for completion (Ctrl+C to cancel)beat orchestrate -i "<goal>": Interactive foreground TTY session (v1.5.0)beat orchestrate init "<goal>": Scaffold custom orchestrator (state file, exit condition, system prompt) (v1.4.0)beat orchestrate status <id>: Check orchestration status and plan progressbeat orchestrate list [--status <status>]: List orchestrations with optional status filterbeat orchestrate cancel <id> [reason]: Cancel an active orchestration
- Max Depth (1-10, default 3): Maximum delegation depth
- Max Workers (1-20, default 5): Maximum concurrent worker agents
- Max Iterations (1-200, default 50): Maximum orchestrator loop iterations
- Goal Length: 1-8,000 characters with Zod validation
- OrchestrationCreated: Emitted when a new orchestration is created
- OrchestrationCompleted: Emitted when the orchestration goal is achieved
- OrchestrationCancelled: Emitted when an orchestration is cancelled
- Migration 14:
orchestrationstable with goal, status, guardrail columns, loop FK, and indexes on status and loop_id
- DelegateTask: Submit tasks to background AI agent instances
- TaskStatus: Check status of running/completed tasks
- TaskLogs: Retrieve stdout/stderr output from tasks (with tail option)
- CancelTask: Cancel running tasks with optional reason
- RetryTask: Retry a failed or completed task (creates new task with same prompt)
- Priority Levels: P0 (Critical), P1 (High), P2 (Normal)
- Task Status Tracking: QUEUED, RUNNING, COMPLETED, FAILED, CANCELLED
- Per-Task Configuration: Custom timeout and output buffer per task
- Working Directory Support: Run tasks in specific directories
- Retry Logic: Exponential backoff for operations
- Automatic Scaling: Spawns workers based on CPU and memory availability
- Resource Monitoring: Real-time CPU and memory usage tracking
- Intelligent Limits: Maintains 20% CPU headroom and 1GB RAM reserve
- No Artificial Limits: Uses all available system resources
- CPU Threshold: Configurable CPU usage limit (default: 80%)
- Memory Reserve: Configurable memory reserve (default: 1GB)
- Worker Lifecycle: Automatic cleanup on completion/failure
- Resource Tracking: Per-worker CPU and memory monitoring
- Problem Solved: Load average is a 1-minute rolling average that doesn't reflect recent spawns
- Settling Window: Recently spawned workers are tracked for 15 seconds (configurable via
WORKER_SETTLING_WINDOW_MS) - Resource Projection: Includes settling workers in resource calculations to prevent spawn burst overload
- Spawn Delay: Minimum 10 seconds between spawns for stability (configurable via
WORKER_MIN_SPAWN_DELAY_MS)
- SQLite Backend: Persistent task storage with WAL mode
- Complete Task History: All tasks, outputs, and metadata stored
- Automatic Recovery: Restores QUEUED/RUNNING tasks on startup
- Database Cleanup: Automatic removal of old completed tasks (7 days)
- State Recovery: Resumes interrupted tasks after crashes
- Duplicate Prevention: Prevents re-queuing already processed tasks
- Status Reconciliation: Marks crashed RUNNING tasks as FAILED
- Memory Buffering: In-memory capture up to configurable limit (default: 10MB)
- File Overflow: Automatic file storage when buffer exceeded
- Stream Processing: Real-time stdout/stderr capture
- Output Repository: Persistent storage of all task output
- Per-Task Buffer Size: Override buffer limit per task (1KB - 1GB)
- Global Defaults: System-wide output buffer configuration
- Automatic Cleanup: Old output files removed with tasks
TASK_TIMEOUT: Default task timeout (default: 1800000ms = 30min)MAX_OUTPUT_BUFFER: Default output buffer size (default: 10MB)CPU_THRESHOLD: CPU usage threshold (default: 80%)MEMORY_RESERVE: Memory reserve in bytes (default: 1GB)LOG_LEVEL: Logging verbosity (debug/info/warn/error)OUTPUT_FLUSH_INTERVAL_MS: Dashboard output polling interval (default: 1000ms; use 5000 to reduce write pressure)
- Validation: Zod schema validation with fallbacks
- Range Checking: Min/max limits for all numeric values
- Graceful Degradation: Falls back to defaults on invalid config
- CLI Spawning: Spawns agent processes (
claude,codex,gemini) with proper arguments - Permission Handling: Agent-specific permission flags (e.g.,
--dangerously-skip-permissionsfor Claude) - Working Directory: Supports custom working directories
- Process Monitoring: Tracks PIDs, exit codes, and resource usage
- Timeout Enforcement: Configurable per-task timeouts (1s - 24h)
- Graceful Termination: SIGTERM then SIGKILL for task cancellation
- Exit Code Tracking: Captures and stores process exit codes
- Error Handling: Distinguishes timeout vs failure vs cancellation
- JSON Logs: Production structured logging with context
- Console Logs: Development-friendly console output
- Log Levels: Configurable verbosity (debug/info/warn/error)
- Context Enrichment: Automatic context addition per module
- System Resources: Real-time CPU/memory monitoring
- Task Metrics: Creation, start, completion timestamps
- Worker Tracking: Active worker count and resource usage
- Error Tracking: Structured error logging with context
beat mcp start: Start the MCP serverbeat mcp test: Test server startup and validationbeat mcp config: Show MCP configuration examplesbeat help: Show help and usage
beat run <prompt>: Delegate task directly to a background agent instancebeat status [task-id]: Check status of all tasks or specific taskbeat logs <task-id>: Retrieve task output and logsbeat cancel <task-id> [reason]: Cancel running task with optional reason
beat schedule create <prompt> [options]: Create a cron or one-time scheduled taskbeat schedule list [--status <status>]: List schedules with optional status filterbeat schedule status <id> [--history]: Get schedule details and execution historybeat schedule pause <id>: Pause an active schedulebeat schedule resume <id>: Resume a paused schedulebeat schedule cancel <id> [reason]: Cancel a schedule with optional reason
beat orchestrate "<goal>": Create and run an orchestration (detached)beat orchestrate "<goal>" --foreground: Block and wait for completionbeat orchestrate -i "<goal>": Interactive foreground TTY session (v1.5.0)beat orchestrate init "<goal>": Scaffold custom orchestrator (v1.4.0)beat orchestrate status <id>: Orchestration detailsbeat orchestrate list [--status <s>]: List orchestrationsbeat orchestrate cancel <id> [reason]: Cancel orchestration
beat pipeline <prompt> [<prompt>]...: Create chained one-time schedules
beat loop <prompt> --until <cmd>: Retry loopbeat loop <prompt> --eval <cmd> --minimize|--maximize: Optimize loopbeat loop list [--status <s>]: List loopsbeat loop status <id> [--history]: Loop detailsbeat loop pause <id>: Pause loop (v0.8.0)beat loop resume <id>: Resume loop (v0.8.0)beat loop cancel <id> [--cancel-tasks] [reason]: Cancel loop
beat resume <task-id>: Resume a failed/completed task from its checkpointbeat resume <task-id> --context "...": Resume with additional instructions
beat agents list: List available agents with auth statusbeat agents check: Auth status for all agentsbeat agents config show [agent]: Show agent configbeat agents config set <agent> <key> <value>: Set agent config (apiKey, baseUrl, model, proxy, runtime)beat agents config reset <agent>: Clear stored config
beat dashboard/beat dash: Interactive terminal UI with metrics and workspace views
- NPM Package: Global installation support
- Local Development: Source code execution
- Claude Desktop: MCP server configuration
- Environment Variables: Runtime configuration options
- MCP Adapter: JSON-RPC 2.0 protocol implementation
- Task Manager: Orchestrates task lifecycle
- Recovery Manager: Startup task recovery with dependency-aware crash detection
- Resource Monitor: System resource tracking
- ReadOnlyContext: Lightweight bootstrap for CLI query commands (~200-400ms faster)
- Hybrid Event-Driven Architecture: Commands (state changes) flow through EventBus; queries use direct repository access
- Event Handlers: Specialized handlers (Persistence, Queue, Worker, Dependency, Schedule, Checkpoint, Loop, Orchestration)
- Singleton EventBus: Shared event bus across all system components (34 events)
- Dependency Injection: Container-based DI with Result types
- Result Pattern: No exceptions in business logic
- Immutable Domain: Readonly data structures
- Database-First Pattern: Single source of truth with no memory-database divergence
- SQLite Worker Coordination:
workerstable with PID-based crash detection for cross-process visibility - Atomic Transactions:
runInTransactionfor multi-step DB operations with rollback - Proper Process Handling: Fixed stdin management (
stdio: ['ignore', 'pipe', 'pipe'])
- Dependency Declaration: Tasks can depend on other tasks via
dependsOnarray in task specification - Cycle Detection: DFS-based algorithm prevents circular dependencies (A→B→A patterns)
- Transitive Cycle Detection: Detects complex cycles across multiple tasks (A→B→C→A)
- Automatic Resolution: Dependencies automatically resolved on task completion/failure/cancellation
- Blocked Task Management: Tasks with unmet dependencies remain in BLOCKED state until resolved
- Multiple Dependencies: Tasks can depend on multiple prerequisite tasks simultaneously
- Diamond Patterns: Supports complex dependency graphs (A→B, A→C, B→D, C→D)
- Foreign Key Constraints: Database-enforced referential integrity
- Resolution Tracking: Automatic resolution timestamp on dependency completion
- Atomic Transactions: TOCTOU-safe dependency addition with synchronous better-sqlite3 transactions
- Composite Indexes: Optimized queries for dependency lookups and blocked task checks
continueFromField: Dependent tasks can specify a dependency whose checkpoint context is injected into their prompt- Automatic Enrichment: When the dependency completes, its output summary, git state, and errors are prepended to the dependent task's prompt
- Race-Safe: Subscribe-first pattern with 5-second timeout ensures checkpoint is available before task runs
- Validation:
continueFrommust reference a task in thedependsOnlist (auto-added if missing) - Chain Support: A→B→C where B receives A's context and C receives B's (which includes A's)
- TaskDependencyAdded: Emitted when new dependency relationship created
- DependencyResolved: Emitted when blocking dependency completes
- TaskUnblocked: Emitted when all dependencies resolved, triggers automatic queuing
- ScheduleTask: Create recurring (cron) or one-time scheduled tasks
- ListSchedules: List all schedules with optional status filter and pagination
- ScheduleStatus: Get schedule details including execution history
- CancelSchedule: Cancel an active schedule with optional reason
- PauseSchedule: Pause an active schedule (can be resumed later)
- ResumeSchedule: Resume a paused schedule
- CreatePipeline (v0.4.1): Create sequential task pipelines with 2–20 steps, per-step delays, priority, and working directory overrides
- SchedulePipeline (v0.6.0): Create recurring or one-time scheduled pipelines with 2–20 steps, each trigger creates a fresh pipeline instance with linear task dependencies
- CRON: Standard 5-field cron expressions for recurring task execution
- ONE_TIME: ISO 8601 datetime for single future execution
- Timezone Support: IANA timezone identifiers (e.g.,
America/New_York) with DST awareness - Missed Run Policies:
skip(ignore missed runs),catchup(execute missed runs),fail(mark as failed) - Max Runs: Optional limit on number of executions for cron schedules
- Expiration: Optional ISO 8601 expiry datetime for schedules
- Lock-Based Protection: Prevents overlapping executions of the same schedule
- Execution Tracking: Full history of schedule executions with status and timing
- ScheduleCreated: Emitted when a new schedule is created
- ScheduleCancelled: Emitted when a schedule is cancelled
- SchedulePaused: Emitted when a schedule is paused
- ScheduleResumed: Emitted when a schedule is resumed
- ScheduleExecuted: Emitted when a scheduled task is triggered
- Automatic Capture: Checkpoints created on task completion or failure (via
CheckpointHandler) - Git State: Branch name, commit SHA, and dirty file list recorded at checkpoint time
- Output Summary: Last 50 lines of stdout/stderr preserved for context injection
- Database Persistence:
task_checkpointstable (migration v5) with full audit data
- Enriched Prompts: Resumed tasks receive full checkpoint context (previous output, git state, error info)
- Additional Context: Provide extra instructions when resuming to guide the retry
- Retry Chains: Track resume lineage via
parentTaskIdandretryOffields on the new task - Terminal State Requirement: Only tasks in completed, failed, or cancelled states can be resumed
- ResumeTask: Resume a terminal task with optional
additionalContextstring (max 4000 chars)
- TaskCompleted / TaskFailed: Triggers automatic checkpoint capture via
CheckpointHandler - CheckpointRepository: SQLite persistence with prepared statements and Zod boundary validation
- Pluggable Adapters: Agent registry with adapter pattern for agent lifecycle management
- Built-in Agents: Claude (
claude), OpenAI Codex (codex), Google Gemini (gemini-cli) - Per-Task Selection: Choose which agent runs each task via MCP
agentfield or CLI--agentflag - Default Agent: System-wide default agent configured via
beat initor~/.autobeat/config.json - Auth Checking: Verify agent CLI tools are installed and authenticated before delegation
beat init: Interactive first-time setup — select default agent, validates availabilitybeat init --agent <name>: Non-interactive setup with specified agentbeat agents list: Show registered agents with default marker and auth status
agentfield on DelegateTask: Specify agent per task (e.g.,{ agent: "codex" })- Fallback: Uses default agent when no agent specified
- SchedulePipeline MCP Tool: Create a single schedule that triggers a full pipeline (2–20 steps) on each execution
- Cron + One-Time: Supports both recurring cron expressions and single future execution
- Linear Dependencies: Each trigger creates fresh tasks wired with linear dependencies (step N depends on step N-1)
- Per-Step Configuration: Each step can have its own prompt, priority, working directory, and agent override (MCP only)
- Shared Defaults: Schedule-level agent, priority, and working directory apply to all steps unless overridden
- Dependency Failure Cascade: When a pipeline step fails, all downstream steps are automatically cancelled
- Cancel with Tasks:
CancelSchedulewithcancelTasks: truecancels in-flight pipeline tasks from current execution - Concurrency Tracking: Pipeline completion tracked via tail task — prevents overlapping pipeline executions
afterScheduleIdSupport: Chain pipelines after other schedules (predecessor dependency injected on step 0)
beat schedule create --pipeline --step "lint" --step "test" --cron "0 9 * * *": Create scheduled pipelinebeat schedule cancel <id> --cancel-tasks: Cancel schedule and in-flight tasks
- Dependency Failure Cascade: Failed/cancelled upstream tasks now cascade cancellation to dependents (was incorrectly unblocking them)
- Queue Handler Race Condition: Fast-path check prevents blocked tasks from being prematurely enqueued
- CreateLoop: Create an iterative loop that runs a task repeatedly until an exit condition is met (retry or optimize strategy)
- LoopStatus: Get loop details including optional iteration history
- ListLoops: List loops with optional status filter and pagination
- CancelLoop: Cancel an active loop, optionally cancelling in-flight iteration tasks
- PauseLoop (v0.8.0): Pause an active loop mid-iteration
- ResumeLoop (v0.8.0): Resume a paused loop from last checkpoint
- ScheduleLoop (v0.8.0): Compose loops with cron/one-time schedules
- Retry: Run a task until an exit condition passes — shell command returning exit code 0 ends the loop
- Optimize: Run a task, score output with eval script, keep improvements — seek the best score across iterations (minimize or maximize direction)
- Agent Eval Mode: Either strategy can delegate exit condition evaluation to an AI agent instead of a shell command. Pass
evalMode: 'agent'(MCP) or--eval-mode agent(CLI). The agent reads iteration output and returns pass/fail (retry) or a numeric score (optimize). UseevalPrompt/--eval-promptto customize the evaluation prompt.
- Task Prompt: Each iteration runs the same prompt (or enriched with checkpoint context if
freshContextis false) - Exit Condition: Shell command evaluated after each iteration to determine pass/fail or score
- Multi-Step Iterations: Repeat a full pipeline (2–20 steps) per iteration instead of a single task
- Linear Dependencies: Each pipeline step depends on the previous step within the iteration
- Same Exit Condition: Evaluated after all pipeline steps complete
- Max Iterations: Safety cap on iteration count (0 = unlimited, default: 10)
- Max Consecutive Failures: Stop after N consecutive failures (default: 3)
- Cooldown: Delay between iterations in milliseconds (default: 0)
- Eval Timeout: Timeout for exit condition evaluation (default: 60s, minimum: 1s)
- Fresh Context: Each iteration gets a fresh agent context (default: true) or continues from previous checkpoint
- Eval Mode (
evalMode):'shell'(default) evaluates iteration results via a shell command exit code;'agent'delegates evaluation to an AI agent - Eval Prompt (
evalPrompt): Optional custom instructions for the agent evaluator (agent mode only). When omitted, the agent uses a default review prompt.
beat loop <prompt> --until <cmd>: Create a retry loop (run until shell command exits 0)beat loop <prompt> --eval <cmd> --minimize|--maximize: Create an optimize loop (score-based)beat loop <prompt> --eval-mode agent --strategy retry: Create a retry loop using agent evaluationbeat loop <prompt> --eval-mode agent --strategy optimize --maximize [--eval-prompt "..."]: Create an optimize loop with agent scoringbeat loop --pipeline --step "..." --step "..." --until <cmd>: Create a pipeline loopbeat loop list [--status <status>]: List loops with optional status filterbeat loop status <loop-id> [--history]: Get loop details and iteration historybeat loop cancel <loop-id> [--cancel-tasks] [reason]: Cancel a loop with optional task cancellationbeat loop pause <loop-id>(v0.8.0): Pause an active loopbeat loop resume <loop-id>(v0.8.0): Resume a paused loop
- LoopCreated: Emitted when a new loop is created
- LoopIterationCompleted: Emitted when an iteration finishes with its result (pass/fail/keep/discard/crash)
- LoopCompleted: Emitted when the loop reaches its exit condition or max iterations
- LoopCancelled: Emitted when a loop is cancelled
- Migration 10:
loopstable for loop definitions and state,loop_iterationstable for per-iteration execution records - Migration 15:
eval_modeandeval_promptcolumns onloopstable,eval_feedbackcolumn onloop_iterationstable
- Distributed Processing: Single-server only
- Web UI: No web-based UI (TUI dashboard available via
beat dashboard) - Workflow Templates: No preset YAML/JSON workflow specifications (post-v1 roadmap item)
- Multi-User Support: Single-user focused
- REST API: MCP protocol only
- Autonomous Goal Execution: Meta-agent that uses Autobeat's own infrastructure to break down goals, delegate to workers, monitor progress, and iterate until done
- 4 CLI Commands:
beat orchestrate,beat orchestrate status,beat orchestrate list,beat orchestrate cancel - 4 MCP Tools:
CreateOrchestrator,OrchestratorStatus,ListOrchestrators,CancelOrchestrator - Persistent State File: Atomic JSON state file with plan, steps, and agent context
- Guardrails:
maxDepth(3),maxWorkers(5),maxIterations(50) with configurable limits - Crash Recovery: SQLite persistence with startup recovery for interrupted orchestrations
- 3 new events (34 total):
OrchestrationCreated,OrchestrationCompleted,OrchestrationCancelled
- Migration 14:
orchestrationstable with goal, status, guardrails, loop FK, and indexes
- PauseLoop / ResumeLoop MCP Tools: Pause active loops mid-iteration, resume from last checkpoint
- CLI Commands:
beat loop pause <id>,beat loop resume <id> - Persistence: Paused state survives server restarts
- ScheduleLoop MCP Tool: Compose loops with cron/one-time schedules — each trigger creates a new loop instance
- CLI:
beat schedule create --loopfor creating scheduled loops
--git-branchflag: Optional git-aware loop iteration tracking- Commit-per-iteration: One branch for the entire loop, one commit per successful iteration (v0.8.1 fix)
- Revert on failure: Failed/discarded iterations fully reverted to appropriate target commit
- Diff summaries: Git diffs tracked between iterations
- O(1) reset target lookup:
getResetTargetSha()reads cachedbestIterationCommitShadirectly from the Loop domain instead of scanning iterations from the database
- 2 new events (31 total):
LoopPaused,LoopResumed
- Migration 11:
loop_pause_statecolumn,schedule_idFK on loops table, git config storage - Migration 12:
git_start_commit_shaon loops,git_commit_shaandpre_iteration_commit_shaon loop_iterations - Migration 13:
best_iteration_commit_shaon loops for O(1) reset target lookup
beat loop pause <id>: Pause an active loopbeat loop resume <id>: Resume a paused loop- Backward-compat fallbacks for v0.8.0 data
CreateLoopMCP Tool: Create retry or optimize loops for single tasks or pipelines (2–20 steps)- Retry Strategy: Run a task until an exit condition shell command returns exit code 0
- Optimize Strategy: Score iterations with an eval script, keep improvements (minimize or maximize)
- Pipeline Loops: Repeat a multi-step pipeline per iteration with linear task dependencies
- Fresh Context: Each iteration gets a clean agent context by default, or continues from previous checkpoint
- Safety Controls: Max iterations (0 = unlimited), max consecutive failures, cooldown between iterations
- Configurable Eval Timeout: Exit condition evaluation timeout (default: 60s)
- CLI:
beat loop,beat loop list,beat loop status,beat loop cancelcommands - 4 MCP Tools:
CreateLoop,LoopStatus,ListLoops,CancelLoop
- 4 New Events:
LoopCreated,LoopIterationCompleted,LoopCompleted,LoopCancelled - Loop Handler: Event-driven iteration engine manages loop lifecycle
- Migration 10:
loopsandloop_iterationstables
SchedulePipelineMCP Tool: Create cron or one-time schedules that trigger a full pipeline (2–20 steps) on each execution- Linear Task Dependencies: Each trigger creates fresh tasks with
task[i].dependsOn = [task[i-1].id] - Per-Step Agent Override: MCP tool supports per-step
agentfield; CLI uses shared--agent cancelTaskson CancelSchedule: Optional flag to also cancel in-flight pipeline tasks from current execution- ListSchedules Enhancement: Response includes
isPipelineandstepCountindicators - ScheduleStatus Enhancement: Response includes full
pipelineStepswhen present - CLI:
--pipeline --step "..." --step "..."flags for creating scheduled pipelines
- Event System Simplification (#91): 18 overhead events removed, 3 services removed (QueryHandler, OutputHandler, AutoscalingManager). Query operations use direct repository calls instead of events. EventBus reduced from 42 to 25 events.
- SQLite Worker Coordination (#94): New
workerstable with PID-based crash detection. Cross-process output visibility via persistent output storage.WorkerRepositoryandOutputRepositorynow required in constructors. - ReadOnlyContext (#100): Lightweight bootstrap for CLI query commands (
status,list,logs). Skips EventBus, worker pool, and schedule executor initialization. ~200-400ms faster startup. - Atomic Transactions (#85):
runInTransactionfor atomic multi-step DB operations. Synchronous schedule operations with partial failure rollback.
- Dependency Failure Cascade: When upstream task fails or is cancelled, dependent tasks are now cancelled instead of incorrectly unblocked (breaking change)
- Queue Handler Race Condition: Fast-path
dependencyStatecheck prevents blocked tasks from being enqueued before dependency rows are written to DB - RecoveryManager Dependency Checks (#84): Crash recovery now validates dependency state before re-queuing tasks
- CancelSchedule Scope (#82):
cancelTasksnow cancels tasks from ALL active executions, not just the latest - Output totalSize (#95):
totalSizerecalculated after tail-slicing via sharedlinesByteSizeutility - FAIL Policy Atomicity (#83): ScheduleExecutor FAIL policy wrapped in transaction — atomic cancel+audit, event emission after transaction commits
- OutputRepository DIP Compliance (#101): Interface moved from implementations to
core/interfaces.ts - BootstrapMode Enum (#104): Boolean flags (
isCli,isRun,isReadOnly) replaced withmode: BootstrapMode('server'|'cli'|'run') - Multi-Provider Branding (#86): Neutralize Claude-specific branding for multi-provider positioning
- Dependency Failure Cascade: Failed/cancelled upstream tasks cascade cancellation to dependents (was incorrectly unblocking)
- Constructor Changes:
WorkerRepositoryandOutputRepositorynow required in constructors (#94) - Event System: EventBus reduced from 42 to 25 events; query operations use direct calls (#91)
- BootstrapOptions: Drops boolean flags, adds
mode: BootstrapMode(#104)
- Migration 8:
pipeline_stepscolumn onschedulestable,pipeline_task_idscolumn onschedule_executionstable - Migration 9:
workerstable for cross-process worker tracking (#94)
- Agent Registry: Pluggable adapters for Claude, Codex, and Gemini with per-task agent selection
beat init: Interactive first-time setup wizard for selecting default agentbeat agents list: Show registered agents with default marker and auth status- Auth Checking: Validates agent CLI availability before task delegation
- MCP + CLI Parity:
agentfield onDelegateTasktool,--agentflag onbeat run
- 54 new tests: Handler unit tests (21), final coverage gaps (33)
- Stale cleanup: Removed 3
it.skiptests for unimplemented threshold events
- Pipeline Creation via MCP: New
CreatePipelinetool closes the last CLI/MCP feature parity gap - 2–20 Steps: Sequential task pipelines with per-step delays, priority, and working directory overrides
- Shared Service: Both MCP and CLI use
ScheduleManagerService.createPipeline()— identical behavior
- Cron & One-Time Schedules: Standard 5-field cron expressions and ISO 8601 one-time scheduling
- Timezone Support: IANA timezone identifiers with DST awareness
- Missed Run Policies:
skip,catchup, orfailfor overdue triggers - Lifecycle Management: Pause, resume, cancel schedules with full execution history
- Concurrent Execution Prevention: Lock-based protection against overlapping runs
- 6 MCP Tools:
ScheduleTask,ListSchedules,ScheduleStatus,CancelSchedule,PauseSchedule,ResumeSchedule - CLI + Pipeline: Full CLI parity including
beat pipelinecommand for chained one-time schedules
- Auto-Checkpoints: Captured on task completion/failure with git state and output summary
- Enriched Prompts: Resumed tasks receive full context from previous attempt
- Retry Chains: Track resume lineage via
parentTaskIdandretryOffields - MCP Tool:
ResumeTaskwith optional additional context - CLI:
beat resume <task-id> [--context "..."]
- Dependency Context Injection: Dependent tasks receive checkpoint context from a specified dependency
continueFromField: Added toDelegateTaskMCP tool andbeat run --continue-fromCLI flag- Automatic Enrichment: Output summary, git state, and errors prepended to task prompt
- Race-Safe Design: Subscribe-first pattern ensures checkpoint availability before task execution
- Chain Support: Context flows through A→B→C dependency chains
- Schedule Service Extraction: ~375 lines of business logic extracted from MCP adapter for CLI reuse
- CLI Bootstrap Helper:
withServices()eliminates repeated bootstrap boilerplate - Database Migrations v3-v6:
schedules,schedule_executions,task_checkpointstables,continue_fromcolumn - FK Cascade Fix: Separated
save()fromupdate()to prevent cascade data loss
- Complete Rewrite: Moved from direct method calls to event-based coordination
- EventBus: Central coordination hub for all system communication
- Event Handlers: Specialized handlers for different concerns (persistence, queue, workers, output)
- Zero Direct State: TaskManager is stateless, handlers manage all state via events
- Task Management: Direct CLI interface without MCP connection required
- Real-time Testing: Instant task delegation and status checking
- Better DX: No need to reconnect MCP server for testing
- Fixed Output Capture: Resolved Claude CLI hanging issues
- Proper stdin: Uses
stdio: ['ignore', 'pipe', 'pipe']instead of hack - Robust Spawning: Eliminated stdin injection workarounds
Note: This document reflects the actual implemented features. For planned features, see ROADMAP.md.