Skip to content

feat: v0.7.0 — Task & Pipeline Loops#110

Merged
dean0x merged 40 commits intomainfrom
feat/v070-task-loops
Mar 22, 2026
Merged

feat: v0.7.0 — Task & Pipeline Loops#110
dean0x merged 40 commits intomainfrom
feat/v070-task-loops

Conversation

@dean0x
Copy link
Owner

@dean0x dean0x commented Mar 21, 2026

Summary

Adds iterative task and pipeline loop execution to Backbeat (v0.7.0). Loops run a task or pipeline repeatedly, evaluating an exit condition after each iteration to decide whether to continue, keep, or discard results.

  • Retry strategy: Re-run until exit condition passes or limits reached
  • Optimize strategy: Track best score across iterations (maximize/minimize)
  • Pipeline loops: Run multi-step pipelines per iteration with linear dependency chains
  • Recovery: Self-healing on startup — rebuilds in-memory maps, recovers stuck iterations
  • Cleanup: cleanupOldLoops in RecoveryManager (7-day retention, FK cascade deletes iterations)

Key design decisions

  • Loop iteration chaining uses prompt enrichment from checkpoints (no dependsOn to avoid pre-resolved dep deadlock)
  • Only the tail task of a pipeline triggers exit condition evaluation (R4)
  • Atomic iteration start via runInTransaction prevents double-start (R4)
  • Timestamps use epoch number (aligned with Task/Schedule/Worker convention)
  • LoopIteration.taskId is optional to handle ON DELETE SET NULL safely

What's included

  • Domain: Loop, LoopIteration types, LoopStatus/LoopStrategy enums, factory functions
  • Events: LoopCreated, LoopCompleted, LoopCancelled, LoopIterationCompleted
  • Repository: SQLiteLoopRepository with prepared statements, sync ops for transactions, cleanup
  • Handler: LoopHandler — 1,100-line event-driven iteration engine with recordAndContinue helper
  • Service: LoopManagerService for create/cancel/list/get operations
  • MCP tools: CreateLoop, LoopStatus, ListLoops, CancelLoop
  • CLI: beat loop create|status|list|cancel commands
  • Migration v10: loops + loop_iterations tables with indexes and FK cascade
  • Tests: 104 new tests (52 repo, 24 manager, 20 handler, 5 integration, 3 recovery)
  • Tech debt fixes: Timestamp alignment, optional taskId, extracted helper, removed dead param, loop cleanup

Stats

  • 22 commits, 32 files changed, +5,910 / −82 lines
  • 1,102 total tests passing (was 998 on main)
  • Snyk SAST: 0 issues, Biome: clean

Test plan

  • npm run build — clean compile
  • npm run test:core — domain type tests
  • npm run test:repositories — loop repo CRUD, cleanup, NULL taskId, cascade
  • npm run test:services — loop manager + recovery manager cleanup tests
  • npm run test:handlers — loop handler lifecycle, optimize, pipeline, recovery
  • npm run test:integration — end-to-end retry loop, optimize loop, cancel
  • npm run test:cli — CLI routing
  • npm run test:adapters — MCP tool registration
  • Manual: beat loop create --strategy retry --exit-condition 'true' --prompt 'test'

Dean Sharon and others added 22 commits March 21, 2026 01:45
Add foundation types for v0.7.0 Task/Pipeline Loops feature:
- LoopId branded type with factory function
- LoopStatus, LoopStrategy, OptimizeDirection enums
- Loop, LoopIteration, LoopCreateRequest interfaces
- createLoop/updateLoop factory functions (frozen, immutable)
- 4 loop events: LoopCreated, LoopIterationCompleted, LoopCompleted, LoopCancelled
- LoopRepository, SyncLoopOperations, LoopService interfaces

Co-Authored-By: Claude <noreply@anthropic.com>
Add database persistence layer for loops:
- Migration v10: loops table (strategy, task_template, exit_condition,
  eval_direction, status tracking) and loop_iterations table
  (iteration results, scores, task correlation)
- SQLiteLoopRepository: prepared statements, Zod boundary validation,
  JSON serialization for task_template/pipeline_steps, boolean-to-integer
  conversion for fresh_context
- Implements both LoopRepository (async) and SyncLoopOperations (sync)
  interfaces for transaction support

Co-Authored-By: Claude <noreply@anthropic.com>
LoopManagerService implements LoopService with:
- Input validation (prompt length, exitCondition, workingDirectory,
  numeric bounds, strategy-specific evalDirection rules, pipeline
  step count 2-20)
- Agent resolution via resolveDefaultAgent
- Event emission for LoopCreated/LoopCancelled
- Cancel with optional running task cancellation

Extract truncatePrompt into src/utils/format.ts to eliminate
duplication in schedule-manager.ts and cli/commands/status.ts.

Co-Authored-By: Claude <noreply@anthropic.com>
Core iteration engine for v0.7.0 task/pipeline loops:

- Factory pattern (LoopHandler.create()) with async init and recovery
- Event subscriptions: LoopCreated, TaskCompleted, TaskFailed,
  TaskCancelled, LoopCancelled
- Single-task and pipeline iteration dispatch (replicates
  ScheduleHandler pipeline pattern with atomic transactions)
- Exit condition evaluation via execSync with env vars
  (BACKBEAT_LOOP_ID, BACKBEAT_ITERATION, BACKBEAT_TASK_ID)
- Retry strategy: exit code 0 = pass, non-zero = fail
- Optimize strategy: parse last non-empty stdout line as score,
  track bestScore/bestIterationId with direction-aware comparison
- Cooldown timers with .unref() (R14 — don't block process exit)
- Crash recovery: rebuildMaps() + recoverStuckLoops() in create()
- In-memory maps: taskToLoop, pipelineTasks, cooldownTimers
- Pipeline tail-task tracking (R4) — only last task in chain
  goes in taskToLoop map

Co-Authored-By: Claude <noreply@anthropic.com>
Handler setup changes:
- Add LoopRepository & SyncLoopOperations to HandlerDependencies interface
- Add LoopHandler to HandlerSetupResult interface
- Extract loopRepository from container in extractHandlerDependencies()
- Create LoopHandler.create() after CheckpointHandler (needs checkpointRepo)
- Follow same error handling / cleanup pattern as other factory handlers

Bootstrap changes:
- Register loopRepository singleton (SQLiteLoopRepository from database)
- Register loopService singleton (LoopManagerService with eventBus, logger,
  loopRepository, config)
- Store loopHandler in container from HandlerSetupResult

Test fixture:
- Register loopRepository in handler-setup test container
- Update handler count assertion (6 -> 7)

Co-Authored-By: Claude <noreply@anthropic.com>
Add 4 new MCP tools for loop management (v0.7.0):
- CreateLoop: Creates iterative loops with retry or optimize strategy
- LoopStatus: Gets loop details with optional iteration history
- ListLoops: Lists loops with optional status filter
- CancelLoop: Cancels active loops with optional task cancellation

Also adds LoopService as MCPAdapter constructor parameter (following
ScheduleService pattern), updates LoopService.getLoop to accept
historyLimit parameter, and updates all test constructor calls.

Co-Authored-By: Claude <noreply@anthropic.com>
Add loop CLI commands following the schedule command pattern:
- beat loop <prompt> --until <cmd>: Retry loop (pass/fail exit condition)
- beat loop <prompt> --eval <cmd> --direction minimize|maximize: Optimize loop
- beat loop --pipeline --step "..." --step "..." --until <cmd>: Pipeline loop
- beat loop list [--status <status>]: List loops
- beat loop get <id> [--history] [--history-limit N]: Get loop details
- beat loop cancel <id> [--cancel-tasks] [reason]: Cancel a loop

Strategy is inferred from flags (--until -> retry, --eval -> optimize).
Read-only commands (list, get) use lightweight ReadOnlyContext.
Mutation commands (create, cancel) use full bootstrap via withServices.

Also adds:
- loopRepository to ReadOnlyContext
- loopService to withServices return type
- Loop Commands section to CLI help text
- loop routing in main CLI entry point

Co-Authored-By: Claude <noreply@anthropic.com>
Add promptPreview field to TaskStatus all-tasks response in MCP adapter,
giving MCP clients a concise task summary without the full prompt text.

Also standardize all inline prompt truncation to use the shared
truncatePrompt() utility from src/utils/format.ts:
- MCP adapter: TaskStatus single-task, GetSchedule template and pipeline steps
- CLI schedule get: prompt display and pipeline step display
- CLI status: single-task prompt display

Co-Authored-By: Claude <noreply@anthropic.com>
- Fix enrichPromptWithCheckpoint fetching only 1 iteration (the latest)
  when it needs to find the previous one — checkpoint context enrichment
  was effectively broken for !freshContext loops
- Fix cancelLoop race condition: task cancellation now runs BEFORE
  LoopCancelled event emission so iterations still have 'running' status
  when TaskCancellationRequested is emitted
- Fix getLoop warning always logging even on success (missing else branch)
- Extract toOptimizeDirection to shared function (DRY: MCP + CLI)
- Remove unused variables in handleIterationResult and startPipelineIteration
- FEATURES.md: Add Task/Pipeline Loops section (strategies, CLI, MCP tools,
  events, config), What's New in v0.7.0 block, update event count to 29
- ROADMAP.md: Mark v0.7.0 as released, update current status, move
  schedule composition to v0.8.0, add loop follow-on items to v0.8.0
- README.md: Add CreateLoop/LoopStatus/ListLoops/CancelLoop to MCP tools
  table, add beat loop commands to CLI table, check v0.7.0 in roadmap

Co-Authored-By: Claude <noreply@anthropic.com>
Add 3 unit test files for v0.7.0 loop feature:
- loop-repository.test.ts (45 tests): CRUD, iterations, sync ops, JSON round-trips
- loop-manager.test.ts (24 tests): validation, createLoop, getLoop, listLoops, cancelLoop
- loop-handler.test.ts (20 tests): retry/optimize strategies, pipeline, cooldown, cancel, recovery

Update package.json test groups to include new test files in
test:repositories, test:services, and test:handlers.

Co-Authored-By: Claude <noreply@anthropic.com>
Add end-to-end loop lifecycle test (5 tests) covering:
- Retry loop: create -> iterate -> exit condition passes -> complete
- Cancel: running loop cancelled with iteration cleanup
- Retry with recovery: task fails -> new iteration -> eventually succeeds
- Persistence: verify loop and iterations persisted in DB after lifecycle
- Optimize: track best score across iterations with minimize direction

Uses real shell commands for exit conditions (no vi.mock) to avoid
test pollution in non-isolated vitest mode.

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
…ber)

All other domain types (Task, Schedule, Worker) use epoch number for
timestamps. Loop and LoopIteration used Date objects, requiring manual
conversion for cross-domain comparisons. This changes all loop timestamp
fields to use epoch milliseconds (number) for consistency:

- Domain: Loop.createdAt/updatedAt/completedAt, LoopIteration.startedAt/completedAt
- Migration v10: TEXT → INTEGER column types
- Repository: Remove toISOString()/new Date() conversion layer
- Handler: new Date() → Date.now() for all timestamp creation
- CLI/MCP: Wrap with new Date() at display boundary only
- Tests: Update all timestamp assertions
When task_id is NULL in SQLite after ON DELETE SET NULL, the repository
was creating '' as TaskId — a silently invalid branded type that would
blow up at runtime. Now returns undefined instead, with guards at all
consumer sites (rebuildMaps, recoverStuckLoops, enrichPromptWithCheckpoint,
CLI display, MCP serialization).

Co-Authored-By: Claude <noreply@anthropic.com>
…lication

Extract private recordAndContinue() method that encapsulates the common
pattern across 5 non-terminal iteration branches: update iteration in DB,
emit LoopIterationCompleted event, apply loop state update, check
termination conditions, and schedule next iteration. Eliminates ~80 lines
of near-duplicate code from handleRetryResult and handleOptimizeResult.

Co-Authored-By: Claude <noreply@anthropic.com>
The _taskId parameter was never used inside handleIterationResult or its
delegates (handleRetryResult, handleOptimizeResult). Remove it from the
method signature and both call sites (handleTaskTerminal, recoverStuckLoops).

Co-Authored-By: Claude <noreply@anthropic.com>
Loops accumulated in DB with no cleanup mechanism. This adds:
- LoopRepository.cleanupOldLoops() interface + SQLite implementation
- RecoveryManager Phase 1b: deletes completed/failed/cancelled loops
  older than 7 days during startup recovery
- LoopRepository injected as optional 7th RecoveryManager param
  (existing call sites unaffected)
- FK cascade (ON DELETE CASCADE) auto-deletes associated iterations

Co-Authored-By: Claude <noreply@anthropic.com>
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 21, 2026

Confidence Score: 4/5

  • PR is on the happy path to merge — no new blocking issues found; all prior critical concerns have been addressed.
  • All P0/P1 concerns from the prior round (blocking eval, FK ordering, non-atomic writes, null taskId, pipeline intermediate failure, unguarded transaction failures) have been resolved. The code now uses async exit condition evaluation, wraps critical DB writes in transactions with proper error fallbacks, and correctly guards null taskIds. 104 new tests including integration coverage further reduce merge risk. Docking one point from 5 given the overall size and complexity of the change (1,263-line handler, 22 commits) and the value of a final smoke-test pass against the test plan before merging.
  • src/services/handlers/loop-handler.ts warrants a final read-through alongside the integration tests given its size and centrality to the feature.

Important Files Changed

Filename Overview
src/services/handlers/loop-handler.ts 1,263-line event-driven iteration engine. Prior concerns (blocking execSync, FK ordering, non-atomic TaskFailed writes, unguarded null taskId, pipeline intermediate failure) have all been resolved — async eval is now delegated to ShellExitConditionEvaluator, task save precedes iteration record in a transaction, and handlePipelineIntermediateTask correctly handles intermediate failures. Recovery paths, timer cleanup, and in-memory map maintenance look correct.
src/services/exit-condition-evaluator.ts Async exit condition evaluator using promisified exec. Clean DI boundary for testability; handles retry (exit code 0 = pass) and optimize (last stdout line parsed as float) strategies correctly. Timeout, NaN/Infinity, and exec-error paths are all handled.
src/implementations/loop-repository.ts SQLite repository with prepared statements, Zod boundary validation, and sync counterparts for transactional use. Row-to-domain and domain-to-row conversions are well-structured. cleanupOldLoops uses CASCADE for iteration cleanup.
src/services/loop-manager.ts Service layer with thorough input validation. The previously flagged undefined taskId guard is now present (lines 272-279), safely skipping cancellation for itérations with cleaned-up tasks. cancelLoop correctly cancels running tasks before emitting LoopCancelled.
src/implementations/database.ts Migration v10 adds loops and loop_iterations tables with FK cascade, CHECK constraints, and appropriate indexes. runInTransaction correctly preserves BackbeatError types.
src/services/recovery-manager.ts cleanupOldLoops added alongside existing cleanupOldCompletedTasks; loopRepository is optional for backward compatibility. Loop recovery itself is handled by LoopHandler.recoverStuckLoops at startup.
src/cli/commands/loop.ts CLI command handler with pure parseLoopCreateArgs for testability. Strategy is correctly inferred from --until/--eval flags. Unknown flags return an error. Pipeline steps validated at parse time.
tests/unit/services/handlers/loop-handler.test.ts Comprehensive 1,046-line test suite covering loop lifecycle, optimize/retry strategies, pipeline failures, recovery paths, and cancellation. 20 targeted handler tests + recovery cases.
tests/integration/task-loops.test.ts 467-line end-to-end integration tests covering retry loop, optimize loop, cancellation, and pipeline loops with real SQLite and event bus.

Sequence Diagram

sequenceDiagram
    participant CLI/MCP
    participant LoopManager
    participant EventBus
    participant LoopHandler
    participant TaskRepo
    participant LoopRepo
    participant Evaluator
    participant Worker

    CLI/MCP->>LoopManager: createLoop(request)
    LoopManager->>EventBus: emit(LoopCreated)
    EventBus->>LoopHandler: handleLoopCreated
    LoopHandler->>LoopRepo: save(loop)
    LoopHandler->>LoopHandler: startNextIteration(loop)

    rect rgb(200, 240, 200)
        note over LoopHandler: Atomic transaction
        LoopHandler->>TaskRepo: saveSync(task)
        LoopHandler->>LoopRepo: recordIterationSync(iteration)
    end

    LoopHandler->>EventBus: emit(TaskDelegated)
    EventBus->>Worker: spawn & execute task

    Worker->>EventBus: emit(TaskCompleted)
    EventBus->>LoopHandler: handleTaskTerminal

    LoopHandler->>Evaluator: evaluate(loop, taskId)
    Evaluator-->>LoopHandler: EvalResult {passed, score}

    alt Retry: passed
        rect rgb(200, 240, 200)
            note over LoopHandler: Atomic transaction
            LoopHandler->>LoopRepo: updateIterationSync(pass)
            LoopHandler->>LoopRepo: updateSync(COMPLETED)
        end
        LoopHandler->>EventBus: emit(LoopCompleted)
    else Retry: failed / Optimize: discard
        rect rgb(200, 240, 200)
            note over LoopHandler: recordAndContinue transaction
            LoopHandler->>LoopRepo: updateIterationSync(fail/discard)
            LoopHandler->>LoopRepo: updateSync(++consecutiveFailures)
        end
        LoopHandler->>EventBus: emit(LoopIterationCompleted)
        LoopHandler->>LoopHandler: checkTerminationConditions
        LoopHandler->>LoopHandler: scheduleNextIteration (with cooldown)
    end

    note over LoopHandler: On server restart
    LoopHandler->>LoopRepo: findRunningIterations → rebuildMaps
    LoopHandler->>LoopRepo: findByStatus(RUNNING) → recoverStuckLoops
Loading

Last reviewed commit: "test(loops): add CLI..."

};

try {
const stdout = execSync(loop.exitCondition, {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: execSync blocks Node.js event loop (Flagged by: Security, Performance reviewers)

The evaluateExitCondition() method uses child_process.execSync() which blocks the entire event loop for up to 60 seconds per evaluation. While your timeout prevents indefinite blocking, this prevents any other loop iterations, task completion handlers, or MCP requests from being processed during exit condition evaluation.

Fix: Replace with async exec wrapped in a Promise:

import { exec } from 'child_process';
import { promisify } from 'util';
const execAsync = promisify(exec);

private async evaluateExitCondition(loop: Loop, taskId: TaskId): Promise<EvalResult> {
  try {
    const { stdout } = await execAsync(loop.exitCondition, {
      cwd: loop.workingDirectory,
      timeout: loop.evalTimeout,
      env: { /* filtered */ },
    });
    // ... parse and return as before
  } catch (execError) {
    // ... existing error handling
  }
}

This allows the event loop to continue processing while the script executes.

if (iterationsResult.ok) {
const runningIterations = iterationsResult.value.filter((i) => i.status === 'running');
for (const iteration of runningIterations) {
const cancelResult = await this.eventBus.emit('TaskCancellationRequested', {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Undefined taskId in TaskCancellationRequested event (Security reviewer)

In cancelLoop() when cancelTasks: true, you emit TaskCancellationRequested with iteration.taskId which can be undefined (nullable field due to ON DELETE SET NULL in FK). However, TaskCancellationRequestedEvent requires taskId: TaskId (non-optional), causing type mismatch and potential runtime errors in downstream handlers.

Fix: Filter to only emit for iterations with defined taskId:

const runningIterations = iterationsResult.value.filter(
  (i) => i.status === 'running' && i.taskId !== undefined
);
for (const iteration of runningIterations) {
  await this.eventBus.emit('TaskCancellationRequested', {
    taskId: iteration.taskId!, // Safe: filtered above
    reason: `Loop ${loopId} cancelled`,
  });
}

CREATE INDEX IF NOT EXISTS idx_loop_iterations_task_id ON loop_iterations(task_id);
CREATE INDEX IF NOT EXISTS idx_loop_iterations_status ON loop_iterations(status);
CREATE INDEX IF NOT EXISTS idx_loop_iterations_loop_iteration ON loop_iterations(loop_id, iteration_number DESC);
`);
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Missing index on loops.status column (Flagged by: Database, Performance reviewers)

Queries like findByStatus() and cleanupOldLoops() filter on loops.status, but no index exists. As loops accumulate, these degrade to full table scans. The schedules table already has idx_schedules_status for the same pattern.

Fix: Add two indexes to migration v10 in src/implementations/database.ts:

CREATE INDEX IF NOT EXISTS idx_loops_status ON loops(status);
CREATE INDEX IF NOT EXISTS idx_loops_cleanup ON loops(status, completed_at);

The composite index covers both status filtering and cleanup date conditions.

export interface SyncLoopOperations {
updateSync(loop: Loop): void;
recordIterationSync(iteration: LoopIteration): void;
findByIdSync(id: LoopId): Loop | undefined;
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: LoopRepository.findById() returns undefined instead of null (Consistency reviewer)

TaskRepository and ScheduleRepository return Result<Task | null> and Result<Schedule | null> for not-found cases. LoopRepository returns Result<Loop | undefined>. This inconsistency in nullable patterns creates confusion across the codebase.

Fix: In src/core/interfaces.ts, update the LoopRepository interface:

findById(id: LoopId): Promise<Result<Loop | null>>;
findByIdSync(id: LoopId): Loop | null;

Then update implementations in src/implementations/loop-repository.ts to return null instead of undefined.

import { err, ok, type Result } from '../../core/result.js';

/**
* Exit condition evaluation result
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: LoopHandler at 1,106 lines approaches God Class threshold (Flagged by: Architecture, Complexity reviewers)

LoopHandler handles loop creation, iteration dispatch, exit condition evaluation, result handling, termination checks, cooldown scheduling, prompt enrichment, recovery, and map rebuilding -- at least 6 distinct responsibilities. This violates SRP and makes maintenance harder.

Suggested refactor (can be follow-up PR):

  • Extract ExitConditionEvaluator interface (evaluation logic, ~60 lines)
  • Extract LoopRecovery class (rebuildMaps + recoverStuckLoops, ~120 lines)
  • Keep iteration dispatch and result handling in LoopHandler

This would bring each component under 400 lines with clearer single responsibility. Consider this for the next iteration.

* Map evalDirection string to OptimizeDirection enum
* Returns undefined for unrecognized values
*/
export function toOptimizeDirection(value: string | undefined): OptimizeDirection | undefined {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: toOptimizeDirection exported from service layer (Architecture reviewer)

The MCP adapter and CLI import toOptimizeDirection from loop-manager.ts (service layer). Pure mapping functions like this belong in the domain/core layer, not inside service implementations. This creates upward dependencies from adapter/interface to concrete service classes.

Fix: Move toOptimizeDirection() to a shared utility:

// In src/utils/format.ts (alongside truncatePrompt)
export function toOptimizeDirection(str?: string): OptimizeDirection | undefined {
  // ... existing logic
}

Then update imports in mcp-adapter.ts and cli/commands/loop.ts to import from utils. (Same refactoring should apply to toMissedRunPolicy in schedule-manager.ts for consistency.)

export interface SyncLoopOperations {
updateSync(loop: Loop): void;
recordIterationSync(iteration: LoopIteration): void;
findByIdSync(id: LoopId): Loop | undefined;
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: LoopRepository.findById() returns undefined instead of null (Consistency reviewer)

TaskRepository and ScheduleRepository return Result<Task | null> and Result<Schedule | null>. LoopRepository returns Result<Loop | undefined>. This inconsistency in nullable patterns creates confusion.

Fix: In src/core/interfaces.ts, update LoopRepository interface:

findById(id: LoopId): Promise<Result<Loop | null>>;
findByIdSync(id: LoopId): Loop | null;

Update src/implementations/loop-repository.ts to return null instead of undefined.

* Map evalDirection string to OptimizeDirection enum
* Returns undefined for unrecognized values
*/
export function toOptimizeDirection(value: string | undefined): OptimizeDirection | undefined {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: toOptimizeDirection exported from service layer (Architecture reviewer)

The MCP adapter and CLI import toOptimizeDirection from loop-manager.ts (service layer). Pure mapping functions belong in the domain/core layer, not inside service implementations. This creates upward dependencies from adapter to concrete service.

Fix: Move to shared utility in src/utils/format.ts:

export function toOptimizeDirection(str?: string): OptimizeDirection | undefined {
  switch (str) {
    case 'minimize': return OptimizeDirection.MINIMIZE;
    case 'maximize': return OptimizeDirection.MAXIMIZE;
    default: return undefined;
  }
}

Then update imports in mcp-adapter.ts and cli/commands/loop.ts.

private async enrichPromptWithCheckpoint(loop: Loop, iterationNumber: number, prompt: string): Promise<string> {
// Get enough iterations to find the previous one (ordered by iteration_number DESC)
// We need at least 2: the current iteration we just started + the previous one
const iterationsResult = await this.loopRepo.getIterations(loop.id, iterationNumber, 0);
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Over-fetching iterations in enrichPromptWithCheckpoint() (Performance reviewer)

The method calls getIterations(loop.id, iterationNumber, 0) where iterationNumber is current count. For iteration 100, this fetches 100+ rows then does linear .find(). This grows I/O linearly with iteration count.

Fix: Only fetch the 2 most recent iterations:

const iterationsResult = await this.loopRepo.getIterations(loop.id, 2, 0);
const prevIteration = iterationsResult.value.find(
  (i) => i.iterationNumber === iterationNumber - 1
);

This reduces I/O from O(N) to O(1) per iteration.

// Loop create — full bootstrap with event bus
// ============================================================================

async function handleLoopCreate(loopArgs: string[]): Promise<void> {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: handleLoopCreate CLI function is 187 lines with ~25 cyclomatic complexity (Complexity reviewer)

The function contains a 14-branch else if chain for argument parsing (lines 53-137) with interleaved validation. This exceeds the 50-line threshold and makes it hard to add new flags.

Fix: Extract into parseLoopCreateArgs() that returns a typed options object, separating parsing from business logic. This pattern already exists in schedule.ts.

import { err, ok, type Result } from '../../core/result.js';

/**
* Exit condition evaluation result
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: LoopHandler at 1,106 lines approaches God Class threshold (Flagged by: Architecture, Complexity reviewers)

LoopHandler handles loop creation, iteration dispatch, exit condition evaluation, result handling, termination checks, cooldown scheduling, prompt enrichment, recovery, and map rebuilding -- at least 6 distinct responsibilities. This violates SRP.

Suggested refactor (can be follow-up PR):

  • Extract ExitConditionEvaluator interface (evaluation logic, ~60 lines)
  • Extract LoopRecovery class (rebuildMaps + recoverStuckLoops, ~120 lines)
  • Keep iteration dispatch and result handling in LoopHandler

This would bring each component under 400 lines with clearer single responsibility.

@dean0x
Copy link
Owner Author

dean0x commented Mar 21, 2026

Code Review Summary: v0.7.0 Task Loops (PR #110)

10 reports analyzed (Security, Architecture, Performance, Complexity, Consistency, Regression, Tests, Database, Dependencies, Documentation)


Inline Comments Created ✓

9 blocking issues (≥80% confidence) with inline comments on specific files/lines:

  1. execSync blocks event loop (95% conf, Security + Performance)
  2. Undefined taskId in TaskCancellationRequested (95% conf, Security)
  3. Missing loops.status index (95% conf, Database + Performance)
  4. findById() returns undefined not null (92% conf, Consistency)
  5. toOptimizeDirection exported from service layer (82% conf, Architecture)
  6. Over-fetching iterations in checkpoint enrichment (82% conf, Performance)
  7. MCP tool naming LoopStatus vs GetLoop (82% conf, Consistency)
  8. CLI handleLoopCreate too complex (187 lines, 90% conf, Complexity)
  9. LoopHandler at 1,106 lines (God Class, 80% conf, Architecture)

Medium-Confidence Findings (60-79%)

Security (Confidence 60-80%)

  • No upper bound on evalTimeout (85% conf) — Add max 300000ms (5 min) to prevent resource exhaustion
  • Loop exit condition inherits full process environment (80% conf) — Filter to safe variables (PATH, HOME, SHELL, BACKBEAT_*)
  • Unbounded loop termination (70% conf) — Log warning when both maxIterations=0 and maxConsecutiveFailures=0 (unlimited)

Architecture (Confidence 65-85%)

  • execSync hard-coded in LoopHandler violates DI (85% conf) — Extract ExitConditionEvaluator interface for testability
  • updateLoop() accepts Partial<Loop> too permissive (80% conf) — Create dedicated LoopUpdate type like TaskUpdate
  • RecoveryManager loopRepository optional is inconsistent (85% conf) — Make required parameter now that loops are core
  • MCP adapter constructor growing (6 params) (80% conf) — Consider MCPAdapterDependencies object pattern

Performance (Confidence 65-85%)

  • N+1 query in recoverStuckLoops() (85% conf) — Acceptable for startup-only path; batch if loop counts grow

Complexity (Confidence 65-92%)

  • recoverStuckLoops() has 5-level nesting (92% conf) — Extract recoverStuckIteration() helper to flatten
  • handleTaskTerminal() is 97 lines with duplicated failure logic (85% conf) — Route "task failed" through recordAndContinue
  • extractHandlerDependencies() is 60 lines repetitive boilerplate (80% conf) — Consider batch extraction pattern
  • LoopManagerService.createLoop() validation is 166 lines (82% conf) — Extract to validateCreateRequest() helper

Database (Confidence 80-85%)

  • Missing CHECK constraint on eval_direction (85% conf) — Add CHECK(eval_direction IS NULL OR eval_direction IN ('minimize', 'maximize'))
  • Non-atomic iteration + loop update in recordAndContinue (82% conf) — Wrap both updates in transaction, emit event after

Consistency (Confidence 80-90%)

  • LoopRepository.update() signature deviates (90% conf) — Consider aligning to update(id, Partial<Loop>) pattern or document intentional choice
  • ErrorCode.TASK_NOT_FOUND used for loop not-found (85% conf) — Add LOOP_NOT_FOUND error code (pre-existing pattern)
  • CLAUDE.md not updated with loop files/tools (88-92% conf) — Add loop files to table, loop tools to tool list, loop handler to architecture notes, loop tables to database section
  • Release notes v0.7.0 marked "Released" but package.json still 0.6.0 (85% conf) — Change to "In Progress" or bump version

Documentation (Confidence 80-95%)

  • events.ts header stale (says "25" not "29" events) (95% conf) — Update comment to reflect 4 new loop events
  • FEATURES.md design patterns header still says "v0.6.0" (80% conf) — Remove version or update to v0.7.0

Testing (Confidence 80-92%)

  • No unit tests for 4 MCP loop handlers (~240 lines) (92% conf) — Add tests for success/validation/error paths
  • No unit tests for CLI loop commands (400 lines) (88% conf) — Add tests for argument parsing and validation
  • Recovery test doesn't verify recovered behavior end-to-end (80% conf) — After handler creation, emit TaskCompleted event and verify recovery
  • Context enrichment test only verifies checkpoint queried, not prompt enriched (85% conf) — Capture TaskDelegated event and verify enriched prompt content
  • LoopManager cancelLoop(_, _, true) not tested (82% conf) — Add test for cancelTasks=true code path

Dependencies (Confidence 95%)

  • Missing --exclude='**/loop-repository.test.ts' in test:implementations (95% conf) — Add to package.json to prevent duplicate runs

Lower-Confidence Observations (< 60%)

To Investigate

  • Dual toOptimizeDirection implementations (service vs repo) — consolidate if possible
  • In-memory maps not persisted on graceful shutdown
  • Pipeline iteration task creation not explicitly tested for dependency ordering
  • Unbounded in-memory maps taskToLoop, pipelineTasks, cooldownTimers
  • Sequential event emission in pipeline steps (could potentially parallelize with Promise.all())
  • CLI loop cancel defaults differ between MCP (true) and CLI (false) for cancelTasks

Summary Statistics

Category CRITICAL HIGH MEDIUM LOW
Blocking (≥80% conf) 1 8 0 0
Medium Conf (60-79%) - - 20+ -
Pre-existing - - 5 -

Overall Assessment:

  • ✓ Clean additive feature (zero regression risk)
  • ✓ All 1,092 tests pass, Snyk clean, biome clean
  • ✗ 9 actionable blocking issues (mostly architectural, testability, consistency)
  • ✗ 20+ medium-confidence improvements (documentation, edge cases, refactoring)

Quality Gates: Validator ✓, Simplifier ✓, Scrutinizer ✓ (fixed 2 real bugs)


Reviewers

  • Security: Found 3 blocking issues (execSync, undefined taskId, evalTimeout limits)
  • Architecture: Found 4 blocking/should-fix issues (DI, layering, size, types)
  • Performance: Found 2 blocking issues (execSync, missing indexes)
  • Complexity: Found 3 blocking issues (handleLoopCreate, recoverStuckLoops, handleTaskTerminal)
  • Consistency: Found 2 blocking issues (null vs undefined, tool naming)
  • Database: Found 2 blocking issues (missing indexes, atomicity)
  • Documentation: Found 1 CRITICAL (wrong release notes) + 4 HIGH (CLAUDE.md stale)
  • Tests: Found 2 blocking gaps (MCP handlers, CLI commands untested)
  • Regression: Clean ✓ — zero breaking changes detected
  • Dependencies: Found 1 blocking issue (duplicate test execution)

Recommendation: CHANGES_REQUESTED

All 9 blocking inline comments require fixes before merge. Many are straightforward (missing indexes, null consistency, refactoring). Document intent where design deviates from existing patterns.


Submitted by Claude Code git-agent

Dean Sharon and others added 3 commits March 21, 2026 13:33
- Rewrite release notes with actual v0.7.0 loop features (was v0.6.0 content)
- Update CLAUDE.md with loop file locations, MCP tools, handler, and DB tables
- Add missing idx_loops_status index to migration v10 (matches pattern of all other status columns)
- Fix event count comment from 25 to 29 after adding 4 loop events
- Exclude loop-repository.test.ts from test:implementations to prevent duplicate test execution

Co-Authored-By: Claude <noreply@anthropic.com>
- Add null guard before emitting TaskCancellationRequested in cancelLoop
  to prevent runtime error when iteration.taskId is undefined (ON DELETE
  SET NULL)
- Add upper bound validation for evalTimeout (max 300000ms / 5 minutes)
  to prevent unbounded blocking
- Fix over-fetching in enrichPromptWithCheckpoint: limit query to 2 rows
  instead of iterationNumber (avoids fetching 1000 rows at iteration 1000)

Co-Authored-By: Claude <noreply@anthropic.com>
Dean Sharon added 3 commits March 21, 2026 14:40
… recordAndContinue in transaction

Two related fixes for loop handler reliability:

1. Replace execSync with async exec (via promisify) to avoid blocking the
   Node.js event loop during exit condition evaluation. Error shape changes
   from error.status to error.code to match async exec behavior.

2. Wrap recordAndContinue's sequential DB writes (updateIteration +
   updateLoop) in a single runInTransaction call for atomicity. Event
   emission moves after commit to match schedule-handler pattern.
Extract ~150 lines of argument parsing and validation from handleLoopCreate
into a pure function that returns Result<ParsedLoopArgs, string>. Replaces
13 ui.error()+process.exit(1) pairs with Result err() returns, enabling
unit testing without mocking process.exit.
Replace stubLoopService with MockLoopService class following MockTaskManager
pattern. Add 10 tests covering CreateLoop, LoopStatus, ListLoops, and
CancelLoop handlers including success, error propagation, filtering, and
flag passing. Adds TODO noting that simulate* helpers bypass Zod validation.
Dean Sharon and others added 10 commits March 21, 2026 22:46
Co-Authored-By: Claude <noreply@anthropic.com>
…teration

Co-Authored-By: Claude <noreply@anthropic.com>
…ck loops

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
- Remove unused imports (BackbeatError, ErrorCode, err, ok, tryCatch
  from loop-repository; LoopRepository, LoopService types from CLI loop)
- Simplify redundant return variable in loop-handler handleTaskTerminal
- Convert split if/if(!...) to if/else in schedule-manager getSchedule
- Remove dead !next guard (already guaranteed truthy by outer condition)
- Change let to const for non-reassigned promptWords in schedule CLI
- Fix import sort order in handler-setup (exit-condition-evaluator
  before handlers/)
Fix H: Wrap iteration-fail + consecutiveFailures in atomic transaction
  (3 locations: handleTaskTerminal, handlePipelineIntermediateTask,
  recoverSingleLoop). Prevents crash between writes from leaving loop
  able to exceed maxConsecutiveFailures.

Fix I: Mark loop FAILED when recordAndContinue transaction fails,
  instead of silently returning (leaving loop stuck RUNNING forever).

Fix J: Recovery handles terminal iteration with RUNNING loop —
  re-derives correct post-commit action (completeLoop or
  startNextIteration) from iteration status. Covers crash between
  DB commit and async cleanup.

Fix K: Atomic transaction for handleRetryResult pass path —
  iteration 'pass' + loop COMPLETED in single transaction with
  double-write for cleanup.

Fix L: recoverSingleLoop CANCELLED path now calls
  checkTerminationConditions + startNextIteration instead of
  returning silently.

12 new tests covering all fix paths including transaction failure
and crash-window recovery scenarios.
Comment on lines +460 to +470
this.taskToLoop.set(task.id, loopId);

// Emit TaskDelegated event AFTER transaction commit
const emitResult = await this.eventBus.emit('TaskDelegated', { task });
if (!emitResult.ok) {
this.logger.error('Failed to emit TaskDelegated for loop iteration', emitResult.error, {
loopId,
iterationNumber,
taskId: task.id,
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 TaskDelegated emit failure orphans in-process loop

If TaskDelegated fails to emit (line 463), taskToLoop has already been populated (line 460) and the task + iteration are committed to the DB, but no handler received the event. The task stays as QUEUED indefinitely within the current process — there is no in-process retry. recoverSingleLoop will see latestIteration.taskId pointing to a non-terminal QUEUED task and conclude "will complete normally via event handler" (line 1205), so it returns without action.

The loop is stuck until server restart, where RecoveryManager.recoverQueuedTasks re-emits TaskQueued for the orphaned task and the pipeline resumes normally. For most deployments this is an acceptable safety net, but in a long-running process any transient subscriber throw during emit will silently freeze the loop.

Removing the taskToLoop entry on emit failure (and not returning silently) would allow the next recoverStuckLoops call to re-start the iteration instead:

const emitResult = await this.eventBus.emit('TaskDelegated', { task });
if (!emitResult.ok) {
  this.logger.error('Failed to emit TaskDelegated for loop iteration', emitResult.error, {
    loopId,
    iterationNumber,
    taskId: task.id,
  });
  this.taskToLoop.delete(task.id); // prevent phantom tracking
}

Comment on lines +563 to +568
});
// Step 0 failure is critical — cannot proceed
if (i === 0) {
return;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Pipeline step 0 early return leaves downstream tasks registered but unrecoverable in-process

When step 0's TaskDelegated fails and the function returns early, steps 1–N have not yet been emitted (the loop exits at i === 0 before reaching them). All N tasks and the iteration record are already committed to the DB, and taskToLoop has entries for all N tasks. The pipelineTasks set is also populated.

Within the current process, step 0 is QUEUED in the DB but has no dispatcher. DependencyHandler was never notified about steps 1–N (their TaskDelegated events were never emitted either), so the dependency chain is not registered in the dependency graph. On restart, RecoveryManager will re-emit TaskQueued for step 0, the chain will be re-processed, and dependency events will be re-dispatched, so the pipeline eventually recovers.

However, if step 0's TaskDelegated emit fails, it would also be worth clearing taskToLoop entries for all steps (similar to the single-task case) so the in-memory map doesn't hold stale references until restart:

if (i === 0) {
  // Step 0 critical — clean up all tracking before returning
  for (const t of tasks) {
    this.taskToLoop.delete(t.id);
  }
  this.pipelineTasks.delete(`${loopId}:${iterationNumber}`);
  return;
}

Comment on lines +393 to +397

if (!txResult.ok) {
this.logger.error('Failed to start next iteration', txResult.error, { loopId });
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Silent failure on iteration-start transaction leaves loop in RUNNING forever (within process)

When the atomic runInTransaction at line 379 fails (e.g., a transient DB error), the function logs and returns — but the loop remains in RUNNING status with no new iteration created. There is no in-process path that marks the loop as FAILED or retries the start.

Recovery on the next restart handles this correctly: recoverSingleLoop finds a RUNNING loop with no new running iteration, and the terminal-status branch (latestIteration.status !== 'running') calls checkTerminationConditions + startNextIteration again. So the safety net exists.

For visibility, consider calling completeLoop here when the failure is definitively non-transient (or at least logging at error level with the loop ID so operators can correlate the stuck loop to the DB error):

if (!txResult.ok) {
  this.logger.error('Failed to start next iteration — loop will recover on restart', txResult.error, { loopId });
  // Optional: await this.completeLoop(loop, LoopStatus.FAILED, 'Failed to atomically start iteration');
  return;
}

…ize tests

Close the coverage gap — every other CLI command had tests except loops.
Adds 39 CLI tests (parseLoopCreateArgs pure function, service integration,
read-only context, cancel) and 5 integration tests (failure paths, maximize
direction, shell eval). Fixes cli-services.test.ts vi.mock compatibility
in non-isolated mode.
@dean0x dean0x merged commit e156c48 into main Mar 22, 2026
2 checks passed
@dean0x dean0x deleted the feat/v070-task-loops branch March 22, 2026 00:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant