Skip to content

Implement Ralph Loop methodology for self-correcting implementation#6

Merged
jsegov merged 13 commits intomainfrom
jonathanjsegovia/shi-56-spec-ralph
Jan 14, 2026
Merged

Implement Ralph Loop methodology for self-correcting implementation#6
jsegov merged 13 commits intomainfrom
jonathanjsegovia/shi-56-spec-ralph

Conversation

@jsegov
Copy link
Copy Markdown
Owner

@jsegov jsegov commented Jan 12, 2026

Summary

  • Implements Ralph Loop methodology (Phases 1-3) for iterative, self-correcting AI development
  • Adds Stop hooks that intercept exit attempts and retry on failed verification
  • Enables auto-retry for /implement-task, /implement-feature, and /feature-planning

Key Changes

Phase 1: Foundation

  • Stop hooks infrastructure (hooks/hooks.json)
  • State file pattern (.claude/shipspec-*.local.md)
  • Completion markers for hook coordination

Phase 2: Single Task Auto-Retry

  • task-loop-hook.sh: Retries up to 5 times on failed verification
  • Completion promise system for subjective criteria

Phase 3: Advanced Features

  • feature-retry-hook.sh: Per-task retry during full feature implementation
  • planning-refine-hook.sh: Optional task refinement for large tasks (>5 story points)
  • Inline verification with auto-retry

Additional

  • Debug logging to .claude/shipspec-debug.log
  • Ralph Loop methodology documented in README
  • Version bump to 1.1.0

Test plan

  • Run /implement-task with a task that fails verification - verify auto-retry
  • Run /implement-feature - verify per-task retry on failure
  • Run /feature-planning with large tasks - verify refinement prompt appears
  • Verify state files are cleaned up after completion

🤖 Generated with Claude Code


Note

Introduces a resilient implementation loop with auto-retries and standardizes task operations on TASKS.json.

  • Hooks & loop state: Adds hooks/hooks.json, task-loop-hook.sh, feature-retry-hook.sh, planning-refine-hook.sh with JSON state files and completion markers (<task-loop-complete>..., <feature-task-complete>..., <feature-complete>...).
  • Commands: Overhauls /implement-task and /implement-feature to read TASKS.json, update status via task-manager update_status, log to .claude/shipspec-debug.log, and drive auto-retry; adds /cancel-task-loop and /cancel-feature-retry; extends /feature-planning to output both TASKS.json and TASKS.md plus optional refinement loop.
  • Agents:
    • task-manager now parses/validates TASKS.json, finds next/in-progress tasks, and updates status.
    • task-verifier verifies acceptance_criteria/testing from TASKS.json and supports a completion-promise shortcut for subjective checks.
    • planning-validator loads PRD/SDD refs from TASKS.json.
    • task-planner generates TASKS.json+TASKS.md with phases, points, deps.
  • Skills & schemas: Adds skills/task-loop-verify, JSON Schemas for TASKS.json and loop state.
  • Docs & metadata: New CLAUDE.md, expanded README.md; bump plugin versions to 1.1.0 in .claude-plugin/plugin.json and marketplace.

Written by Cursor Bugbot for commit fbed6b5. This will update automatically on new commits. Configure here.

…lementation

Phase 1: Foundation
- Add Stop hooks infrastructure (hooks/hooks.json)
- Add state file pattern (.claude/shipspec-*.local.md)
- Add completion markers for hook coordination

Phase 2: Single Task Auto-Retry
- task-loop-hook.sh: Retries /implement-task up to 5 times on failed verification
- Add loop state initialization in implement-task.md
- Add completion promise system for subjective criteria in task-verifier.md

Phase 3: Advanced Features
- feature-retry-hook.sh: Per-task retry during /implement-feature
- planning-refine-hook.sh: Optional Phase 8 to refine large tasks (>5 story points)
- Add inline verification with auto-retry in implement-feature.md
- Add task refinement loop in feature-planning.md

Additional changes:
- Add debug logging to .claude/shipspec-debug.log
- Document Ralph Loop methodology in README
- Bump version to 1.1.0

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude bot commented Jan 12, 2026


Code review

No issues found. Checked for bugs and CLAUDE.md compliance.


Two issues fixed:

1. INCOMPLETE and MISALIGNED verification results now properly clean up
   the loop state file and output completion markers. Previously these
   cases told the user to stop for manual fixes but left the state file
   intact, causing the stop hook to trigger unwanted automatic retries.

2. All three stop hooks now check for state file existence BEFORE reading
   stdin. Since hooks run sequentially and share stdin, the first hook
   was consuming all input, leaving nothing for subsequent hooks. Now
   inactive hooks exit immediately without consuming stdin.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add missing Bash(rm:*) and Bash(mkdir:*) permissions to implement-task.md
  and implement-feature.md allowed-tools frontmatter
- Fix counter reset bug in implement-feature.md by checking current_task_id
  before creating state file (preserves retry counter for same-task retries,
  resets to 1 only for new tasks)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add set -euo pipefail for strict error handling
- Fix frontmatter parsing to isolate YAML from prompt body
- Preserve original task prompt in retry feedback (critical fix)
- Add prompt text validation with detailed error messages
- Fix zero means unlimited logic for max iterations
- Add comprehensive transcript validation
- Consistent state cleanup on max iterations reached
- Add /cancel-task-loop and /cancel-feature-retry commands

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The state file created in Phase 8.2 contained only YAML frontmatter with
no content after the closing ---. The planning-refine-hook.sh extracts
PROMPT_TEXT from content after the frontmatter and validates it's not
empty. When empty, the hook treated the file as corrupted and exited
without blocking, making the entire task refinement auto-retry loop
non-functional.

Changes:
- Changed heredoc from 'EOF' to EOF to allow variable expansion
- Added refinement prompt with instructions after frontmatter
- Included completion marker instruction for loop exit

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The awk command used to extract PROMPT_TEXT was skipping all lines
matching ^---$, not just the YAML frontmatter delimiters. If a task
prompt contained a markdown horizontal rule (---), that line was
silently removed, corrupting the prompt during retry.

Fix: Only skip --- lines until we've seen two (the frontmatter
delimiters), then preserve any subsequent --- in the content.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
jsegov and others added 3 commits January 12, 2026 18:45
- Add empty stdin detection to all hooks to prevent erroneous state
  file deletion when multiple hooks are active simultaneously
- Add context-aware retry messages (first retry says "continue",
  subsequent retries say "previous attempt did not pass")
- Update task-loop-verify SKILL.md to output INCOMPLETE marker
  consistently with implement-task behavior
- Add CLAUDE.md with project architecture and development notes

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update marketplace.json version from 1.0.0 to 1.1.0
- Add version bumping instructions to CLAUDE.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
State files are now stored alongside planning artifacts:
- Pointer: .shipspec/active-loop.local.md
- State: .shipspec/planning/<feature>/<loop-type>.local.md

Changes:
- Update all 3 hooks to read pointer file for dynamic state discovery
- Update commands to create pointer + state in new locations
- Update cancel commands to read pointer and clean up both files
- Update task-loop-verify skill to use new paths
- Update documentation (CLAUDE.md, README.md)

This keeps all ShipSpec artifacts in one directory and removes
dependency on .claude/ existing in user repos.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
jsegov and others added 3 commits January 12, 2026 20:50
Replace checkbox-based TASKS.md parsing with deterministic JSON.

Changes:
- TASKS.json: Machine-parseable metadata (status, deps, acceptance criteria, prompts)
- TASKS.md: Human-readable view (context, approach, implementation guidance)
- Loop state files: Pure JSON format (no YAML frontmatter)
- Hooks: jq-only parsing (no sed/awk)
- task-manager: New update_status operation for JSON status updates
- All commands/agents: Updated for JSON operations

Benefits:
- Eliminates checkbox parsing ambiguity
- Type-safe status enum (not_started/in_progress/completed)
- Smaller state files (prompts read from TASKS.json, not copied)
- Schema validation for TASKS.json and state files

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
In implement-feature.md, after implementing a task within the INCOMPLETE
flow and running verification, only VERIFIED and INCOMPLETE outcomes were
handled. If task-verifier returned BLOCKED (e.g., acceptance criteria
removed during implementation), the task would be left in an inconsistent
state with orphaned state files.

Added **If BLOCKED after implementation:** handling that mirrors Step 4.4:
- Cleans up state files
- Outputs blocked marker for hook compatibility
- Notifies user and prompts for manual intervention

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Reframe ShipSpec as the marriage of spec-driven development and the
Ralph Wiggum loop. New opening identifies two problems (vibe coding,
giving up early) and shows how specs + loops solve both. Added ASCII
diagrams, consolidated redundant Ralph section into intro, linked to
source blog posts.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Both hooks incorrectly claimed attempt 2 was "just started" with no
verification. In reality, verification always runs before any hook
can trigger a retry. Unified messaging now correctly states that the
previous attempt did not pass acceptance criteria.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@jsegov jsegov merged commit 8fd5d95 into main Jan 14, 2026
2 checks passed
@jsegov jsegov deleted the jonathanjsegovia/shi-56-spec-ralph branch January 14, 2026 01:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant