Implement Ralph Loop methodology for self-correcting implementation#6
Merged
Implement Ralph Loop methodology for self-correcting implementation#6
Conversation
…lementation Phase 1: Foundation - Add Stop hooks infrastructure (hooks/hooks.json) - Add state file pattern (.claude/shipspec-*.local.md) - Add completion markers for hook coordination Phase 2: Single Task Auto-Retry - task-loop-hook.sh: Retries /implement-task up to 5 times on failed verification - Add loop state initialization in implement-task.md - Add completion promise system for subjective criteria in task-verifier.md Phase 3: Advanced Features - feature-retry-hook.sh: Per-task retry during /implement-feature - planning-refine-hook.sh: Optional Phase 8 to refine large tasks (>5 story points) - Add inline verification with auto-retry in implement-feature.md - Add task refinement loop in feature-planning.md Additional changes: - Add debug logging to .claude/shipspec-debug.log - Document Ralph Loop methodology in README - Bump version to 1.1.0 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. |
Two issues fixed: 1. INCOMPLETE and MISALIGNED verification results now properly clean up the loop state file and output completion markers. Previously these cases told the user to stop for manual fixes but left the state file intact, causing the stop hook to trigger unwanted automatic retries. 2. All three stop hooks now check for state file existence BEFORE reading stdin. Since hooks run sequentially and share stdin, the first hook was consuming all input, leaving nothing for subsequent hooks. Now inactive hooks exit immediately without consuming stdin. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add missing Bash(rm:*) and Bash(mkdir:*) permissions to implement-task.md and implement-feature.md allowed-tools frontmatter - Fix counter reset bug in implement-feature.md by checking current_task_id before creating state file (preserves retry counter for same-task retries, resets to 1 only for new tasks) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add set -euo pipefail for strict error handling - Fix frontmatter parsing to isolate YAML from prompt body - Preserve original task prompt in retry feedback (critical fix) - Add prompt text validation with detailed error messages - Fix zero means unlimited logic for max iterations - Add comprehensive transcript validation - Consistent state cleanup on max iterations reached - Add /cancel-task-loop and /cancel-feature-retry commands Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The state file created in Phase 8.2 contained only YAML frontmatter with no content after the closing ---. The planning-refine-hook.sh extracts PROMPT_TEXT from content after the frontmatter and validates it's not empty. When empty, the hook treated the file as corrupted and exited without blocking, making the entire task refinement auto-retry loop non-functional. Changes: - Changed heredoc from 'EOF' to EOF to allow variable expansion - Added refinement prompt with instructions after frontmatter - Included completion marker instruction for loop exit Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The awk command used to extract PROMPT_TEXT was skipping all lines matching ^---$, not just the YAML frontmatter delimiters. If a task prompt contained a markdown horizontal rule (---), that line was silently removed, corrupting the prompt during retry. Fix: Only skip --- lines until we've seen two (the frontmatter delimiters), then preserve any subsequent --- in the content. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add empty stdin detection to all hooks to prevent erroneous state file deletion when multiple hooks are active simultaneously - Add context-aware retry messages (first retry says "continue", subsequent retries say "previous attempt did not pass") - Update task-loop-verify SKILL.md to output INCOMPLETE marker consistently with implement-task behavior - Add CLAUDE.md with project architecture and development notes Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update marketplace.json version from 1.0.0 to 1.1.0 - Add version bumping instructions to CLAUDE.md Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
State files are now stored alongside planning artifacts: - Pointer: .shipspec/active-loop.local.md - State: .shipspec/planning/<feature>/<loop-type>.local.md Changes: - Update all 3 hooks to read pointer file for dynamic state discovery - Update commands to create pointer + state in new locations - Update cancel commands to read pointer and clean up both files - Update task-loop-verify skill to use new paths - Update documentation (CLAUDE.md, README.md) This keeps all ShipSpec artifacts in one directory and removes dependency on .claude/ existing in user repos. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace checkbox-based TASKS.md parsing with deterministic JSON. Changes: - TASKS.json: Machine-parseable metadata (status, deps, acceptance criteria, prompts) - TASKS.md: Human-readable view (context, approach, implementation guidance) - Loop state files: Pure JSON format (no YAML frontmatter) - Hooks: jq-only parsing (no sed/awk) - task-manager: New update_status operation for JSON status updates - All commands/agents: Updated for JSON operations Benefits: - Eliminates checkbox parsing ambiguity - Type-safe status enum (not_started/in_progress/completed) - Smaller state files (prompts read from TASKS.json, not copied) - Schema validation for TASKS.json and state files Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
In implement-feature.md, after implementing a task within the INCOMPLETE flow and running verification, only VERIFIED and INCOMPLETE outcomes were handled. If task-verifier returned BLOCKED (e.g., acceptance criteria removed during implementation), the task would be left in an inconsistent state with orphaned state files. Added **If BLOCKED after implementation:** handling that mirrors Step 4.4: - Cleans up state files - Outputs blocked marker for hook compatibility - Notifies user and prompts for manual intervention Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Reframe ShipSpec as the marriage of spec-driven development and the Ralph Wiggum loop. New opening identifies two problems (vibe coding, giving up early) and shows how specs + loops solve both. Added ASCII diagrams, consolidated redundant Ralph section into intro, linked to source blog posts. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Both hooks incorrectly claimed attempt 2 was "just started" with no verification. In reality, verification always runs before any hook can trigger a retry. Unified messaging now correctly states that the previous attempt did not pass acceptance criteria. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
/implement-task,/implement-feature, and/feature-planningKey Changes
Phase 1: Foundation
hooks/hooks.json).claude/shipspec-*.local.md)Phase 2: Single Task Auto-Retry
task-loop-hook.sh: Retries up to 5 times on failed verificationPhase 3: Advanced Features
feature-retry-hook.sh: Per-task retry during full feature implementationplanning-refine-hook.sh: Optional task refinement for large tasks (>5 story points)Additional
.claude/shipspec-debug.logTest plan
/implement-taskwith a task that fails verification - verify auto-retry/implement-feature- verify per-task retry on failure/feature-planningwith large tasks - verify refinement prompt appears🤖 Generated with Claude Code
Note
Introduces a resilient implementation loop with auto-retries and standardizes task operations on
TASKS.json.hooks/hooks.json,task-loop-hook.sh,feature-retry-hook.sh,planning-refine-hook.shwith JSON state files and completion markers (<task-loop-complete>...,<feature-task-complete>...,<feature-complete>...)./implement-taskand/implement-featureto readTASKS.json, update status via task-managerupdate_status, log to.claude/shipspec-debug.log, and drive auto-retry; adds/cancel-task-loopand/cancel-feature-retry; extends/feature-planningto output bothTASKS.jsonandTASKS.mdplus optional refinement loop.task-managernow parses/validatesTASKS.json, finds next/in-progress tasks, and updates status.task-verifierverifiesacceptance_criteria/testingfromTASKS.jsonand supports a completion-promise shortcut for subjective checks.planning-validatorloads PRD/SDD refs fromTASKS.json.task-plannergeneratesTASKS.json+TASKS.mdwith phases, points, deps.skills/task-loop-verify, JSON Schemas forTASKS.jsonand loop state.CLAUDE.md, expandedREADME.md; bump plugin versions to1.1.0in.claude-plugin/plugin.jsonand marketplace.Written by Cursor Bugbot for commit fbed6b5. This will update automatically on new commits. Configure here.