jsegov · jsegov · Jan 14, 2026 · Jan 12, 2026 · Jan 12, 2026 · Jan 12, 2026
diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json
@@ -1,7 +1,7 @@
 {
   "name": "shipspec",
   "description": "Spec-driven development for big features. When features get too big, plan mode gets too vague—leading to hallucinations during implementation. ShipSpec replaces vague plans with structured PRDs, technical designs, and ordered tasks that keep Claude grounded.",
-  "version": "1.0.0",
+  "version": "1.1.0",
   "author": {
     "name": "ShipSpec"
   },

diff --git a/README.md b/README.md
@@ -192,6 +192,43 @@ The `/implement-feature` command implements all tasks and runs a comprehensive f
 - **Use `/implement-task` to work through tasks manually**: Tracks progress and verifies completion
 - **Use `/implement-feature` for automation**: Implements all tasks and runs comprehensive final review
 
+## Ralph Loop Methodology
+
+This plugin uses the [Ralph Loop](https://github.com/anthropics/claude-plugins-official/tree/main/plugins/ralph-loop) methodology for iterative, self-correcting implementation.
+
+### What is Ralph Loop?
+
+Ralph Loop is a development methodology based on continuous AI agent loops. The core concept: use a **Stop hook** to intercept Claude's exit attempts and feed the same prompt back until the task is complete. This creates a self-referential feedback loop where Claude iteratively improves its work.
+
+### How ShipSpec Uses It
+
+ShipSpec adapts Ralph Loop for structured feature development:
+
+| Feature | Ralph Loop Technique |
+|---------|---------------------|
+| `/implement-task` auto-retry | Stop hook blocks exit on failed verification, retries until VERIFIED or max attempts |
+| `/implement-feature` per-task retry | Same mechanism, applied to each task during full-feature implementation |
+| `/feature-planning` task refinement | Stop hook triggers re-analysis of large tasks (>5 story points) |
+
+### Key Components
+
+1. **Stop Hooks** - Intercept session exit and trigger retry loops
+2. **State Files** - Track iteration count and current task (`.claude/shipspec-*.local.md`)
+3. **Completion Markers** - Signal successful completion (`<task-loop-complete>VERIFIED</task-loop-complete>`)
+4. **Max Iterations** - Safety limit to prevent infinite loops (default: 5 attempts per task)
+
+### Philosophy
+
+From Ralph Loop:
+- **Iteration > Perfection** - Don't aim for perfect on first try; let the loop refine the work
+- **Failures Are Data** - Failed verification tells Claude exactly what to fix
+- **Persistence Wins** - Keep trying until success; the loop handles retry logic automatically
+
+### Learn More
+
+- [Ralph Loop Plugin](https://github.com/anthropics/claude-plugins-official/tree/main/plugins/ralph-loop)
+- [Original Ralph Technique](https://ghuntley.com/ralph/)
+
 ## Issues & Feedback
 
 Found a bug or have a suggestion? [Submit an issue](https://github.com/jsegov/shipspec-claude-code-plugin/issues)

diff --git a/agents/task-verifier.md b/agents/task-verifier.md
@@ -45,6 +45,22 @@ You will receive:
 
 ## Verification Process
 
+### Step 0: Check for Completion Promise
+
+If the task prompt contains a `## Completion Promise` section:
+
+1. Extract the promise text (the identifier between `<promise>` tags expected)
+2. Search recent assistant messages in the conversation for `<promise>EXACT_TEXT</promise>`
+3. **If promise found and matches exactly:**
+   - Set `promise_matched = true`
+   - Note in report: "Completion promise detected and matched"
+4. **If promise section exists but no matching promise found:**
+   - Set `promise_matched = false`
+   - Continue to Step 1 (standard verification applies)
+5. **If task has NO Completion Promise section:**
+   - Set `promise_matched = false` (not applicable)
+   - Continue to Step 1 normally
+
 ### Step 1: Extract Acceptance Criteria
 
 Parse the task prompt to find the `## Acceptance Criteria` section. Each line starting with `- [ ]` or `- [x]` is a criterion to verify.
@@ -198,10 +214,24 @@ git diff --name-only HEAD~5
 ## Edge Cases
 
 ### Cannot Verify
-If a criterion cannot be verified (e.g., "User experience is smooth"), mark as CANNOT_VERIFY and note why:
-- Requires manual testing
-- Requires running application
-- Subjective criterion
+
+If a criterion cannot be verified, behavior depends on whether a completion promise was matched in Step 0:
+
+**If completion promise WAS matched (promise_matched = true):**
+- Treat subjective CANNOT_VERIFY criteria as PASS
+- The developer's promise signal counts as explicit verification
+- Example: If criterion is "Documentation is clear (SUBJECTIVE)" and promise matched → PASS
+- Non-subjective CANNOT_VERIFY items (e.g., missing test infrastructure) remain CANNOT_VERIFY
+
+**If completion promise was NOT matched:**
+- Mark subjective criteria as CANNOT_VERIFY with reason:
+  - Requires manual testing
+  - Requires running application
+  - Subjective criterion
+- These do NOT block task completion
+
+**How to identify subjective criteria:**
+Look for markers like "(SUBJECTIVE)", "(subjective)", "clear", "intuitive", "smooth", "user experience" in the criterion text.
 
 ### Partial Completion
 If some criteria pass but others fail, status is INCOMPLETE. List exactly what needs to be fixed.
@@ -233,3 +263,28 @@ Always end with one of these clear verdicts:
 3. **Be helpful**: If something fails, explain how to fix it
 4. **Be honest**: If you can't verify something, say so
 5. **Check related files**: Sometimes implementation spans multiple files
+
+## Completion Promise Format
+
+Tasks with subjective criteria can include this section to enable explicit completion signals:
+
+```markdown
+## Completion Promise
+
+To signal completion, output: <promise>UNIQUE_IDENTIFIER</promise>
+
+**This promise should only be output when:**
+- [List specific conditions for this task]
+
+Do NOT output this promise if unsure or to exit early.
+```
+
+**Promise requirements:**
+- The promise text must be a meaningful identifier (e.g., "API_DESIGN_COMPLETE", "DATABASE_SCHEMA_DONE")
+- Must appear verbatim in `<promise>TEXT</promise>` XML tags
+- Match is case-sensitive with no extra whitespace
+
+**When promise is matched:**
+- Subjective criteria marked "(SUBJECTIVE)" are treated as PASS
+- Non-subjective CANNOT_VERIFY items still remain CANNOT_VERIFY
+- Report notes: "Completion promise detected and matched"
diff --git a/commands/cancel-feature-retry.md b/commands/cancel-feature-retry.md
@@ -0,0 +1,56 @@
+---
+description: Cancel active feature implementation retry loop
+allowed-tools:
+  - Bash(test:*)
+  - Bash(rm:*)
+  - Read
+---
+
+# Cancel Feature Retry
+
+Cancel an active feature implementation retry loop.
+
+## Step 1: Check for Active Loop
+
+```bash
+test -f .claude/shipspec-feature-retry.local.md && echo "EXISTS" || echo "NOT_FOUND"
+```
+
+**If NOT_FOUND:**
+> "No active feature retry loop found."
+
+**Stop here.**
+
+## Step 2: Read Current State
+
+If EXISTS, read the state file:
+
+```
+Read .claude/shipspec-feature-retry.local.md
+```
+
+Extract from the YAML frontmatter:
+- `current_task_id` - the task being implemented
+- `feature` - the feature name
+- `task_attempt` - current attempt number for the task
+- `max_task_attempts` - maximum attempts per task
+- `tasks_completed` - number of completed tasks
+- `total_tasks` - total number of tasks
+
+## Step 3: Cancel the Loop
+
+Remove the state file:
+
+```bash
+rm .claude/shipspec-feature-retry.local.md
+```
+
+## Step 4: Report
+
+> "Cancelled feature retry for **[FEATURE]**
+>
+> - Current task: [CURRENT_TASK_ID] (attempt [TASK_ATTEMPT]/[MAX_TASK_ATTEMPTS])
+> - Progress: [TASKS_COMPLETED]/[TOTAL_TASKS] tasks completed
+>
+> The current task remains in `[~]` (in-progress) status in TASKS.md.
+> Run `/implement-feature [feature]` to resume, or `/implement-task [feature]` to continue task-by-task."
diff --git a/commands/cancel-task-loop.md b/commands/cancel-task-loop.md
@@ -0,0 +1,53 @@
+---
+description: Cancel active task implementation loop
+allowed-tools:
+  - Bash(test:*)
+  - Bash(rm:*)
+  - Read
+---
+
+# Cancel Task Loop
+
+Cancel an active task implementation loop.
+
+## Step 1: Check for Active Loop
+
+```bash
+test -f .claude/shipspec-task-loop.local.md && echo "EXISTS" || echo "NOT_FOUND"
+```
+
+**If NOT_FOUND:**
+> "No active task loop found."
+
+**Stop here.**
+
+## Step 2: Read Current State
+
+If EXISTS, read the state file:
+
+```
+Read .claude/shipspec-task-loop.local.md
+```
+
+Extract from the YAML frontmatter:
+- `task_id` - the task being implemented
+- `feature` - the feature name
+- `iteration` - current attempt number
+- `max_iterations` - maximum attempts configured
+
+## Step 3: Cancel the Loop
+
+Remove the state file:
+
+```bash
+rm .claude/shipspec-task-loop.local.md
+```
+
+## Step 4: Report
+
+> "Cancelled task loop for **[TASK_ID]** in feature **[FEATURE]**
+>
+> - Was at iteration: [ITERATION]/[MAX_ITERATIONS]
+>
+> The task remains in `[~]` (in-progress) status in TASKS.md.
+> Run `/implement-task [feature]` to resume, or manually update the task status."
diff --git a/commands/feature-planning.md b/commands/feature-planning.md
@@ -323,6 +323,133 @@ The context and description are now incorporated into the PRD, SDD, and TASKS.md
 
 ---
 
+## Phase 8: Task Refinement (Optional)
+
+After generating TASKS.md, analyze task complexity to identify tasks that may be too large.
+
+### 8.1: Identify Large Tasks
+
+Parse TASKS.md and find tasks with estimated effort > 5 story points.
+
+**If no large tasks found:**
+> "All tasks are appropriately sized (≤5 story points). Skipping refinement."
+
+Skip to Completion Summary.
+
+**If large tasks found:**
+
+Show user:
+> "## Task Size Analysis
+>
+> Found **X tasks** with estimated effort > 5 story points:
+>
+> | Task | Title | Story Points |
+> |------|-------|--------------|
+> | TASK-003 | [Title] | 8 |
+> | TASK-007 | [Title] | 13 |
+>
+> Large tasks are harder to implement and verify. Would you like to auto-refine them into smaller subtasks?"
+
+Use AskUserQuestion with options:
+- "Yes, auto-refine large tasks"
+- "No, keep current breakdown"
+
+**If user chooses No:** Skip to Completion Summary.
+
+### 8.2: Initialize Refinement Loop
+
+Create state file:
+```bash
+mkdir -p .claude
+cat > .claude/shipspec-planning-refine.local.md << EOF
+---
+active: true
+feature: [FEATURE_DIR]
+iteration: 1
+max_iterations: 3
+large_tasks:
+  - TASK-003
+  - TASK-007
+tasks_refined: 0
+---
+
+## Task Refinement for [FEATURE_DIR]
+
+Continue refining large tasks in \`.shipspec/planning/[FEATURE_DIR]/TASKS.md\`.
+
+### Instructions:
+1. Read TASKS.md and identify tasks with story points > 5
+2. For each large task, break it into 2-3 smaller subtasks (each ≤3 story points)
+3. Update TASKS.md with the new subtasks
+4. Preserve acceptance criteria across subtasks
+5. Update dependencies pointing to the original task
+
+### Completion:
+When all tasks are ≤5 story points, output:
+\`<planning-refine-complete>\`
+EOF
+```
+
+### 8.3: Refine Each Large Task
+
+For each task in large_tasks:
+
+1. Get full task details from TASKS.md
+2. Delegate to `task-planner` agent:
+   > "Break down TASK-XXX into 2-3 subtasks.
+   >
+   > Original task:
+   > [task prompt]
+   >
+   > Requirements:
+   > - Each subtask should be < 3 story points
+   > - Preserve the original acceptance criteria distributed across subtasks
+   > - Maintain dependency relationships
+   > - Use format TASK-XXX-A, TASK-XXX-B, etc. for subtask IDs"
+
+3. Replace original task with generated subtasks in TASKS.md
+4. Update dependencies pointing to original task
+5. Run task-manager validate to check no circular deps
+
+**If validation fails:**
+- Rollback changes
+- Mark task as "cannot refine"
+- Continue to next large task
+
+### 8.4: Check Completion
+
+After processing all large tasks:
+
+1. Re-analyze TASKS.md for tasks > 5 points
+2. If still have large tasks AND iteration < max_iterations:
+   - Increment iteration
+   - Add new large tasks to list
+   - Return to 8.3
+3. If no more large tasks OR max iterations:
+   - Clean up state file
+   - Show summary
+
+### 8.5: Summary
+
+> "## Task Refinement Complete
+>
+> **Results:**
+> - Original large tasks: X
+> - Successfully refined: Y
+> - Could not refine: Z
+>
+> **New task count:** N (was M)"
+
+**Output completion marker:**
+`<planning-refine-complete>`
+
+Clean up:
+```bash
+rm -f .claude/shipspec-planning-refine.local.md
+```
+
+---
+
 ## Completion Summary
 
 After all phases complete, provide: