Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .claude-plugin/plugin.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "shipspec",
"description": "Spec-driven development for big features. When features get too big, plan mode gets too vague—leading to hallucinations during implementation. ShipSpec replaces vague plans with structured PRDs, technical designs, and ordered tasks that keep Claude grounded.",
"version": "1.0.0",
"version": "1.1.0",
"author": {
"name": "ShipSpec"
},
Expand Down
37 changes: 37 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,43 @@ The `/implement-feature` command implements all tasks and runs a comprehensive f
- **Use `/implement-task` to work through tasks manually**: Tracks progress and verifies completion
- **Use `/implement-feature` for automation**: Implements all tasks and runs comprehensive final review

## Ralph Loop Methodology

This plugin uses the [Ralph Loop](https://github.com/anthropics/claude-plugins-official/tree/main/plugins/ralph-loop) methodology for iterative, self-correcting implementation.

### What is Ralph Loop?

Ralph Loop is a development methodology based on continuous AI agent loops. The core concept: use a **Stop hook** to intercept Claude's exit attempts and feed the same prompt back until the task is complete. This creates a self-referential feedback loop where Claude iteratively improves its work.

### How ShipSpec Uses It

ShipSpec adapts Ralph Loop for structured feature development:

| Feature | Ralph Loop Technique |
|---------|---------------------|
| `/implement-task` auto-retry | Stop hook blocks exit on failed verification, retries until VERIFIED or max attempts |
| `/implement-feature` per-task retry | Same mechanism, applied to each task during full-feature implementation |
| `/feature-planning` task refinement | Stop hook triggers re-analysis of large tasks (>5 story points) |

### Key Components

1. **Stop Hooks** - Intercept session exit and trigger retry loops
2. **State Files** - Track iteration count and current task (`.claude/shipspec-*.local.md`)
3. **Completion Markers** - Signal successful completion (`<task-loop-complete>VERIFIED</task-loop-complete>`)
4. **Max Iterations** - Safety limit to prevent infinite loops (default: 5 attempts per task)

### Philosophy

From Ralph Loop:
- **Iteration > Perfection** - Don't aim for perfect on first try; let the loop refine the work
- **Failures Are Data** - Failed verification tells Claude exactly what to fix
- **Persistence Wins** - Keep trying until success; the loop handles retry logic automatically

### Learn More

- [Ralph Loop Plugin](https://github.com/anthropics/claude-plugins-official/tree/main/plugins/ralph-loop)
- [Original Ralph Technique](https://ghuntley.com/ralph/)

## Issues & Feedback

Found a bug or have a suggestion? [Submit an issue](https://github.com/jsegov/shipspec-claude-code-plugin/issues)
Expand Down
63 changes: 59 additions & 4 deletions agents/task-verifier.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,22 @@ You will receive:

## Verification Process

### Step 0: Check for Completion Promise

If the task prompt contains a `## Completion Promise` section:

1. Extract the promise text (the identifier between `<promise>` tags expected)
2. Search recent assistant messages in the conversation for `<promise>EXACT_TEXT</promise>`
3. **If promise found and matches exactly:**
- Set `promise_matched = true`
- Note in report: "Completion promise detected and matched"
4. **If promise section exists but no matching promise found:**
- Set `promise_matched = false`
- Continue to Step 1 (standard verification applies)
5. **If task has NO Completion Promise section:**
- Set `promise_matched = false` (not applicable)
- Continue to Step 1 normally

### Step 1: Extract Acceptance Criteria

Parse the task prompt to find the `## Acceptance Criteria` section. Each line starting with `- [ ]` or `- [x]` is a criterion to verify.
Expand Down Expand Up @@ -198,10 +214,24 @@ git diff --name-only HEAD~5
## Edge Cases

### Cannot Verify
If a criterion cannot be verified (e.g., "User experience is smooth"), mark as CANNOT_VERIFY and note why:
- Requires manual testing
- Requires running application
- Subjective criterion

If a criterion cannot be verified, behavior depends on whether a completion promise was matched in Step 0:

**If completion promise WAS matched (promise_matched = true):**
- Treat subjective CANNOT_VERIFY criteria as PASS
- The developer's promise signal counts as explicit verification
- Example: If criterion is "Documentation is clear (SUBJECTIVE)" and promise matched → PASS
- Non-subjective CANNOT_VERIFY items (e.g., missing test infrastructure) remain CANNOT_VERIFY

**If completion promise was NOT matched:**
- Mark subjective criteria as CANNOT_VERIFY with reason:
- Requires manual testing
- Requires running application
- Subjective criterion
- These do NOT block task completion

**How to identify subjective criteria:**
Look for markers like "(SUBJECTIVE)", "(subjective)", "clear", "intuitive", "smooth", "user experience" in the criterion text.

### Partial Completion
If some criteria pass but others fail, status is INCOMPLETE. List exactly what needs to be fixed.
Expand Down Expand Up @@ -233,3 +263,28 @@ Always end with one of these clear verdicts:
3. **Be helpful**: If something fails, explain how to fix it
4. **Be honest**: If you can't verify something, say so
5. **Check related files**: Sometimes implementation spans multiple files

## Completion Promise Format

Tasks with subjective criteria can include this section to enable explicit completion signals:

```markdown
## Completion Promise

To signal completion, output: <promise>UNIQUE_IDENTIFIER</promise>

**This promise should only be output when:**
- [List specific conditions for this task]

Do NOT output this promise if unsure or to exit early.
```

**Promise requirements:**
- The promise text must be a meaningful identifier (e.g., "API_DESIGN_COMPLETE", "DATABASE_SCHEMA_DONE")
- Must appear verbatim in `<promise>TEXT</promise>` XML tags
- Match is case-sensitive with no extra whitespace

**When promise is matched:**
- Subjective criteria marked "(SUBJECTIVE)" are treated as PASS
- Non-subjective CANNOT_VERIFY items still remain CANNOT_VERIFY
- Report notes: "Completion promise detected and matched"
56 changes: 56 additions & 0 deletions commands/cancel-feature-retry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
---
description: Cancel active feature implementation retry loop
allowed-tools:
- Bash(test:*)
- Bash(rm:*)
- Read
---

# Cancel Feature Retry

Cancel an active feature implementation retry loop.

## Step 1: Check for Active Loop

```bash
test -f .claude/shipspec-feature-retry.local.md && echo "EXISTS" || echo "NOT_FOUND"
```

**If NOT_FOUND:**
> "No active feature retry loop found."

**Stop here.**

## Step 2: Read Current State

If EXISTS, read the state file:

```
Read .claude/shipspec-feature-retry.local.md
```

Extract from the YAML frontmatter:
- `current_task_id` - the task being implemented
- `feature` - the feature name
- `task_attempt` - current attempt number for the task
- `max_task_attempts` - maximum attempts per task
- `tasks_completed` - number of completed tasks
- `total_tasks` - total number of tasks

## Step 3: Cancel the Loop

Remove the state file:

```bash
rm .claude/shipspec-feature-retry.local.md
```

## Step 4: Report

> "Cancelled feature retry for **[FEATURE]**
>
> - Current task: [CURRENT_TASK_ID] (attempt [TASK_ATTEMPT]/[MAX_TASK_ATTEMPTS])
> - Progress: [TASKS_COMPLETED]/[TOTAL_TASKS] tasks completed
>
> The current task remains in `[~]` (in-progress) status in TASKS.md.
> Run `/implement-feature [feature]` to resume, or `/implement-task [feature]` to continue task-by-task."
53 changes: 53 additions & 0 deletions commands/cancel-task-loop.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
---
description: Cancel active task implementation loop
allowed-tools:
- Bash(test:*)
- Bash(rm:*)
- Read
---

# Cancel Task Loop

Cancel an active task implementation loop.

## Step 1: Check for Active Loop

```bash
test -f .claude/shipspec-task-loop.local.md && echo "EXISTS" || echo "NOT_FOUND"
```

**If NOT_FOUND:**
> "No active task loop found."

**Stop here.**

## Step 2: Read Current State

If EXISTS, read the state file:

```
Read .claude/shipspec-task-loop.local.md
```

Extract from the YAML frontmatter:
- `task_id` - the task being implemented
- `feature` - the feature name
- `iteration` - current attempt number
- `max_iterations` - maximum attempts configured

## Step 3: Cancel the Loop

Remove the state file:

```bash
rm .claude/shipspec-task-loop.local.md
```

## Step 4: Report

> "Cancelled task loop for **[TASK_ID]** in feature **[FEATURE]**
>
> - Was at iteration: [ITERATION]/[MAX_ITERATIONS]
>
> The task remains in `[~]` (in-progress) status in TASKS.md.
> Run `/implement-task [feature]` to resume, or manually update the task status."
127 changes: 127 additions & 0 deletions commands/feature-planning.md
Original file line number Diff line number Diff line change
Expand Up @@ -323,6 +323,133 @@ The context and description are now incorporated into the PRD, SDD, and TASKS.md

---

## Phase 8: Task Refinement (Optional)

After generating TASKS.md, analyze task complexity to identify tasks that may be too large.

### 8.1: Identify Large Tasks

Parse TASKS.md and find tasks with estimated effort > 5 story points.

**If no large tasks found:**
> "All tasks are appropriately sized (≤5 story points). Skipping refinement."

Skip to Completion Summary.

**If large tasks found:**

Show user:
> "## Task Size Analysis
>
> Found **X tasks** with estimated effort > 5 story points:
>
> | Task | Title | Story Points |
> |------|-------|--------------|
> | TASK-003 | [Title] | 8 |
> | TASK-007 | [Title] | 13 |
>
> Large tasks are harder to implement and verify. Would you like to auto-refine them into smaller subtasks?"

Use AskUserQuestion with options:
- "Yes, auto-refine large tasks"
- "No, keep current breakdown"

**If user chooses No:** Skip to Completion Summary.

### 8.2: Initialize Refinement Loop

Create state file:
```bash
mkdir -p .claude
cat > .claude/shipspec-planning-refine.local.md << EOF
---
active: true
feature: [FEATURE_DIR]
iteration: 1
max_iterations: 3
large_tasks:
- TASK-003
- TASK-007
tasks_refined: 0
---

## Task Refinement for [FEATURE_DIR]

Continue refining large tasks in \`.shipspec/planning/[FEATURE_DIR]/TASKS.md\`.

### Instructions:
1. Read TASKS.md and identify tasks with story points > 5
2. For each large task, break it into 2-3 smaller subtasks (each ≤3 story points)
3. Update TASKS.md with the new subtasks
4. Preserve acceptance criteria across subtasks
5. Update dependencies pointing to the original task

### Completion:
When all tasks are ≤5 story points, output:
\`<planning-refine-complete>\`
EOF
```

### 8.3: Refine Each Large Task

For each task in large_tasks:

1. Get full task details from TASKS.md
2. Delegate to `task-planner` agent:
> "Break down TASK-XXX into 2-3 subtasks.
>
> Original task:
> [task prompt]
>
> Requirements:
> - Each subtask should be < 3 story points
> - Preserve the original acceptance criteria distributed across subtasks
> - Maintain dependency relationships
> - Use format TASK-XXX-A, TASK-XXX-B, etc. for subtask IDs"

3. Replace original task with generated subtasks in TASKS.md
4. Update dependencies pointing to original task
5. Run task-manager validate to check no circular deps

**If validation fails:**
- Rollback changes
- Mark task as "cannot refine"
- Continue to next large task

### 8.4: Check Completion

After processing all large tasks:

1. Re-analyze TASKS.md for tasks > 5 points
2. If still have large tasks AND iteration < max_iterations:
- Increment iteration
- Add new large tasks to list
- Return to 8.3
3. If no more large tasks OR max iterations:
- Clean up state file
- Show summary

### 8.5: Summary

> "## Task Refinement Complete
>
> **Results:**
> - Original large tasks: X
> - Successfully refined: Y
> - Could not refine: Z
>
> **New task count:** N (was M)"

**Output completion marker:**
`<planning-refine-complete>`

Clean up:
```bash
rm -f .claude/shipspec-planning-refine.local.md
```

---

## Completion Summary

After all phases complete, provide:
Expand Down
Loading