You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pilot/skills/create-skill/orchestrator.md
+11Lines changed: 11 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,3 +8,14 @@ model: opus
8
8
# /create-skill — Skill Creator
9
9
10
10
**Create a reusable skill.** Provide a topic or workflow description, and this command explores the codebase, gathers relevant patterns, and builds a well-structured skill interactively with you. If no topic is given, it evaluates the current session for extractable knowledge.
11
+
12
+
## Editing existing skills
13
+
14
+
For NEW skills, Step 6 runs the with-skill vs baseline subagent comparison — no skill ships without it.
15
+
16
+
For EDITS, classify the change first:
17
+
18
+
-**Behavioural** — adds/removes a step, changes a rule, reorders critical sections, edits the description, changes Iron Laws / red flags / rationalization tables, modifies trigger keywords. **Re-run Step 6 (or write 2–3 prompts if none exist).** These changes shift trigger accuracy and step compliance.
19
+
-**Cosmetic** — typo, prose polish, link fix, formatting, example clarification with no semantic shift. Skip the test re-run.
20
+
21
+
When uncertain, treat as behavioural. Skill changes that go unverified are how skill quality drifts.
Copy file name to clipboardExpand all lines: pilot/skills/create-skill/steps/07-anti-patterns.md
+10Lines changed: 10 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,3 +20,13 @@
20
20
| **Overtriggering** | Skill loads for irrelevant queries, users disabling it | Add negative triggers ("Do NOT use for..."), be more specific |
21
21
| **Instructions not followed** | Claude loads skill but ignores steps | Instructions too verbose (condense), buried (move critical to top), or ambiguous (use exact commands) |
22
22
| **Large context issues** | Skill seems slow or responses degraded | Move detailed docs to `references/`, keep SKILL.md under 5,000 words |
For skills that enforce discipline (TDD, verification, root-cause-before-fix), use baseline testing (Step 6.2) to capture verbatim agent rationalizations and counter them with three patterns:
27
+
28
+
- **Excuse → Reality table** — each rationalization the agent produced in baseline, paired with the technical counter.
29
+
- **Iron Laws + Red Flags list** — short numbered laws and a symptom-checklist that points back to "STOP, return to step N".
30
+
- **Closed loopholes** — don't just state the rule, forbid specific workarounds (e.g. "Don't keep code as 'reference'. Delete means delete.").
31
+
32
+
Skip these for technique/reference skills — they don't have rationalization pressure to defend against.
Look for the commit that introduced the bug. If recent, read that diff. If nothing obvious, skip.
18
18
19
+
**Bisect when `git log` doesn't reveal it.** If the bug appeared between two known-good and known-bad states and the suspect commit isn't obvious, run `git bisect start <bad> <good>` then `git bisect run <test-cmd>` against the reproducing test you'll write in Step 2 — this pinpoints the introducing commit automatically. Skip when the surface area is small enough that a single read finds it.
20
+
19
21
### 1.3 Trace to root cause
20
22
21
23
**Start with `codegraph_context(task="<bug description>")`** — single call, returns entry points and related symbols. Read it carefully.
@@ -37,6 +39,17 @@ For bugs that don't surface clearly through stack traces or static reading — U
37
39
38
40
**Mark every temporary log with `SPEC-DEBUG:`** (e.g. `console.log("SPEC-DEBUG: filters=", filters)`). Step 3.5 greps for this marker — any unremoved match fails the diff sanity check.
39
41
42
+
**Unknown caller?** If the bug is in shared code reached from many sites and you don't know which caller triggers it, capture the call chain inline:
43
+
44
+
```js
45
+
// SPEC-DEBUG: who is calling this with bad input?
Run, read the stack, identify the offending caller, then trace upward to the original trigger.
50
+
51
+
**Performance regressions are different.** Value-logging is the wrong tool — replace it with baseline measurement: timing harness, `performance.now()`, profiler, or query plan / `EXPLAIN`. Establish current vs expected timing first, then bisect against that signal (Step 1.2). Measure first, fix second.
52
+
40
53
Skip 1.4 when the stack trace already names the failing line, or when a static read of the file is enough to see the bug. Skip is the default.
41
54
42
55
### 1.5 State the root cause
@@ -47,6 +60,18 @@ Out loud, in one sentence to the user, before writing any test:
47
60
48
61
If confidence is Low: bail out. Don't guess in the quick lane.
49
62
63
+
### 1.6 Lock in a fast signal before Step 2
64
+
65
+
Your reproducing signal — what runs in <30s and definitively shows fail/pass — must be **fast and deterministic** before you write the fix. For most bugs that's the unit test you're about to write in Step 2. For UI / integration / async bugs the unit-test seam may be wrong, and your real signal is a `curl`, CLI invocation, or headless-browser command (the same one you'll run in Step 4 E2E).
66
+
67
+
Whichever it is, sharpen it now:
68
+
69
+
-**Slow loop (>30s)?** Narrow the test scope, skip unrelated setup, cache fixtures. A flaky 30s loop is the slowest path to a fix.
70
+
-**Flaky?** Pin time, seed RNG, isolate filesystem, freeze network. For non-deterministic bugs, raise the reproduction rate (loop the trigger 50–100×, parallelise) until it's debuggable — a 1% flake is not.
71
+
-**Wrong symptom?** The signal must fail with the **user's** reported symptom, not a different failure that happens to be nearby. Wrong bug = wrong fix.
72
+
73
+
If you cannot get a fast deterministic signal after one pass of sharpening, that's a bail-out trigger — tell the user to use `/spec`. Don't hypothesise into a flaky loop.
74
+
50
75
### Red flags — STOP and re-investigate
51
76
52
77
- "Quick fix for now, investigate later" → STOP, this IS the quick lane.
Copy file name to clipboardExpand all lines: pilot/skills/fix/steps/03-fix.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,7 @@ If the buggy data flows from upstream, fix upstream. Red flags in the diff:
14
14
- Early return that hides wrong state from the caller.
15
15
- Renamed/suppressed log lines that previously surfaced the bug.
16
16
17
-
A defensive layer is occasionally legitimate (defense-in-depth). When it is, document it explicitly in the Step 6 summary. Otherwise revert it.
17
+
**Quick-lane fixes are single-site.** If you find yourself adding validation at 2+ layers (entry + business + storage, etc.) to make the bug "structurally impossible," you're in `/spec` territory — bail out. Defense-in-depth is a feature of `/spec`, not `/fix`. The one exception: a single boundary check that prevents the same root cause from re-entering through one other obvious path — document it in the Step 6 summary.
-[ ] Diff is small and every changed line traces to the bug (lineage rule).
56
+
-[ ] Worktree mode: single bundled `fix:` commit. Non-worktree: changes ready, no commit yet.
57
+
58
+
If any box is unchecked, do not write the report and do not ask for approval — fix the gap first.
59
+
60
+
### 6.5 Report
48
61
49
62
```
50
63
Bugfix complete — <bug>.
@@ -57,4 +70,14 @@ Run /clear before starting new work — this resets context while keeping projec
57
70
58
71
The `E2E:` line is **mandatory** — it documents that the actual program was exercised, not just the unit tests.
59
72
73
+
### 6.6 Post-mortem flag (optional, one line)
74
+
75
+
Ask once, now that you have more information than when you started: **what would have prevented this bug?** If the answer is architectural — no clean test seam, hidden coupling between modules, validation absent at the boundary the bad data crossed, repeated near-miss in the same area — name it as a `/spec` follow-up candidate in one line:
76
+
77
+
```
78
+
Follow-up (architectural): <one-line description> — candidate for /spec.
79
+
```
80
+
81
+
Skip when the answer is "nothing structural, it was a one-line typo / off-by-one / wrong default." Don't manufacture follow-ups.
Copy file name to clipboardExpand all lines: pilot/skills/prd/orchestrator.md
+6Lines changed: 6 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,6 +8,12 @@ model: opus
8
8
9
9
# /prd - Generate Product Requirements Documents
10
10
11
+
<HARD-GATE>
12
+
Do NOT invoke `/spec`, `/spec-plan`, `/spec-implement`, write any code, scaffold any project, or take any implementation action until you have written a PRD and the user has approved it. This applies to EVERY idea regardless of perceived simplicity.
13
+
14
+
`/prd`'s output is a written PRD at the path determined in Step 6 (write-prd). The terminal state is offering hand-off to `/spec` and waiting for the user. The skill does not invoke implementation skills directly.
15
+
</HARD-GATE>
16
+
11
17
**Strategic thought partner and brainstorming surface** — turns vague ideas into concrete Product Requirements Documents (PRDs) through one-on-one conversation, with optional research. Produces a PRD that can be handed off directly to `/spec` for implementation.
Copy file name to clipboardExpand all lines: pilot/skills/prd/steps/03-clarify.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,6 +8,8 @@
8
8
9
9
**⛔ Skip obvious questions.** Do not ask anything already answered by the one-line idea, the codebase exploration in Step 1, or earlier answers in this conversation. The goal is to surface what the user hasn't thought about yet, not to collect a standard intake form.
10
10
11
+
**⛔ Code-first rule.** Before each question, ask "can the codebase answer this?" If yes — read the code. Use `codegraph_context`, `codegraph_search`, `codegraph_explore`, or Probe to resolve "how does X currently work / where does Y live / what's the existing pattern for Z". Only ask the user about things code can't tell you: purpose, priorities, audience, constraints, scope boundaries, behavioural expectations not yet encoded. Asking the user about facts the code already encodes wastes their time and signals you didn't explore.
12
+
11
13
Coverage areas (ask only where genuinely unclear):
12
14
- **Purpose** — what's the core outcome the user wants?
13
15
- **Users** — who benefits from this? What's their workflow today?
Copy file name to clipboardExpand all lines: pilot/skills/setup-rules/steps/01-reference.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,6 +5,7 @@
5
5
### Guidelines
6
6
7
7
- **Always use AskUserQuestion** when asking the user anything
8
+
- **Explainer-first prompts** — when prompting the user to choose between options, lead with one short paragraph that names what the term means, why these skills need it, and what changes if they pick differently. Then show the choices and the recommended default. Assume the user does not know the term — never present `paths` frontmatter, MCP scoping, or rule-vs-skill distinctions as a multiple-choice without context. Walk them through one section at a time, not all sections at once.
8
9
- **Read before writing** — check existing rules before creating
9
10
- **Write concise rules** — every word costs tokens in context
10
11
- **Idempotent** — running multiple times produces consistent results
2. AskUserQuestion: "Yes, proceed"| "No, let me edit" | "No, I'll annotate in the Console"
14
+
2. AskUserQuestion:
15
+
- "Yes, proceed"— Approve as-is and start spec-implement
16
+
- "No, I have feedback"— I've annotated in the Console or edited the plan file; process my feedback
17
+
18
+
The user can pause at this prompt, annotate in the Console's Specifications tab (auto-saves), or edit the plan file directly, then pick option 2. No "ready" handshake required.
15
19
3. **Yes:** Set `Approved: Yes`, invoke `Skill(skill='spec-implement', args='<plan-path>')`
16
-
**No (edit or annotate):** Tell user to edit the plan or annotate in the Console Specifications tab — annotations auto-save. Say "ready" when done. Re-run Step 6 (check for annotation feedback), re-read plan, ask again. **Other:** Incorporate, re-ask.
20
+
**No, I have feedback:** Re-run Step 6 (process Console annotations), re-read the plan file (in case the user edited it), then return to Step 7 and ask again.
21
+
**Other free-text feedback:** Incorporate the changes into the plan, then re-ask with a fresh AskUserQuestion.
0 commit comments