Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 21 additions & 1 deletion course/units/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,27 @@
- title: "2. Using Skills with Coding Agents"
sections:
- local: unit2/introduction
title: Skills Across Agent Platforms
title: Using Skills with Claude Code
- local: unit2/skills-in-claude-code
title: Skills in Claude Code
- local: unit2/hands-on-debugging-activation
title: "Hands-on: Did Your Skill Fire?"
- local: unit2/inside-the-box
title: "Inside the Box: Load, Select, Compose"
- local: unit2/the-real-sdk
title: The Real SDK
- local: unit2/hands-on-build-space
title: "Hands-on: Build the Space"
- local: unit2/hands-on-iterate
title: "Hands-on: Fix It Mid-Session"
- local: unit2/skill-creator-install
title: "The skill-creator Skill: Install and Tour"
- local: unit2/skill-creator-loop
title: "The skill-creator Loop"
- local: unit2/hands-on-optimize-description
title: "Hands-on: Optimize the Description"
- local: unit2/publish-and-conclusion
title: Publish and Conclusion

- title: "3. Sharing Skills"
sections:
Expand Down
Binary file added course/units/en/unit2/assets/baseline.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added course/units/en/unit2/assets/brutalist.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added course/units/en/unit2/assets/hfbrand.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
93 changes: 93 additions & 0 deletions course/units/en/unit2/hands-on-build-space.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Hands-on: Build the Space

You fine-tuned a model in Unit 1. Time to give it a demo people can actually click on.

## The Setup

Clone the starter and install the brand skill:

```bash
git clone https://github.com/huggingface/skills-course-unit2-starter
cd skills-course-unit2-starter
cp -r skills/hf-brand .claude/skills/
```

> [!NOTE]
> If you didn't complete the Unit 1 fine-tune (it requires HF Pro), that's fine — the scaffold has a mock `predict()` function. You can swap in your real model later, or use any Hub model.

## What the Brand Skill Does

Open `.claude/skills/hf-brand/SKILL.md`. It's about 60 lines encoding Hugging Face's visual identity:

- Primary yellow `#FFD21E`, orange `#FF9D00` hover
- The 🤗 emoji as the brand mark
- Source Sans Pro font, soft rounded corners
- Warm, welcoming copy — "Try it out!" not "ENTER TEXT"
- The Gradio 6 theme config, pre-written

It's a **brand guideline in skill form**. Claude reads it and builds accordingly.

## Build It

Open Claude Code in the starter directory:

```bash
claude
```

Ask:

```
Build a Gradio demo for the sentiment classifier in app_scaffold.py.
Keep the mock predict() function — just build a proper UI around it.
Three example inputs. Make it ready to deploy as a Space.
```

You didn't say "use the HF brand skill." Claude matched the request ("Gradio", "Space") to the skill's description and loaded it.

## What You'll See

With the skill loaded, the app Claude builds looks like this:

![HF-branded demo](./assets/hfbrand.png)

🤗 in the title. Yellow button with dark text. Soft rounded cards. Copy like *"Paste any review and see what the model thinks."*

<details>
<summary>What it looks like without the skill</summary>

Same prompt, same model, no brand skill:

![Baseline demo](./assets/baseline.png)

It works. It's fine. It also looks like every other Gradio app on the Hub — generic blue, random emoji, default theme.

</details>

## Run It

```bash
pip install gradio
python app.py
```

Open `http://localhost:7860`. Click around.

## Try a Different Aesthetic

The starter also includes `skills/brutalist/` — monochrome, monospace, hard edges, ALL CAPS labels. Swap it in:

```bash
rm -r .claude/skills/hf-brand
cp -r skills/brutalist .claude/skills/
```

`/reload-plugins` in Claude Code, then ask for a rebuild. Same app, different personality:

![Brutalist demo](./assets/brutalist.png)

`SENTIMENT CLASSIFIER` in heavy black caps. `PASTE A REVIEW. RECEIVE A VERDICT.` Zero rounded corners.

Same 60-odd lines of skill. Completely different output.

Next: what to do when it's *almost* right.
92 changes: 92 additions & 0 deletions course/units/en/unit2/hands-on-debugging-activation.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Hands-on: Did Your Skill Fire?

You built `dataset-publisher` in Unit 1. Let's find out if it works.

## Step 1: Install It

Copy your skill into Claude Code's skills directory:

```bash
mkdir -p ~/.claude/skills/dataset-publisher
cp path/to/your/unit1/dataset-publisher/SKILL.md ~/.claude/skills/dataset-publisher/
```

If you have the helper script and template from Unit 1, copy those too — they go in the same directory.

## Step 2: Ask Without Naming It

Open Claude Code in an empty directory:

```bash
mkdir ~/skill-test && cd ~/skill-test
claude
```

Now ask something your skill *should* handle — but don't say the skill's name:

```
I want to share about 20 Python coding questions and answers with my team
in a structured way. What's a good approach?
```

Watch the response.

## Step 3: Diagnose

Three things can happen.

### It activated

You'll see Claude follow your skill's workflow — initialize repo, define structure, add data, write card. The response references the specific steps you wrote. **This is success.** Your description was clear enough.

### It didn't activate

Generic advice. "You could use a CSV file, or a shared doc, or..." No mention of the Hub, no structured workflow.

The problem is almost always the description. Unit 1 gave you:

```yaml
description: Create, populate, and publish datasets on the Hugging Face Hub
```

That's a *definition*, not a *trigger*. It says what the skill does, not when to use it. Claude reads "share Q&A with my team" and doesn't see a match.

Rewrite it to include trigger phrases:

```yaml
description: Create and publish datasets to the Hugging Face Hub. Use when
the user wants to share structured data, publish a dataset, upload examples
for training, or create a dataset card. Triggers on "dataset", "publish to
Hub", "share training data", "Q&A pairs".
```

Save the file. In Claude Code, reload:

```
/reload-plugins
```

Ask again. Different response.

### You can't tell

Ask directly:

```
Which skills did you consider or load for that last response?
```

Claude will tell you. If `dataset-publisher` isn't mentioned, it didn't fire.

You can also run `/context` to see what's loaded in the context window.

## What You Just Learned

The skill you wrote in Unit 1 was *correct* — the format was right, the instructions were good. But it probably didn't activate on a natural question, because the description was written for humans reading a catalog, not for an agent matching intent.

This is the single most common reason skills don't work. And now you know how to fix it.

> [!TIP]
> **A good description has two parts:** what the skill does (one sentence) and when to use it (trigger phrases, situations, keywords). Unit 1 taught the first part. The second part is what makes it fire.

Next: build something with a skill that's already tuned to activate well.
84 changes: 84 additions & 0 deletions course/units/en/unit2/hands-on-iterate.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# Hands-on: Fix It Mid-Session

The Space is close. But something's off.

## The Annoyance

Look at what Claude generated. Somewhere in there — probably at the bottom — there's something like:

```python
gr.Markdown(
"⚠️ **Note:** This demo uses keyword-based mock inference for "
"illustration purposes. In a production system, replace this with "
"a real NLP model (e.g., a fine-tuned BERT or RoBERTa classifier)."
)
```

You didn't ask for that. Claude adds disclaimers by default because it's being careful. Sometimes you want that. Right now you don't.

You could say "remove the disclaimer." That fixes it once. Next time you ask Claude to edit the app, the disclaimer might come back.

## Write the Skill — Right Now

Don't close Claude Code. In a second terminal:

```bash
mkdir -p .claude/skills/no-fluff
```

Create `.claude/skills/no-fluff/SKILL.md`:

```markdown
---
name: no-fluff
description: Strip disclaimers and filler from demos. Use when building
Gradio apps, demo Spaces, or example code.
---

# No Fluff

When generating demo code:

- No "⚠️ Note: this is a demo" footers
- No "In production, you would..." disclaimers
- No TODO placeholders unless asked
- Trust the user knows it's a demo

The user is a developer. They understand the difference between a demo
and production. Let the code speak.
```

Twelve lines. Thirty seconds.

## Reload and Continue

Back in Claude Code — the *same* session, same conversation:

```
/reload-plugins
```

Then:

```
Regenerate app.py. Same requirements.
```

The disclaimer is gone. And it stays gone, because the skill loads every time Claude touches this project.

## What Just Happened

This is how skills actually get written. Not in advance, from a spec. When Claude does the annoying thing for the third time and you decide you're done explaining.

The pattern:

1. Claude does X
2. You say "don't do X"
3. Next session, Claude does X again
4. You write a skill: "don't do X"
5. It stops

> [!TIP]
> Skills you write this way tend to be small and specific. That's good. A skill that fixes one thing reliably is more useful than a skill that tries to fix everything and sometimes misfires. You'll accumulate many small ones over time.

Now let's ship it.
65 changes: 65 additions & 0 deletions course/units/en/unit2/hands-on-optimize-description.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Hands-on: Optimize the Description

Your skill works when it's loaded. The question is whether Claude loads it when it should.

## Why This Step Exists

The `description` field in your frontmatter is the only thing Claude sees when deciding whether to consult a skill. Not the body, not the scripts — just the description. If it's vague, the skill never fires. If it's too broad, it fires when it shouldn't.

Unit 1 gave you descriptions like *"Create, populate, and publish datasets on the Hugging Face Hub"* — accurate but unhelpful for matching. A real user asking "help me share some Q&A with my team" won't trigger that. The description says what the skill *is*, not when to *use* it.

skill-creator has an automated loop to fix this.

## Generate Trigger Queries

In Claude Code:

```
Let's optimize the description for triggering.
```

Claude generates 20 eval queries — half should trigger, half shouldn't. The tricky part: the should-not-trigger cases need to be *near-misses*, not obviously irrelevant. "Write a fibonacci function" as a negative for a PDF skill is too easy. "Show me the text on page 3" (simple enough to handle without a skill) is a real test.

## Review the Queries

Claude opens an HTML form in your browser. You can edit queries, flip should/shouldn't, add or remove. When you're done, click **Export Eval Set** — it downloads `eval_set.json` to `~/Downloads/`.

This step matters. Bad eval queries → bad optimized description. Spend a minute here.

## Run the Loop

```bash
python -m scripts.run_loop \
--eval-set ~/Downloads/eval_set.json \
--skill-path ./your-skill \
--model claude-sonnet-4-6 \
--max-iterations 5 \
--verbose
```

This runs in the background. What it does:

1. Split your eval set 60/40 into train and held-out test
2. Evaluate the current description — runs each query 3 times, measures trigger rate
3. Ask Claude to rewrite the description based on what failed
4. Re-evaluate the new description on both train *and* test
5. Repeat up to 5 times
6. Pick the best by **test** score, not train — to avoid overfitting

It opens an HTML report when done. You'll see something like:

```
Iteration 0 (original): train 65% test 60%
Iteration 1: train 80% test 75%
Iteration 2: train 85% test 80% ← best
Iteration 3: train 90% test 75% (overfit — discarded)
```

## Apply It

The output JSON has a `best_description` field. Claude updates your `SKILL.md` frontmatter with it. Show the before/after.

> [!TIP]
> **What a good optimized description looks like:** It's pushier than you'd expect. Not just "helps with X" but "Use this whenever the user mentions X, Y, or Z, even if they don't explicitly ask." Claude undertriggers by default — the description has to compensate.

Your skill now fires when it should. Time to package it.
Loading