Skip to content

fix: improve skill routing for Claude 4.5 and reduce test cost#1080

Merged
kvenkatrajan merged 30 commits intomainfrom
fix/skill-invocation-claude45-v2
Mar 3, 2026
Merged

fix: improve skill routing for Claude 4.5 and reduce test cost#1080
kvenkatrajan merged 30 commits intomainfrom
fix/skill-invocation-claude45-v2

Conversation

@kvenkatrajan
Copy link
Collaborator

Problem

Skill invocation rate dropped for Claude 4.5. The model bypasses skills by running commands directly, routes to azure-prepare instead of azure-deploy, or investigates rather than acting.

Root Cause

Claude 4.5 treats prohibitions (NEVER, FORBIDDEN) as suggestions and optimizes for the shortest path to the goal. Descriptions using negative constraints don't reliably steer tool selection.

Changes

Skill Descriptions (capability-claim framing)

  • azure-deploy/SKILL.md: Reframed description as capability claim ("This skill runs azd up, azd deploy, terraform apply..."), narrowed triggers to already-prepared apps, added scope rule
  • azure-prepare/SKILL.md: Added "Preparation ONLY" scope boundary, rule 8 for scope enforcement, handoff marker in Phase 2
  • azure-validate/SKILL.md: Added handoff language directing to azure-deploy after validation passes

Test Prompts

  • Updated 2 deploy skill-invocation prompts with anti-routing signals ("already has azure.yaml", "infrastructure already set up")

Test Cost

  • Reduced \RUNS_PER_PROMPT\ from 5 to 1 in both azure-deploy and azure-prepare integration tests. CI runs 3x/day across scheduled runs, providing statistical signal over time without redundant per-job repetition.

Snapshots

  • Updated trigger test snapshots for all 3 skills to match new descriptions

Test Results

  • 103/103 unit + trigger tests pass
  • All skill-invocation tests pass locally

- azure-deploy: capability-claim description ('This skill runs azd up...'),
  narrowed triggers to already-prepared apps, added scope rule
- azure-prepare: added 'Preparation ONLY' scope boundary, rule 8 for
  scope enforcement, handoff marker in Phase 2
- azure-validate: added handoff language to azure-deploy in description
- Tests: updated 2 deploy prompts with anti-routing signals
- Updated snapshots for all 3 skills
CI runs 3x/day across scheduled runs, providing statistical signal over time.
Running each prompt 5x per CI job was redundant cost with no added reliability.
Copilot AI review requested due to automatic review settings February 28, 2026 02:36
@github-actions
Copy link
Contributor

github-actions bot commented Feb 28, 2026

🔍 Token Analysis Report

@github-copilot-for-azure/scripts@1.0.0 tokens
node --import tsx src/tokens/cli.ts compare --base origin/main --head HEAD --markdown

📊 Token Change Report

Comparing origin/mainHEAD

Summary

Metric Value
📈 Total Change +387 tokens (+5%)
Before 8,348 tokens
After 8,735 tokens
Files Changed 4

Changed Files

File Before After Change
plugin/skills/azure-deploy/SKILL.md 1,142 1,355 +213 (+19%)
plugin/skills/azure-prepare/SKILL.md 1,897 2,067 +170 (+9%)
tests/README.md 3,976 3,978 +2 (0%)
tests/azure-prepare/eval/README.md 1,333 1,335 +2 (0%)

@github-copilot-for-azure/scripts@1.0.0 tokens
node --import tsx src/tokens/cli.ts check --markdown

📊 Token Limit Check Report

Checked: 448 files
Exceeded: 133 files

⚠️ Files Exceeding Token Limits

File Tokens Limit Over By
.github/skills/file-test-bug/SKILL.md 628 500 +128
.github/skills/sensei/README.md 3530 1000 +2530
.github/skills/sensei/SKILL.md 2382 500 +1882
.github/skills/sensei/references/EXAMPLES.md 3707 1000 +2707
.github/skills/sensei/references/LOOP.md 4181 1000 +3181
.github/skills/sensei/references/SCORING.md 3927 1000 +2927
.github/skills/sensei/references/TOKEN-INTEGRATION.md 1094 1000 +94
.github/skills/skill-authoring/SKILL.md 817 500 +317
plugin/skills/appinsights-instrumentation/SKILL.md 965 500 +465
plugin/skills/azure-ai/SKILL.md 846 500 +346
plugin/skills/azure-ai/references/auth-best-practices.md 1543 1000 +543
plugin/skills/azure-aigateway/SKILL.md 1294 500 +794
plugin/skills/azure-aigateway/references/auth-best-practices.md 1543 1000 +543
plugin/skills/azure-aigateway/references/patterns.md 1696 1000 +696
plugin/skills/azure-aigateway/references/policies.md 2342 1000 +1342
plugin/skills/azure-aigateway/references/troubleshooting.md 1971 1000 +971
plugin/skills/azure-cloud-migrate/references/services/functions/assessment.md 1601 1000 +601
plugin/skills/azure-cloud-migrate/references/services/functions/code-migration.md 1515 1000 +515
plugin/skills/azure-cloud-migrate/references/services/functions/lambda-to-functions.md 2600 1000 +1600
plugin/skills/azure-cloud-migrate/references/services/functions/runtimes/csharp.md 1403 1000 +403
plugin/skills/azure-cloud-migrate/references/services/functions/runtimes/java.md 1638 1000 +638
plugin/skills/azure-cloud-migrate/references/services/functions/runtimes/javascript.md 2181 1000 +1181
plugin/skills/azure-cloud-migrate/references/services/functions/runtimes/powershell.md 1261 1000 +261
plugin/skills/azure-cloud-migrate/references/services/functions/runtimes/python.md 1632 1000 +632
plugin/skills/azure-compliance/SKILL.md 1250 500 +750
plugin/skills/azure-compliance/references/auth-best-practices.md 1543 1000 +543
plugin/skills/azure-compliance/references/azqr-recommendations.md 1447 1000 +447
plugin/skills/azure-compliance/references/azqr-remediation-patterns.md 1987 1000 +987
plugin/skills/azure-compliance/references/azure-keyvault-expiration-audit.md 1286 1000 +286
plugin/skills/azure-compliance/references/azure-quick-review.md 1268 1000 +268
plugin/skills/azure-compute/SKILL.md 2631 500 +2131
plugin/skills/azure-compute/references/retail-prices-api.md 1609 1000 +609
plugin/skills/azure-compute/references/vm-families.md 1234 1000 +234
plugin/skills/azure-compute/references/vmss-guide.md 1621 1000 +621
plugin/skills/azure-cost-optimization/SKILL.md 3468 500 +2968
plugin/skills/azure-cost-optimization/references/auth-best-practices.md 1543 1000 +543
plugin/skills/azure-deploy/SKILL.md 1355 500 +855
plugin/skills/azure-deploy/references/auth-best-practices.md 1543 1000 +543
plugin/skills/azure-deploy/references/pre-deploy-checklist.md 1195 1000 +195
plugin/skills/azure-deploy/references/recipes/azd/ef-migrations.md 1318 1000 +318
plugin/skills/azure-deploy/references/recipes/azd/errors.md 1212 1000 +212
plugin/skills/azure-deploy/references/recipes/azd/sql-managed-identity.md 1190 1000 +190
plugin/skills/azure-deploy/references/troubleshooting.md 1527 1000 +527
plugin/skills/azure-diagnostics/SKILL.md 1077 500 +577
plugin/skills/azure-hosted-copilot-sdk/SKILL.md 671 500 +171
plugin/skills/azure-hosted-copilot-sdk/references/auth-best-practices.md 1543 1000 +543
plugin/skills/azure-hosted-copilot-sdk/references/azure-model-config.md 1151 1000 +151
plugin/skills/azure-kusto/SKILL.md 2175 500 +1675
plugin/skills/azure-messaging/SKILL.md 867 500 +367
plugin/skills/azure-messaging/references/auth-best-practices.md 1543 1000 +543
plugin/skills/azure-messaging/references/service-troubleshooting.md 1044 1000 +44
plugin/skills/azure-observability/SKILL.md 1048 500 +548
plugin/skills/azure-observability/references/auth-best-practices.md 1543 1000 +543
plugin/skills/azure-prepare/SKILL.md 2067 500 +1567
plugin/skills/azure-prepare/references/analyze.md 1038 1000 +38
plugin/skills/azure-prepare/references/apim.md 1453 1000 +453
plugin/skills/azure-prepare/references/aspire.md 2735 1000 +1735
plugin/skills/azure-prepare/references/auth-best-practices.md 1543 1000 +543
plugin/skills/azure-prepare/references/azure-context.md 1019 1000 +19
plugin/skills/azure-prepare/references/plan-template.md 1063 1000 +63
plugin/skills/azure-prepare/references/recipes/azd/aspire.md 1584 1000 +584
plugin/skills/azure-prepare/references/recipes/azd/azure-yaml.md 1803 1000 +803
plugin/skills/azure-prepare/references/recipes/azd/terraform.md 2924 1000 +1924
plugin/skills/azure-prepare/references/research.md 1784 1000 +784
plugin/skills/azure-prepare/references/runtimes/nodejs.md 1508 1000 +508
plugin/skills/azure-prepare/references/security.md 2092 1000 +1092
plugin/skills/azure-prepare/references/services/functions/bicep.md 2184 1000 +1184
plugin/skills/azure-prepare/references/services/functions/templates/SPEC-composable-templates.md 6187 1000 +5187
plugin/skills/azure-prepare/references/services/functions/templates/recipes/README.md 1354 1000 +354
plugin/skills/azure-prepare/references/services/functions/templates/recipes/common/nodejs-entry-point.md 1034 1000 +34
plugin/skills/azure-prepare/references/services/functions/templates/recipes/common/uami-bindings.md 1223 1000 +223
plugin/skills/azure-prepare/references/services/functions/templates/recipes/composition.md 4564 1000 +3564
plugin/skills/azure-prepare/references/services/functions/templates/recipes/cosmosdb/README.md 1467 1000 +467
plugin/skills/azure-prepare/references/services/functions/templates/recipes/durable/README.md 1149 1000 +149
plugin/skills/azure-prepare/references/services/functions/templates/recipes/eventhubs/README.md 1403 1000 +403
plugin/skills/azure-prepare/references/services/functions/templates/recipes/mcp/source/java.md 1312 1000 +312
plugin/skills/azure-prepare/references/services/functions/templates/recipes/mcp/source/python.md 1207 1000 +207
plugin/skills/azure-prepare/references/services/functions/templates/recipes/mcp/source/typescript.md 1138 1000 +138
plugin/skills/azure-prepare/references/services/functions/templates/recipes/servicebus/README.md 1171 1000 +171
plugin/skills/azure-prepare/references/services/functions/templates/recipes/servicebus/source/dotnet.md 1280 1000 +280
plugin/skills/azure-prepare/references/services/functions/templates/recipes/servicebus/source/java.md 1016 1000 +16
plugin/skills/azure-prepare/references/services/functions/templates/recipes/sql/source/java.md 1009 1000 +9
plugin/skills/azure-prepare/references/services/functions/templates/recipes/sql/source/python.md 1080 1000 +80
plugin/skills/azure-prepare/references/services/functions/terraform.md 2545 1000 +1545
plugin/skills/azure-prepare/references/services/service-bus/patterns.md 1122 1000 +122
plugin/skills/azure-resource-lookup/SKILL.md 1389 500 +889
plugin/skills/azure-resource-lookup/references/azure-resource-graph.md 1307 1000 +307
plugin/skills/azure-resource-visualizer/SKILL.md 2105 500 +1605
plugin/skills/azure-storage/SKILL.md 1180 500 +680
plugin/skills/azure-storage/references/auth-best-practices.md 1543 1000 +543
plugin/skills/azure-storage/references/sdk-usage.md 1135 1000 +135
plugin/skills/azure-validate/SKILL.md 761 500 +261
plugin/skills/azure-validate/references/recipes/azd/README.md 1191 1000 +191
plugin/skills/entra-app-registration/SKILL.md 2068 500 +1568
plugin/skills/entra-app-registration/references/api-permissions.md 2545 1000 +1545
plugin/skills/entra-app-registration/references/auth-best-practices.md 1543 1000 +543
plugin/skills/entra-app-registration/references/cli-commands.md 2211 1000 +1211
plugin/skills/entra-app-registration/references/console-app-example.md 2752 1000 +1752
plugin/skills/entra-app-registration/references/first-app-registration.md 1846 1000 +846
plugin/skills/entra-app-registration/references/oauth-flows.md 2375 1000 +1375
plugin/skills/entra-app-registration/references/troubleshooting.md 1896 1000 +896
plugin/skills/microsoft-foundry/SKILL.md 1948 500 +1448
plugin/skills/microsoft-foundry/foundry-agent/create/create.md 3016 1000 +2016
plugin/skills/microsoft-foundry/foundry-agent/create/references/agentframework.md 1300 1000 +300
plugin/skills/microsoft-foundry/foundry-agent/create/references/tool-memory.md 1204 1000 +204
plugin/skills/microsoft-foundry/foundry-agent/deploy/deploy.md 4005 1000 +3005
plugin/skills/microsoft-foundry/foundry-agent/invoke/invoke.md 1273 1000 +273
plugin/skills/microsoft-foundry/foundry-agent/trace/references/kql-templates.md 1913 1000 +913
plugin/skills/microsoft-foundry/foundry-agent/trace/references/search-traces.md 1366 1000 +366
plugin/skills/microsoft-foundry/foundry-agent/trace/trace.md 1265 1000 +265
plugin/skills/microsoft-foundry/foundry-agent/troubleshoot/troubleshoot.md 1299 1000 +299
plugin/skills/microsoft-foundry/models/deploy-model/SKILL.md 1640 500 +1140
plugin/skills/microsoft-foundry/models/deploy-model/capacity/SKILL.md 1739 500 +1239
plugin/skills/microsoft-foundry/models/deploy-model/customize/EXAMPLES.md 1091 1000 +91
plugin/skills/microsoft-foundry/models/deploy-model/customize/SKILL.md 2235 500 +1735
plugin/skills/microsoft-foundry/models/deploy-model/customize/references/customize-workflow.md 3335 1000 +2335
plugin/skills/microsoft-foundry/models/deploy-model/preset/SKILL.md 1226 500 +726
plugin/skills/microsoft-foundry/models/deploy-model/preset/references/preset-workflow.md 5534 1000 +4534
plugin/skills/microsoft-foundry/models/deploy-model/preset/references/workflow.md 1315 1000 +315
plugin/skills/microsoft-foundry/project/create/create-foundry-project.md 1346 1000 +346
plugin/skills/microsoft-foundry/quota/quota.md 2129 1000 +1129
plugin/skills/microsoft-foundry/quota/references/capacity-planning.md 1968 1000 +968
plugin/skills/microsoft-foundry/quota/references/error-resolution.md 1141 1000 +141
plugin/skills/microsoft-foundry/quota/references/optimization.md 1846 1000 +846
plugin/skills/microsoft-foundry/quota/references/ptu-guide.md 1473 1000 +473
plugin/skills/microsoft-foundry/quota/references/troubleshooting.md 1807 1000 +807
plugin/skills/microsoft-foundry/quota/references/workflows.md 1614 1000 +614
plugin/skills/microsoft-foundry/rbac/rbac.md 1752 1000 +752
plugin/skills/microsoft-foundry/references/auth-best-practices.md 1543 1000 +543
plugin/skills/microsoft-foundry/references/sdk/foundry-sdk-py.md 2060 1000 +1060
plugin/skills/microsoft-foundry/resource/create/create-foundry-resource.md 1489 1000 +489
plugin/skills/microsoft-foundry/resource/create/references/workflows.md 1637 1000 +637
.github/agents/SkillCreator.agent.md 1044 1000 +44

Consider moving content to references/ subdirectories.


Automated token analysis. See skill authoring guidelines for best practices.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to improve Azure skill routing (especially for Claude 4.5) by reframing skill descriptions toward capability/hand-off language, tightening deploy prompts to reduce misrouting, and cutting integration-test cost by reducing repeated runs per prompt.

Changes:

  • Updated azure-prepare, azure-validate, and azure-deploy skill frontmatter/descriptions and added stronger scope/hand-off guidance.
  • Adjusted deploy integration prompts to include “already prepared” anti-routing signals; reduced RUNS_PER_PROMPT from 5 → 1.
  • Updated trigger keyword snapshots to reflect the new descriptions.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
plugin/skills/azure-deploy/SKILL.md Reframed description + added scope/rules clarifications for deployment execution.
plugin/skills/azure-prepare/SKILL.md Added “preparation only” scope boundary and explicit hand-off messaging.
plugin/skills/azure-validate/SKILL.md Added explicit hand-off guidance to azure-deploy after validation.
tests/azure-deploy/integration.test.ts Reduced runs per prompt and updated 2 prompts to signal “already prepared”.
tests/azure-prepare/integration.test.ts Reduced runs per prompt for lower integration test cost.
tests/azure-deploy/__snapshots__/triggers.test.ts.snap Snapshot update for new azure-deploy description/keywords.
tests/azure-prepare/__snapshots__/triggers.test.ts.snap Snapshot update for new azure-prepare description/keywords.
tests/azure-validate/__snapshots__/triggers.test.ts.snap Snapshot update for new azure-validate description/keywords.

saikoumudi
saikoumudi previously approved these changes Feb 28, 2026
Set systemMessage to { mode: 'append' } by default instead of undefined,
ensuring the Copilot CLI built-in system prompt is always included in
integration test sessions.
Appends 'When a relevant skill is available, prefer using it instead of doing the task manually.' to the CLI system prompt via mode: append, nudging the model to invoke skills rather than executing tasks directly.
Copilot AI review requested due to automatic review settings March 1, 2026 19:24
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

tests/utils/agent-runner.ts:729

  • systemMessage default is duplicated here and in useAgentRunner(). Consider extracting the default prompt object into a shared constant (or helper) so future edits don’t accidentally diverge between the two session creation paths.
      systemMessage: config.systemPrompt ?? { mode: "append", content: "When a relevant skill is available, prefer using it instead of doing the task manually." }
    });

- azure-prepare: Remove handoff sentence ('After preparation, hand off to
  azure-validate then azure-deploy') and 'MUST be invoked FIRST' sentence
  that injected azure-deploy/azure-validate as competing keywords.
  Add explicit WHEN triggers for Terraform, App Service, Container Apps,
  Static Web Apps, and brownfield patterns.
- azure-deploy: Remove 'provision infrastructure' competing trigger,
  add DO NOT USE WHEN guidance for create/build patterns.
Copilot AI review requested due to automatic review settings March 2, 2026 13:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Copilot AI review requested due to automatic review settings March 2, 2026 15:44
Copy link
Contributor

Copilot AI commented Mar 2, 2026

@kvenkatrajan I've opened a new pull request, #1096, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 2, 2026 22:16
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI and others added 2 commits March 2, 2026 14:18
* Initial plan

* fix: replace ⛔ emoji with ⚠️ in azure-prepare SKILL.md per authoring guidelines

Co-authored-by: kvenkatrajan <102772054+kvenkatrajan@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: kvenkatrajan <102772054+kvenkatrajan@users.noreply.github.com>
… table (#1096)

* Initial plan

* fix: add azure-cloud-migrate to SKILL.md Step 0 routing table

Co-authored-by: kvenkatrajan <102772054+kvenkatrajan@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: kvenkatrajan <102772054+kvenkatrajan@users.noreply.github.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

tests/utils/agent-runner.ts:412

  • This change makes all agent sessions implicitly append a "prefer using a skill" system message whenever config.systemPrompt is omitted. Since useAgentRunner() is used across many integration suites, this globally alters test behavior and can mask regressions where the model would otherwise bypass skills. Consider making this opt-in (e.g., a preferSkills flag) or only applying it in the specific tests that are measuring routing, so other suites keep their prior baseline unless they explicitly request the extra steering.
        systemMessage: config.systemPrompt ?? {
          mode: "append",
          content: "When a relevant skill is available, prefer using it instead of doing the task manually."
        }

Copilot AI review requested due to automatic review settings March 3, 2026 00:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Copy link
Contributor

Copilot AI commented Mar 3, 2026

@kvenkatrajan I've opened a new pull request, #1103, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link
Contributor

Copilot AI commented Mar 3, 2026

@kvenkatrajan I've opened a new pull request, #1104, to work on those changes. Once the pull request is ready, I'll request review from you.

* Initial plan

* Replace disallowed ⛔ emoji with ❌ in azure-prepare SKILL.md

Co-authored-by: kvenkatrajan <102772054+kvenkatrajan@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: kvenkatrajan <102772054+kvenkatrajan@users.noreply.github.com>
@kvenkatrajan
Copy link
Collaborator Author

@kvenkatrajan kvenkatrajan merged commit 9847048 into main Mar 3, 2026
11 checks passed
@kvenkatrajan kvenkatrajan deleted the fix/skill-invocation-claude45-v2 branch March 3, 2026 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants