Add azure-cost-query and azure-cost-forecast skills#1221
Add azure-cost-query and azure-cost-forecast skills#1221taylorak wants to merge 6 commits intomicrosoft:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds two new Azure Cost Management agent skills—azure-cost-query (historical cost querying) and azure-cost-forecast (future cost forecasting)—along with supporting reference documentation and a full test suite to validate metadata, triggering behavior, and integration behavior.
Changes:
- Added SKILL.md + reference docs for azure-cost-query and azure-cost-forecast (schemas, guardrails, examples, error handling).
- Added unit, trigger (with snapshots), and integration tests for both skills.
- Added fixtures to support prompt/examples and sample payloads.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/azure-cost-query/unit.test.ts | Unit tests validating azure-cost-query metadata and required SKILL sections |
| tests/azure-cost-query/triggers.test.ts | Triggering/negative-triggering tests + keyword snapshots for azure-cost-query |
| tests/azure-cost-query/integration.test.ts | End-to-end invocation tests for azure-cost-query using the agent runner |
| tests/azure-cost-query/fixtures/sample.json | Sample prompts + sample request body fixture for azure-cost-query |
| tests/azure-cost-query/snapshots/triggers.test.ts.snap | Snapshot outputs for azure-cost-query trigger keyword extraction |
| tests/azure-cost-forecast/unit.test.ts | Unit tests validating azure-cost-forecast metadata and required SKILL sections |
| tests/azure-cost-forecast/triggers.test.ts | Triggering/negative-triggering tests + keyword snapshots for azure-cost-forecast |
| tests/azure-cost-forecast/integration.test.ts | End-to-end invocation tests for azure-cost-forecast using the agent runner |
| tests/azure-cost-forecast/fixtures/sample.json | Sample prompts + sample request body fixture for azure-cost-forecast |
| tests/azure-cost-forecast/snapshots/triggers.test.ts.snap | Snapshot outputs for azure-cost-forecast trigger keyword extraction |
| plugin/skills/azure-cost-query/SKILL.md | Main skill instructions/workflow for historical cost query construction |
| plugin/skills/azure-cost-query/references/request-body-schema.md | Query API request/response schema reference |
| plugin/skills/azure-cost-query/references/guardrails.md | Query-specific validation rules and constraints |
| plugin/skills/azure-cost-query/references/examples.md | Common Query API request body examples |
| plugin/skills/azure-cost-query/references/error-handling.md | Query API error handling reference |
| plugin/skills/azure-cost-query/references/dimensions-by-scope.md | Dimension availability matrix by scope/agreement type |
| plugin/skills/azure-cost-forecast/SKILL.md | Main skill instructions/workflow for forecast request construction |
| plugin/skills/azure-cost-forecast/references/request-body-schema.md | Forecast API request/response schema reference |
| plugin/skills/azure-cost-forecast/references/guardrails.md | Forecast-specific guardrails (future-to, training data, row limits, etc.) |
| plugin/skills/azure-cost-forecast/references/examples.md | Common Forecast API request body examples + scope URL reference |
| plugin/skills/azure-cost-forecast/references/error-handling.md | Forecast API error handling reference |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 22 out of 22 changed files in this pull request and generated 4 comments.
You can also share your feedback on Copilot code review. Take the survey.
| > ⚠️ **Warning:** Key time period guardrails: | ||
| > - **Daily granularity**: max **31 days** | ||
| > - **Monthly/None granularity**: max **12 months** | ||
| > - `Custom` timeframe **requires** a `timePeriod` object with `from` and `to` dates | ||
| > - Future dates are not allowed for historical queries | ||
| > | ||
| > See [guardrails.md](./references/guardrails.md) for the complete set of validation rules. |
There was a problem hiding this comment.
The SKILL.md says “Future dates are not allowed for historical queries”, but the guardrails reference explicitly documents silent future-date adjustments (shift to last year / clamp to to today). These two sources conflict and could lead the agent to provide incorrect guidance. Align the SKILL.md warning with the guardrails behavior (either describe the adjustment behavior or state the agent must pre-normalize dates to avoid silent shifts).
| # Query using REST API (more reliable than az costmanagement query) | ||
| az rest --method post ` | ||
| --url "<scope>/providers/Microsoft.CostManagement/query?api-version=2023-11-01" ` | ||
| --body '@temp/cost-query.json' |
There was a problem hiding this comment.
The execute example uses --url "<scope>/providers/...", while elsewhere this skill (and other skills like azure-cost-optimization) use either a leading /subscriptions/... path or a full https://management.azure.com/... URL. Using <scope> is ambiguous (missing leading / or ARM host) and is easy to copy/paste into a non-working command. Consider standardizing on one format (preferably https://management.azure.com{scope}/providers/... or {scope} explicitly documented as a path starting with /).
| ### Filter | ||
|
|
||
| Filter expressions restrict which cost records are included. Filters support logical operators (`And`, `Or`, `Not`) and comparison operators on dimensions or tags. | ||
|
|
||
| #### Filter Expression Structure | ||
|
|
||
| ```json | ||
| "filter": { | ||
| "And": [ | ||
| { | ||
| "Dimensions": { | ||
| "Name": "ResourceGroupName", | ||
| "Operator": "In", | ||
| "Values": ["rg-prod", "rg-staging"] | ||
| } | ||
| }, | ||
| { | ||
| "Not": { | ||
| "Tags": { | ||
| "Name": "Environment", | ||
| "Operator": "Equal", | ||
| "Values": ["dev"] | ||
| } | ||
| } | ||
| } | ||
| ] | ||
| } | ||
| ``` |
There was a problem hiding this comment.
This skill references filter fields as dimensions in SKILL.md, but the schema/examples here use Dimensions/Tags with Name/Operator/Values. The casing/shape should be consistent across the skill and match the actual Cost Management Query API schema; otherwise agents may generate invalid request bodies. Please verify the correct JSON shape from the official docs and update SKILL.md + all references/examples to use the same field names.
| describe("Should NOT Trigger", () => { | ||
| const shouldNotTriggerPrompts: string[] = [ | ||
| // Forecast skill (should not trigger cost-query) | ||
| "Predict the budget for next quarter", | ||
| "What will the projected budget look like next quarter?", | ||
| // Optimization skill (should not trigger cost-query) | ||
| "Find orphaned resources and rightsize VMs", | ||
| "Reduce waste and optimize cloud expenses", | ||
| // Deployment (different skill) | ||
| "Deploy a new VM to Azure", | ||
| // Wrong cloud provider | ||
| "Set up an AWS budget", | ||
| // Unrelated | ||
| "Write a Python script", | ||
| "Help me write a poem", | ||
| ]; |
There was a problem hiding this comment.
PR description says trigger tests include cross-skill shouldNotTrigger checks in both directions between cost-query, cost-forecast, and cost-optimization. These new trigger tests cover negative prompts for forecast/optimization, but the existing azure-cost-optimization trigger tests don’t include reciprocal negative prompts for cost-query/cost-forecast routing. Either update the PR description to match what’s implemented, or add the missing reciprocal shouldNotTrigger prompts to the azure-cost-optimization trigger tests to complete the ↔ coverage.
|
@taylorak With the two new skills, the total char count of skill exceeds the char count budget of Copilot CLI. @saikoumudi Could you help consolidate the new content with the existing azure-cost-optimization skill? Maybe rebrand the azure-cost-optimization skill to cover a broader range of cost related capabilities. |
|
@JasonYeMSFT can you explain in more detail how to get around the char count budget? and what the char count budget is for? |
|
@taylorak The skill description char count is computed by putting the "description" section of each skill's SKILL.md together and count how many characters are there after formatting. https://github.com/microsoft/GitHub-Copilot-for-Azure/blob/main/scripts/src/copilot-cli-char-budget.ts The best way to mitigate the limitation is to reuse existing skills. If a new skill has to be added, it should try to make the description as concise as possible. |
|
@JasonYeMSFT what is stopping us from increasing the limit when adding new skills? The existing skills are already very close to the limit. It leaves almost nothing for new skills. Is the plan to keep only the existing set of skills and not expand much more? |
|
@taylorak The plan is to making sure skill descriptions of existing skills fit within the budget. If no existing skill can be reused as is, we accept rebranding conslidating new content with existing skills via rebranding. For example, the existing azure-cost-optimization skill may be rebranded to azure-cost which covers both what azure-cost-optimization has and what you are going to add. This would allow it to support more scenarios without introducing significantly more chars in skill descriptions. |
|
makes sense @JasonYeMSFT. Thank you for the explanation. So, the goal would be to reduce the description size by combining the skills for querying for historical costs, forecasting future costs, and doing cost optimization into a single azure cost skill. You said @saikoumudi is the contact for cost optimization and she can help me merge the skills? There's also a limit on the size of the skill itself right? so combining might cause issues with that? |
|
@taylorak @saikoumudi is the author for the existing azure-cost-optimization skill. She can help you consolidate the content to make sure the new skill covers both old and new capabilities. Unlike skill description, there is no hard limit for size of skill content. Although having too much content for a skill in the same context window will hurt the agent performance, it's hard to tell where the tipping point is. It's more of a recommendation than a requirement to make the skill content concise. On the other hand, if the skill description exceeds the budget, the agent will truncate some skill, making it not presented to the LLM and cause the skill to not work at all. |
Introduces two new skills for Azure Cost Management:
azure-cost-query
Guides the agent through querying historical Azure cost data via the Cost Management Query API. Supports breakdowns by service, resource, location, tag, and other
dimensions across subscriptions, resource groups, and billing accounts.
azure-cost-forecast
Guides the agent through constructing and executing Azure Cost Management Forecast API requests to project future spending, with built-in time-period guardrails and
training-data validation.
What's included
false positives
Cross-skill design
Both descriptions include WHEN: and DO NOT USE FOR: clauses to help the agent route between the three cost skills (query, forecast, optimization). Trigger tests
include negative prompts from sibling skills to validate correct routing.