Foundry observability skills update for creating dataset from traces and more.#1219
Merged
JasonYeMSFT merged 11 commits intomicrosoft:mainfrom Mar 13, 2026
Merged
Foundry observability skills update for creating dataset from traces and more.#1219JasonYeMSFT merged 11 commits intomicrosoft:mainfrom
JasonYeMSFT merged 11 commits intomicrosoft:mainfrom
Conversation
…OPILOT_GITHUB_TOKEN` (#6) * Initial plan * Fix issue triage workflow: add github.token fallback for COPILOT_GITHUB_TOKEN The Issue Triage workflow was failing at the secret validation step because COPILOT_GITHUB_TOKEN was not configured. This adds github.token as a fallback in all 4 places where COPILOT_GITHUB_TOKEN is used for authentication: - agent job: validate-secret step and Execute step - detection job: validate-secret step and Execute step This is consistent with the existing fallback patterns in the workflow (e.g., secrets.GH_AW_GITHUB_MCP_SERVER_TOKEN || secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN) Co-authored-by: XOEEst <18523445+XOEEst@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: XOEEst <18523445+XOEEst@users.noreply.github.com>
* Fix broken auto-create evaluators step in deploy/observe loop The 'Auto-create evaluators & evaluation dataset' step was being skipped when the monolithic agent-observability-loop skill was split into separate deploy and observe skills. Neither skill owned the auto-create step, causing post-deploy users to jump directly to evaluation. Changes: - deploy.md: Replace generic 'set up evaluation?' prompt with automatic 6-step evaluator & dataset creation matching the reference behavior - observe.md: Add Loop Overview, fix entry points to route post-deploy users through auto-setup, add evaluator existence check - deploy-and-setup.md: Make auto-create primary content, demote deploy section to prerequisites Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add content tests for observe/deploy loop logic Tests verify: - observe.md has Loop Overview, post-deploy entry points, evaluator existence checks, behavioral rules, and all reference files - deploy.md has auto-create evaluators section that is automatic (not optional), includes evaluator categories, LLM-judge, artifact persistence, and routes to observe skill Step 2 - deploy-and-setup.md has auto-create as primary content with proper evaluator selection, dataset generation, and user prompt 49 tests total (29 observe + 20 deploy), all passing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: trigger CI checks * Fix * add local dataset gen enforcement * Merge * feat: prefer monitor_resource_log_query and local datasets - Replace azure-kusto delegation with monitor_resource_log_query for App Insights KQL queries in trace.md and troubleshoot.md - Mark evaluation_dataset_create as not available (MCP upload not ready) - Replace server-side dataset sections with local JSONL workflow - Update mcp-gap-analysis.md to reflect practical tool availability Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: make dataset upload restriction more agent-proof - Add Do NOT section at top of trace-to-dataset.md (before Overview) - Add behavioral rule #7 to eval-datasets.md: never upload to cloud - Remove Option A/B structure; Step 4 is now local JSONL only - Eliminates subtle strikethrough formatting that agents miss Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix link * fix: make auto-create evaluators an explicit numbered step - Hosted workflow: add Step 10 after Step 9 with DO NOT stop gate - Prompt workflow: add Step 5 after Step 4 with DO NOT stop gate - Both link to existing After Deployment section as implementation - Prevents agents from treating evaluator setup as optional appendix Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: add dataset update loop with optimization guardrails - Add Dataset Update Loop (eval→compare→analyze→optimize→re-eval) to dataset-versioning.md after Creating a New Version - Add guardrails: never remove dataset rows or weaken evaluators to recover scores after dataset expansion - Add same guardrail to observe optimize-deploy.md Step 6 - Add behavioral rule #8 to eval-datasets.md Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: add subscription parameter warning to trace-related skills Always pass subscription explicitly to Azure MCP tools like monitor_resource_log_query — they don't extract it from resource IDs. Added to trace.md, troubleshoot.md, and trace-to-dataset.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: make customEvents-to-traces eval correlation more obvious - Add Key Concept section to trace-to-dataset.md explaining that eval results live in customEvents (not dependencies) and the join key is gen_ai.response.id - Add table showing dependencies vs customEvents join pattern - Cross-reference trace skill's eval-correlation.md from both trace-to-dataset.md and eval-datasets.md Related Skills Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: improve cross-references and add KQL parse_json warning 1. Add parse_json(customDimensions) warning to Do NOT section 2. Add Related References section with skill-root paths 3. Add skill-root path hints to all cross-skill links 4. Add observe + trace to SKILL.md sub-skill routing table Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: improve hosted agent KQL patterns and content extraction - Add Hosted Agent Harvest template (requests→dependencies join) - Fix Hosted Agent Attributes: appear on both requests and traces - Add gen_ai.agent.name duality callout (Foundry name vs class name) - Remove incorrect azure.ai.agentserver.agent_name fallback from dependencies queries - Document gen_ai.input.messages/gen_ai.output.messages as content source - Add operation_ParentId join example to Span Correlation section - Update search-traces.md hosted agent query to use requests entry point Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: improve trace sub-skills for hosted agent KQL patterns - search-traces: fix hosted agent query to group by operation_ParentId - conversation-detail: add content extraction from invoke_agent spans (gen_ai.input.messages / gen_ai.output.messages) - analyze-failures: add hosted agent gen_ai.agent.name duality warning and hosted agent variant query using requests→dependencies join - analyze-latency: same hosted agent warning and variant query - kql-templates: expand requests table description as preferred entry point; add gen_ai.input/output.messages to attributes table - trace.md: reword rule 6 to clarify hosted vs prompt agent filtering Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: restore routing keywords and update trigger snapshots - Add back critical routing keywords to SKILL.md description (578→779 chars): role assignment, permissions, capacity, region, deployment failure, AI Services, Cognitive Services, provision, knowledge index, monitoring, customize, onboard, availability - Update trigger test snapshots for new keyword set (24 snapshots) - Fix deploy trigger test: Docker IS our capability (remove false negative) - Fix customize-deployment tests: ensure prompts have ≥2 keyword matches - Fix deploy-model-optimal-region tests: use longer prompts for HA/PTU Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: add 'create AI Services' to description for resource/create test Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: bump microsoft-foundry version to 1.0.2 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(eval-datasets): enable Foundry dataset sync via MCP tools - Add Step 5 (Sync to Foundry) to trace-to-dataset pipeline using evaluation_dataset_create with connectionName and project_connection tools - Add server-side version discovery via evaluation_dataset_versions_get - Add dual experiment types to dataset-comparison (agent vs dataset comparison) - Update mcp-gap-analysis: mark resolved tools, update workarounds - Add AzureBlob to project connections reference - Bump microsoft-foundry version to 1.0.3 - Fix upstream section heading changes in unit tests - Update trigger snapshots for upstream keyword changes Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor(dataset-comparison): focus on dataset-version comparison only Remove agent comparison experiment type from dataset-comparison flow. Agent comparison belongs in the observe/eval loop, not the dataset skill. Update all examples to use dataset versions as baseline/treatment. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Remove Playwright MCP server until skills require it (microsoft#1200) * Collapse token analysis comment (microsoft#1147) * update region-availability in prepare/validate/deploy skill (microsoft#1083) * update region-availability in prepare/deploy skill * update * update * fix * update date * Update plugin/skills/azure-deploy/references/region-availability.md * fix ci failure * bump version * build(deps): bump @github/copilot and @github/copilot-sdk in /tests (microsoft#1201) Bumps [@github/copilot](https://github.com/github/copilot-cli) to 1.0.2 and updates ancestor dependency [@github/copilot-sdk](https://github.com/github/copilot-sdk). These dependencies need to be updated together. Updates `@github/copilot` from 0.0.414 to 1.0.2 - [Release notes](https://github.com/github/copilot-cli/releases) - [Changelog](https://github.com/github/copilot-cli/blob/main/changelog.md) - [Commits](github/copilot-cli@v0.0.414...v1.0.2) Updates `@github/copilot-sdk` from 0.1.26 to 0.1.32 - [Release notes](https://github.com/github/copilot-sdk/releases) - [Changelog](https://github.com/github/copilot-sdk/blob/main/CHANGELOG.md) - [Commits](https://github.com/github/copilot-sdk/commits/v0.1.32) --- updated-dependencies: - dependency-name: "@github/copilot" dependency-version: 1.0.2 dependency-type: indirect - dependency-name: "@github/copilot-sdk" dependency-version: 0.1.32 dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * specify application path in prompt (microsoft#1204) * Add AVM (Azure Verified Modules) integration tests (microsoft#1171) * Add AVM (Azure Verified Modules) integration tests Add 3 integration tests validating the AVM module selection hierarchy for Bicep infrastructure generation: - avm-module-priority: Verifies AVM modules prioritized over non-AVM - avm-fallback-behavior: Verifies fallback stays within AVM ecosystem - avm-azd-pattern-preference: Verifies AZD pattern modules preferred Tests validate that the azure-deploy skill enforces the mandatory AVM selection order: Pattern modules > Resource modules > Utility modules, and never falls back to non-AVM alternatives. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add output assertions to AVM integration tests Address Copilot review feedback: add keyword-based output assertions using getAllAssistantMessages/getAllToolText to verify agent responses contain AVM hierarchy terms, not just skill invocation. Includes non-AVM fallback negative check. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Strengthen AVM test output assertions per Copilot review - Split keyword checks into critical-term + context assertions - Add resource-before-utility ordering assertion for fallback test - Expand non-AVM negative check to use regex patterns - Require core keywords (avm+pattern, azd+pattern) explicitly Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: address Copilot round 3 — ordering assertions and context-aware non-AVM check - Add hierarchy ordering assertion to test 1 (pattern before resource/utility) - Make non-AVM detection context-aware: skip matches preceded by negation words (e.g., 'never fall back to non-AVM' is correct behavior, not a false positive) - Add pattern-before-resource ordering assertion to test 3 (AZD pattern preference) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: move AVM integration tests to avm/ subdirectory Move tests/azure-deploy/avm-integration.test.ts to tests/azure-deploy/avm/integration.test.ts so the file matches the **/integration.test.ts glob used by the custom ESLint rule (integration-test-name) and follows the subdirectory convention established by tests/microsoft-foundry/ (e.g. foundry-agent/). Import paths updated from ../utils/ to ../../utils/ to reflect the new depth. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: address round 4 Copilot review feedback - Add 'fall back'/'fall-back' keyword variants for resilience - Extend non-AVM negation check to also scan following context - Use regex for AZD ordering assertion to match plural/prefixed variants Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Update github workflows to use best practices (microsoft#1149) * Early terminate azure-deploy, azure-validate tests (microsoft#1205) Add comment for early termination to help AI grader * Replace script inline parameters with env var (microsoft#1209) * Early terminate azure-deploy tests on deploy link (microsoft#1208) * Early terminate azure-deploy tests on deploy link * Fix lint issue * Reduce char count of existing skills (microsoft#1210) * Reduce char count of existing skills * Update ci tests and snapshots * Enhance benchmark ci run script (microsoft#1176) * Add msbench_benchmarks repo clone to get model definition * Remove unused vars * Use mcp-pr repo before MI has access to msbench-benchmarks repo * Address copilot feedback * Change back to msbench-benchmarks repo * Get ADO token for repo clone * Fix line continuation character * Add run for all interested models * Extract run IDs * Fix yaml format issue * Schedule it to run nightly * Address copilot feedbacks * formalize .foundry and multi-environment support * fix * Feature/azure quotas (microsoft#1137) * update for using azure-quotas in skill * test update * unit test update * path update * add skill in skills.json * skills.json update * reduce the text * version update * skill version * skill description update * reduce text size * 1.0.4 for next prepare version * upload snap shot * update version * test update --------- Co-authored-by: Yinghui Dong <yinghuidong@microsoft.com> * build(deps-dev): bump simple-git from 3.30.0 to 3.32.3 in /tests (microsoft#1213) Bumps [simple-git](https://github.com/steveukx/git-js/tree/HEAD/simple-git) from 3.30.0 to 3.32.3. - [Release notes](https://github.com/steveukx/git-js/releases) - [Changelog](https://github.com/steveukx/git-js/blob/main/simple-git/CHANGELOG.md) - [Commits](https://github.com/steveukx/git-js/commits/simple-git@3.32.3/simple-git) --- updated-dependencies: - dependency-name: simple-git dependency-version: 3.32.3 dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Improve azure-compliance invocation rate (microsoft#1214) * Improve azure-compliance invocation rate * Race condition free report writing * Fix debug logging for report location * Bump skill version * Fix suffix base value * fix * llm judge model and eval group improvement --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Chris Harris <charris@microsoft.com> Co-authored-by: JasonYeMSFT <chuye@microsoft.com> Co-authored-by: xfz11 <81600993+xfz11@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Juan Ospina <70209456+jeo02@users.noreply.github.com> Co-authored-by: Jon Gallant <2163001+jongio@users.noreply.github.com> Co-authored-by: Wes Haggard <weshaggard@users.noreply.github.com> Co-authored-by: Fan Yang <52458914+fanyang-mono@users.noreply.github.com> Co-authored-by: rakal-dyh <33503911+rakal-dyh@users.noreply.github.com> Co-authored-by: Yinghui Dong <yinghuidong@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> # Conflicts: # tests/microsoft-foundry/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/foundry-agent/create/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/foundry-agent/deploy/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/foundry-agent/invoke/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/foundry-agent/observe/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/foundry-agent/trace/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/foundry-agent/troubleshoot/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/models/deploy/capacity/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/models/deploy/customize-deployment/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/models/deploy/deploy-model-optimal-region/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/models/deploy/deploy-model/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/resource/create/__snapshots__/triggers.test.ts.snap
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Updates the microsoft-foundry skill documentation and test suite to formalize a .foundry/ workspace standard, expand the observability/evaluation loop, and add an eval-datasets sub-skill for trace-to-dataset workflows (including optional dataset sync back to Foundry).
Changes:
- Introduces/standardizes
.foundry/agent-metadata.yaml+.foundry/{datasets,evaluators,results}/as the local cache + configuration contract across deploy/observe/trace/eval-datasets docs. - Adds the
eval-datasetssub-skill content and accompanying unit/trigger/integration tests. - Updates trigger prompts/snapshots and unit assertions to reflect new routing keywords (prompt optimization) and new/renamed sections.
Reviewed changes
Copilot reviewed 44 out of 44 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/microsoft-foundry/unit.test.ts | Updates unit assertions for new SKILL.md headings, routing keywords, and .foundry workspace contract references. |
| tests/microsoft-foundry/triggers.test.ts | Adds prompt-optimization trigger prompts for the root skill. |
| tests/microsoft-foundry/resource/create/snapshots/triggers.test.ts.snap | Updates snapshots to reflect updated skill description keywords. |
| tests/microsoft-foundry/models/deploy/deploy-model/snapshots/triggers.test.ts.snap | Updates snapshots to reflect updated skill description keywords. |
| tests/microsoft-foundry/models/deploy/deploy-model-optimal-region/snapshots/triggers.test.ts.snap | Updates snapshots to reflect updated skill description keywords. |
| tests/microsoft-foundry/models/deploy/customize-deployment/snapshots/triggers.test.ts.snap | Updates snapshots to reflect updated skill description keywords. |
| tests/microsoft-foundry/models/deploy/capacity/snapshots/triggers.test.ts.snap | Updates snapshots to reflect updated skill description keywords. |
| tests/microsoft-foundry/foundry-agent/troubleshoot/snapshots/triggers.test.ts.snap | Updates snapshots to reflect updated skill description keywords. |
| tests/microsoft-foundry/foundry-agent/trace/snapshots/triggers.test.ts.snap | Updates snapshots to reflect updated skill description keywords. |
| tests/microsoft-foundry/foundry-agent/observe/unit.test.ts | Aligns observe tests with .foundry cache paths and new guardrails (evalId vs evaluationId, judge deployment lookup). |
| tests/microsoft-foundry/foundry-agent/observe/snapshots/triggers.test.ts.snap | Updates snapshots to reflect updated skill description keywords. |
| tests/microsoft-foundry/foundry-agent/invoke/snapshots/triggers.test.ts.snap | Updates snapshots to reflect updated skill description keywords. |
| tests/microsoft-foundry/foundry-agent/eval-datasets/unit.test.ts | Adds unit tests validating eval-datasets content structure and reference file presence. |
| tests/microsoft-foundry/foundry-agent/eval-datasets/triggers.test.ts | Adds trigger tests and snapshots for eval-datasets prompts/keywords. |
| tests/microsoft-foundry/foundry-agent/eval-datasets/integration.test.ts | Adds integration tests ensuring eval-datasets prompts invoke the microsoft-foundry skill. |
| tests/microsoft-foundry/foundry-agent/eval-datasets/snapshots/triggers.test.ts.snap | Adds new snapshots for eval-datasets trigger keywords/description parsing. |
| tests/microsoft-foundry/foundry-agent/deploy/unit.test.ts | Updates deploy tests to require judge deployment discovery guidance and .foundry persistence expectations. |
| tests/microsoft-foundry/foundry-agent/deploy/triggers.test.ts | Tweaks a non-trigger prompt to reduce overlap with Foundry monitoring scenarios. |
| tests/microsoft-foundry/foundry-agent/deploy/snapshots/triggers.test.ts.snap | Updates snapshots to reflect updated skill description keywords. |
| tests/microsoft-foundry/foundry-agent/create/unit.test.ts | Minor formatting change (blank line). |
| tests/microsoft-foundry/foundry-agent/create/snapshots/triggers.test.ts.snap | Updates snapshots to reflect updated skill description keywords. |
| tests/microsoft-foundry/snapshots/triggers.test.ts.snap | Updates root snapshots to reflect updated skill description keywords. |
| plugin/skills/microsoft-foundry/references/agent-metadata-contract.md | Adds a canonical schema and workflow rules for .foundry/agent-metadata.yaml. |
| plugin/skills/microsoft-foundry/project/connections.md | Documents AzureBlob as a connection type for dataset sync scenarios. |
| plugin/skills/microsoft-foundry/foundry-agent/trace/trace.md | Refactors trace skill to be environment/agent-root scoped via .foundry metadata and updates behavioral rules. |
| plugin/skills/microsoft-foundry/foundry-agent/trace/references/search-traces.md | Updates prerequisites/reminders to persist App Insights connection info into .foundry/agent-metadata.yaml. |
| plugin/skills/microsoft-foundry/foundry-agent/observe/references/evaluate-step.md | Adds env/test-case prerequisites, evalId vs evaluationId guardrails, and judge deployment resolution guidance. |
| plugin/skills/microsoft-foundry/foundry-agent/observe/references/deploy-and-setup.md | Updates auto-setup to reuse/refresh .foundry cache and persist test cases into agent metadata. |
| plugin/skills/microsoft-foundry/foundry-agent/observe/references/compare-iterate.md | Adds parameter-switch reminder and eval-group immutability guidance. |
| plugin/skills/microsoft-foundry/foundry-agent/observe/references/cicd-monitoring.md | Updates CI/CD guidance to read test cases and artifacts from .foundry. |
| plugin/skills/microsoft-foundry/foundry-agent/observe/references/analyze-results.md | Updates results persistence path to .foundry/results/<environment>/... and prioritization guidance. |
| plugin/skills/microsoft-foundry/foundry-agent/observe/observe.md | Refactors observe workflow to be .foundry-aware (env selection, cache reuse, exact parameter guardrails). |
| plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/trace-to-dataset.md | Adds optional “sync to Foundry” workflow using AzureBlob storage connection + dataset registration. |
| plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/mcp-gap-analysis.md | Updates MCP capability notes (dataset create + version listing now available) and remaining gaps. |
| plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/eval-trending.md | Adds evalId/evaluationId guardrail and immutability warning for trending. |
| plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/eval-regression.md | Updates manifest path to .foundry/datasets/manifest.json. |
| plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/eval-lineage.md | Updates lineage guidance for .foundry and clarifies evalId vs evaluationId usage. |
| plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/dataset-versioning.md | Updates naming conventions to include environment and adds server-side version discovery guidance. |
| plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/dataset-organization.md | Updates example paths to .foundry/datasets/.... |
| plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/dataset-curation.md | Updates candidate/versioning paths and manifest references to .foundry. |
| plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/dataset-comparison.md | Reframes comparison around dataset versions (pinned agent) and adds immutability/parameter guardrails. |
| plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/eval-datasets.md | Adds the eval-datasets skill doc with .foundry cache rules and Foundry sync entry point. |
| plugin/skills/microsoft-foundry/foundry-agent/deploy/deploy.md | Updates deploy guidance to persist deployment context into .foundry/agent-metadata.yaml and reuse cache. |
| plugin/skills/microsoft-foundry/SKILL.md | Bumps version, adds eval-datasets sub-skill, updates lifecycle routing, and documents .foundry workspace standard. |
plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/trace-to-dataset.md
Outdated
Show resolved
Hide resolved
JasonYeMSFT
reviewed
Mar 12, 2026
tests/microsoft-foundry/foundry-agent/eval-datasets/integration.test.ts
Outdated
Show resolved
Hide resolved
JasonYeMSFT
requested changes
Mar 12, 2026
Member
JasonYeMSFT
left a comment
There was a problem hiding this comment.
Please see me comment and resolve the merge conflicts.
# Conflicts: # plugin/skills/microsoft-foundry/foundry-agent/observe/observe.md
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 44 out of 44 changed files in this pull request and generated 5 comments.
Comments suppressed due to low confidence (1)
plugin/skills/microsoft-foundry/SKILL.md:7
SKILL.mdwas modified butmetadata.versionwas not bumped. Repo guidelines require incrementingmetadata.version(semver) in the same PR whenever a skill’sSKILL.mdchanges.
metadata:
author: Microsoft
version: "1.0.4"
plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/mcp-gap-analysis.md
Outdated
Show resolved
Hide resolved
plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/trace-to-dataset.md
Show resolved
Hide resolved
plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/eval-datasets.md
Show resolved
Hide resolved
Member
|
Please bump the version of the skill.md. |
plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/trace-to-dataset.md
Show resolved
Hide resolved
plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/dataset-versioning.md
Show resolved
Hide resolved
JasonYeMSFT
approved these changes
Mar 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request focuses on improving GitHub Actions workflows, with a major emphasis on updating action versions, enhancing security and reliability, and refactoring the token analysis process for pull requests. The most significant changes are grouped below by theme.
Workflow Action Version Updates and Security Improvements:
actions/checkoutacross workflows to versionv6for improved security and consistency. This affects files such as.github/workflows/pr.yml,.github/workflows/eval.yml,.github/workflows/publish-to-microsoft-skills.yml,.github/workflows/publish-to-microsoft-azure-skills.yml,.github/workflows/skill-factory.yml,.github/workflows/test-all-integration.yml,.github/workflows/codeql.yml, and.github/workflows/info-needed-closer.yml. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]github.tokenifCOPILOT_GITHUB_TOKENis not set, improving robustness in.github/workflows/issue-triage.lock.yml. [1] [2] [3] [4]Token Analysis Refactor and Workflow Separation:
token-checktotoken-analysis..github/workflows/pr-comment.yml, which is triggered byworkflow_runand posts the token analysis comment from the main branch to ensure security and integrity. [1] [2] [3]Workflow Robustness and Maintainability Enhancements:
BASE_REFmore strictly to prevent unsafe characters in token comparison and skill version check steps. [1] [2]CodeQL and Language Matrix Improvements:
actionsto the language matrix for CodeQL analysis, expanding security coverage to GitHub Actions workflows.These changes collectively modernize the CI/CD workflows, improve security, and provide a more reliable and maintainable process for token analysis and skill validation.