Foundry observability skills update for creating dataset from traces and more. by XOEEst · Pull Request #1219 · microsoft/GitHub-Copilot-for-Azure

XOEEst · 2026-03-11T06:18:46Z

This pull request focuses on improving GitHub Actions workflows, with a major emphasis on updating action versions, enhancing security and reliability, and refactoring the token analysis process for pull requests. The most significant changes are grouped below by theme.

Workflow Action Version Updates and Security Improvements:

Updated all usages of actions/checkout across workflows to version v6 for improved security and consistency. This affects files such as .github/workflows/pr.yml, .github/workflows/eval.yml, .github/workflows/publish-to-microsoft-skills.yml, .github/workflows/publish-to-microsoft-azure-skills.yml, .github/workflows/skill-factory.yml, .github/workflows/test-all-integration.yml, .github/workflows/codeql.yml, and .github/workflows/info-needed-closer.yml. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]
Modified environment variable handling to allow fallback to github.token if COPILOT_GITHUB_TOKEN is not set, improving robustness in .github/workflows/issue-triage.lock.yml. [1] [2] [3] [4]

Token Analysis Refactor and Workflow Separation:

Refactored the token analysis workflow for PRs by:
- Renaming the job from token-check to token-analysis.
- Removing direct commenting from the PR workflow and instead uploading the token analysis report as an artifact.
- Introducing a new workflow, .github/workflows/pr-comment.yml, which is triggered by workflow_run and posts the token analysis comment from the main branch to ensure security and integrity. [1] [2] [3]

Workflow Robustness and Maintainability Enhancements:

Improved shell script safety by validating BASE_REF more strictly to prevent unsafe characters in token comparison and skill version check steps. [1] [2]
Refactored environment variable usage and script logic for better maintainability, including passing changed file lists via environment variables and adjusting file path handling in validation steps. [1] [2] [3]

CodeQL and Language Matrix Improvements:

Added actions to the language matrix for CodeQL analysis, expanding security coverage to GitHub Actions workflows.

These changes collectively modernize the CI/CD workflows, improve security, and provide a more reliable and maintainable process for token analysis and skill validation.

…OPILOT_GITHUB_TOKEN` (#6) * Initial plan * Fix issue triage workflow: add github.token fallback for COPILOT_GITHUB_TOKEN The Issue Triage workflow was failing at the secret validation step because COPILOT_GITHUB_TOKEN was not configured. This adds github.token as a fallback in all 4 places where COPILOT_GITHUB_TOKEN is used for authentication: - agent job: validate-secret step and Execute step - detection job: validate-secret step and Execute step This is consistent with the existing fallback patterns in the workflow (e.g., secrets.GH_AW_GITHUB_MCP_SERVER_TOKEN || secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN) Co-authored-by: XOEEst <18523445+XOEEst@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: XOEEst <18523445+XOEEst@users.noreply.github.com>

* Fix broken auto-create evaluators step in deploy/observe loop The 'Auto-create evaluators & evaluation dataset' step was being skipped when the monolithic agent-observability-loop skill was split into separate deploy and observe skills. Neither skill owned the auto-create step, causing post-deploy users to jump directly to evaluation. Changes: - deploy.md: Replace generic 'set up evaluation?' prompt with automatic 6-step evaluator & dataset creation matching the reference behavior - observe.md: Add Loop Overview, fix entry points to route post-deploy users through auto-setup, add evaluator existence check - deploy-and-setup.md: Make auto-create primary content, demote deploy section to prerequisites Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add content tests for observe/deploy loop logic Tests verify: - observe.md has Loop Overview, post-deploy entry points, evaluator existence checks, behavioral rules, and all reference files - deploy.md has auto-create evaluators section that is automatic (not optional), includes evaluator categories, LLM-judge, artifact persistence, and routes to observe skill Step 2 - deploy-and-setup.md has auto-create as primary content with proper evaluator selection, dataset generation, and user prompt 49 tests total (29 observe + 20 deploy), all passing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: trigger CI checks * Fix * add local dataset gen enforcement * Merge * feat: prefer monitor_resource_log_query and local datasets - Replace azure-kusto delegation with monitor_resource_log_query for App Insights KQL queries in trace.md and troubleshoot.md - Mark evaluation_dataset_create as not available (MCP upload not ready) - Replace server-side dataset sections with local JSONL workflow - Update mcp-gap-analysis.md to reflect practical tool availability Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: make dataset upload restriction more agent-proof - Add Do NOT section at top of trace-to-dataset.md (before Overview) - Add behavioral rule #7 to eval-datasets.md: never upload to cloud - Remove Option A/B structure; Step 4 is now local JSONL only - Eliminates subtle strikethrough formatting that agents miss Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix link * fix: make auto-create evaluators an explicit numbered step - Hosted workflow: add Step 10 after Step 9 with DO NOT stop gate - Prompt workflow: add Step 5 after Step 4 with DO NOT stop gate - Both link to existing After Deployment section as implementation - Prevents agents from treating evaluator setup as optional appendix Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: add dataset update loop with optimization guardrails - Add Dataset Update Loop (eval→compare→analyze→optimize→re-eval) to dataset-versioning.md after Creating a New Version - Add guardrails: never remove dataset rows or weaken evaluators to recover scores after dataset expansion - Add same guardrail to observe optimize-deploy.md Step 6 - Add behavioral rule #8 to eval-datasets.md Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: add subscription parameter warning to trace-related skills Always pass subscription explicitly to Azure MCP tools like monitor_resource_log_query — they don't extract it from resource IDs. Added to trace.md, troubleshoot.md, and trace-to-dataset.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: make customEvents-to-traces eval correlation more obvious - Add Key Concept section to trace-to-dataset.md explaining that eval results live in customEvents (not dependencies) and the join key is gen_ai.response.id - Add table showing dependencies vs customEvents join pattern - Cross-reference trace skill's eval-correlation.md from both trace-to-dataset.md and eval-datasets.md Related Skills Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: improve cross-references and add KQL parse_json warning 1. Add parse_json(customDimensions) warning to Do NOT section 2. Add Related References section with skill-root paths 3. Add skill-root path hints to all cross-skill links 4. Add observe + trace to SKILL.md sub-skill routing table Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: improve hosted agent KQL patterns and content extraction - Add Hosted Agent Harvest template (requests→dependencies join) - Fix Hosted Agent Attributes: appear on both requests and traces - Add gen_ai.agent.name duality callout (Foundry name vs class name) - Remove incorrect azure.ai.agentserver.agent_name fallback from dependencies queries - Document gen_ai.input.messages/gen_ai.output.messages as content source - Add operation_ParentId join example to Span Correlation section - Update search-traces.md hosted agent query to use requests entry point Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: improve trace sub-skills for hosted agent KQL patterns - search-traces: fix hosted agent query to group by operation_ParentId - conversation-detail: add content extraction from invoke_agent spans (gen_ai.input.messages / gen_ai.output.messages) - analyze-failures: add hosted agent gen_ai.agent.name duality warning and hosted agent variant query using requests→dependencies join - analyze-latency: same hosted agent warning and variant query - kql-templates: expand requests table description as preferred entry point; add gen_ai.input/output.messages to attributes table - trace.md: reword rule 6 to clarify hosted vs prompt agent filtering Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: restore routing keywords and update trigger snapshots - Add back critical routing keywords to SKILL.md description (578→779 chars): role assignment, permissions, capacity, region, deployment failure, AI Services, Cognitive Services, provision, knowledge index, monitoring, customize, onboard, availability - Update trigger test snapshots for new keyword set (24 snapshots) - Fix deploy trigger test: Docker IS our capability (remove false negative) - Fix customize-deployment tests: ensure prompts have ≥2 keyword matches - Fix deploy-model-optimal-region tests: use longer prompts for HA/PTU Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: add 'create AI Services' to description for resource/create test Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: bump microsoft-foundry version to 1.0.2 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(eval-datasets): enable Foundry dataset sync via MCP tools - Add Step 5 (Sync to Foundry) to trace-to-dataset pipeline using evaluation_dataset_create with connectionName and project_connection tools - Add server-side version discovery via evaluation_dataset_versions_get - Add dual experiment types to dataset-comparison (agent vs dataset comparison) - Update mcp-gap-analysis: mark resolved tools, update workarounds - Add AzureBlob to project connections reference - Bump microsoft-foundry version to 1.0.3 - Fix upstream section heading changes in unit tests - Update trigger snapshots for upstream keyword changes Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor(dataset-comparison): focus on dataset-version comparison only Remove agent comparison experiment type from dataset-comparison flow. Agent comparison belongs in the observe/eval loop, not the dataset skill. Update all examples to use dataset versions as baseline/treatment. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Remove Playwright MCP server until skills require it (microsoft#1200) * Collapse token analysis comment (microsoft#1147) * update region-availability in prepare/validate/deploy skill (microsoft#1083) * update region-availability in prepare/deploy skill * update * update * fix * update date * Update plugin/skills/azure-deploy/references/region-availability.md * fix ci failure * bump version * build(deps): bump @github/copilot and @github/copilot-sdk in /tests (microsoft#1201) Bumps [@github/copilot](https://github.com/github/copilot-cli) to 1.0.2 and updates ancestor dependency [@github/copilot-sdk](https://github.com/github/copilot-sdk). These dependencies need to be updated together. Updates `@github/copilot` from 0.0.414 to 1.0.2 - [Release notes](https://github.com/github/copilot-cli/releases) - [Changelog](https://github.com/github/copilot-cli/blob/main/changelog.md) - [Commits](github/copilot-cli@v0.0.414...v1.0.2) Updates `@github/copilot-sdk` from 0.1.26 to 0.1.32 - [Release notes](https://github.com/github/copilot-sdk/releases) - [Changelog](https://github.com/github/copilot-sdk/blob/main/CHANGELOG.md) - [Commits](https://github.com/github/copilot-sdk/commits/v0.1.32) --- updated-dependencies: - dependency-name: "@github/copilot" dependency-version: 1.0.2 dependency-type: indirect - dependency-name: "@github/copilot-sdk" dependency-version: 0.1.32 dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * specify application path in prompt (microsoft#1204) * Add AVM (Azure Verified Modules) integration tests (microsoft#1171) * Add AVM (Azure Verified Modules) integration tests Add 3 integration tests validating the AVM module selection hierarchy for Bicep infrastructure generation: - avm-module-priority: Verifies AVM modules prioritized over non-AVM - avm-fallback-behavior: Verifies fallback stays within AVM ecosystem - avm-azd-pattern-preference: Verifies AZD pattern modules preferred Tests validate that the azure-deploy skill enforces the mandatory AVM selection order: Pattern modules > Resource modules > Utility modules, and never falls back to non-AVM alternatives. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add output assertions to AVM integration tests Address Copilot review feedback: add keyword-based output assertions using getAllAssistantMessages/getAllToolText to verify agent responses contain AVM hierarchy terms, not just skill invocation. Includes non-AVM fallback negative check. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Strengthen AVM test output assertions per Copilot review - Split keyword checks into critical-term + context assertions - Add resource-before-utility ordering assertion for fallback test - Expand non-AVM negative check to use regex patterns - Require core keywords (avm+pattern, azd+pattern) explicitly Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: address Copilot round 3 — ordering assertions and context-aware non-AVM check - Add hierarchy ordering assertion to test 1 (pattern before resource/utility) - Make non-AVM detection context-aware: skip matches preceded by negation words (e.g., 'never fall back to non-AVM' is correct behavior, not a false positive) - Add pattern-before-resource ordering assertion to test 3 (AZD pattern preference) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: move AVM integration tests to avm/ subdirectory Move tests/azure-deploy/avm-integration.test.ts to tests/azure-deploy/avm/integration.test.ts so the file matches the **/integration.test.ts glob used by the custom ESLint rule (integration-test-name) and follows the subdirectory convention established by tests/microsoft-foundry/ (e.g. foundry-agent/). Import paths updated from ../utils/ to ../../utils/ to reflect the new depth. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: address round 4 Copilot review feedback - Add 'fall back'/'fall-back' keyword variants for resilience - Extend non-AVM negation check to also scan following context - Use regex for AZD ordering assertion to match plural/prefixed variants Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Update github workflows to use best practices (microsoft#1149) * Early terminate azure-deploy, azure-validate tests (microsoft#1205) Add comment for early termination to help AI grader * Replace script inline parameters with env var (microsoft#1209) * Early terminate azure-deploy tests on deploy link (microsoft#1208) * Early terminate azure-deploy tests on deploy link * Fix lint issue * Reduce char count of existing skills (microsoft#1210) * Reduce char count of existing skills * Update ci tests and snapshots * Enhance benchmark ci run script (microsoft#1176) * Add msbench_benchmarks repo clone to get model definition * Remove unused vars * Use mcp-pr repo before MI has access to msbench-benchmarks repo * Address copilot feedback * Change back to msbench-benchmarks repo * Get ADO token for repo clone * Fix line continuation character * Add run for all interested models * Extract run IDs * Fix yaml format issue * Schedule it to run nightly * Address copilot feedbacks * formalize .foundry and multi-environment support * fix * Feature/azure quotas (microsoft#1137) * update for using azure-quotas in skill * test update * unit test update * path update * add skill in skills.json * skills.json update * reduce the text * version update * skill version * skill description update * reduce text size * 1.0.4 for next prepare version * upload snap shot * update version * test update --------- Co-authored-by: Yinghui Dong <yinghuidong@microsoft.com> * build(deps-dev): bump simple-git from 3.30.0 to 3.32.3 in /tests (microsoft#1213) Bumps [simple-git](https://github.com/steveukx/git-js/tree/HEAD/simple-git) from 3.30.0 to 3.32.3. - [Release notes](https://github.com/steveukx/git-js/releases) - [Changelog](https://github.com/steveukx/git-js/blob/main/simple-git/CHANGELOG.md) - [Commits](https://github.com/steveukx/git-js/commits/simple-git@3.32.3/simple-git) --- updated-dependencies: - dependency-name: simple-git dependency-version: 3.32.3 dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Improve azure-compliance invocation rate (microsoft#1214) * Improve azure-compliance invocation rate * Race condition free report writing * Fix debug logging for report location * Bump skill version * Fix suffix base value * fix * llm judge model and eval group improvement --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Chris Harris <charris@microsoft.com> Co-authored-by: JasonYeMSFT <chuye@microsoft.com> Co-authored-by: xfz11 <81600993+xfz11@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Juan Ospina <70209456+jeo02@users.noreply.github.com> Co-authored-by: Jon Gallant <2163001+jongio@users.noreply.github.com> Co-authored-by: Wes Haggard <weshaggard@users.noreply.github.com> Co-authored-by: Fan Yang <52458914+fanyang-mono@users.noreply.github.com> Co-authored-by: rakal-dyh <33503911+rakal-dyh@users.noreply.github.com> Co-authored-by: Yinghui Dong <yinghuidong@microsoft.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> # Conflicts: # tests/microsoft-foundry/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/foundry-agent/create/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/foundry-agent/deploy/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/foundry-agent/invoke/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/foundry-agent/observe/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/foundry-agent/trace/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/foundry-agent/troubleshoot/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/models/deploy/capacity/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/models/deploy/customize-deployment/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/models/deploy/deploy-model-optimal-region/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/models/deploy/deploy-model/__snapshots__/triggers.test.ts.snap # tests/microsoft-foundry/resource/create/__snapshots__/triggers.test.ts.snap

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Updates the microsoft-foundry skill documentation and test suite to formalize a .foundry/ workspace standard, expand the observability/evaluation loop, and add an eval-datasets sub-skill for trace-to-dataset workflows (including optional dataset sync back to Foundry).

Changes:

Introduces/standardizes .foundry/agent-metadata.yaml + .foundry/{datasets,evaluators,results}/ as the local cache + configuration contract across deploy/observe/trace/eval-datasets docs.
Adds the eval-datasets sub-skill content and accompanying unit/trigger/integration tests.
Updates trigger prompts/snapshots and unit assertions to reflect new routing keywords (prompt optimization) and new/renamed sections.

Reviewed changes

Copilot reviewed 44 out of 44 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/microsoft-foundry/unit.test.ts	Updates unit assertions for new SKILL.md headings, routing keywords, and `.foundry` workspace contract references.
tests/microsoft-foundry/triggers.test.ts	Adds prompt-optimization trigger prompts for the root skill.
tests/microsoft-foundry/resource/create/snapshots/triggers.test.ts.snap	Updates snapshots to reflect updated skill description keywords.
tests/microsoft-foundry/models/deploy/deploy-model/snapshots/triggers.test.ts.snap	Updates snapshots to reflect updated skill description keywords.
tests/microsoft-foundry/models/deploy/deploy-model-optimal-region/snapshots/triggers.test.ts.snap	Updates snapshots to reflect updated skill description keywords.
tests/microsoft-foundry/models/deploy/customize-deployment/snapshots/triggers.test.ts.snap	Updates snapshots to reflect updated skill description keywords.
tests/microsoft-foundry/models/deploy/capacity/snapshots/triggers.test.ts.snap	Updates snapshots to reflect updated skill description keywords.
tests/microsoft-foundry/foundry-agent/troubleshoot/snapshots/triggers.test.ts.snap	Updates snapshots to reflect updated skill description keywords.
tests/microsoft-foundry/foundry-agent/trace/snapshots/triggers.test.ts.snap	Updates snapshots to reflect updated skill description keywords.
tests/microsoft-foundry/foundry-agent/observe/unit.test.ts	Aligns observe tests with `.foundry` cache paths and new guardrails (evalId vs evaluationId, judge deployment lookup).
tests/microsoft-foundry/foundry-agent/observe/snapshots/triggers.test.ts.snap	Updates snapshots to reflect updated skill description keywords.
tests/microsoft-foundry/foundry-agent/invoke/snapshots/triggers.test.ts.snap	Updates snapshots to reflect updated skill description keywords.
tests/microsoft-foundry/foundry-agent/eval-datasets/unit.test.ts	Adds unit tests validating eval-datasets content structure and reference file presence.
tests/microsoft-foundry/foundry-agent/eval-datasets/triggers.test.ts	Adds trigger tests and snapshots for eval-datasets prompts/keywords.
tests/microsoft-foundry/foundry-agent/eval-datasets/integration.test.ts	Adds integration tests ensuring eval-datasets prompts invoke the microsoft-foundry skill.
tests/microsoft-foundry/foundry-agent/eval-datasets/snapshots/triggers.test.ts.snap	Adds new snapshots for eval-datasets trigger keywords/description parsing.
tests/microsoft-foundry/foundry-agent/deploy/unit.test.ts	Updates deploy tests to require judge deployment discovery guidance and `.foundry` persistence expectations.
tests/microsoft-foundry/foundry-agent/deploy/triggers.test.ts	Tweaks a non-trigger prompt to reduce overlap with Foundry monitoring scenarios.
tests/microsoft-foundry/foundry-agent/deploy/snapshots/triggers.test.ts.snap	Updates snapshots to reflect updated skill description keywords.
tests/microsoft-foundry/foundry-agent/create/unit.test.ts	Minor formatting change (blank line).
tests/microsoft-foundry/foundry-agent/create/snapshots/triggers.test.ts.snap	Updates snapshots to reflect updated skill description keywords.
tests/microsoft-foundry/snapshots/triggers.test.ts.snap	Updates root snapshots to reflect updated skill description keywords.
plugin/skills/microsoft-foundry/references/agent-metadata-contract.md	Adds a canonical schema and workflow rules for `.foundry/agent-metadata.yaml`.
plugin/skills/microsoft-foundry/project/connections.md	Documents `AzureBlob` as a connection type for dataset sync scenarios.
plugin/skills/microsoft-foundry/foundry-agent/trace/trace.md	Refactors trace skill to be environment/agent-root scoped via `.foundry` metadata and updates behavioral rules.
plugin/skills/microsoft-foundry/foundry-agent/trace/references/search-traces.md	Updates prerequisites/reminders to persist App Insights connection info into `.foundry/agent-metadata.yaml`.
plugin/skills/microsoft-foundry/foundry-agent/observe/references/evaluate-step.md	Adds env/test-case prerequisites, evalId vs evaluationId guardrails, and judge deployment resolution guidance.
plugin/skills/microsoft-foundry/foundry-agent/observe/references/deploy-and-setup.md	Updates auto-setup to reuse/refresh `.foundry` cache and persist test cases into agent metadata.
plugin/skills/microsoft-foundry/foundry-agent/observe/references/compare-iterate.md	Adds parameter-switch reminder and eval-group immutability guidance.
plugin/skills/microsoft-foundry/foundry-agent/observe/references/cicd-monitoring.md	Updates CI/CD guidance to read test cases and artifacts from `.foundry`.
plugin/skills/microsoft-foundry/foundry-agent/observe/references/analyze-results.md	Updates results persistence path to `.foundry/results/<environment>/...` and prioritization guidance.
plugin/skills/microsoft-foundry/foundry-agent/observe/observe.md	Refactors observe workflow to be `.foundry`-aware (env selection, cache reuse, exact parameter guardrails).
plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/trace-to-dataset.md	Adds optional “sync to Foundry” workflow using AzureBlob storage connection + dataset registration.
plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/mcp-gap-analysis.md	Updates MCP capability notes (dataset create + version listing now available) and remaining gaps.
plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/eval-trending.md	Adds evalId/evaluationId guardrail and immutability warning for trending.
plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/eval-regression.md	Updates manifest path to `.foundry/datasets/manifest.json`.
plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/eval-lineage.md	Updates lineage guidance for `.foundry` and clarifies evalId vs evaluationId usage.
plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/dataset-versioning.md	Updates naming conventions to include environment and adds server-side version discovery guidance.
plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/dataset-organization.md	Updates example paths to `.foundry/datasets/...`.
plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/dataset-curation.md	Updates candidate/versioning paths and manifest references to `.foundry`.
plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/dataset-comparison.md	Reframes comparison around dataset versions (pinned agent) and adds immutability/parameter guardrails.
plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/eval-datasets.md	Adds the eval-datasets skill doc with `.foundry` cache rules and Foundry sync entry point.
plugin/skills/microsoft-foundry/foundry-agent/deploy/deploy.md	Updates deploy guidance to persist deployment context into `.foundry/agent-metadata.yaml` and reuse cache.
plugin/skills/microsoft-foundry/SKILL.md	Bumps version, adds eval-datasets sub-skill, updates lifecycle routing, and documents `.foundry` workspace standard.

plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/trace-to-dataset.md

plugin/skills/microsoft-foundry/foundry-agent/trace/trace.md

tests/microsoft-foundry/foundry-agent/eval-datasets/integration.test.ts

JasonYeMSFT

Please see me comment and resolve the merge conflicts.

# Conflicts: # plugin/skills/microsoft-foundry/foundry-agent/observe/observe.md

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 44 out of 44 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

plugin/skills/microsoft-foundry/SKILL.md:7

SKILL.md was modified but metadata.version was not bumped. Repo guidelines require incrementing metadata.version (semver) in the same PR whenever a skill’s SKILL.md changes.

metadata:
  author: Microsoft
  version: "1.0.4"

plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/mcp-gap-analysis.md

plugin/skills/microsoft-foundry/SKILL.md

plugin/skills/microsoft-foundry/foundry-agent/trace/trace.md

plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/trace-to-dataset.md

plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/eval-datasets.md

JasonYeMSFT · 2026-03-13T00:01:13Z

Please bump the version of the skill.md.

Copilot

Pull request overview

Copilot reviewed 46 out of 46 changed files in this pull request and generated 4 comments.

plugin/skills/microsoft-foundry/SKILL.md

plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/trace-to-dataset.md

plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/dataset-versioning.md

tests/microsoft-foundry/foundry-agent/deploy/triggers.test.ts

Copilot AI and others added 3 commits March 10, 2026 23:35

XOEEst force-pushed the main branch from e9cd275 to 92f37a4 Compare March 11, 2026 06:39

XOEEst and others added 3 commits March 10, 2026 23:43

chore: drop issue triage token fallback

7e19dd0

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

docs: simplify deploy P0 test case guidance

b45bda1

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

chore: align microsoft-foundry version

67d153c

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

XOEEst marked this pull request as ready for review March 11, 2026 07:04

Copilot AI review requested due to automatic review settings March 11, 2026 07:04

Copilot started reviewing on behalf of XOEEst March 11, 2026 07:05 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

plugin/skills/microsoft-foundry/foundry-agent/eval-datasets/references/trace-to-dataset.md Outdated Show resolved Hide resolved

plugin/skills/microsoft-foundry/foundry-agent/trace/trace.md Outdated Show resolved Hide resolved

Fix comments

4dcf65a

github-actions bot mentioned this pull request Mar 12, 2026

[repo-status] Weekly Repo Status — Mar 6 – Mar 12, 2026 #1240

Open

JasonYeMSFT reviewed Mar 12, 2026

View reviewed changes

tests/microsoft-foundry/foundry-agent/eval-datasets/integration.test.ts Outdated Show resolved Hide resolved

JasonYeMSFT requested changes Mar 12, 2026

View reviewed changes

XOEEst and others added 2 commits March 12, 2026 15:39

Merge remote-tracking branch 'upstream/main'

abaa1a3

# Conflicts: # plugin/skills/microsoft-foundry/foundry-agent/observe/observe.md

test: move eval-datasets invocation tests

6862366

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 12, 2026 22:41

Copilot started reviewing on behalf of XOEEst March 12, 2026 22:42 View session

XOEEst requested a review from JasonYeMSFT March 12, 2026 22:43

Copilot AI reviewed Mar 12, 2026

View reviewed changes

Fix comments

85eefe5

bump version

c6dc5b4

Copilot AI review requested due to automatic review settings March 13, 2026 00:06

Copilot started reviewing on behalf of XOEEst March 13, 2026 00:06 View session

Copilot AI reviewed Mar 13, 2026

View reviewed changes

JasonYeMSFT approved these changes Mar 13, 2026

View reviewed changes

JasonYeMSFT merged commit fef6de1 into microsoft:main Mar 13, 2026
15 of 16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Foundry observability skills update for creating dataset from traces and more.#1219

Foundry observability skills update for creating dataset from traces and more.#1219
JasonYeMSFT merged 11 commits intomicrosoft:mainfrom
XOEEst:main

XOEEst commented Mar 11, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JasonYeMSFT left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JasonYeMSFT commented Mar 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

XOEEst commented Mar 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JasonYeMSFT left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JasonYeMSFT commented Mar 13, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants