-
Notifications
You must be signed in to change notification settings - Fork 263
Enable metric collection on long running cluster test pipeline. #4204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…scheduled runs from eastus2 to westus2 for TIP session availability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Integrates persistent cluster metrics emission into the swiftv2 long-running cluster test pipeline to improve observability of per-stage test outcomes.
Changes:
- Adds a
BuildMetricsBinaryjob that builds and publishes thepersistent-cluster-metricsbinary as a pipeline artifact. - Wraps each datapath test stage (
create,connectivity,private endpoint,scale,delete) to emit success/failure metrics aftergo test. - Switches the pipeline default deployment region from
eastus2towestus2.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
.pipelines/swiftv2-long-running/template/datapath-tests-stage.yaml |
Builds/downloads metrics binary and emits success/failure metrics per test job; updates storage RBAC script paths for multi-repo layout. |
.pipelines/swiftv2-long-running/pipeline.yaml |
Changes default location parameter to westus2. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
.pipelines/swiftv2-long-running/template/datapath-tests-stage.yaml
Outdated
Show resolved
Hide resolved
|
/azp run Azure Container Networking PR |
|
Azure Pipelines successfully started running 1 pipeline(s). |
.pipelines/swiftv2-long-running/template/datapath-tests-stage.yaml
Outdated
Show resolved
Hide resolved
|
/azp run Azure Container Networking PR |
|
Azure Pipelines successfully started running 1 pipeline(s). |
* Enable metric collection on long running cluster test pipeline. Move scheduled runs from eastus2 to westus2 for TIP session availability. * Refactor: extract metrics setup steps into reusable template * remove redundant dependencies. --------- Co-authored-by: sivakami <[email protected]>
Reason for Change:
Add observability to the swiftv2 long running cluster test pipeline by integrating metrics collection.
Issue Fixed:
This PR integrates the persistent-cluster-metrics tool to report test outcomes (success/failure) for each test stage to our telemetry system.
test run on long running cluster pipeline
Changes:
Build the metrics binary from Networking-Aquarius repo during pipeline execution
After each test (pod creation, connectivity, private endpoint, scale, cleanup), report the result as a metric
Switch default region to westus2 (temporary - eastus2 TIP sessions unavailable)
Metrics emitted:
https://amg-persistent-metrics-ccawb9g7hecucsh2.wus2.grafana.azure.com/goto/a0zNbRHDR?orgId=1
PersistentClusterTestsSucceeded - when a test passes
PersistentClusterTestsFailed - when a test fails
Each metric includes: ADO run ID, resource group, work load type, and test scenario name.