Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: new test-health page #28398

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 23 additions & 18 deletions config/_default/menus/main.en.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4622,96 +4622,101 @@ menu:
parent: tests_flaky_test_management
identifier: tests_auto_test_retries
weight: 702
- name: Test Health
url: tests/test_health
parent: tests
identifier: test_health
weight: 8
- name: Test Impact Analysis
url: tests/test_impact_analysis
parent: tests
identifier: test_impact_analysis
weight: 8
weight: 9
- name: Setup
url: tests/test_impact_analysis/setup/
parent: test_impact_analysis
identifier: test_impact_analysis_setup
weight: 801
weight: 901
- name: .NET
url: tests/test_impact_analysis/setup/dotnet/
parent: test_impact_analysis_setup
identifier: test_impact_analysis_setup_dotnet
weight: 8101
weight: 9101
- name: JavaScript and TypeScript
url: tests/test_impact_analysis/setup/javascript/
parent: test_impact_analysis_setup
identifier: test_impact_analysis_setup_javascript
weight: 8102
weight: 9102
- name: Python
url: tests/test_impact_analysis/setup/python/
parent: test_impact_analysis_setup
identifier: test_impact_analysis_setup_python
weight: 8103
weight: 9103
- name: Swift
url: tests/test_impact_analysis/setup/swift/
parent: test_impact_analysis_setup
identifier: test_impact_analysis_setup_swift
weight: 8104
weight: 9104
- name: Java
url: tests/test_impact_analysis/setup/java/
parent: test_impact_analysis_setup
identifier: test_impact_analysis_setup_java
weight: 8105
weight: 9105
- name: Ruby
url: tests/test_impact_analysis/setup/ruby/
parent: test_impact_analysis_setup
identifier: test_impact_analysis_setup_ruby
weight: 8106
weight: 9106
- name: Go
url: tests/test_impact_analysis/setup/go/
parent: test_impact_analysis_setup
identifier: test_impact_analysis_setup_go
weight: 8107
weight: 9107
- name: How It Works
url: tests/test_impact_analysis/how_it_works/
parent: test_impact_analysis
identifier: test_impact_analysis_how_it_works
weight: 802
weight: 902
- name: Troubleshooting
url: tests/test_impact_analysis/troubleshooting/
parent: test_impact_analysis
identifier: test_impact_analysis_troubleshooting
weight: 803
weight: 903
- name: Developer Workflows
url: tests/developer_workflows
parent: tests
identifier: tests_developer_workflows
weight: 9
weight: 10
- name: Code Coverage
url: tests/code_coverage
parent: tests
identifier: tests_code_coverage
weight: 10
weight: 11
- name: Instrument Browser Tests with RUM
url: tests/browser_tests
parent: tests
identifier: tests_browser_tests
weight: 11
weight: 12
- name: Instrument Swift Tests with RUM
url: tests/swift_tests
parent: tests
identifier: tests_swift_tests
weight: 12
weight: 13
- name: Correlate Logs and Tests
url: tests/correlate_logs_and_tests
parent: tests
identifier: tests_correlate_logs_and_tests
weight: 13
weight: 14
- name: Guides
url: tests/guides/
parent: tests
identifier: tests_guides
weight: 14
weight: 15
- name: Troubleshooting
url: tests/troubleshooting/
parent: tests
identifier: tests_troubleshooting
weight: 15
weight: 16
- name: Quality Gates
url: quality_gates/
pre: ci
Expand Down
2 changes: 1 addition & 1 deletion content/en/tests/repositories.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ The [Test Optimization settings][3] page gives you an overview of the features e
- **[GitHub Comments][4]**: Show summaries of your test results directly in pull requests.
- **[Auto Test Retries][5]**: Retry failing tests to avoid failing your build due to flaky tests.
- **[Early Flake Detection][6]**: Identify flaky tests early in the development cycle.
- **[Test Impact Analysis][7]**: Automatically select and run only the relevant tests for a given commit based on the code being changed.
- **[Test Impact Analysis][8]**: Automatically select and run only the relevant tests for a given commit based on the code being changed.

#### Overrides for test services

Expand Down
65 changes: 65 additions & 0 deletions content/en/tests/test_health/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
---
title: Test Health
description: Understanding and Quantifying the impact of your tests
further_reading:
- link: "/continuous_integration/tests/"
tag: "Documentation"
text: "Learn about Test Optimization"
---

{{< site-region region="gov" >}}
<div class="alert alert-warning">Test Optimization is not available in the selected site ({{< region-param key="dd_site_name" >}}) at this time.</div>
{{< /site-region >}}

## Overview

Effective test optimization involves not only identifying flaky tests but also providing insights that help Developer Experience teams (such as DevOps or Platform teams) clearly demonstrate value and manage their CI pipelines and test sessions more efficiently. The impact of failing jobs extends across multiple dimensions, including increased infrastructure costs from rerunning test jobs, decreased deployment frequency, and lost developer focus time.

Test Health provides insights at the repository and test service levels, focusing on pipelines that failed due to flaky tests, time wasted in CI due to flaky tests, pipelines prevented from failing by Test Optimization, and time saved in CI thanks to the Test Optimization product.

## Key Metrics

### Pipelines Failed
- **Failures due to non-flaky tests**: Count of CI pipelines with failed test sessions containing non-flaky test failures.
- **Failures due to flaky tests**: Count of CI pipelines with failed test sessions where all test failures are flaky (`@test.is_known_flaky` or `@test.is_new_flaky`).
- **Percentage of failures due to flaky tests**: `(Flaky Failures / Total Failures)`.

### Time Wasted in CI
- **Lost duration due to failed tests**: Total duration of failed test sessions containing test failures.
- **Lost duration due to flaky tests**: Total duration of failed test sessions where all test failures are flaky (`@test.is_known_flaky` or `@test.is_new_flaky`).

### Pipelines Saved

- **Number of pipelines saved due to auto-retries**: Number of CI pipelines with passed test sessions containing tests with `@test.is_retry:true` and `@test.is_new:false`.
- **Percentage of pipelines saved due to auto-retries**: `(Number of CI pipelines saved due to auto-retries / Total number of CI pipelines with tests)`.

### Time Saved in CI
- **Saved time due to auto-retries**: Total duration of passed test sessions in which some tests initially failed but later passed due to the [Auto Test Retry][1] feature. These tests are tagged with `@test.is_retry:true` and `@test.is_new:false`.
- **Saved time due to Test Impact Analysis**: Total duration indicated by `@test_session.itr.time_saved`.

## Common Use Cases

Platform and DevOps teams conduct thorough evaluations within strict budgets. Their decision-making criteria heavily focus on measurable value relative to costs. During evaluations or Proof of Concepts (POCs), it's critical for these teams to quickly demonstrate the quantifiable impact of flaky tests and improvements in overall test reliability.

Test Health provides these teams with metrics and insights necessary to quantify the potential impact of Test Optimization in their repositories.

### Reduce Team Frustration from Unreliable Test Pipelines
We quantify the **Developer Experience** by comparing the number of failures caused by legitimate regressions versus failures due to unreliable tests. The **Pipelines Failed** section provides insights to evaluate the criticality of flaky tests (percentage of testing time due to flaky tests).

Test Optimization provides features to alleviate these issues:
- **[Auto Test Retries][1]** reduce the risk of flaky tests ruining entire test sessions.
- **[Early Flake Detection][2]**, combined with **[Quality Gates][3]**, prevents flaky tests from entering your main branches.
- **[Test Impact Analysis][4]** minimizes impact by running only relevant tests based on code coverage.

### Reduce Lost Time in Your Pipelines
In most cases, tests serve as gating checks for pipelines (if a test fails, the pipeline fails and becomes blocked). **Test Health** provides insights to understand the impact of failing tests in your Continuous Integration provider.

Test Optimization provides features to alleviate these issues:
- **[Auto Test Retries][1]**: If a single flaky test fails during your session, the entire duration of the CI job is lost. Auto Test Retries allow flaky tests to be easily rerun, increasing the likelihood of passing.
- **[Test Impact Analysis][4]**: By running only tests relevant to your code changes, you reduce the overall duration of the test session.


[1]: /tests/flaky_test_management/auto_test_retries/
[2]: /tests/flaky_test_management/early_flake_detection/
[3]: /quality_gates/
[4]: /tests/test_impact_analysis/
Loading