ci: enable performance quality gates#5571
ci: enable performance quality gates#5571igoragoli wants to merge 15 commits intoaugusto/add-perf-quality-gate-dd-octo-sts-policyfrom
Conversation
Add macrobenchmarks-gates and macrobenchmarks-notify stages. Include check-slo-breaches and notify-slo-breaches templates from benchmarking-platform-tools. Add placeholder check-slo-breaches job that depends on all 8 macrobenchmark jobs. Temporarily set macrobenchmarks to auto-trigger on all branches to collect baseline artifacts for SLO threshold generation.
Adds a quality gate that fails on microbenchmark regressions exceeding 20%. Uses bp-runner fail_on_regression step from benchmarking-platform. Runs after microbenchmarks with when: always to catch failures too. Set to allow_failure: true until thresholds are validated.
|
Thank you for updating Change log entry section 👏 Visited at: 2026-04-09 14:50:40 UTC |
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
BenchmarksBenchmark execution time: 2026-04-10 11:41:04 Comparing candidate commit d1e7605 in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 45 metrics, 1 unstable metrics.
|
Replace check-slo-breaches placeholder with real fail_on_breach implementation. Add notify-slo-breaches job to alert on apm-dcs-performance-alerts. Generate 209 SLO thresholds across 42 scenarios using tight strategy (T=5%). Revert macrobenchmarks to manual trigger on non-master branches.
Move microbenchmarks before macrobenchmarks so macro gates and notify stages are adjacent. Restrict check-slo-breaches and notify-slo-breaches to master only since non-master branches use manual macrobenchmarks.
2e12e39 to
efb574d
Compare
Drop rules: block from check-slo-breaches and notify-slo-breaches. GitLab ignores top-level when: when rules: is present. Follow dd-trace-py pattern: use when: always with no rules.
Use rules: with when: always on master, default on_success on branches. Remove conflicting top-level when: always which GitLab ignores when rules: is present.
0568f25 to
c866a4b
Compare
Remove baseline scenarios (not actionable). Keep only: - normal_operation: agg_http_req_duration p50/p99 - high_load: throughput - utilization monitors: cpu_usage_percentage, rss Drop data_received, data_sent, dropped_iterations, http_req_duration. Reduces from 209 to 66 thresholds across 36 scenarios.
c866a4b to
c3caecc
Compare
Fix macrobenchmarks-notify-slo-breaches referencing wrong job name. Move when: always into rules for microbenchmarks-check-big-regressions since GitLab ignores top-level when: when rules: is present.
Single-run SLO generation produced a tight RSS threshold (2.73 GB) that doesn't account for cross-run variance. Bump to 3.25 GB based on observed values across multiple runs.

What does this PR do?
Enables pre-release performance quality gates on dd-trace-rb.
microbenchmarks-check-big-regressionsjob (20% threshold viafail_on_regression)macrobenchmarks-check-slo-breaches+macrobenchmarks-notify-slo-breachesjobs with SLO thresholds viafail_on_breachMotivation:
Catch performance regressions before release. Aligns dd-trace-rb with dd-trace-go and dd-trace-py.
Change log entry
None.
Additional Notes:
SLO generation:
benchmark_analyzer generate slos --strategy tight --significant-impact-threshold 0.10(T=10%)high_load--profiling-and-tracing-and-appsec--puma-utilization: 2.73 GB → 3.25 GB) due to cross-run varianceQuality gates setup:
allow_failure: trueuntil thresholds are validatedapm-dcs-performance-alerts(TODO: switch to#guild-dd-ruby)tracing-and-appsecmacrobenchmark produced no k6 results, so it has no SLO thresholds yetHow to test the change?
CI pipeline validates gate jobs run correctly after benchmarks complete.