Skip to content

Reproduce flaky: shutdown 'executes only once'#5584

Draft
p-datadog wants to merge 1 commit intomasterfrom
reproduce-flaky-shutdown-executes-only-once
Draft

Reproduce flaky: shutdown 'executes only once'#5584
p-datadog wants to merge 1 commit intomasterfrom
reproduce-flaky-shutdown-executes-only-once

Conversation

@p-datadog
Copy link
Copy Markdown
Member

What does this PR do?

Reproducer for flaky test spec/datadog/tracing/integration_spec.rb:592 — the shutdown "executes only once" integration test. Forces the suspected race condition to validate the hypothesis in CI. Expected result: the test fails deterministically.

Do not merge — this PR exists for CI validation only.

Motivation:

Flaky test reported in PR #5581. The test intermittently fails with traces_flushed: 0 under CI load.

Hypothesis: The test fails when the HTTP round-trip to the agent takes longer than DEFAULT_SHUTDOWN_TIMEOUT (1 second). The worker thread is mid-HTTP-call when shutdown! calls join(1), the join times out, and traces_flushed is still 0 when the assertion runs.

Reproducer technique: Stubs tracer.writer.transport.send_traces with and_wrap_original to add a 2-second sleep before the real HTTP call. This exceeds the 1-second shutdown timeout, forcing the join to time out deterministically.

Change log entry

None.

How to test the change?

CI should show the "executes only once" / "flushed trace" test failing deterministically with traces_flushed: 0. If it doesn't, the hypothesis is wrong.

Hypothesis: the test fails when the HTTP round-trip to the agent takes
longer than DEFAULT_SHUTDOWN_TIMEOUT (1 second). The worker thread is
mid-HTTP-call when shutdown! calls join(1), the join times out, and
traces_flushed is still 0 when the assertion runs.

This commit forces the race condition by stubbing the transport's
send_traces to add a 2-second sleep, exceeding the 1-second shutdown
timeout. The test should fail deterministically in CI.

Co-Authored-By: Claude <noreply@anthropic.com>
@p-datadog p-datadog added the AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos label Apr 11, 2026
@github-actions github-actions bot added the dev/testing Involves testing processes (e.g. RSpec) label Apr 11, 2026
@datadog-datadog-prod-us1
Copy link
Copy Markdown
Contributor

datadog-datadog-prod-us1 bot commented Apr 11, 2026

⚠️ Tests

Fix all issues with BitsAI or with Cursor

⚠️ Other Violations

🧪 2 Tests failed

Tracer integration tests shutdown executes only once behaves like flushed trace is expected to include {:traces_flushed => 1} from rspec   View in Datadog   (Fix with Cursor)
expected {:traces_flushed => 0, :transport => #<Datadog::Tracing::Transport::Statistics::Counts:0x00007ffbccdcee78 @success=0, @client_error=0, @server_error=0, @internal_error=0, @consecutive_errors=0>} to include {:traces_flushed => 1}
Diff:
@@ -1 +1,2 @@
-:traces_flushed => 1,
+:traces_flushed => 0,
+:transport => #<Datadog::Tracing::Transport::Statistics::Counts:0x00007ffbccdcee78 @success=0, @client_error=0, @server_error=0, @internal_error=0, @consecutive_errors=0>,

Failure/Error: expect(stats).to include(traces_flushed: 1)

  expected {:traces_flushed => 0, :transport => #<Datadog::Tracing::Transport::Statistics::Counts:0x00007ffbccdcee78 @success=0, @client_error=0, @server_error=0, @internal_error=0, @consecutive_errors=0>} to include {:traces_flushed => 1}
...
Tracer integration tests shutdown executes only once behaves like flushed trace is expected to include {traces_flushed: 1} from rspec   View in Datadog   (Fix with Cursor)
expected {traces_flushed: 0, transport: #<Datadog::Tracing::Transport::Statistics::Counts:0x00007ff59f2cdf70 @success=0, @client_error=0, @server_error=0, @internal_error=0, @consecutive_errors=0>} to include {traces_flushed: 1}
Diff:
@@ -1 +1,2 @@
-:traces_flushed => 1,
+:traces_flushed => 0,
+:transport => #<Datadog::Tracing::Transport::Statistics::Counts:0x00007ff59f2cdf70 @success=0, @client_error=0, @server_error=0, @internal_error=0, @consecutive_errors=0>,

Failure/Error: expect(stats).to include(traces_flushed: 1)

  expected {traces_flushed: 0, transport: #<Datadog::Tracing::Transport::Statistics::Counts:0x00007ff59f2cdf70 @success=0, @client_error=0, @server_error=0, @internal_error=0, @consecutive_errors=0>} to include {traces_flushed: 1}
...

ℹ️ Info

No other issues found (see more)

❄️ No new flaky tests detected

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: c69ca6e | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos dev/testing Involves testing processes (e.g. RSpec)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants