Skip to content

feat(tracing): add DD_TRACE_SECURE_RANDOM option for guaranteed ID uniqueness#5739

Draft
litianningdatadog wants to merge 4 commits into
masterfrom
tianning.li/dd-trace-secure-random
Draft

feat(tracing): add DD_TRACE_SECURE_RANDOM option for guaranteed ID uniqueness#5739
litianningdatadog wants to merge 4 commits into
masterfrom
tianning.li/dd-trace-secure-random

Conversation

@litianningdatadog
Copy link
Copy Markdown

@litianningdatadog litianningdatadog commented May 11, 2026

Tech Doc

Summary

  • Utils.next_id uses a module-level Random.new (Mersenne Twister) seeded from OS entropy at construction. After that the PRNG state is fully deterministic. In environments where process memory is cloned, all copies share the same PRNG state and produce identical trace/span ID sequences. The existing after_fork! guard reseeds on fork(2) via PID comparison but cannot detect other forms of memory duplication.
  • When DD_TRACE_SECURE_RANDOM=true, next_id delegates to SecureRandom.random_number instead. SecureRandom reads from the OS entropy pool on every call and holds no userspace state.
  • SecureRandom is Ruby stdlib — no new gem dependency.

Test plan

  • With DD_TRACE_SECURE_RANDOM=true: IDs within valid range, >90 unique out of 100 calls, SecureRandom.random_number invoked
  • Without flag: existing Random.new path used, SecureRandom not invoked
  • Existing Utils specs unaffected

🤖 Generated with Claude Code

…iqueness

Utils.next_id uses a module-level Random.new (Mersenne Twister) seeded from
OS entropy at construction. After that point the PRNG state is fully
deterministic. In environments where process memory is cloned (e.g. VM
snapshots, certain fork patterns), all clones share the same PRNG state and
produce identical trace and span ID sequences. The existing after_fork! guard
reseeds on fork(2) via PID comparison but cannot detect other forms of memory
duplication.

When DD_TRACE_SECURE_RANDOM=true, next_id delegates to
SecureRandom.random_number instead. SecureRandom reads from the OS entropy
pool on every call and holds no userspace state, ensuring IDs remain unique
regardless of how the process image was created or duplicated.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@dd-octo-sts dd-octo-sts Bot added the tracing label May 11, 2026
@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts Bot commented May 11, 2026

👋 Hey @DataDog/ruby-guild, please fill "Change log entry" section in the pull request description.

If changes need to be present in CHANGELOG.md you can state it this way

**Change log entry**

Yes. A brief summary to be placed into the CHANGELOG.md

(possible answers Yes/Yep/Yeah)

Or you can opt out like that

**Change log entry**

None.

(possible answers No/Nope/None)

Visited at: 2026-05-12 00:09:01 UTC

@datadog-datadog-prod-us1
Copy link
Copy Markdown
Contributor

datadog-datadog-prod-us1 Bot commented May 11, 2026

Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 97.15% (-0.00%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 964e877 | Docs | Datadog PR Page | Give us feedback!

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 11, 2026

Benchmarks

Benchmark execution time: 2026-05-13 02:34:22

Comparing candidate commit 964e877 in PR branch tianning.li/dd-trace-secure-random with baseline commit 75b62c2 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 45 metrics, 1 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

…RANDOM

- Use DATADOG_ENV instead of ENV for DD_TRACE_SECURE_RANDOM access
- Register DD_TRACE_SECURE_RANDOM in supported-configurations.json
- Add RBS type declarations for secure_random? and @secure_random
- Fix memoization bug: ||= does not cache false values; use nil? guard
- Add test coverage for explicit false and memoization correctness

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@dd-octo-sts dd-octo-sts Bot added the core Involves Datadog core libraries label May 12, 2026
@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts Bot commented May 12, 2026

Typing analysis

Note: Ignored files are excluded from the next sections.

Untyped methods

This PR introduces 7 untyped methods and 1 partially typed method, and clears 7 untyped methods and 1 partially typed method. It increases the percentage of typed methods from 62.24% to 62.26% (+0.02%).

Untyped methods (+7-7)Introduced:
sig/datadog/tracing/utils.rbs:13
└── def self.next_id: () -> untyped
sig/datadog/tracing/utils.rbs:17
└── def self.id_rng: () -> untyped
sig/datadog/tracing/utils.rbs:19
└── def self.reset!: () -> untyped
sig/datadog/tracing/utils.rbs:25
└── def self?.next_id: () -> untyped
sig/datadog/tracing/utils.rbs:27
└── def self?.to_high_order: (untyped trace_id) -> untyped
sig/datadog/tracing/utils.rbs:29
└── def self?.to_low_order: (untyped trace_id) -> untyped
sig/datadog/tracing/utils.rbs:31
└── def self?.concatenate: (untyped high_order, untyped low_order) -> untyped
Cleared:
sig/datadog/tracing/utils.rbs:12
└── def self.next_id: () -> untyped
sig/datadog/tracing/utils.rbs:14
└── def self.id_rng: () -> untyped
sig/datadog/tracing/utils.rbs:16
└── def self.reset!: () -> untyped
sig/datadog/tracing/utils.rbs:22
└── def self?.next_id: () -> untyped
sig/datadog/tracing/utils.rbs:24
└── def self?.to_high_order: (untyped trace_id) -> untyped
sig/datadog/tracing/utils.rbs:26
└── def self?.to_low_order: (untyped trace_id) -> untyped
sig/datadog/tracing/utils.rbs:28
└── def self?.concatenate: (untyped high_order, untyped low_order) -> untyped
Partially typed methods (+1-1)Introduced:
sig/datadog/tracing/utils.rbs:21
└── def self.serialize_attribute: (untyped key, untyped value) -> (untyped | ::Array[::Array[untyped]])
Cleared:
sig/datadog/tracing/utils.rbs:18
└── def self.serialize_attribute: (untyped key, untyped value) -> (untyped | ::Array[::Array[untyped]])

Untyped other declarations

This PR introduces 1 untyped other declaration, and clears 1 untyped other declaration. It increases the percentage of typed other declarations from 78.4% to 78.41% (+0.01%).

Untyped other declarations (+1-1)Introduced:
sig/datadog/tracing/utils.rbs:24
└── MAX: untyped
Cleared:
sig/datadog/tracing/utils.rbs:21
└── MAX: untyped

If you believe a method or an attribute is rightfully untyped or partially typed, you can add # untyped:accept on the line before the definition to remove it from the stats.

litianningdatadog and others added 2 commits May 12, 2026 00:58
Bare `DATADOG_ENV` reference caused NameError since the constant is
defined as `Datadog::DATADOG_ENV` and is not resolvable in the RSpec
top-level scope.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Use `!!` to coerce `bool?` ivar to `bool` so Steep accepts the early
return as the declared `() -> bool` return type.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Involves Datadog core libraries tracing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant