SymDB: Component / Remote / Uploader integration#5717
Conversation
Typing analysisNote: Ignored files are excluded from the next sections.
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5a83782a05
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # Class-level state: tracks whether any Component instance in this process | ||
| # has performed an extract+upload. Survives Component replacement during | ||
| # Datadog reconfiguration so duplicate uploads are prevented. | ||
| @uploaded_this_process = false |
There was a problem hiding this comment.
Reset SymDB upload state after fork
In prefork servers, if the parent process performs the Symbol Database upload before forking (for example via force upload or remote config during boot), this class-level flag is inherited by every child as true, so start_upload immediately returns and the workers can never upload their own symbol database. I checked Components.after_fork, and it does not reset or reinitialize Symbol Database state, so a Puma/Passenger-style child that loads additional code after fork, or where the parent never serves traffic, will have missing DI autocomplete until the process is restarted.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
thought: thanks for flagging — going to be addressed in #5697.
The inherited flag itself goes away there: @uploaded_this_process and the surrounding @upload_done_mutex / @upload_done_cv machinery are all removed to support the hot-load hook. The broader fork hygiene (orphan mutexes, dead scheduler thread, dead TracePoint) is handled by a new Component#after_fork! wired from Components#after_fork — reinitializes mutexes/CVs, clears scheduler state, and re-registers the deferred upload in force-upload mode.
One nuance worth surfacing: under the canonical Rails + Puma preload_app! + eager_load=true config, child workers actually hold identical code to the parent. So our guiding principle is "if code in child is identical, nothing should be uploaded" — which is the existing class-of-issue tracked as Fork/child dedup. The full reasoning and the matrix of when child code does or doesn't diverge lives in projects/symdb/design/fork-architecture.md.
Fix three issues identified in di-pr-review: 1. extractor_spec.rb:1809 — class fixture name mismatch. `class ExtractAllInjectableTest` was the pre-rename name; the test references `ExtractAllTargetableTest` post-rename (#5692). The mismatch was introduced by `git merge --squash -X theirs`, which favored the branch's pre-rename version on conflict while the surrounding references use master's renamed version. Tests were unrunnable as written. Renamed the fixture class to match. 2. integration_spec.rb:90, 115 — code comments still said "Injectable lines" while referencing `targetable_lines` / `targetable_lines?` methods. Updated comments. 3. component.rb:52, 56 — added YARD docs for the class methods `uploaded_this_process?` and `mark_uploaded` for consistency with the rest of the file's documentation style. Not addressed (intentional): - Sleep usage in component_spec.rb — used for negative-assertion tests ("verify nothing happens given X seconds"). Comments justify duration; the cleanest deterministic alternative would require adding a "scheduler decision made" hook to production code, which is scope expansion beyond this PR. - Missing telemetry in component.rb / remote.rb rescue blocks — matches the established SymDB direction (#5693 explicitly removed telemetry from Extractor). Adding telemetry here would contradict the recent project decision.
🎉 All green!❄️ No new flaky tests detected 🎯 Code Coverage (details) 🔗 Commit SHA: b2257fd | Docs | Datadog PR Page | Give us feedback! |
BenchmarksBenchmark execution time: 2026-05-14 00:27:03 Comparing candidate commit b2257fd in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 45 metrics, 1 unstable metrics.
|
1a95405 to
4473fe6
Compare
Fix three issues identified in di-pr-review: 1. extractor_spec.rb:1809 — class fixture name mismatch. `class ExtractAllInjectableTest` was the pre-rename name; the test references `ExtractAllTargetableTest` post-rename (#5692). The mismatch was introduced by `git merge --squash -X theirs`, which favored the branch's pre-rename version on conflict while the surrounding references use master's renamed version. Tests were unrunnable as written. Renamed the fixture class to match. 2. integration_spec.rb:90, 115 — code comments still said "Injectable lines" while referencing `targetable_lines` / `targetable_lines?` methods. Updated comments. 3. component.rb:52, 56 — added YARD docs for the class methods `uploaded_this_process?` and `mark_uploaded` for consistency with the rest of the file's documentation style. Not addressed (intentional): - Sleep usage in component_spec.rb — used for negative-assertion tests ("verify nothing happens given X seconds"). Comments justify duration; the cleanest deterministic alternative would require adding a "scheduler decision made" hook to production code, which is scope expansion beyond this PR. - Missing telemetry in component.rb / remote.rb rescue blocks — matches the established SymDB direction (#5693 explicitly removed telemetry from Extractor). Adding telemetry here would contradict the recent project decision.
Removes the `telemetry` parameter from `Datadog::SymbolDatabase::ScopeBatcher` and the three `@telemetry&.report(...)` calls in its rescue blocks (`add_scope`, `timer_loop`, `perform_upload`). The rescue blocks still log at debug level. Constructor change: `ScopeBatcher.new(uploader, logger:, telemetry: nil, on_upload: nil, timer_enabled: true)` becomes `ScopeBatcher.new(uploader, logger:, on_upload: nil, timer_enabled: true)`. Master has no non-test callers of `ScopeBatcher.new` — only its spec instantiates it directly — so this is contained to the tests in this PR. The three "logs and reports telemetry without propagating" / "logs and reports telemetry without crashing" rescue-path tests are rewritten to assert on the log message only (the dropped telemetry assertion). Motivation: matches the SymDB direction set by DataDog#5693 (telemetry removal from Extractor). Telemetry reporting for internal SymDB lifecycle errors is being replaced by debug-level logging; only the upload boundary (Uploader) retains telemetry reporting. Extracted from DataDog#5717. Verification: - bundle exec rspec spec/datadog/symbol_database/scope_batcher_spec.rb → 30 examples, 0 failures - bundle exec steep check lib/datadog/symbol_database/scope_batcher.rb → No type error detected - bundle exec standardrb lib/datadog/symbol_database/scope_batcher.rb spec/datadog/symbol_database/scope_batcher_spec.rb → clean Co-Authored-By: Claude <noreply@anthropic.com>
Adds a single telemetry call to `Datadog::SymbolDatabase::Uploader#upload_scopes`:
@telemetry&.distribution('tracers', 'symbol_database.payload_size',
compressed_data.bytesize)
emitted unconditionally after GZIP compression and before the
MAX_PAYLOAD_SIZE check. The metric records the compressed payload size in
bytes for every upload attempt, including ones that get dropped for being
too large (so the oversized case is observable instead of silently lost).
Four new specs cover:
- successful upload: distribution called with the compressed bytesize
- oversized-skip path: distribution still called before the skip
- serialization-raises path: distribution not called
- compression-raises path: distribution not called
No constructor or API changes — the `telemetry:` parameter already exists
on Uploader.
Motivation: Uploader is the externally-observable boundary of the SymDB
subsystem. The payload_size distribution gives ops/customer visibility into
the size of uploaded symbol databases.
Extracted from DataDog#5717 to land independently.
Verification:
- bundle exec rspec spec/datadog/symbol_database/uploader_spec.rb
→ 21 examples, 0 failures
- bundle exec steep check lib/datadog/symbol_database/uploader.rb
→ No type error detected
- bundle exec standardrb lib/datadog/symbol_database/uploader.rb
spec/datadog/symbol_database/uploader_spec.rb → clean
Co-Authored-By: Claude <noreply@anthropic.com>
Squashed rebase of #5431 (symbol-database-upload, 185 commits) onto current master, capturing only the unique work that hasn't been split out and merged via individual sub-PRs (#5618, #5670, #5692, Adds the wiring layer that ties together the data models, extractor, batcher, uploader, and transport that were merged separately: - Component class for symbol_database lifecycle management - Remote module for remote config integration - Component-initialize / integration / remote_config_integration tests - Top-level wiring in core/configuration/components.rb and core/remote/client/capabilities.rb - Settings registration in supported-configurations.json - Documentation updates in DynamicInstrumentation.md and GettingStarted.md Branch #5431 is left intact for reference; this is a fresh single-commit PR for review.
Remove "yet" qualifiers from DynamicInstrumentation.md statements about unsupported features (the qualifier implies an active deferral that isn't being committed to). Remove cross-language gap bullets that name internal mechanics (FIELD symbols, LOCAL scopes, fork deduplication, injectable line information) without customer-facing impact, and the class-methods note that described an internal extraction-vs-upload detail. Rename "Deferred features" to "Limitations". Co-Authored-By: Claude <noreply@anthropic.com>
Fix three issues identified in di-pr-review: 1. extractor_spec.rb:1809 — class fixture name mismatch. `class ExtractAllInjectableTest` was the pre-rename name; the test references `ExtractAllTargetableTest` post-rename (#5692). The mismatch was introduced by `git merge --squash -X theirs`, which favored the branch's pre-rename version on conflict while the surrounding references use master's renamed version. Tests were unrunnable as written. Renamed the fixture class to match. 2. integration_spec.rb:90, 115 — code comments still said "Injectable lines" while referencing `targetable_lines` / `targetable_lines?` methods. Updated comments. 3. component.rb:52, 56 — added YARD docs for the class methods `uploaded_this_process?` and `mark_uploaded` for consistency with the rest of the file's documentation style. Not addressed (intentional): - Sleep usage in component_spec.rb — used for negative-assertion tests ("verify nothing happens given X seconds"). Comments justify duration; the cleanest deterministic alternative would require adding a "scheduler decision made" hook to production code, which is scope expansion beyond this PR. - Missing telemetry in component.rb / remote.rb rescue blocks — matches the established SymDB direction (#5693 explicitly removed telemetry from Extractor). Adding telemetry here would contradict the recent project decision.
Previous commit eba8297 removed these bullets entirely; the user's quote scoped removal to the parenthetical and explanation only. Co-Authored-By: Claude <noreply@anthropic.com>
Replace `untyped` with concrete types where Steep can verify: component.rbs: - @settings, @agent_settings → Core::Configuration::Settings / AgentSettings - attr_reader settings → Core::Configuration::Settings - self.build / initialize signatures → typed settings/agent_settings params remote.rbs: - capabilities return → Array[Integer] (capability flags) - receivers telemetry → Core::Telemetry::Component - receiver block params → Repository + ::Array[untyped] - enable_upload / parse_config content → Core::Remote::Configuration::Content @logger and the Repository::Change union remain `untyped`: - @logger: callers pass Core::Logger but downstream collaborators (ScopeBatcher) expect SymbolDatabase::Logger, which has additional methods like #trace. Resolving the cascade is out of scope for a type-fillout commit. Matches the convention in lib/datadog/di/ component.rbs. - Change union (Inserted | Updated | Deleted): Steep cannot narrow `case change.type` to the corresponding variant, so concrete typing produces 13 false positives on `change.content` / `change.previous`. Verified with `bundle exec steep check` — no errors.
…json
DD_INTERNAL_FORCE_SYMBOL_DATABASE_UPLOAD and DD_SYMBOL_DATABASE_UPLOAD_ENABLED
were already added in alphabetically-sorted positions by the earlier SymDB
configuration settings change. The squash rebase reintroduced them mid-file
(out of order, before DD_ENV). Remove the duplicates; the originals at the
sorted positions remain authoritative.
Verification:
- ruby -rjson -e 'JSON.parse(File.read("supported-configurations.json"))' → valid
- Each key appears once (line 415 and line 688)
Co-Authored-By: Claude <noreply@anthropic.com>
The directory-based skip in spec_helper centralizes the platform guard for the symbol_database suite (which requires MRI Ruby 2.6+). Expand the comment to show how to opt a spec out of the wholesale skip via the symdb_supported_platforms metadata tag, including describe-level and example-level usage examples — needed when a spec validates the platform guard itself and therefore must run on the otherwise-skipped platforms. Verification: - ruby -c spec/spec_helper.rb → syntax OK - Comment-only change; no behavior modified. Co-Authored-By: Claude <noreply@anthropic.com>
The squash rebase replaced per-line @type var annotations with three steep:ignore:start/:end blocks in scope_batcher.rb, and dropped :: prefixes from concrete-type references in scope_batcher.rbs. Both changes inflated the diff against the merged ScopeBatcher PR's form without changing behavior. Restore the original style: - add_scope/flush/shutdown: `# @type var scopes_to_upload: ::Array[Scope]?` re-added on each local declaration; the surrounding steep:ignore blocks are removed (the annotation does the same job in two lines instead of six). - shutdown's guard widened to `unless scopes_to_upload.nil? || .empty?` so Steep accepts `.empty?` on the now-nullable-annotated local. - scope_batcher.rbs: `::` restored on MAX_SCOPES, INACTIVITY_TIMEOUT, MAX_FILES, @on_upload, @scopes, @Mutex, @timer_cv, @timer_thread, @file_count, @uploaded_modules, perform_upload arg, size return. Verification: - bundle exec steep check lib/datadog/symbol_database/scope_batcher.rb → No type error detected. - bundle exec rspec spec/datadog/symbol_database/scope_batcher_spec.rb → 26 examples, 0 failures. Co-Authored-By: Claude <noreply@anthropic.com>
Six small changes flagged in review on the Component class: - build: `unless environment_supported?(symdb_logger) ... return nil end` collapsed to guard-clause `return nil unless environment_supported?(...)`, with a note that the helper logs the specific reason internally. - build: compound `unless settings.remote&.enabled || ... force_upload` flipped to `if !A && !B` (DeMorgan) — easier to read. - shutdown!: removed standalone `# @api private` marker before `private` keyword (redundant; the keyword already enforces visibility). - scheduler_loop: added a comment explaining what `should_fire = true` means (debounce deadline elapsed; extract_and_upload runs once after the mutex is released). - scheduler_loop: removed the steep:ignore:start/:end block around the mutex section. NOTE: removal is unverified locally because the libdatadog version mismatch blocks Steep from running here; CI will surface a regression if removal isn't safe. - extract_and_upload: moved `extraction_duration` and `targetable_count` calculations inside the `@logger.debug` block so the work is skipped when debug logging is off (the latter iterates the scope tree). Co-Authored-By: Claude <noreply@anthropic.com>
dda2762 to
05794a2
Compare
Replaces remaining `untyped` declarations introduced by this PR with concrete types where the type is unambiguous, and removes phantom sig declarations that referenced methods/fields not present in the production code. - symbol_database.rbs: remove phantom `@mutex`, `@component`, `self.component`, `self.set_component`, `self.enabled?` — none exist in `lib/datadog/symbol_database.rb`. The sig had been declaring an API that never landed. - component.rbs: `@logger`, `initialize`'s `logger` parameter, and `environment_supported?`'s `logger` parameter typed as `SymbolDatabase::Logger` (the wrapper instantiated inside `build`). `build`'s `logger` parameter stays `untyped` — it's the raw core logger passed in from the configuration layer (duck-typed, matches existing `untyped` convention for cross-component logger handoff). - remote.rbs: `process_change`'s `change` parameter and `receiver` block's `changes` parameter typed as `Datadog::Core::Remote::Configuration::Repository::change` (a type alias defined in repository.rbs as `Change::Deleted | Change::Inserted | Change::Updated`). The `parse_config` return value's inner `untyped` stays — the JSON-decoded hash is genuinely heterogeneous. Verification: - ruby -rrbs parser run on all three files: OK - bundle exec steep check could not run locally (libdatadog version mismatch blocks the dev env here); CI will surface any regression. Co-Authored-By: Claude <noreply@anthropic.com>
CI flagged 10 Steep errors after the previous type-fill commit. Three distinct issues: 1. Component @logger (typed as SymbolDatabase::Logger) passed to Uploader whose sig said `Datadog::Core::Logger`. Runtime reality: the wrapped Logger is what's passed (and forwarded through to Transport::HTTP). Fix: update Uploader and Transport::HTTP sigs to declare `SymbolDatabase::Logger` — matches the actual call path. 2. Remote.process_change's `change` parameter typed as the union `Change::Deleted | Change::Inserted | Change::Updated` — Steep cannot narrow the union via the `case change.type` symbol-dispatch the code uses. Add `# @type var change: <ConcreteClass>` annotations inside each `when` branch — narrows for Steep without changing the dispatch (preserves test patterns using `instance_double('Change', type: ...)`). The `else` and rescue branches use respond_to? introspection which Steep cannot narrow either — silenced with per-line steep:ignore. 3. scheduler_loop's `@scheduled_at - Time.get_time` flagged because Steep does not narrow nullable instance variables across an `if @scheduled_at.nil?` check. Copy to local variable so Steep narrows in the else branch. Verification: - bundle exec steep check → No type error detected. - bundle exec rspec spec/datadog/symbol_database/{remote,component,uploader}_spec.rb → 71 examples, 0 failures. Co-Authored-By: Claude <noreply@anthropic.com>
ce69f31 to
b3bfac4
Compare
* origin/symdb-hot-load-hook: Add Component#after_fork! for post-fork mutex hygiene + scheduler reset Add TracePoint :class hot-load hook for incremental class extraction Fix Steep failures from over-tight type-fill + reinstate scheduler_loop ignore Fill in concrete types for remaining SymDB sig declarations Address review findings on component.rb Restore @type var annotations and :: prefixes in ScopeBatcher Document symbol_database platform-skip opt-out usage in spec_helper Remove duplicate SymDB env var entries from supported-configurations.json Fill in concrete types for SymDB Component and Remote sig files Restore Instance/Local variable extraction bullet lead-ins Address review findings on #5717 SymDB docs: trim DI documentation cross-language disclaimers SymDB: Add Component, Remote, Uploader integration on top of master
* origin/base-5733: SymDB: add extraction performance benchmark Add Component#after_fork! for post-fork mutex hygiene + scheduler reset Fix Steep failures with type annotations, not untyped reverts Add TracePoint :class hot-load hook for incremental class extraction Fix Steep failures from over-tight type-fill + reinstate scheduler_loop ignore Fill in concrete types for remaining SymDB sig declarations Address review findings on component.rb Restore @type var annotations and :: prefixes in ScopeBatcher Document symbol_database platform-skip opt-out usage in spec_helper Remove duplicate SymDB env var entries from supported-configurations.json Fill in concrete types for SymDB Component and Remote sig files Restore Instance/Local variable extraction bullet lead-ins Address review findings on #5717 SymDB docs: trim DI documentation cross-language disclaimers SymDB: Add Component, Remote, Uploader integration on top of master
* origin/symdb-hot-load-hook: Add Component#after_fork! for post-fork mutex hygiene + scheduler reset Add TracePoint :class hot-load hook for incremental class extraction Fix Steep failures from over-tight type-fill + reinstate scheduler_loop ignore Fill in concrete types for remaining SymDB sig declarations Address review findings on component.rb Restore @type var annotations and :: prefixes in ScopeBatcher Document symbol_database platform-skip opt-out usage in spec_helper Remove duplicate SymDB env var entries from supported-configurations.json Fill in concrete types for SymDB Component and Remote sig files Restore Instance/Local variable extraction bullet lead-ins Address review findings on #5717 SymDB docs: trim DI documentation cross-language disclaimers SymDB: Add Component, Remote, Uploader integration on top of master
* origin/base-5733: Add Rails integration test for symbol database extraction SymDB: add extraction performance benchmark Add Component#after_fork! for post-fork mutex hygiene + scheduler reset Fix Steep failures with type annotations, not untyped reverts Add TracePoint :class hot-load hook for incremental class extraction Fix Steep failures from over-tight type-fill + reinstate scheduler_loop ignore Fill in concrete types for remaining SymDB sig declarations Address review findings on component.rb Restore @type var annotations and :: prefixes in ScopeBatcher Document symbol_database platform-skip opt-out usage in spec_helper Remove duplicate SymDB env var entries from supported-configurations.json Fill in concrete types for SymDB Component and Remote sig files Restore Instance/Local variable extraction bullet lead-ins Address review findings on #5717 SymDB docs: trim DI documentation cross-language disclaimers SymDB: Add Component, Remote, Uploader integration on top of master
`DD_INTERNAL_FORCE_SYMBOL_DATABASE_UPLOAD` is an internal testing knob, not a customer-facing setting. Moves the snippet out of DynamicInstrumentation.md (customer guide) into DevelopmentGuide.md (contributor guide) under a new "Testing the Symbol Database without Remote Configuration" subsection.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9f0c6427b3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| def receivers(_telemetry) | ||
| receiver do |repository, changes| | ||
| component = begin | ||
| Datadog.send(:components)&.symbol_database |
There was a problem hiding this comment.
Avoid returning from the stored receiver proc
When Dynamic Instrumentation is enabled on a runtime where the Symbol Database component is not built (for example JRuby or Ruby 2.5, which Component.build explicitly skips), this branch executes inside the receiver block after .receivers has already returned. Dispatcher::Receiver stores and later calls that non-lambda block, so return unless component raises LocalJumpError: unexpected return instead of just ignoring the SymDB config, killing remote-config processing for that poll. Use a non-local-control-flow-free guard (for example an if component ... end) here.
Useful? React with 👍 / 👎.
Three CRITICAL items from PR review: 1. Component built without telemetry — fixed by threading telemetry from Components#initialize → Component.build → Component.new → Uploader.new. Uploader's `telemetry: nil` default is now provided by the wiring layer. 2. Rescue blocks missing telemetry — added `@telemetry&.report` at: - component.rb start_upload, scheduler_loop, extract_and_upload - remote.rb component-lookup-in-receiver, process_change Telemetry in the Remote module is looked up once per receiver invocation via Datadog.send(:components)&.telemetry (component-boundary exception). parse_config no longer rescues JSON::ParserError internally — the error propagates to process_change's rescue which logs + reports telemetry. 3. Sleep in tests — removed all 8 sleeps. Analysis shows they were guarding against scheduler-thread races that the production code already handles deterministically: - start_upload short-circuits on uploaded_this_process?/@shutdown before starting a scheduler thread — assert @scheduler_thread.nil? instead - shutdown! signals the CV and joins the thread before returning — the thread cannot fire after shutdown! returns - the `after { component.shutdown! }` hook joins any live scheduler thread before RSpec verifies `not_to receive` expectations 318/318 symdb specs pass; Steep clean on changed files.
What does this PR do?
Squashed rebase of #5431 (
symbol-database-upload, 185 commits) onto current master, capturing only the unique work that hasn't been split out and merged via individual sub-PRs.Adds the wiring layer that ties together the data models, extractor, batcher, uploader, and transport that were merged separately:
lib/datadog/symbol_database/component.rb— Component class for symbol_database lifecycle managementlib/datadog/symbol_database/remote.rb— Remote module for remote config integrationspec/datadog/symbol_database/component_initialize_spec.rb,component_spec.rb,integration_spec.rb,remote_config_integration_spec.rb,remote_spec.rb— coverage for the new componentlib/datadog/core/configuration/components.rb,core/remote/client/capabilities.rb— top-level wiring to register symbol_database with the tracer and remote configsupported-configurations.json— env var registrationdocs/DynamicInstrumentation.md,docs/GettingStarted.md— documentationMotivation:
#5431 has been the umbrella branch for the SymbolDatabase feature. Several pieces have been split out and merged independently (#5618, #5670, #5692, #5693, #5694). #5431 itself accumulated 185 commits and rebasing it directly produces conflicts on most commits because master now contains the post-squash final state of work that the branch held in intermediate form.
This PR captures the integration / wiring layer that hasn't been split out yet, on top of current master, in a single commit. #5431 is left intact for reference.
Change log entry
Yes. Dynamic Instrumentation: Symbol Database upload is now wired up — Ruby services will populate the Live Debugger UI's autocomplete (class names, method names, parameter names) when probes are being created.
How to test the change?
The new specs (
component_spec.rb,integration_spec.rb,remote_config_integration_spec.rb,remote_spec.rb) exercise the wiring; CI runs them as part of the symbol_database test suite.