Skip to content

Add Symbol Database Upload for Ruby Tracer#5431

Draft
p-datadog wants to merge 202 commits intomasterfrom
symbol-database-upload
Draft

Add Symbol Database Upload for Ruby Tracer#5431
p-datadog wants to merge 202 commits intomasterfrom
symbol-database-upload

Conversation

@p-datadog
Copy link
Copy Markdown
Member

@p-datadog p-datadog commented Mar 9, 2026

What does this PR do?

Implements symbol database upload for dd-trace-rb. Extracts class, method, and variable information from running Ruby applications and uploads it to the Datadog backend to support Dynamic Instrumentation auto-completion and search.

Motivation:

Brings Ruby tracer to parity with Java and Python for symbol database support. Enables IDE-like auto-completion when creating DI probes — users see class fields, method arguments, and local variables as they type.

Change log entry

Yes. Add symbol database upload support for Dynamic Instrumentation. When enabled, the tracer extracts and uploads symbol information (class names, method signatures, variable names) to enable auto-completion and search features in the DI UI.

Implementation:

  • Datadog::SymbolDatabase namespace (peer to DI, not under it)
  • 8 components: data models, extractor, batching, uploader, remote config, configuration, lifecycle, logging
  • Filters to user code only (excludes Datadog::*, stdlib, gems)
  • Multipart upload to /symdb/v1/input via Core::Transport::HTTP
  • 122 tests passing (unit + integration + telemetry verification)
  • Validated end-to-end with a real Rails app (demo-ruby)

Ruby version differences:

On Ruby 2.6, Module#const_source_location is unavailable (added in Ruby 2.7). This means namespace-only modules (e.g., module ApplicationCable; class Channel...; end; end where the module has no methods) and classes with only generated methods (e.g., ActiveRecord models with only associations) may not appear in the symbol database. On Ruby 2.7+, const_source_location locates these via their constant declarations. This is a graceful degradation — fewer symbols in the DI auto-completion UI on 2.6, no errors.

How to test the change?

# Run symbol database tests
bundle exec rspec spec/datadog/symbol_database/

# Run with a real Rails app
cd /path/to/demo-ruby
DD_TRACER=/path/to/dd-trace-rb bundle exec bin/extract_symbols

@p-datadog p-datadog added the AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos label Mar 9, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 9, 2026

Thank you for updating Change log entry section 👏

Visited at: 2026-03-17 18:37:08 UTC

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 9, 2026

Typing analysis

Note: Ignored files are excluded from the next sections.

steep:ignore comments

This PR introduces 20 steep:ignore comments, and clears 1 steep:ignore comment.

steep:ignore comments (+20-1)Introduced:
lib/datadog/symbol_database/component.rb:224
lib/datadog/symbol_database/configuration/settings.rb:28
lib/datadog/symbol_database/extractor.rb:225
lib/datadog/symbol_database/extractor.rb:237
lib/datadog/symbol_database/extractor.rb:251
lib/datadog/symbol_database/extractor.rb:388
lib/datadog/symbol_database/extractor.rb:561
lib/datadog/symbol_database/extractor.rb:591
lib/datadog/symbol_database/extractor.rb:676
lib/datadog/symbol_database/extractor.rb:737
lib/datadog/symbol_database/extractor.rb:738
lib/datadog/symbol_database/extractor.rb:816
lib/datadog/symbol_database/extractor.rb:864
lib/datadog/symbol_database/remote.rb:52
lib/datadog/symbol_database/scope.rb:85
lib/datadog/symbol_database/scope_context.rb:98
lib/datadog/symbol_database/scope_context.rb:125
lib/datadog/symbol_database/scope_context.rb:142
lib/datadog/symbol_database/scope_context.rb:237
lib/datadog/symbol_database/transport.rb:44
Cleared:
lib/datadog/symbol_database/scope.rb:77

Untyped methods

This PR introduces 4 untyped methods and 32 partially typed methods, and clears 3 partially typed methods. It increases the percentage of typed methods from 61.33% to 61.41% (+0.08%).

Untyped methods (+4-0)Introduced:
sig/datadog/symbol_database/component.rbs:23
└── def initialize: (untyped settings, untyped agent_settings, untyped logger, ?telemetry: untyped?) -> void
sig/datadog/symbol_database/extractor.rbs:11
└── def initialize: (logger: untyped, settings: untyped, ?telemetry: untyped?) -> void
sig/datadog/symbol_database/transport/http/endpoint.rbs:13
└── def call: (untyped env) ?{ (untyped) -> untyped } -> untyped
sig/datadog/symbol_database/uploader.rbs:22
└── def initialize: (untyped config, untyped agent_settings, logger: untyped, ?telemetry: untyped) -> void
Partially typed methods (+32-3)Introduced:
sig/datadog/symbol_database/component.rbs:21
└── def self.build: (untyped settings, untyped agent_settings, untyped logger, ?telemetry: untyped?) -> Component?
sig/datadog/symbol_database/component.rbs:42
└── def self.environment_supported?: (untyped logger) -> bool
sig/datadog/symbol_database/configuration/settings.rbs:5
└── def self.extended: (untyped base) -> void
sig/datadog/symbol_database/configuration/settings.rbs:7
└── def self.add_settings!: (untyped base) -> void
sig/datadog/symbol_database/extractor.rbs:35
└── def build_class_language_specifics: (Class klass) -> Hash[::Symbol, untyped]
sig/datadog/symbol_database/extractor.rbs:53
└── def collect_extractable_modules: () -> Hash[String, untyped]
sig/datadog/symbol_database/extractor.rbs:55
└── def group_methods_by_file: (Module mod) -> Hash[String, Array[untyped]]
sig/datadog/symbol_database/extractor.rbs:57
└── def build_file_trees: (Hash[String, untyped] entries) -> Hash[String, untyped]
sig/datadog/symbol_database/extractor.rbs:59
└── def place_in_tree: (untyped root, Array[String] name_parts, Module mod, Array[untyped] methods, String file_path) -> void
sig/datadog/symbol_database/extractor.rbs:63
└── def convert_trees_to_scopes: (Hash[String, untyped] file_trees) -> Array[Scope]
sig/datadog/symbol_database/extractor.rbs:65
└── def convert_node_to_scope: (untyped node) -> Scope
sig/datadog/symbol_database/file_hash.rbs:4
└── def self.compute: (String? file_path, logger: untyped) -> String?
sig/datadog/symbol_database/file_hash.rbs:6
└── def compute: (String? file_path, logger: untyped) -> String?
sig/datadog/symbol_database/logger.rbs:10
└── def initialize: (untyped settings, ::Logger target) -> void
sig/datadog/symbol_database/logger.rbs:17
└── def debug: (?untyped? message) ?{ () -> untyped } -> void
sig/datadog/symbol_database/logger.rbs:18
└── def warn: (?untyped? message) ?{ () -> untyped } -> void
sig/datadog/symbol_database/logger.rbs:19
└── def trace: () { () -> untyped } -> void
sig/datadog/symbol_database/remote.rbs:8
└── def self.capabilities: () -> Array[untyped]
sig/datadog/symbol_database/remote.rbs:10
└── def self.receivers: (untyped telemetry) -> Array[Core::Remote::Dispatcher::Receiver]
sig/datadog/symbol_database/remote.rbs:12
└── def self.receiver: (?Array[String] products) { (untyped repository, untyped changes) -> void } -> Array[Core::Remote::Dispatcher::Receiver]
sig/datadog/symbol_database/remote.rbs:14
└── def self.process_change: (Component component, untyped change, ?telemetry: untyped?) -> void
sig/datadog/symbol_database/remote.rbs:16
└── def self.enable_upload: (Component component, untyped content) -> void
sig/datadog/symbol_database/remote.rbs:20
└── def self.parse_config: (untyped content) -> Hash[String, untyped]?
sig/datadog/symbol_database/scope.rbs:22
└── def initialize: (
        scope_type: String,
        ?name: String?,
        ?source_file: String?,
        ?start_line: Integer?,
        ?end_line: Integer?,
        ?injectible_lines: Array[Hash[::Symbol, Integer]]?,
        ?language_specifics: Hash[::Symbol, untyped]?,
        ?symbols: Array[Datadog::SymbolDatabase::Symbol]?,
        ?scopes: Array[Scope]?
      ) -> void
sig/datadog/symbol_database/scope.rbs:54
└── def to_h: () -> Hash[::Symbol, untyped]
sig/datadog/symbol_database/scope.rbs:56
└── def to_json: (?untyped _state) -> String
sig/datadog/symbol_database/scope_context.rbs:36
└── def initialize: (Uploader uploader, logger: Logger, ?telemetry: untyped, ?on_upload: Proc?, ?timer_enabled: bool) -> void
sig/datadog/symbol_database/transport/http/endpoint.rbs:11
└── def initialize: (String path, untyped encoder) -> void
sig/datadog/symbol_database/transport/http.rbs:7
└── def self.build: (
          agent_settings: ::Datadog::Core::Configuration::AgentSettings,
          logger: untyped,
          ?headers: Hash[String, String]?
        ) ?{ (untyped) -> void } -> SymbolDatabase::Transport::Transport
sig/datadog/symbol_database/transport.rbs:9
└── def initialize: (Hash[untyped, untyped] form) -> void
sig/datadog/symbol_database/transport.rbs:17
└── def send_symdb_payload: (Hash[untyped, untyped] form) -> Core::Transport::Response
sig/datadog/symbol_database/uploader.rbs:48
└── def handle_response: (untyped response, Integer scope_count) -> bool
Cleared:
sig/datadog/symbol_database/scope.rbs:24
└── def initialize: (
        scope_type: String,
        ?name: String?,
        ?source_file: String?,
        ?start_line: Integer?,
        ?end_line: Integer?,
        ?has_injectible_lines: bool,
        ?injectible_lines: Array[Hash[::Symbol, Integer]]?,
        ?language_specifics: Hash[::Symbol, untyped]?,
        ?symbols: Array[Datadog::SymbolDatabase::Symbol]?,
        ?scopes: Array[Scope]?
      ) -> void
sig/datadog/symbol_database/scope.rbs:57
└── def to_h: () -> Hash[::Symbol, untyped]
sig/datadog/symbol_database/scope.rbs:59
└── def to_json: (?untyped _state) -> String

Untyped other declarations

This PR introduces 20 untyped other declarations and 4 partially typed other declarations, and clears 2 partially typed other declarations. It decreases the percentage of typed other declarations from 77.75% to 77.24% (-0.51%).

Untyped other declarations (+20-0)Introduced:
sig/datadog/symbol_database/component.rbs:7
└── @settings: untyped
sig/datadog/symbol_database/component.rbs:8
└── @agent_settings: untyped
sig/datadog/symbol_database/component.rbs:9
└── @logger: untyped
sig/datadog/symbol_database/component.rbs:10
└── @telemetry: untyped?
sig/datadog/symbol_database/component.rbs:25
└── attr_reader settings: untyped
sig/datadog/symbol_database/extractor.rbs:7
└── @logger: untyped
sig/datadog/symbol_database/extractor.rbs:8
└── @settings: untyped
sig/datadog/symbol_database/extractor.rbs:9
└── @telemetry: untyped?
sig/datadog/symbol_database/logger.rbs:6
└── @settings: untyped
sig/datadog/symbol_database/logger.rbs:12
└── attr_reader settings: untyped
sig/datadog/symbol_database/scope_context.rbs:14
└── @telemetry: untyped
sig/datadog/symbol_database/transport/http/endpoint.rbs:7
└── @encoder: untyped
sig/datadog/symbol_database/transport/http/endpoint.rbs:9
└── attr_reader encoder: untyped
sig/datadog/symbol_database/uploader.rbs:12
└── @config: untyped
sig/datadog/symbol_database/uploader.rbs:14
└── @agent_settings: untyped
sig/datadog/symbol_database/uploader.rbs:16
└── @logger: untyped
sig/datadog/symbol_database/uploader.rbs:18
└── @telemetry: untyped
sig/datadog/symbol_database/uploader.rbs:20
└── @transport: untyped
sig/datadog/symbol_database.rbs:7
└── @mutex: untyped
sig/datadog/symbol_database.rbs:9
└── @component: untyped
Partially typed other declarations (+4-2)Introduced:
sig/datadog/symbol_database/scope.rbs:16
└── @language_specifics: Hash[::Symbol, untyped]
sig/datadog/symbol_database/scope.rbs:48
└── attr_reader language_specifics: Hash[::Symbol, untyped]
sig/datadog/symbol_database/transport.rbs:5
└── @form: Hash[untyped, untyped]
sig/datadog/symbol_database/transport.rbs:7
└── attr_reader form: Hash[untyped, untyped]
Cleared:
sig/datadog/symbol_database/scope.rbs:18
└── @language_specifics: Hash[::Symbol, untyped]
sig/datadog/symbol_database/scope.rbs:51
└── attr_reader language_specifics: Hash[::Symbol, untyped]

If you believe a method or an attribute is rightfully untyped or partially typed, you can add # untyped:accept on the line before the definition to remove it from the stats.

@github-actions github-actions Bot added the core Involves Datadog core libraries label Mar 9, 2026
@datadog-prod-us1-5
Copy link
Copy Markdown

datadog-prod-us1-5 Bot commented Mar 9, 2026

Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 97.00%
Overall Coverage: 97.19% (-0.08%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: f9f4092 | Docs | Datadog PR Page | Give us feedback!

Comment thread lib/datadog/core/configuration/components.rb Outdated
Comment thread lib/datadog/symbol_database/component.rb Outdated
Comment thread lib/datadog/symbol_database/component.rb Outdated
Comment thread lib/datadog/symbol_database/component.rb Outdated
Comment thread lib/datadog/symbol_database/component.rb Outdated
Comment thread lib/datadog/symbol_database/component.rb Outdated
Comment thread lib/datadog/symbol_database/component.rb Outdated
Comment thread lib/datadog/symbol_database/component.rb Outdated
Comment thread lib/datadog/symbol_database/component.rb Outdated
p-datadog pushed a commit that referenced this pull request Mar 9, 2026
Motivation:
PR #5431 received review feedback on code quality and patterns. Addressing
all code change requests to match Ruby tracer conventions.

Technical Details:

1. Error logging pattern (20 instances changed):
   Changed: #{e.message} → #{e.class}: #{e}
   Files: component.rb, extractor.rb, file_hash.rb, remote.rb, scope_context.rb, uploader.rb
   Reason: Provides exception class for better debugging
   Pattern: "SymDB: Failed to X: #{e.class}: #{e}"

2. Time provider (2 instances changed):
   Changed: Time.now → Datadog::Core::Utils::Time.now
   File: component.rb (lines 56, 81)
   Added: require '../core/utils/time'
   Reason: Tracer uses its own time utilities (testable, mockable)

3. Constant for magic number (1 instance):
   Changed: Hardcoded 60 → UPLOAD_COOLDOWN constant
   File: component.rb
   Added: UPLOAD_COOLDOWN = 60  # seconds
   Reason: Self-documenting, easier to test/modify

Remaining PR feedback items (questions - will respond separately):
- Q: Why require DI to be enabled? (research Java/Python behavior)
- Q: Dependency warning pattern (analyzed - logger.warn is standard)
- Q: Explain force upload (will add code comment)
- Q: Concurrency safety of start_upload (will analyze and respond)
- Q: In-progress uploads during shutdown (will analyze and respond)

Testing:
Changes validated by re-running tests after modifications.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Comment thread lib/datadog/symbol_database/file_hash.rb
Comment thread lib/datadog/symbol_database/file_hash.rb
@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented Mar 9, 2026

Benchmarks

Benchmark execution time: 2026-04-26 02:45:55

Comparing candidate commit c92737e in PR branch symbol-database-upload with baseline commit 617c212 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 45 metrics, 1 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

Comment thread Steepfile Outdated
Comment thread lib/datadog/symbol_database/component.rb Outdated
Comment thread lib/datadog/symbol_database/component.rb Outdated
Comment thread lib/datadog/symbol_database/component.rb Outdated
Comment thread lib/datadog/symbol_database/scope_context.rb
Comment thread lib/datadog/symbol_database/scope_context.rb
Comment thread lib/datadog/symbol_database/scope_context.rb Outdated
Comment thread lib/datadog/symbol_database/service_version.rb Outdated
Comment thread lib/datadog/symbol_database/symbol.rb Outdated
Comment thread lib/datadog/symbol_database/uploader.rb
Comment thread lib/datadog/symbol_database/uploader.rb Outdated
@p-datadog p-datadog force-pushed the symbol-database-upload branch 2 times, most recently from b7bb240 to 3c3c7e3 Compare March 11, 2026 23:07
Comment thread lib/datadog/symbol_database/extractor.rb
Comment thread lib/datadog/symbol_database/extractor.rb Outdated
Comment thread lib/datadog/symbol_database/extractor.rb
Comment thread lib/datadog/symbol_database/extractor.rb Outdated
Comment thread lib/datadog/symbol_database/extractor.rb
Comment thread lib/datadog/symbol_database/scope_context.rb Outdated
Comment thread lib/datadog/symbol_database/uploader.rb Outdated
Comment thread lib/datadog/symbol_database/uploader.rb Outdated
Comment thread lib/datadog/symbol_database/uploader.rb Outdated
Copy link
Copy Markdown
Member Author

@p-datadog p-datadog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the thorough review! All feedback has been addressed.

@p-datadog
Copy link
Copy Markdown
Member Author

Backend dependency: DataDog/debugger-backend#1974 adds RUBY to the storage-level Language enum. Without it, Ruby symbol database uploads are silently dropped by the extractor — the tracer and agent work correctly, but data never appears in the symdb API.

@p-datadog p-datadog force-pushed the symbol-database-upload branch from ab32a30 to f9ef632 Compare March 22, 2026 20:26
p-datadog pushed a commit that referenced this pull request Mar 28, 2026
* origin/symbol-database-upload: (167 commits)
  Add /test/ path exclusion; fix DI docs on generated method filtering
  Lowercase language field: 'RUBY' → 'ruby'
  Extract empty classes (AR models, Forwardable-only) via const_source_location
  Remove synthetic self ARG from instance method symbols
  Log full scope tree at trace level during extraction
  Remove upload retries — single attempt, matching Python behavior
  Fix NoMethodError when transport returns InternalErrorResponse
  Add diagnostic logging: extraction summary and per-scope trace
  Revert JAVA workaround: use RUBY language and ruby ddsource
  Replace non-running test files with working RC integration test
  Fix Steep type errors in SymbolDatabase RBS signatures
  Add YARD docs to Logger#trace
  Remove workarounds: logger defaults, respond_to guard, Steepfile ignores
  Ignore symdb files in Steep typecheck
  Refactor Extractor from static class methods to component instance
  Add missing RBS signature for SymbolDatabase::Logger
  Fix StandardRB Style/KeywordParametersOrder violation
  Move param_name nil log to trace level, document introspection limitation
  Add symdb diagnostics: logger facade, trace level, prefix normalization
  Fix bare .filter_map breaking Ruby 2.6 in extract_all
  ...
p-datadog pushed a commit that referenced this pull request Mar 28, 2026
* origin/symbol-database-upload: (167 commits)
  Add /test/ path exclusion; fix DI docs on generated method filtering
  Lowercase language field: 'RUBY' → 'ruby'
  Extract empty classes (AR models, Forwardable-only) via const_source_location
  Remove synthetic self ARG from instance method symbols
  Log full scope tree at trace level during extraction
  Remove upload retries — single attempt, matching Python behavior
  Fix NoMethodError when transport returns InternalErrorResponse
  Add diagnostic logging: extraction summary and per-scope trace
  Revert JAVA workaround: use RUBY language and ruby ddsource
  Replace non-running test files with working RC integration test
  Fix Steep type errors in SymbolDatabase RBS signatures
  Add YARD docs to Logger#trace
  Remove workarounds: logger defaults, respond_to guard, Steepfile ignores
  Ignore symdb files in Steep typecheck
  Refactor Extractor from static class methods to component instance
  Add missing RBS signature for SymbolDatabase::Logger
  Fix StandardRB Style/KeywordParametersOrder violation
  Move param_name nil log to trace level, document introspection limitation
  Add symdb diagnostics: logger facade, trace level, prefix normalization
  Fix bare .filter_map breaking Ruby 2.6 in extract_all
  ...
p-datadog pushed a commit that referenced this pull request Apr 1, 2026
Motivation:
PR #5431 received review feedback on code quality and patterns. Addressing
all code change requests to match Ruby tracer conventions.

Technical Details:

1. Error logging pattern (20 instances changed):
   Changed: #{e.message} → #{e.class}: #{e}
   Files: component.rb, extractor.rb, file_hash.rb, remote.rb, scope_context.rb, uploader.rb
   Reason: Provides exception class for better debugging
   Pattern: "SymDB: Failed to X: #{e.class}: #{e}"

2. Time provider (2 instances changed):
   Changed: Time.now → Datadog::Core::Utils::Time.now
   File: component.rb (lines 56, 81)
   Added: require '../core/utils/time'
   Reason: Tracer uses its own time utilities (testable, mockable)

3. Constant for magic number (1 instance):
   Changed: Hardcoded 60 → UPLOAD_COOLDOWN constant
   File: component.rb
   Added: UPLOAD_COOLDOWN = 60  # seconds
   Reason: Self-documenting, easier to test/modify

Remaining PR feedback items (questions - will respond separately):
- Q: Why require DI to be enabled? (research Java/Python behavior)
- Q: Dependency warning pattern (analyzed - logger.warn is standard)
- Q: Explain force upload (will add code comment)
- Q: Concurrency safety of start_upload (will analyze and respond)
- Q: In-progress uploads during shutdown (will analyze and respond)

Testing:
Changes validated by re-running tests after modifications.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@p-datadog p-datadog force-pushed the symbol-database-upload branch from 68442c1 to a89c77c Compare April 1, 2026 19:26
p-datadog pushed a commit that referenced this pull request Apr 3, 2026
Motivation:
PR #5431 received review feedback on code quality and patterns. Addressing
all code change requests to match Ruby tracer conventions.

Technical Details:

1. Error logging pattern (20 instances changed):
   Changed: #{e.message} → #{e.class}: #{e}
   Files: component.rb, extractor.rb, file_hash.rb, remote.rb, scope_context.rb, uploader.rb
   Reason: Provides exception class for better debugging
   Pattern: "SymDB: Failed to X: #{e.class}: #{e}"

2. Time provider (2 instances changed):
   Changed: Time.now → Datadog::Core::Utils::Time.now
   File: component.rb (lines 56, 81)
   Added: require '../core/utils/time'
   Reason: Tracer uses its own time utilities (testable, mockable)

3. Constant for magic number (1 instance):
   Changed: Hardcoded 60 → UPLOAD_COOLDOWN constant
   File: component.rb
   Added: UPLOAD_COOLDOWN = 60  # seconds
   Reason: Self-documenting, easier to test/modify

Remaining PR feedback items (questions - will respond separately):
- Q: Why require DI to be enabled? (research Java/Python behavior)
- Q: Dependency warning pattern (analyzed - logger.warn is standard)
- Q: Explain force upload (will add code comment)
- Q: Concurrency safety of start_upload (will analyze and respond)
- Q: In-progress uploads during shutdown (will analyze and respond)

Testing:
Changes validated by re-running tests after modifications.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@p-datadog p-datadog force-pushed the symbol-database-upload branch from 03720f2 to fa30bee Compare April 3, 2026 01:53
p-datadog pushed a commit that referenced this pull request Apr 8, 2026
Motivation:
PR #5431 received review feedback on code quality and patterns. Addressing
all code change requests to match Ruby tracer conventions.

Technical Details:

1. Error logging pattern (20 instances changed):
   Changed: #{e.message} → #{e.class}: #{e}
   Files: component.rb, extractor.rb, file_hash.rb, remote.rb, scope_context.rb, uploader.rb
   Reason: Provides exception class for better debugging
   Pattern: "SymDB: Failed to X: #{e.class}: #{e}"

2. Time provider (2 instances changed):
   Changed: Time.now → Datadog::Core::Utils::Time.now
   File: component.rb (lines 56, 81)
   Added: require '../core/utils/time'
   Reason: Tracer uses its own time utilities (testable, mockable)

3. Constant for magic number (1 instance):
   Changed: Hardcoded 60 → UPLOAD_COOLDOWN constant
   File: component.rb
   Added: UPLOAD_COOLDOWN = 60  # seconds
   Reason: Self-documenting, easier to test/modify

Remaining PR feedback items (questions - will respond separately):
- Q: Why require DI to be enabled? (research Java/Python behavior)
- Q: Dependency warning pattern (analyzed - logger.warn is standard)
- Q: Explain force upload (will add code comment)
- Q: Concurrency safety of start_upload (will analyze and respond)
- Q: In-progress uploads during shutdown (will analyze and respond)

Testing:
Changes validated by re-running tests after modifications.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@p-datadog p-datadog force-pushed the symbol-database-upload branch from 33d3996 to d702cea Compare April 8, 2026 19:03
Comment on lines +69 to +81
option :upload_class_methods do |o|
o.type :bool
o.default false
end

# Which RubyVM::InstructionSequence#trace_points event types
# to include when computing injectable lines on METHOD scopes.
# Default: [:line, :return]. :call excluded because method
# entry is handled by method probes, not line probes.
# Changeable at runtime — takes effect on next extraction.
option :injectable_line_events do |o|
o.default [:line, :return]
end
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add an environment variable to these two configs ?

p-datadog pushed a commit that referenced this pull request Apr 20, 2026
Motivation:
PR #5431 received review feedback on code quality and patterns. Addressing
all code change requests to match Ruby tracer conventions.

Technical Details:

1. Error logging pattern (20 instances changed):
   Changed: #{e.message} → #{e.class}: #{e}
   Files: component.rb, extractor.rb, file_hash.rb, remote.rb, scope_context.rb, uploader.rb
   Reason: Provides exception class for better debugging
   Pattern: "SymDB: Failed to X: #{e.class}: #{e}"

2. Time provider (2 instances changed):
   Changed: Time.now → Datadog::Core::Utils::Time.now
   File: component.rb (lines 56, 81)
   Added: require '../core/utils/time'
   Reason: Tracer uses its own time utilities (testable, mockable)

3. Constant for magic number (1 instance):
   Changed: Hardcoded 60 → UPLOAD_COOLDOWN constant
   File: component.rb
   Added: UPLOAD_COOLDOWN = 60  # seconds
   Reason: Self-documenting, easier to test/modify

Remaining PR feedback items (questions - will respond separately):
- Q: Why require DI to be enabled? (research Java/Python behavior)
- Q: Dependency warning pattern (analyzed - logger.warn is standard)
- Q: Explain force upload (will add code comment)
- Q: Concurrency safety of start_upload (will analyze and respond)
- Q: In-progress uploads during shutdown (will analyze and respond)

Testing:
Changes validated by re-running tests after modifications.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
p-ddsign and others added 16 commits April 21, 2026 17:12
- extract_injectable_lines: log iseq-nil skip (C extension), log
  range count and line span for successful extraction
- component summary: include count of methods with injectable lines

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add comment explaining the behavioral difference on Ruby 2.6 where
const_source_location is unavailable: namespace-only modules and
classes with only generated methods are silently omitted from extraction.

Co-Authored-By: Claude <noreply@anthropic.com>
…mponent pattern

Review fixes:
- Remove Steepfile ignores for component.rb and extractor.rb (use per-line
  steep:ignore directives instead of hiding entire files)
- Inject logger into FileHash.compute instead of using Datadog.logger global
- Add telemetry to Remote.process_change rescue block
- Capture exception in Component.schedule_deferred_upload bare rescue and log it
- Add YARD docstrings to log_scope_tree, Logger#initialize,
  Endpoint#initialize, Settings.extended
- Remove stale TODO comment from uploader.rb (backend workaround already reverted)
- Remove useless comments: "# Build components", "# Get component from global
  state", "# Re-enable with new config", "# Delete change has 'previous'...",
  "# Parse JSON string to Hash", "# Validate it's actually a Hash",
  "# Only call errored()...", "# Handle both content and previous"
- Remove stale process_changes method from remote.rbs
- Update FileHash RBS signature and specs for new logger: parameter
- Document Ruby 2.6 graceful degradation for const_source_location fallback

All 266 symbol_database specs pass. Steep clean.

Co-Authored-By: Claude <noreply@anthropic.com>
…g docs

On Ruby 2.6, namespace-only modules and classes with only generated
methods may not appear in DI auto-completion because
Module#const_source_location is unavailable. Document this in the
Symbol Database section of DynamicInstrumentation.md.

Co-Authored-By: Claude <noreply@anthropic.com>
Document the DD_SYMBOL_DATABASE_UPLOAD_ENABLED env var and
c.symbol_database.enabled setting in customer-facing docs,
per CLAUDE.md requirement to update GettingStarted.md when
adding new settings/env vars. Links to DynamicInstrumentation.md
for full details.

Co-Authored-By: Claude <noreply@anthropic.com>
Four customer-facing DI env vars (DD_DYNAMIC_INSTRUMENTATION_ENABLED,
DD_DYNAMIC_INSTRUMENTATION_REDACTED_IDENTIFIERS,
DD_DYNAMIC_INSTRUMENTATION_REDACTION_EXCLUDED_IDENTIFIERS,
DD_DYNAMIC_INSTRUMENTATION_REDACTED_TYPES) were missing from
GettingStarted.md. Add a proper Dynamic Instrumentation section with
all env vars and programmatic settings, with Symbol Database as a
subsection underneath.

Co-Authored-By: Claude <noreply@anthropic.com>
…y doc

- Add telemetry to top-level extract() and extract_all() rescue blocks
  so extraction failures are tracked (symbol_database.extract_error,
  symbol_database.extract_all_error)
- Capture => e and add debug logging to three method-level bare rescues:
  find_source_file, calculate_class_line_range, build_class_language_specifics
- Remove dead code (empty if block in extract_singleton_method_parameters)
- Add class-level comment documenting the three-layer rescue strategy:
  inner per-item (bare, skip one item), method-level (log, return safe
  default), top-level (log + telemetry, error boundary)
- Restore Steepfile ignores for component.rb and extractor.rb with
  explanation (RBS only covers public API; 30+ private methods undeclared)

All 266 specs pass. Steep clean.

Co-Authored-By: Claude <noreply@anthropic.com>
…tractor

- Remove Steepfile ignores for component.rb and extractor.rb
- Rewrite extractor.rbs: change from stale class methods to instance
  methods matching actual implementation, add all 30+ private methods
- Update component.rbs: add missing FORCE_UPLOAD_ONCE constant,
  @extractor/@shutdown instance vars, schedule_deferred_upload,
  shutdown?, environment_supported?, log_scope_tree, count_injectable_methods
- Update logger.rbs: debug/warn accept both positional message and block
  (matching ::Logger delegation behavior)
- Add per-line steep:ignore for Steep false positives: array slice
  nilability, RubyVM::InstructionSequence.of missing from stdlib RBS,
  Hash.new block type, Module vs Class narrowing

Steep clean. All 266 specs pass.

Co-Authored-By: Claude <noreply@anthropic.com>
The skip 'git not available' guard is unnecessary — git is always
present in CI (the repo is checked out via git, and release_gem_spec.rb
already calls git ls-files without a skip guard). Removing the skip
ensures the test hard-fails if the Git blob algorithm compatibility
breaks, rather than silently passing.

Also improved test description to explain what the test validates:
that FileHash.compute produces identical output to git hash-object,
confirming Git blob hashing compatibility for backend commit inference.

Co-Authored-By: Claude <noreply@anthropic.com>
…test

The test at extractor_spec.rb:643 used `skip 'requires Ruby 2.7+'` but
every other const_source_location test in the file uses inline
`if Module.method_defined?(:const_source_location)` branching with
explicit expectations for both paths. Align this test with the existing
pattern so it runs on all Ruby versions: verifies file_hash on 2.7+,
verifies nil (graceful degradation) on < 2.7.

Co-Authored-By: Claude <noreply@anthropic.com>
…s tests, wire format

Gap 1: Replace 9 bare rescues in extractor.rb with => e + debug logging.
Gap 4: Add DI-independence test (symdb works with DI disabled).
Gap 5: Add 10 injectable lines test cases: build_injectable_ranges
  (5 input/output cases), C-extension iseq nil, dedup, initialize
  method, custom event filters (2 cases).
Gap 6: Add wire format integration test verifying multipart form
  structure, event.json metadata, and gzip-compressed payload.

278 tests, 0 failures.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…scues

Scope serialization tests (scope_spec.rb):
- METHOD with injectable lines: has_injectible_lines + injectible_lines present
- METHOD without: has_injectible_lines false, no injectible_lines key
- CLASS/MODULE/FILE: neither field present

extract() path tests (extractor_spec.rb):
- Instance method injectable lines via extract()
- Singleton method injectable lines via extract() with upload_class_methods

Bare rescues (extractor.rb):
- All 9 bare rescues replaced with => e + @logger.debug

263 tests, 0 failures.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… early

The `describe '#to_h'` block was closed prematurely at line 220, leaving
the injectable lines tests (lines 222-284) orphaned outside any describe
block with an extra `end` at line 285. This caused a SyntaxError at
load time (`unexpected 'end'` at line 330), which produced "1 error
occurred outside of examples" in all 30 failing CI jobs.

Fix: remove the premature `end` so the injectable lines tests remain
inside `describe '#to_h'`.

Verified: `ruby -c` passes, 22 examples 0 failures locally.

Co-Authored-By: Claude <noreply@anthropic.com>
Style/RedundantAssignment: `result` was assigned from filter_map then
immediately returned. Return the filter_map result directly.

Co-Authored-By: Claude <noreply@anthropic.com>
Expose enabled, last_upload_time, and upload_in_progress as
attr_readers for diagnostic tools (gobo demo pages, health checks).
These are @api private read-only accessors — no new public API.

Thread-safe: ivars set inside mutex, Ruby reference reads are atomic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Test enabled, last_upload_time, upload_in_progress at each state
transition. Also update ported Java test to use public accessor
instead of instance_variable_get.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@p-datadog p-datadog force-pushed the symbol-database-upload branch from 21ae446 to f447025 Compare April 21, 2026 21:12
p-ddsign and others added 3 commits April 21, 2026 17:18
Rebase conflict resolution incorrectly kept older branch versions,
reverting changes from merged PR #5618:
- compact! (in-place, fewer allocations) back to .compact
- Trailing commas removed
- Comment accuracy (FILE scope hierarchy)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rebase replayed both the original and the #5618-merged versions of
these 5 tests, producing duplicates. Keep the first set (with trailing
commas matching project style).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove the has_injectible_lines parameter from Scope#initialize and
derive it from injectible_lines instead. The boolean carried no
independent information — all 4 extractor call sites already passed
`has_injectible_lines: !injectable_lines.nil?`.

Addresses review feedback from Strech on #5618.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@p-datadog p-datadog force-pushed the symbol-database-upload branch from 2196512 to 26107cf Compare April 22, 2026 12:03
nil (not computed), [] (no executable lines), non-empty (has ranges)
all affect has_injectible_lines and to_h serialization. This was
undocumented.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@p-datadog p-datadog force-pushed the symbol-database-upload branch from 26107cf to 3216674 Compare April 22, 2026 12:05
p-ddsign and others added 7 commits April 22, 2026 08:08
Ruby predicate convention: question mark suffix, no has_ prefix.
Wire format key (:has_injectible_lines) unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove extra empty line at block body end in scope_spec.rb.

Co-Authored-By: Claude <noreply@anthropic.com>
Steep no longer flags these blocks, so the ignore comments are
redundant and cause CI failure (redundant ignore comment warning).

Co-Authored-By: Claude <noreply@anthropic.com>
No realistic need to change event types at runtime — Java and Go
hardcode their equivalent behavior. Move from an internal setting
to Extractor::INJECTABLE_LINE_EVENTS constant. Remove two tests
that tested the configurable behavior.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The instance_double rule and Troubleshooting section were bundled
into a find_source_file bug fix commit. They don't belong in this PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ruby DI cannot instrument class methods (singleton class), so
extracting them is pointless. Remove upload_class_methods setting,
extract_singleton_method_scope, extract_singleton_method_parameters,
build_singleton_method_scope, and all associated tests. Delete
docs/class_methods_di_design.md. Class method extraction is deferred
until DI gains singleton class instrumentation support.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove extra empty line at block body end (Layout/EmptyLinesAroundBlockBody).

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos core Involves Datadog core libraries

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants