Add Symbol Database Upload for Ruby Tracer#5431
Conversation
|
Thank you for updating Change log entry section 👏 Visited at: 2026-03-17 18:37:08 UTC |
Typing analysisNote: Ignored files are excluded from the next sections.
|
🎉 All green!❄️ No new flaky tests detected 🎯 Code Coverage (details) 🔗 Commit SHA: f9f4092 | Docs | Datadog PR Page | Give us feedback! |
Motivation: PR #5431 received review feedback on code quality and patterns. Addressing all code change requests to match Ruby tracer conventions. Technical Details: 1. Error logging pattern (20 instances changed): Changed: #{e.message} → #{e.class}: #{e} Files: component.rb, extractor.rb, file_hash.rb, remote.rb, scope_context.rb, uploader.rb Reason: Provides exception class for better debugging Pattern: "SymDB: Failed to X: #{e.class}: #{e}" 2. Time provider (2 instances changed): Changed: Time.now → Datadog::Core::Utils::Time.now File: component.rb (lines 56, 81) Added: require '../core/utils/time' Reason: Tracer uses its own time utilities (testable, mockable) 3. Constant for magic number (1 instance): Changed: Hardcoded 60 → UPLOAD_COOLDOWN constant File: component.rb Added: UPLOAD_COOLDOWN = 60 # seconds Reason: Self-documenting, easier to test/modify Remaining PR feedback items (questions - will respond separately): - Q: Why require DI to be enabled? (research Java/Python behavior) - Q: Dependency warning pattern (analyzed - logger.warn is standard) - Q: Explain force upload (will add code comment) - Q: Concurrency safety of start_upload (will analyze and respond) - Q: In-progress uploads during shutdown (will analyze and respond) Testing: Changes validated by re-running tests after modifications. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
BenchmarksBenchmark execution time: 2026-04-26 02:45:55 Comparing candidate commit c92737e in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 45 metrics, 1 unstable metrics.
|
b7bb240 to
3c3c7e3
Compare
p-datadog
left a comment
There was a problem hiding this comment.
Thank you for the thorough review! All feedback has been addressed.
|
Backend dependency: DataDog/debugger-backend#1974 adds |
ab32a30 to
f9ef632
Compare
* origin/symbol-database-upload: (167 commits) Add /test/ path exclusion; fix DI docs on generated method filtering Lowercase language field: 'RUBY' → 'ruby' Extract empty classes (AR models, Forwardable-only) via const_source_location Remove synthetic self ARG from instance method symbols Log full scope tree at trace level during extraction Remove upload retries — single attempt, matching Python behavior Fix NoMethodError when transport returns InternalErrorResponse Add diagnostic logging: extraction summary and per-scope trace Revert JAVA workaround: use RUBY language and ruby ddsource Replace non-running test files with working RC integration test Fix Steep type errors in SymbolDatabase RBS signatures Add YARD docs to Logger#trace Remove workarounds: logger defaults, respond_to guard, Steepfile ignores Ignore symdb files in Steep typecheck Refactor Extractor from static class methods to component instance Add missing RBS signature for SymbolDatabase::Logger Fix StandardRB Style/KeywordParametersOrder violation Move param_name nil log to trace level, document introspection limitation Add symdb diagnostics: logger facade, trace level, prefix normalization Fix bare .filter_map breaking Ruby 2.6 in extract_all ...
* origin/symbol-database-upload: (167 commits) Add /test/ path exclusion; fix DI docs on generated method filtering Lowercase language field: 'RUBY' → 'ruby' Extract empty classes (AR models, Forwardable-only) via const_source_location Remove synthetic self ARG from instance method symbols Log full scope tree at trace level during extraction Remove upload retries — single attempt, matching Python behavior Fix NoMethodError when transport returns InternalErrorResponse Add diagnostic logging: extraction summary and per-scope trace Revert JAVA workaround: use RUBY language and ruby ddsource Replace non-running test files with working RC integration test Fix Steep type errors in SymbolDatabase RBS signatures Add YARD docs to Logger#trace Remove workarounds: logger defaults, respond_to guard, Steepfile ignores Ignore symdb files in Steep typecheck Refactor Extractor from static class methods to component instance Add missing RBS signature for SymbolDatabase::Logger Fix StandardRB Style/KeywordParametersOrder violation Move param_name nil log to trace level, document introspection limitation Add symdb diagnostics: logger facade, trace level, prefix normalization Fix bare .filter_map breaking Ruby 2.6 in extract_all ...
Motivation: PR #5431 received review feedback on code quality and patterns. Addressing all code change requests to match Ruby tracer conventions. Technical Details: 1. Error logging pattern (20 instances changed): Changed: #{e.message} → #{e.class}: #{e} Files: component.rb, extractor.rb, file_hash.rb, remote.rb, scope_context.rb, uploader.rb Reason: Provides exception class for better debugging Pattern: "SymDB: Failed to X: #{e.class}: #{e}" 2. Time provider (2 instances changed): Changed: Time.now → Datadog::Core::Utils::Time.now File: component.rb (lines 56, 81) Added: require '../core/utils/time' Reason: Tracer uses its own time utilities (testable, mockable) 3. Constant for magic number (1 instance): Changed: Hardcoded 60 → UPLOAD_COOLDOWN constant File: component.rb Added: UPLOAD_COOLDOWN = 60 # seconds Reason: Self-documenting, easier to test/modify Remaining PR feedback items (questions - will respond separately): - Q: Why require DI to be enabled? (research Java/Python behavior) - Q: Dependency warning pattern (analyzed - logger.warn is standard) - Q: Explain force upload (will add code comment) - Q: Concurrency safety of start_upload (will analyze and respond) - Q: In-progress uploads during shutdown (will analyze and respond) Testing: Changes validated by re-running tests after modifications. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
68442c1 to
a89c77c
Compare
Motivation: PR #5431 received review feedback on code quality and patterns. Addressing all code change requests to match Ruby tracer conventions. Technical Details: 1. Error logging pattern (20 instances changed): Changed: #{e.message} → #{e.class}: #{e} Files: component.rb, extractor.rb, file_hash.rb, remote.rb, scope_context.rb, uploader.rb Reason: Provides exception class for better debugging Pattern: "SymDB: Failed to X: #{e.class}: #{e}" 2. Time provider (2 instances changed): Changed: Time.now → Datadog::Core::Utils::Time.now File: component.rb (lines 56, 81) Added: require '../core/utils/time' Reason: Tracer uses its own time utilities (testable, mockable) 3. Constant for magic number (1 instance): Changed: Hardcoded 60 → UPLOAD_COOLDOWN constant File: component.rb Added: UPLOAD_COOLDOWN = 60 # seconds Reason: Self-documenting, easier to test/modify Remaining PR feedback items (questions - will respond separately): - Q: Why require DI to be enabled? (research Java/Python behavior) - Q: Dependency warning pattern (analyzed - logger.warn is standard) - Q: Explain force upload (will add code comment) - Q: Concurrency safety of start_upload (will analyze and respond) - Q: In-progress uploads during shutdown (will analyze and respond) Testing: Changes validated by re-running tests after modifications. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
03720f2 to
fa30bee
Compare
Motivation: PR #5431 received review feedback on code quality and patterns. Addressing all code change requests to match Ruby tracer conventions. Technical Details: 1. Error logging pattern (20 instances changed): Changed: #{e.message} → #{e.class}: #{e} Files: component.rb, extractor.rb, file_hash.rb, remote.rb, scope_context.rb, uploader.rb Reason: Provides exception class for better debugging Pattern: "SymDB: Failed to X: #{e.class}: #{e}" 2. Time provider (2 instances changed): Changed: Time.now → Datadog::Core::Utils::Time.now File: component.rb (lines 56, 81) Added: require '../core/utils/time' Reason: Tracer uses its own time utilities (testable, mockable) 3. Constant for magic number (1 instance): Changed: Hardcoded 60 → UPLOAD_COOLDOWN constant File: component.rb Added: UPLOAD_COOLDOWN = 60 # seconds Reason: Self-documenting, easier to test/modify Remaining PR feedback items (questions - will respond separately): - Q: Why require DI to be enabled? (research Java/Python behavior) - Q: Dependency warning pattern (analyzed - logger.warn is standard) - Q: Explain force upload (will add code comment) - Q: Concurrency safety of start_upload (will analyze and respond) - Q: In-progress uploads during shutdown (will analyze and respond) Testing: Changes validated by re-running tests after modifications. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
33d3996 to
d702cea
Compare
| option :upload_class_methods do |o| | ||
| o.type :bool | ||
| o.default false | ||
| end | ||
|
|
||
| # Which RubyVM::InstructionSequence#trace_points event types | ||
| # to include when computing injectable lines on METHOD scopes. | ||
| # Default: [:line, :return]. :call excluded because method | ||
| # entry is handled by method probes, not line probes. | ||
| # Changeable at runtime — takes effect on next extraction. | ||
| option :injectable_line_events do |o| | ||
| o.default [:line, :return] | ||
| end |
There was a problem hiding this comment.
Is it possible to add an environment variable to these two configs ?
Motivation: PR #5431 received review feedback on code quality and patterns. Addressing all code change requests to match Ruby tracer conventions. Technical Details: 1. Error logging pattern (20 instances changed): Changed: #{e.message} → #{e.class}: #{e} Files: component.rb, extractor.rb, file_hash.rb, remote.rb, scope_context.rb, uploader.rb Reason: Provides exception class for better debugging Pattern: "SymDB: Failed to X: #{e.class}: #{e}" 2. Time provider (2 instances changed): Changed: Time.now → Datadog::Core::Utils::Time.now File: component.rb (lines 56, 81) Added: require '../core/utils/time' Reason: Tracer uses its own time utilities (testable, mockable) 3. Constant for magic number (1 instance): Changed: Hardcoded 60 → UPLOAD_COOLDOWN constant File: component.rb Added: UPLOAD_COOLDOWN = 60 # seconds Reason: Self-documenting, easier to test/modify Remaining PR feedback items (questions - will respond separately): - Q: Why require DI to be enabled? (research Java/Python behavior) - Q: Dependency warning pattern (analyzed - logger.warn is standard) - Q: Explain force upload (will add code comment) - Q: Concurrency safety of start_upload (will analyze and respond) - Q: In-progress uploads during shutdown (will analyze and respond) Testing: Changes validated by re-running tests after modifications. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- extract_injectable_lines: log iseq-nil skip (C extension), log range count and line span for successful extraction - component summary: include count of methods with injectable lines Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add comment explaining the behavioral difference on Ruby 2.6 where const_source_location is unavailable: namespace-only modules and classes with only generated methods are silently omitted from extraction. Co-Authored-By: Claude <noreply@anthropic.com>
…mponent pattern Review fixes: - Remove Steepfile ignores for component.rb and extractor.rb (use per-line steep:ignore directives instead of hiding entire files) - Inject logger into FileHash.compute instead of using Datadog.logger global - Add telemetry to Remote.process_change rescue block - Capture exception in Component.schedule_deferred_upload bare rescue and log it - Add YARD docstrings to log_scope_tree, Logger#initialize, Endpoint#initialize, Settings.extended - Remove stale TODO comment from uploader.rb (backend workaround already reverted) - Remove useless comments: "# Build components", "# Get component from global state", "# Re-enable with new config", "# Delete change has 'previous'...", "# Parse JSON string to Hash", "# Validate it's actually a Hash", "# Only call errored()...", "# Handle both content and previous" - Remove stale process_changes method from remote.rbs - Update FileHash RBS signature and specs for new logger: parameter - Document Ruby 2.6 graceful degradation for const_source_location fallback All 266 symbol_database specs pass. Steep clean. Co-Authored-By: Claude <noreply@anthropic.com>
…g docs On Ruby 2.6, namespace-only modules and classes with only generated methods may not appear in DI auto-completion because Module#const_source_location is unavailable. Document this in the Symbol Database section of DynamicInstrumentation.md. Co-Authored-By: Claude <noreply@anthropic.com>
Document the DD_SYMBOL_DATABASE_UPLOAD_ENABLED env var and c.symbol_database.enabled setting in customer-facing docs, per CLAUDE.md requirement to update GettingStarted.md when adding new settings/env vars. Links to DynamicInstrumentation.md for full details. Co-Authored-By: Claude <noreply@anthropic.com>
Four customer-facing DI env vars (DD_DYNAMIC_INSTRUMENTATION_ENABLED, DD_DYNAMIC_INSTRUMENTATION_REDACTED_IDENTIFIERS, DD_DYNAMIC_INSTRUMENTATION_REDACTION_EXCLUDED_IDENTIFIERS, DD_DYNAMIC_INSTRUMENTATION_REDACTED_TYPES) were missing from GettingStarted.md. Add a proper Dynamic Instrumentation section with all env vars and programmatic settings, with Symbol Database as a subsection underneath. Co-Authored-By: Claude <noreply@anthropic.com>
…y doc - Add telemetry to top-level extract() and extract_all() rescue blocks so extraction failures are tracked (symbol_database.extract_error, symbol_database.extract_all_error) - Capture => e and add debug logging to three method-level bare rescues: find_source_file, calculate_class_line_range, build_class_language_specifics - Remove dead code (empty if block in extract_singleton_method_parameters) - Add class-level comment documenting the three-layer rescue strategy: inner per-item (bare, skip one item), method-level (log, return safe default), top-level (log + telemetry, error boundary) - Restore Steepfile ignores for component.rb and extractor.rb with explanation (RBS only covers public API; 30+ private methods undeclared) All 266 specs pass. Steep clean. Co-Authored-By: Claude <noreply@anthropic.com>
…tractor - Remove Steepfile ignores for component.rb and extractor.rb - Rewrite extractor.rbs: change from stale class methods to instance methods matching actual implementation, add all 30+ private methods - Update component.rbs: add missing FORCE_UPLOAD_ONCE constant, @extractor/@shutdown instance vars, schedule_deferred_upload, shutdown?, environment_supported?, log_scope_tree, count_injectable_methods - Update logger.rbs: debug/warn accept both positional message and block (matching ::Logger delegation behavior) - Add per-line steep:ignore for Steep false positives: array slice nilability, RubyVM::InstructionSequence.of missing from stdlib RBS, Hash.new block type, Module vs Class narrowing Steep clean. All 266 specs pass. Co-Authored-By: Claude <noreply@anthropic.com>
The skip 'git not available' guard is unnecessary — git is always present in CI (the repo is checked out via git, and release_gem_spec.rb already calls git ls-files without a skip guard). Removing the skip ensures the test hard-fails if the Git blob algorithm compatibility breaks, rather than silently passing. Also improved test description to explain what the test validates: that FileHash.compute produces identical output to git hash-object, confirming Git blob hashing compatibility for backend commit inference. Co-Authored-By: Claude <noreply@anthropic.com>
…test The test at extractor_spec.rb:643 used `skip 'requires Ruby 2.7+'` but every other const_source_location test in the file uses inline `if Module.method_defined?(:const_source_location)` branching with explicit expectations for both paths. Align this test with the existing pattern so it runs on all Ruby versions: verifies file_hash on 2.7+, verifies nil (graceful degradation) on < 2.7. Co-Authored-By: Claude <noreply@anthropic.com>
…s tests, wire format Gap 1: Replace 9 bare rescues in extractor.rb with => e + debug logging. Gap 4: Add DI-independence test (symdb works with DI disabled). Gap 5: Add 10 injectable lines test cases: build_injectable_ranges (5 input/output cases), C-extension iseq nil, dedup, initialize method, custom event filters (2 cases). Gap 6: Add wire format integration test verifying multipart form structure, event.json metadata, and gzip-compressed payload. 278 tests, 0 failures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…scues Scope serialization tests (scope_spec.rb): - METHOD with injectable lines: has_injectible_lines + injectible_lines present - METHOD without: has_injectible_lines false, no injectible_lines key - CLASS/MODULE/FILE: neither field present extract() path tests (extractor_spec.rb): - Instance method injectable lines via extract() - Singleton method injectable lines via extract() with upload_class_methods Bare rescues (extractor.rb): - All 9 bare rescues replaced with => e + @logger.debug 263 tests, 0 failures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… early The `describe '#to_h'` block was closed prematurely at line 220, leaving the injectable lines tests (lines 222-284) orphaned outside any describe block with an extra `end` at line 285. This caused a SyntaxError at load time (`unexpected 'end'` at line 330), which produced "1 error occurred outside of examples" in all 30 failing CI jobs. Fix: remove the premature `end` so the injectable lines tests remain inside `describe '#to_h'`. Verified: `ruby -c` passes, 22 examples 0 failures locally. Co-Authored-By: Claude <noreply@anthropic.com>
Style/RedundantAssignment: `result` was assigned from filter_map then immediately returned. Return the filter_map result directly. Co-Authored-By: Claude <noreply@anthropic.com>
Expose enabled, last_upload_time, and upload_in_progress as attr_readers for diagnostic tools (gobo demo pages, health checks). These are @api private read-only accessors — no new public API. Thread-safe: ivars set inside mutex, Ruby reference reads are atomic. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Test enabled, last_upload_time, upload_in_progress at each state transition. Also update ported Java test to use public accessor instead of instance_variable_get. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
21ae446 to
f447025
Compare
Rebase conflict resolution incorrectly kept older branch versions, reverting changes from merged PR #5618: - compact! (in-place, fewer allocations) back to .compact - Trailing commas removed - Comment accuracy (FILE scope hierarchy) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rebase replayed both the original and the #5618-merged versions of these 5 tests, producing duplicates. Keep the first set (with trailing commas matching project style). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove the has_injectible_lines parameter from Scope#initialize and derive it from injectible_lines instead. The boolean carried no independent information — all 4 extractor call sites already passed `has_injectible_lines: !injectable_lines.nil?`. Addresses review feedback from Strech on #5618. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2196512 to
26107cf
Compare
nil (not computed), [] (no executable lines), non-empty (has ranges) all affect has_injectible_lines and to_h serialization. This was undocumented. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
26107cf to
3216674
Compare
Ruby predicate convention: question mark suffix, no has_ prefix. Wire format key (:has_injectible_lines) unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove extra empty line at block body end in scope_spec.rb. Co-Authored-By: Claude <noreply@anthropic.com>
Steep no longer flags these blocks, so the ignore comments are redundant and cause CI failure (redundant ignore comment warning). Co-Authored-By: Claude <noreply@anthropic.com>
No realistic need to change event types at runtime — Java and Go hardcode their equivalent behavior. Move from an internal setting to Extractor::INJECTABLE_LINE_EVENTS constant. Remove two tests that tested the configurable behavior. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The instance_double rule and Troubleshooting section were bundled into a find_source_file bug fix commit. They don't belong in this PR. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ruby DI cannot instrument class methods (singleton class), so extracting them is pointless. Remove upload_class_methods setting, extract_singleton_method_scope, extract_singleton_method_parameters, build_singleton_method_scope, and all associated tests. Delete docs/class_methods_di_design.md. Class method extraction is deferred until DI gains singleton class instrumentation support. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove extra empty line at block body end (Layout/EmptyLinesAroundBlockBody). Co-Authored-By: Claude <noreply@anthropic.com>
What does this PR do?
Implements symbol database upload for dd-trace-rb. Extracts class, method, and variable information from running Ruby applications and uploads it to the Datadog backend to support Dynamic Instrumentation auto-completion and search.
Motivation:
Brings Ruby tracer to parity with Java and Python for symbol database support. Enables IDE-like auto-completion when creating DI probes — users see class fields, method arguments, and local variables as they type.
Change log entry
Yes. Add symbol database upload support for Dynamic Instrumentation. When enabled, the tracer extracts and uploads symbol information (class names, method signatures, variable names) to enable auto-completion and search features in the DI UI.
Implementation:
Datadog::SymbolDatabasenamespace (peer to DI, not under it)Datadog::*, stdlib, gems)/symdb/v1/inputviaCore::Transport::HTTPRuby version differences:
On Ruby 2.6,
Module#const_source_locationis unavailable (added in Ruby 2.7). This means namespace-only modules (e.g.,module ApplicationCable; class Channel...; end; endwhere the module has no methods) and classes with only generated methods (e.g., ActiveRecord models with only associations) may not appear in the symbol database. On Ruby 2.7+,const_source_locationlocates these via their constant declarations. This is a graceful degradation — fewer symbols in the DI auto-completion UI on 2.6, no errors.How to test the change?