Skip to content

feat(opentelemetry sink): add automatic native log and trace to OTLP conversion#24621

Open
szibis wants to merge 59 commits intovectordotdev:masterfrom
szibis:feat/otlp-native-auto-conversion
Open

feat(opentelemetry sink): add automatic native log and trace to OTLP conversion#24621
szibis wants to merge 59 commits intovectordotdev:masterfrom
szibis:feat/otlp-native-auto-conversion

Conversation

@szibis
Copy link
Contributor

@szibis szibis commented Feb 9, 2026

Summary

Add automatic conversion from Vector's native flat log and trace formats to OTLP (OpenTelemetry Protocol) format in the opentelemetry sink's otlp codec.

Problem: Users currently need 50+ lines of complex VRL to manually build the nested OTLP structure (resourceLogsscopeLogslogRecords with KeyValue arrays), and trace events fail entirely when sent to the OTLP sink without passthrough mode.

Solution: The OTLP encoder now automatically detects native log and trace events and converts them to valid OTLP protobuf. Pre-formatted OTLP events continue to use passthrough encoding (backward compatible).

Scope

Signal Native → OTLP Passthrough (use_otlp_decoding: true)
Logs ✅ Auto-converts ✅ Passthrough
Traces ✅ Auto-converts ✅ Passthrough
Metrics ✅ Auto-converts (see #24897) ✅ Passthrough

Performance Impact (Logs)

Scenario OLD (VRL + encode) NEW (auto-convert) Improvement
Single event 374 µs 340 µs 9% faster
Batch 100 2,886 µs 620 µs 4.7x faster
Batch throughput 58 MiB/s 269 MiB/s 4.7x higher

Architecture Comparison

%%{init: {'theme': 'base', 'themeVariables': { 'lineColor': '#000000', 'primaryTextColor': '#000000'}}}%%
flowchart TB
    subgraph OLD["OLD: 50 lines VRL"]
        direction LR
        O1[Source] ==> O2[VRL Transform]
        O2 ==> O3[OTLP Encoder]
        O3 ==> O4[Collector]
    end

    subgraph NEW["NEW: Zero VRL"]
        direction LR
        N1[Source] ==> N2[OTLP Encoder]
        N2 ==> N3[Collector]
    end

    OLD ==> NEW

    style O1 fill:#ffffff,stroke:#000000,stroke-width:2px,color:#000000
    style O2 fill:#cccccc,stroke:#000000,stroke-width:3px,color:#000000
    style O3 fill:#ffffff,stroke:#000000,stroke-width:2px,color:#000000
    style O4 fill:#ffffff,stroke:#000000,stroke-width:2px,color:#000000
    style N1 fill:#ffffff,stroke:#000000,stroke-width:2px,color:#000000
    style N2 fill:#999999,stroke:#000000,stroke-width:3px,color:#000000
    style N3 fill:#ffffff,stroke:#000000,stroke-width:2px,color:#000000

    linkStyle default stroke:#000000,stroke-width:2px
Loading

Before vs After Comparison

Capability Before PR After PR
Native logs → OTLP sink
use_otlp_decoding: false + codec: otlp ❌ encoding error ✅ auto-converts
Non-OTLP source (file/syslog) + codec: otlp ❌ encoding error ✅ auto-converts
use_otlp_decoding: true + codec: otlp ✅ passthrough ✅ passthrough
Manual VRL rebuild needed ❌ 50+ lines required ✅ not needed
Native traces → OTLP sink
Trace events + codec: otlp ❌ encoding error ✅ auto-converts
use_otlp_decoding: true + codec: otlp ✅ passthrough ✅ passthrough
Metrics → OTLP sink
use_otlp_decoding: true + codec: otlp ✅ passthrough ✅ passthrough
Native metrics + codec: otlp ❌ not supported ✅ auto-converts (see #24897)
Auto-conversion field mapping (logs)
.messagebody.stringValue ❌ manual VRL ✅ automatic
.severity_textseverityText ❌ manual VRL ✅ automatic
.severity_numberseverityNumber ❌ manual VRL ✅ automatic
.attributes.*logRecords[].attributes[] ❌ manual VRL ✅ automatic
.resources.*resource.attributes[] ❌ manual VRL ✅ automatic
.trace_idtraceId ❌ manual VRL ✅ automatic
.span_idspanId ❌ manual VRL ✅ automatic
.timestamptimeUnixNano ❌ manual VRL ✅ automatic
Severity inferred from text
Hex validation on trace/span IDs
Multiple timestamp formats
Auto-conversion field mapping (traces)
.trace_idtraceId (16 bytes) ❌ encoding error ✅ automatic
.span_idspanId (8 bytes) ❌ encoding error ✅ automatic
.parent_span_idparentSpanId ❌ encoding error ✅ automatic
.namename ❌ encoding error ✅ automatic
.kindkind ❌ encoding error ✅ automatic
.start_time_unix_nano / .end_time_unix_nano ❌ encoding error ✅ automatic
.attributes.*attributes[] ❌ encoding error ✅ automatic
.resources.*resource.attributes[] ❌ encoding error ✅ automatic
.eventsevents[] (span events) ❌ encoding error ✅ automatic
.linkslinks[] (span links) ❌ encoding error ✅ automatic
.statusstatus (message, code) ❌ encoding error ✅ automatic
Data integrity (native → OTLP)
String body ❌ fails
Structured body (kvlist/array) ❌ fails ⚠️ stringified
String attributes ❌ fails
Complex attributes (array/nested) ❌ fails ⚠️ best-effort
ResourceLogs batching ❌ fails ✅ 1:1 per event
ScopeLogs grouping ❌ fails ✅ 1:1 per event
Wire efficiency
Batching structure ❌ destroyed — 1 record per plog.Logs ✅ preserved — all records stay together
Wire payload size ❌ N × (resource + scope + 1 record) ✅ 1 × (resource + scope + N records)
Error handling
Missing fields ❌ crash/drop ✅ graceful defaults
Invalid trace_id hex ❌ crash/drop ✅ warning, field omitted
Unknown timestamp format ❌ crash/drop ✅ warning, uses 0
Performance
Single event throughput N/A (failed) ✅ 9% faster than VRL
Batch throughput (100 events) N/A (failed) ✅ 4.7× faster than VRL
Throughput (MiB/s) 58 (VRL workaround) 269

Vector configuration

Before (Complex VRL Required)

50+ lines of VRL transformation
sources:
  otel_source:
    type: opentelemetry
    grpc:
      address: 0.0.0.0:4317

transforms:
  build_otlp_structure:
    type: remap
    inputs: ["otel_source.logs"]
    source: |
      resource_attrs = []
      if exists(.resources) {
        for_each(object!(.resources)) -> |k, v| {
          resource_attrs = push(resource_attrs, {
            "key": k,
            "value": { "stringValue": to_string(v) ?? "" }
          })
        }
      }

      log_attrs = []
      if exists(.attributes) {
        for_each(object!(.attributes)) -> |k, v| {
          attr_value = if is_boolean(v) {
            { "boolValue": v }
          } else if is_integer(v) {
            { "intValue": to_string!(v) }
          } else if is_float(v) {
            { "doubleValue": v }
          } else {
            { "stringValue": to_string(v) ?? "" }
          }
          log_attrs = push(log_attrs, { "key": k, "value": attr_value })
        }
      }

      .resource_logs = [{
        "resource": { "attributes": resource_attrs },
        "scopeLogs": [{
          "scope": {
            "name": .scope.name ?? "",
            "version": .scope.version ?? ""
          },
          "logRecords": [{
            "timeUnixNano": to_string(to_unix_timestamp(.timestamp, unit: "nanoseconds")),
            "severityText": .severity_text ?? "INFO",
            "severityNumber": .severity_number ?? 9,
            "body": { "stringValue": .message ?? "" },
            "attributes": log_attrs,
            "traceId": .trace_id ?? "",
            "spanId": .span_id ?? ""
          }]
        }]
      }]

sinks:
  otel_out:
    type: opentelemetry
    inputs: ["build_otlp_structure"]
    endpoint: http://collector:4317
    encoding:
      codec: otlp

After (Zero VRL)

sources:
  otel_source:
    type: opentelemetry
    grpc:
      address: 0.0.0.0:4317
    # use_otlp_decoding: false (default)

sinks:
  otel_out:
    type: opentelemetry
    inputs: ["otel_source.logs"]
    endpoint: http://collector:4317
    encoding:
      codec: otlp  # Automatic conversion!

Traces (Zero VRL)

sources:
  otel_source:
    type: opentelemetry
    grpc:
      address: 0.0.0.0:4317

sinks:
  otel_out:
    type: opentelemetry
    inputs: ["otel_source.traces"]
    endpoint: http://collector:4317
    encoding:
      codec: otlp  # Native traces auto-converted to OTLP protobuf

Native Metrics (Auto-Convert via #24897)

sources:
  host_metrics:
    type: host_metrics

sinks:
  otel_out:
    type: opentelemetry
    inputs: ["host_metrics"]
    endpoint: http://collector:4317
    encoding:
      codec: otlp  # Native metrics auto-converted to OTLP protobuf (see #24897)

With Optional Enrichment

sources:
  otel_source:
    type: opentelemetry
    grpc:
      address: 0.0.0.0:4317

transforms:
  enrich:
    type: remap
    inputs: ["otel_source.logs"]
    source: |
      # Simple flat field access - no nested structure needed
      .attributes.processed_by = "vector"
      .resources."deployment.region" = "us-west-2"

sinks:
  otel_out:
    type: opentelemetry
    inputs: ["enrich"]
    endpoint: http://collector:4317
    encoding:
      codec: otlp

Supported Native Log Format

{
  "message": "User login successful",
  "timestamp": "2024-01-15T10:30:00Z",
  "severity_text": "INFO",
  "severity_number": 9,
  "trace_id": "0123456789abcdef0123456789abcdef",
  "span_id": "fedcba9876543210",
  "attributes": {
    "user_id": "user-12345",
    "duration_ms": 42.5
  },
  "resources": {
    "service.name": "auth-service"
  },
  "scope": {
    "name": "auth-module",
    "version": "1.0.0"
  }
}

Supported Native Trace Format

{
  "trace_id": "0123456789abcdef0123456789abcdef",
  "span_id": "fedcba9876543210",
  "parent_span_id": "abcdef0123456789",
  "name": "HTTP GET /api/users",
  "kind": 2,
  "start_time_unix_nano": 1705312200000000000,
  "end_time_unix_nano": 1705312200042000000,
  "attributes": {
    "http.method": "GET",
    "http.status_code": 200
  },
  "resources": {
    "service.name": "api-gateway"
  },
  "status": {
    "code": 1,
    "message": "OK"
  },
  "events": [
    {
      "name": "request.start",
      "time_unix_nano": 1705312200000000000,
      "attributes": { "component": "handler" }
    }
  ],
  "links": []
}

Log Field Mapping

Native Field OTLP Field Notes
.message / .body / .msg body.stringValue Auto-detected
.timestamp timeUnixNano chrono, epoch, RFC3339
.severity_text severityText
.severity_number severityNumber Auto-inferred from text if missing
.trace_id traceId Hex string → 16 bytes
.span_id spanId Hex string → 8 bytes
.attributes.* attributes[] Object → KeyValue array
.resources.* resource.attributes[] Object → KeyValue array
.scope.name scope.name Instrumentation scope
.scope.version scope.version Instrumentation scope

Trace Field Mapping

Native Field OTLP Field Notes
.trace_id traceId Hex string → 16 bytes
.span_id spanId Hex string → 8 bytes
.parent_span_id parentSpanId Hex string → 8 bytes
.name name Span operation name
.kind kind SpanKind enum (0-5)
.start_time_unix_nano startTimeUnixNano Nanosecond timestamp
.end_time_unix_nano endTimeUnixNano Nanosecond timestamp
.trace_state traceState W3C trace state string
.attributes.* attributes[] Object → KeyValue array
.resources.* resource.attributes[] Object → KeyValue array
.events[] events[] Span events (name, time, attributes)
.links[] links[] Span links (trace_id, span_id, attributes)
.status.code status.code StatusCode enum
.status.message status.message Status description
.dropped_attributes_count droppedAttributesCount
.dropped_events_count droppedEventsCount
.dropped_links_count droppedLinksCount

How did you test this PR?

Unit Tests

cargo test -p opentelemetry-proto  # 72+ tests (log + trace conversion)
cargo test -p codecs --test otlp   # 28 tests

Tests cover:

  • Basic encoding, error handling, type conversion, safe timestamp overflow handling
  • Logs: Timestamp formats, severity inference, message fallbacks, hex validation
  • Traces: Basic fields, timestamps, parent_span_id, trace_state, attributes, resources, status, events, links, dropped counts, invalid/wrong-length IDs, mixed valid/invalid fields
  • Roundtrip encode/decode for both signals
  • Validation edge cases (negative timestamps, severity clamping, ID length)

E2E Tests

cargo vdev test e2e opentelemetry-native

Docker Compose with telemetrygen validates:

  • Native logs convert to valid OTLP
  • Field preservation through pipeline
  • Custom attributes via VRL transforms
  • Correct event counting metrics

Benchmarks

cargo bench --bench codecs --features codecs-benches -- otlp_encoding

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Pre-formatted OTLP events (with resourceLogs/resourceSpans/resourceMetrics fields) continue using existing passthrough path. Native metric events return an explicit error with a clear message (same behavior as before for unsupported types).

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

Issue/PR Relationship
#22789 Directly related - Our codec-level approach provides the same outcome with simpler implementation
#24515 Solves - User requesting OTLP transform helper; this eliminates the need
#23971 Solves root cause - OTLP encoding failing; our encoder handles all formats
#22550 Complementary - Adds metrics support; our patterns can extend to metrics
#24897 Companion PR - Adds native metric → OTLP conversion (Counter, Gauge, Histogram, Summary, Distribution, Set)

Add conversion from Vector's native flat log format to OTLP protobuf:

- Value → PBValue converters (inverse of existing PBValue → Value)
- native_log_to_otlp_request() for full event conversion
- Safe extraction helpers with graceful error handling
- Hex validation for trace_id (16 bytes) and span_id (8 bytes)
- Severity inference from severity_text when number missing
- Support for multiple timestamp formats (chrono, epoch, RFC3339)
- Pre-allocation and inline hints for performance
Detect native log format and automatically convert to OTLP when:
- Event does not contain 'resourceLogs' field (pre-formatted OTLP)
- Works with any Vector source (file, socket, otlp with flat decoding)

Maintains backward compatibility:
- Pre-formatted OTLP events (use_otlp_decoding: true) encode via passthrough
- Native events get automatic conversion to valid OTLP protobuf

This eliminates the need for 50+ lines of complex VRL transformation.
Add integration and E2E tests:

Unit/Integration tests (lib/codecs/tests/otlp.rs):
- Basic encoding functionality
- Error handling (invalid types, missing fields, malformed hex)
- Source compatibility (file, syslog, modified OTLP)
- Timestamp handling (seconds, nanos, RFC3339, chrono)
- Severity inference from text
- Message field fallbacks (.message, .body, .msg, .log)
- Roundtrip encode/decode verification

E2E tests (tests/e2e/opentelemetry/native/):
- Native logs convert to valid OTLP
- Service name preservation through conversion
- Log body, severity, timestamps preserved
- Custom attributes via VRL transforms
- Correct event counting metrics
Add comprehensive benchmarks comparing encoding approaches:

1. NEW: Native → auto-convert → encode (this PR)
2. OLD: VRL transform simulation → encode (what users had before)
3. OLD: Passthrough only (pre-formatted OTLP)

Results show 4.7x throughput improvement for batch operations:
- NEW batch: 288 MiB/s
- OLD VRL: 61 MiB/s

Single event is 7.4% faster than VRL approach.
- Changelog fragment for release notes
- Comprehensive documentation with mermaid diagrams
- Before/after configuration examples
- Field mapping reference
- Performance comparison tables
@szibis szibis requested a review from a team as a code owner February 9, 2026 16:36
@szibis szibis requested a review from a team February 9, 2026 16:36
@szibis szibis requested a review from a team as a code owner February 9, 2026 16:36
@szibis szibis changed the title Feat/otlp native auto conversion feat(opentelemetry): add automatic native log to OTLP conversion in opentelemetry sink Feb 9, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 9, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@szibis szibis changed the title feat(opentelemetry): add automatic native log to OTLP conversion in opentelemetry sink feat(opentelemetry sink): add automatic native log to OTLP conversion Feb 9, 2026
@szibis
Copy link
Contributor Author

szibis commented Feb 9, 2026

I have read the CLA Document and I hereby sign the CLA

Fix check-spelling CI failure by adding two domain-specific terms:
- kvlist: OpenTelemetry KeyValueList type
- xychart: Mermaid diagram chart type
@github-actions github-actions bot added the domain: ci Anything related to Vector's CI environment label Feb 10, 2026
@thomasqueirozb thomasqueirozb added domain: codecs Anything related to Vector's codecs (encoding/decoding) sink: opentelemetry Anything `opentelemetry` sink related labels Feb 11, 2026
Copy link
Contributor

@maycmlee maycmlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for opening this PR! I'm the docs reviewer. Just some small doc suggestions and also a question about using version numbers.

szibis and others added 2 commits February 17, 2026 21:16
Co-authored-by: May Lee <may.lee@datadoghq.com>
Co-authored-by: May Lee <may.lee@datadoghq.com>
@github-actions github-actions bot removed the domain: codecs Anything related to Vector's codecs (encoding/decoding) label Feb 17, 2026
Co-authored-by: May Lee <may.lee@datadoghq.com>
@szibis szibis force-pushed the feat/otlp-native-auto-conversion branch from 6891644 to 9ba4ed1 Compare March 2, 2026 23:51
@github-actions github-actions bot removed domain: topology Anything related to Vector's topology code domain: sources Anything related to the Vector's sources domain: transforms Anything related to Vector's transform components domain: sinks Anything related to the Vector's sinks domain: core Anything related to core crates i.e. vector-core, core-common, etc domain: vdev Anything related to the vdev tooling labels Mar 2, 2026
@szibis szibis requested a review from pront March 2, 2026 23:54
@szibis
Copy link
Contributor Author

szibis commented Mar 11, 2026

@pront this is PR #24897 that solves metrics based on this PR as a reference which should close the whole problem after both PR's merged.

Copy link
Member

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @szibis, I will prioritize reviewing this PR.

szibis added a commit to szibis/vector that referenced this pull request Mar 11, 2026
… encode

Apply review feedback patterns from PR vectordotdev#24621:
- Replace `as u64` with `u64::try_from().ok()` for timestamp conversion
- Replace `as u64`/`as f64` with `u64::from()`/`f64::from()` for sample.rate
- Remove unwrap() in Distribution bucket overflow guard, use
  saturating index clamping instead
szibis added 8 commits March 11, 2026 23:49
…ecode path

Replace bare 'u64 as i64' casts with i64::try_from().ok() in timestamp
conversions for logs and spans decode paths. Values above i64::MAX (year
2262+) now gracefully fall back to current time or Value::Null instead of
silently wrapping to negative timestamps.

Also guards log record dropped_attributes_count with > 0 check to avoid
inserting zero values, matching the scope dropped_attributes_count pattern.

Fixes internal_log_rate_secs to internal_log_rate_limit (Vector convention).
kv_list_into_value was dropping KeyValue entries where kv.value was None
(outer AnyValue wrapper missing). Now all entries are preserved as Null.
…elds in log conversion

Add namespace-aware field extraction that checks both event root (Legacy
namespace) and %metadata.opentelemetry.* (Vector namespace), ensuring
round-trip compatibility for logs decoded with Vector namespace.

Collect unrecognized event fields (e.g. user_id, request_id, hostname)
into OTLP attributes instead of silently dropping them during native
log-to-OTLP conversion.
…OTLP conversion

Add 19 new tests covering:
- Full OTLP field mapping (all fields set simultaneously)
- Attribute value types (int, float, bool, array, nested object)
- Body field priority (message > body > msg > log)
- Structured object body → KvlistValue
- Observed timestamp, flags, dropped_attributes_count
- Scope with attributes
- Remaining field dedup with explicit attributes
- Null field filtering
- All severity inference levels + case insensitivity
- RFC3339 string and float timestamp parsing
- Resource via alternative field names
- Many custom fields from JSON/k8s sources
- Vector namespace full metadata roundtrip
Mirror the log fix: collect unknown trace event fields (deployment_id,
tenant, environment, etc.) as span attributes to prevent silent data
loss during native→OTLP conversion.

Add KNOWN_OTLP_SPAN_FIELDS list and collect_trace_remaining_fields
helper. Include ingest_timestamp as known to avoid re-encoding the
decode-path timestamp.

Add 6 tests: unknown fields collected, known fields excluded, merge
with explicit attributes, null filtering, type preservation, and
ingest_timestamp exclusion.
…avior

Fix scope.dropped_attributes_count: read from event/metadata instead of
hard-coding 0, preserving round-trip fidelity.

Add source_type and ingest_timestamp to known OTLP log fields to prevent
Vector operational metadata from spilling into OTLP attributes.

Document the automatic remaining-fields-to-attributes behavior in both
the OtlpSerializer doc comments and the sink how_it_works section.
@szibis
Copy link
Contributor Author

szibis commented Mar 12, 2026

@pront I think this should goes first - Companion decode-side fix: #24905 adds decode for missing scope, schema_url, and resource.dropped_attributes_count fields across logs, traces, and metrics. This ensures the fields that this PR's encode path handles are actually populated by the decode side.



Extract scope.schema_url, resource schema_url, resource_dropped_attributes_count,
and scope.dropped_attributes_count in the native-to-OTLP encode path. These fields
are produced by the decode fix in vectordotdev#24905 — the encode now reads them when present
and falls back to defaults (empty/0) when absent, ensuring full round-trip fidelity
once vectordotdev#24905 merges while remaining backward-compatible before it does.

Also fixes schema_url mapping: root "schema_url" now correctly maps to
ResourceLogs/ResourceSpans.schema_url (resource level), while "scope.schema_url"
maps to ScopeLogs/ScopeSpans.schema_url (scope level).
@szibis szibis requested a review from pront March 12, 2026 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: ci Anything related to Vector's CI environment domain: external docs Anything related to Vector's external, public documentation editorial review sink: opentelemetry Anything `opentelemetry` sink related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support OTLP transform for opentelemetry sink

5 participants