feat(opentelemetry sink): add automatic native log and trace to OTLP conversion#24621
Open
szibis wants to merge 59 commits intovectordotdev:masterfrom
Open
feat(opentelemetry sink): add automatic native log and trace to OTLP conversion#24621szibis wants to merge 59 commits intovectordotdev:masterfrom
szibis wants to merge 59 commits intovectordotdev:masterfrom
Conversation
Add conversion from Vector's native flat log format to OTLP protobuf: - Value → PBValue converters (inverse of existing PBValue → Value) - native_log_to_otlp_request() for full event conversion - Safe extraction helpers with graceful error handling - Hex validation for trace_id (16 bytes) and span_id (8 bytes) - Severity inference from severity_text when number missing - Support for multiple timestamp formats (chrono, epoch, RFC3339) - Pre-allocation and inline hints for performance
Detect native log format and automatically convert to OTLP when: - Event does not contain 'resourceLogs' field (pre-formatted OTLP) - Works with any Vector source (file, socket, otlp with flat decoding) Maintains backward compatibility: - Pre-formatted OTLP events (use_otlp_decoding: true) encode via passthrough - Native events get automatic conversion to valid OTLP protobuf This eliminates the need for 50+ lines of complex VRL transformation.
Add integration and E2E tests: Unit/Integration tests (lib/codecs/tests/otlp.rs): - Basic encoding functionality - Error handling (invalid types, missing fields, malformed hex) - Source compatibility (file, syslog, modified OTLP) - Timestamp handling (seconds, nanos, RFC3339, chrono) - Severity inference from text - Message field fallbacks (.message, .body, .msg, .log) - Roundtrip encode/decode verification E2E tests (tests/e2e/opentelemetry/native/): - Native logs convert to valid OTLP - Service name preservation through conversion - Log body, severity, timestamps preserved - Custom attributes via VRL transforms - Correct event counting metrics
Add comprehensive benchmarks comparing encoding approaches: 1. NEW: Native → auto-convert → encode (this PR) 2. OLD: VRL transform simulation → encode (what users had before) 3. OLD: Passthrough only (pre-formatted OTLP) Results show 4.7x throughput improvement for batch operations: - NEW batch: 288 MiB/s - OLD VRL: 61 MiB/s Single event is 7.4% faster than VRL approach.
- Changelog fragment for release notes - Comprehensive documentation with mermaid diagrams - Before/after configuration examples - Field mapping reference - Performance comparison tables
Contributor
|
All contributors have signed the CLA ✍️ ✅ |
Contributor
Author
|
I have read the CLA Document and I hereby sign the CLA |
Fix check-spelling CI failure by adding two domain-specific terms: - kvlist: OpenTelemetry KeyValueList type - xychart: Mermaid diagram chart type
maycmlee
reviewed
Feb 17, 2026
Co-authored-by: May Lee <may.lee@datadoghq.com>
Co-authored-by: May Lee <may.lee@datadoghq.com>
Co-authored-by: May Lee <may.lee@datadoghq.com>
6891644 to
9ba4ed1
Compare
9 tasks
Contributor
Author
szibis
added a commit
to szibis/vector
that referenced
this pull request
Mar 11, 2026
… encode Apply review feedback patterns from PR vectordotdev#24621: - Replace `as u64` with `u64::try_from().ok()` for timestamp conversion - Replace `as u64`/`as f64` with `u64::from()`/`f64::from()` for sample.rate - Remove unwrap() in Distribution bucket overflow guard, use saturating index clamping instead
…ecode path Replace bare 'u64 as i64' casts with i64::try_from().ok() in timestamp conversions for logs and spans decode paths. Values above i64::MAX (year 2262+) now gracefully fall back to current time or Value::Null instead of silently wrapping to negative timestamps. Also guards log record dropped_attributes_count with > 0 check to avoid inserting zero values, matching the scope dropped_attributes_count pattern. Fixes internal_log_rate_secs to internal_log_rate_limit (Vector convention).
kv_list_into_value was dropping KeyValue entries where kv.value was None (outer AnyValue wrapper missing). Now all entries are preserved as Null.
…elds in log conversion Add namespace-aware field extraction that checks both event root (Legacy namespace) and %metadata.opentelemetry.* (Vector namespace), ensuring round-trip compatibility for logs decoded with Vector namespace. Collect unrecognized event fields (e.g. user_id, request_id, hostname) into OTLP attributes instead of silently dropping them during native log-to-OTLP conversion.
…OTLP conversion Add 19 new tests covering: - Full OTLP field mapping (all fields set simultaneously) - Attribute value types (int, float, bool, array, nested object) - Body field priority (message > body > msg > log) - Structured object body → KvlistValue - Observed timestamp, flags, dropped_attributes_count - Scope with attributes - Remaining field dedup with explicit attributes - Null field filtering - All severity inference levels + case insensitivity - RFC3339 string and float timestamp parsing - Resource via alternative field names - Many custom fields from JSON/k8s sources - Vector namespace full metadata roundtrip
Mirror the log fix: collect unknown trace event fields (deployment_id, tenant, environment, etc.) as span attributes to prevent silent data loss during native→OTLP conversion. Add KNOWN_OTLP_SPAN_FIELDS list and collect_trace_remaining_fields helper. Include ingest_timestamp as known to avoid re-encoding the decode-path timestamp. Add 6 tests: unknown fields collected, known fields excluded, merge with explicit attributes, null filtering, type preservation, and ingest_timestamp exclusion.
…avior Fix scope.dropped_attributes_count: read from event/metadata instead of hard-coding 0, preserving round-trip fidelity. Add source_type and ingest_timestamp to known OTLP log fields to prevent Vector operational metadata from spilling into OTLP attributes. Document the automatic remaining-fields-to-attributes behavior in both the OtlpSerializer doc comments and the sink how_it_works section.
Contributor
Author
Extract scope.schema_url, resource schema_url, resource_dropped_attributes_count, and scope.dropped_attributes_count in the native-to-OTLP encode path. These fields are produced by the decode fix in vectordotdev#24905 — the encode now reads them when present and falls back to defaults (empty/0) when absent, ensuring full round-trip fidelity once vectordotdev#24905 merges while remaining backward-compatible before it does. Also fixes schema_url mapping: root "schema_url" now correctly maps to ResourceLogs/ResourceSpans.schema_url (resource level), while "scope.schema_url" maps to ScopeLogs/ScopeSpans.schema_url (scope level).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add automatic conversion from Vector's native flat log and trace formats to OTLP (OpenTelemetry Protocol) format in the
opentelemetrysink'sotlpcodec.Problem: Users currently need 50+ lines of complex VRL to manually build the nested OTLP structure (
resourceLogs→scopeLogs→logRecordswithKeyValuearrays), and trace events fail entirely when sent to the OTLP sink without passthrough mode.Solution: The OTLP encoder now automatically detects native log and trace events and converts them to valid OTLP protobuf. Pre-formatted OTLP events continue to use passthrough encoding (backward compatible).
Scope
use_otlp_decoding: true)Performance Impact (Logs)
Architecture Comparison
%%{init: {'theme': 'base', 'themeVariables': { 'lineColor': '#000000', 'primaryTextColor': '#000000'}}}%% flowchart TB subgraph OLD["OLD: 50 lines VRL"] direction LR O1[Source] ==> O2[VRL Transform] O2 ==> O3[OTLP Encoder] O3 ==> O4[Collector] end subgraph NEW["NEW: Zero VRL"] direction LR N1[Source] ==> N2[OTLP Encoder] N2 ==> N3[Collector] end OLD ==> NEW style O1 fill:#ffffff,stroke:#000000,stroke-width:2px,color:#000000 style O2 fill:#cccccc,stroke:#000000,stroke-width:3px,color:#000000 style O3 fill:#ffffff,stroke:#000000,stroke-width:2px,color:#000000 style O4 fill:#ffffff,stroke:#000000,stroke-width:2px,color:#000000 style N1 fill:#ffffff,stroke:#000000,stroke-width:2px,color:#000000 style N2 fill:#999999,stroke:#000000,stroke-width:3px,color:#000000 style N3 fill:#ffffff,stroke:#000000,stroke-width:2px,color:#000000 linkStyle default stroke:#000000,stroke-width:2pxBefore vs After Comparison
use_otlp_decoding: false+codec: otlpcodec: otlpuse_otlp_decoding: true+codec: otlpcodec: otlpuse_otlp_decoding: true+codec: otlpuse_otlp_decoding: true+codec: otlpcodec: otlp.message→body.stringValue.severity_text→severityText.severity_number→severityNumber.attributes.*→logRecords[].attributes[].resources.*→resource.attributes[].trace_id→traceId.span_id→spanId.timestamp→timeUnixNano.trace_id→traceId(16 bytes).span_id→spanId(8 bytes).parent_span_id→parentSpanId.name→name.kind→kind.start_time_unix_nano/.end_time_unix_nano.attributes.*→attributes[].resources.*→resource.attributes[].events→events[](span events).links→links[](span links).status→status(message, code)Vector configuration
Before (Complex VRL Required)
50+ lines of VRL transformation
After (Zero VRL)
Traces (Zero VRL)
Native Metrics (Auto-Convert via #24897)
With Optional Enrichment
Supported Native Log Format
{ "message": "User login successful", "timestamp": "2024-01-15T10:30:00Z", "severity_text": "INFO", "severity_number": 9, "trace_id": "0123456789abcdef0123456789abcdef", "span_id": "fedcba9876543210", "attributes": { "user_id": "user-12345", "duration_ms": 42.5 }, "resources": { "service.name": "auth-service" }, "scope": { "name": "auth-module", "version": "1.0.0" } }Supported Native Trace Format
{ "trace_id": "0123456789abcdef0123456789abcdef", "span_id": "fedcba9876543210", "parent_span_id": "abcdef0123456789", "name": "HTTP GET /api/users", "kind": 2, "start_time_unix_nano": 1705312200000000000, "end_time_unix_nano": 1705312200042000000, "attributes": { "http.method": "GET", "http.status_code": 200 }, "resources": { "service.name": "api-gateway" }, "status": { "code": 1, "message": "OK" }, "events": [ { "name": "request.start", "time_unix_nano": 1705312200000000000, "attributes": { "component": "handler" } } ], "links": [] }Log Field Mapping
.message/.body/.msgbody.stringValue.timestamptimeUnixNano.severity_textseverityText.severity_numberseverityNumber.trace_idtraceId.span_idspanId.attributes.*attributes[].resources.*resource.attributes[].scope.namescope.name.scope.versionscope.versionTrace Field Mapping
.trace_idtraceId.span_idspanId.parent_span_idparentSpanId.namename.kindkind.start_time_unix_nanostartTimeUnixNano.end_time_unix_nanoendTimeUnixNano.trace_statetraceState.attributes.*attributes[].resources.*resource.attributes[].events[]events[].links[]links[].status.codestatus.code.status.messagestatus.message.dropped_attributes_countdroppedAttributesCount.dropped_events_countdroppedEventsCount.dropped_links_countdroppedLinksCountHow did you test this PR?
Unit Tests
Tests cover:
E2E Tests
cargo vdev test e2e opentelemetry-nativeDocker Compose with telemetrygen validates:
Benchmarks
Change Type
Is this a breaking change?
Pre-formatted OTLP events (with
resourceLogs/resourceSpans/resourceMetricsfields) continue using existing passthrough path. Native metric events return an explicit error with a clear message (same behavior as before for unsupported types).Does this PR include user facing changes?
no-changeloglabel to this PR.References