Skip to content

fix(opentelemetry lib): decode missing scope, schema_url, and resource fields#24905

Open
szibis wants to merge 10 commits intovectordotdev:masterfrom
szibis:fix/otlp-decode-missing-fields
Open

fix(opentelemetry lib): decode missing scope, schema_url, and resource fields#24905
szibis wants to merge 10 commits intovectordotdev:masterfrom
szibis:fix/otlp-decode-missing-fields

Conversation

@szibis
Copy link
Contributor

@szibis szibis commented Mar 12, 2026

Summary

The OTLP decode path silently drops several protobuf fields during conversion to Vector events, causing data loss and breaking round-trip fidelity (OTLP → Vector → OTLP).

Before this PR

Field Logs Traces Metrics
scope.name ✅ (tag)
scope.version ✅ (tag)
scope.attributes ✅ (tags)
scope.dropped_attributes_count
ScopeX.schema_url
ResourceX.schema_url
resource.dropped_attributes_count

After this PR

Field Logs Traces Metrics
scope.name ✅ (tag)
scope.version ✅ (tag)
scope.attributes ✅ (tags)
scope.dropped_attributes_count ✅ (tag)
ScopeX.schema_url ✅ (tag)
ResourceX.schema_url ✅ (tag)
resource.dropped_attributes_count ✅ (tag)

Field mapping

Logs (Legacy / Vector namespace):

  • ScopeLogs.schema_urlscope.schema_url / %opentelemetry.scope.schema_url
  • ResourceLogs.schema_urlschema_url / %opentelemetry.resources.schema_url
  • Resource.dropped_attributes_countresource_dropped_attributes_count / %opentelemetry.resources.dropped_attributes_count

Traces (always at event root):

  • ScopeSpans.scope.*scope.name, scope.version, scope.attributes, scope.dropped_attributes_count
  • ScopeSpans.schema_urlscope.schema_url
  • ResourceSpans.schema_urlschema_url
  • Resource.dropped_attributes_countresource_dropped_attributes_count

Metrics (as tags, following existing resource.* / scope.* pattern):

  • scope.dropped_attributes_count, scope.schema_url, resource.schema_url, resource.dropped_attributes_count

Related

Test plan

  • 31 new unit tests covering all signal types (12 logs, 11 traces, 8 metrics)
  • Tests verify both presence of new fields and absence when empty/zero
  • Tests cover Legacy and Vector namespace for logs
  • Combined tests verify all new fields work together with existing fields
  • Integration test with OTLP collector for round-trip verification

…elds

The OTLP decode path drops several protobuf fields during conversion to
Vector events. This causes silent data loss and breaks round-trip fidelity
when events are later re-encoded to OTLP format.

Fields now decoded:

Logs:
- ScopeLogs.schema_url → scope.schema_url / %opentelemetry.scope.schema_url
- ResourceLogs.schema_url → schema_url / %opentelemetry.resources.schema_url
- Resource.dropped_attributes_count → resource_dropped_attributes_count

Traces:
- ScopeSpans.scope (name, version, attributes, dropped_attributes_count)
- ScopeSpans.schema_url → scope.schema_url
- ResourceSpans.schema_url → schema_url
- Resource.dropped_attributes_count → resource_dropped_attributes_count

Metrics:
- scope.dropped_attributes_count → tag
- ScopeMetrics.schema_url → scope.schema_url tag
- ResourceMetrics.schema_url → resource.schema_url tag
- Resource.dropped_attributes_count → resource.dropped_attributes_count tag

Closes vectordotdev#24904
Relates to vectordotdev#15500
- Add changelog fragment for vectordotdev#24905
- Document new log output fields in source CUE: scope.schema_url,
  schema_url (resource-level), resource_dropped_attributes_count
- Add comprehensive trace output field documentation to source CUE,
  including all span fields, scope fields, schema_url, and
  resource_dropped_attributes_count (previously undocumented)
@szibis szibis requested a review from a team as a code owner March 12, 2026 08:14
@github-actions github-actions bot added the domain: external docs Anything related to Vector's external, public documentation label Mar 12, 2026
…ions

Remove redundant .clone() calls in metrics tag building (format! only
borrows), eliminate Value clone for observed_timestamp by keeping it as
DateTime<Utc> (Copy), and remove unnecessary resource.clone() where self
is already consumed by value. Add inline documentation for intentional
Legacy vs Vector namespace path asymmetry on schema_url and
resource_dropped_attributes_count fields.
szibis added a commit to szibis/vector that referenced this pull request Mar 12, 2026


Extract scope.schema_url, resource schema_url, resource_dropped_attributes_count,
and scope.dropped_attributes_count in the native-to-OTLP encode path. These fields
are produced by the decode fix in vectordotdev#24905 — the encode now reads them when present
and falls back to defaults (empty/0) when absent, ensuring full round-trip fidelity
once vectordotdev#24905 merges while remaining backward-compatible before it does.

Also fixes schema_url mapping: root "schema_url" now correctly maps to
ResourceLogs/ResourceSpans.schema_url (resource level), while "scope.schema_url"
maps to ScopeLogs/ScopeSpans.schema_url (scope level).
szibis added a commit to szibis/vector that referenced this pull request Mar 12, 2026
… tags

Update decompose_metric_tags to handle 4 special tags as proto-level
structural fields rather than generic attributes:
- resource.dropped_attributes_count → Resource.dropped_attributes_count
- resource.schema_url → ResourceMetrics.schema_url
- scope.dropped_attributes_count → InstrumentationScope.dropped_attributes_count
- scope.schema_url → ScopeMetrics.schema_url

This ensures round-trip fidelity with fix/otlp-decode-missing-fields
(vectordotdev#24905) once merged, while remaining backward-compatible (graceful
defaults of 0 / empty string) before that PR merges.
@szibis szibis changed the title fix(opentelemetry): decode missing scope, schema_url, and resource fields fix(opentelemetry lib): decode missing scope, schema_url, and resource fields Mar 12, 2026
@pront
Copy link
Member

pront commented Mar 12, 2026

Hi @szibis, you have quite a few OTEL PRs open: https://github.com/vectordotdev/vector/pulls?q=sort%3Aupdated-desc+is%3Apr+is%3Aopen+author%3Aszibis+

Can you please list the order in which you want me to review them here? Even better, I would mark all but one as draft so I can keep filtering them without need to exchange comments here.

@szibis
Copy link
Contributor Author

szibis commented Mar 12, 2026

Hi @szibis, you have quite a few OTEL PRs open: https://github.com/vectordotdev/vector/pulls?q=sort%3Aupdated-desc+is%3Apr+is%3Aopen+author%3Aszibis+

Can you please list the order in which you want me to review them here? Even better, I would mark all but one as draft so I can keep filtering them without need to exchange comments here.

@pront Sorry for that, but I just discovered this gaps and avoiding one big PR addon.

  1. fix(opentelemetry lib): decode missing scope, schema_url, and resource fields #24905 - this PR for decode missing scopes - Full OTLP format baseline for all later PR's
  2. feat(opentelemetry sink): add automatic native log and trace to OTLP conversion #24621 - For auto-convert sink in Logs and Traces
  3. feat(opentelemetry sink): add native metric to OTLP conversion #24897 - Based on Logs and Traces native convert implement Metrics auto convert.

szibis added 2 commits March 12, 2026 19:37
… resources overwrite

In Vector namespace, the resources insert (kv_list_into_value for
attributes) overwrites the entire "resources" metadata key. Moving
resource_schema_url insert after the resources insert ensures it is
not lost.

Also:
- Add Vector namespace combined test to verify schema_url survives
  alongside resource attributes
- Reformat changelog to 80-100 char lines
…nt passing

Replace the repeated 5-argument pattern (resource, scope, metric_name,
scope_schema_url, resource_schema_url) across all convert_* methods
with a single MetricContext struct. This also removes the per-function
clone boilerplate since ctx is moved into each closure directly.
@szibis szibis requested a review from pront March 12, 2026 18:44
szibis added 2 commits March 13, 2026 09:16
Per OTLP spec, time_unix_nano == 0 means the timestamp is unset/unknown.
Previously all 5 metric types (Sum, Gauge, Histogram, ExponentialHistogram,
Summary) converted 0 to Some(epoch), which is semantically incorrect.
Now returns None when time_unix_nano is 0, consistent with the existing
log decode behavior.
Copy link
Member

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the new resources.schema_url / resources.dropped_attributes_count field handling introduces a backwards-compatibility issue with Vector namespace logs: they’re written into the same resources object that already contains arbitrary OTLP resource attributes, so valid incoming attributes can now be silently overwritten.

From my local test:

{
  "service.name": "checkout",
  "schema_url": "tenant-defined-value",
  "dropped_attributes_count": "user-payload"
}

With resource metadata:

{
  "schema_url": "https://resource.schema",
  "dropped_attributes_count": 7
}

The emitted event was:

{
  "otel_resources": {
    "service.name": "checkout",
    "schema_url": "https://resource.schema",
    "dropped_attributes_count": 7
  }
}

So the original resource attributes schema_url = "tenant-defined-value" and dropped_attributes_count = "user-payload" were lost.

Repro config:

data_dir: "/tmp/vector-pr-24905-data"

sources:
  otel:
    type: opentelemetry
    use_otlp_decoding: false
    log_namespace: true
    grpc:
      address: "127.0.0.1:43171"
    http:
      address: "127.0.0.1:43181"

transforms:
  expose_meta:
    type: remap
    inputs:
      - otel.logs
    source: |
      .otel_resources = %opentelemetry.resources
      .otel_scope = %opentelemetry.scope
      .otel_timestamp = %opentelemetry.timestamp

sinks:
  out:
    type: console
    inputs:
      - expose_meta
    target: stdout
    encoding:
      codec: json

We should preserve %opentelemetry.resources as the raw user-supplied resource attributes.

Also, we have the same type of issue with Metrics: resource.* / scope.* tag collisions.

Generally, if a field was not literally present as a user attribute, it should not be inserted into the raw attribute map. We should not place synthetic or derived metadata into a namespace that is also used for raw user payload.

szibis added 2 commits March 13, 2026 21:53
Move resource_schema_url and resource_dropped_attributes_count to
their own metadata paths instead of nesting them under the "resources"
namespace which holds user-supplied resource attributes.

For logs (Vector namespace): metadata now stored at flat paths like
%opentelemetry.resource_schema_url instead of
%opentelemetry.resources.schema_url, preventing collision when users
have resource attributes named "schema_url" or
"dropped_attributes_count".

For metrics: metadata tags now use underscore-separated names
(resource_schema_url, scope_dropped_attributes_count) instead of
dot-separated (resource.schema_url, scope.dropped_attributes_count)
to avoid colliding with user attribute tags that follow the
"resource.{key}" / "scope.{key}" format.

Also simplifies test section comment separators per review feedback.
@szibis
Copy link
Contributor Author

szibis commented Mar 13, 2026

We should preserve %opentelemetry.resources as the raw user-supplied resource attributes.

Also, we have the same type of issue with Metrics: resource.* / scope.* tag collisions.

Generally, if a field was not literally present as a user attribute, it should not be inserted into the raw attribute map. We should not place synthetic or derived metadata into a namespace that is also used for raw user payload.

@pront All fixed

@szibis szibis requested review from cswatt and pront March 13, 2026 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: external docs Anything related to Vector's external, public documentation domain: opentelemetry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OpenTelemetry source: trace decode drops scope, schema_url fields

3 participants