fix(langchain): use default EmbeddingInvocation operation_name instead of hardcoded value#7
fix(langchain): use default EmbeddingInvocation operation_name instead of hardcoded value#7
Conversation
…d of hardcoded value The langchain instrumentation was setting operation_name="embedding" (singular) which overrode the correct semantic convention default of "embeddings" (plural) defined in EmbeddingInvocation types.py. Remove the explicit override so the dataclass default from GenAiOperationNameValues.EMBEDDINGS is used. Add embedding tests for both packages: - util-genai: default operation_name, semconv attributes, span name, error path - langchain: VCR integration tests for success and error paths Co-authored-by: Cursor <cursoragent@cursor.com>
Review Summary by QodoFix embedding operation_name and add comprehensive test coverage
WalkthroughsDescription• Remove hardcoded "embedding" operation_name override in langchain instrumentation - Allows default "embeddings" semantic convention value to be used • Add comprehensive embedding tests for util-genai package - Default operation_name, semantic convention attributes, span naming, error handling • Add VCR integration tests for langchain embedding instrumentation - Success and error paths with recorded API cassettes Diagramflowchart LR
A["LangChain Instrumentation"] -->|Remove hardcoded operation_name| B["Use EmbeddingInvocation Default"]
B -->|operation_name='embeddings'| C["Semantic Convention Compliance"]
D["util-genai Tests"] -->|Default operation_name| E["Span Attributes"]
D -->|Error handling| E
D -->|Span naming| E
F["LangChain Integration Tests"] -->|VCR cassettes| G["Success Path"]
F -->|VCR cassettes| H["Error Path"]
File Changes1. instrumentation-genai/opentelemetry-instrumentation-langchain/src/opentelemetry/instrumentation/langchain/__init__.py
|
📝 WalkthroughWalkthroughA minor parameter removal from EmbeddingInvocation construction in LangChain instrumentation, paired with comprehensive test coverage including two new VCR cassettes for embedding API calls (success and error scenarios), a new LangChain embedding test module, and extended utility tests for EmbeddingInvocation telemetry lifecycle. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~18 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (3)
util/opentelemetry-util-genai/tests/test_embedding_invocation.py (2)
67-69: Silent no-op when internal API changes.The
if span_emitters:guard means that ifemitters_for("span")returns an empty list (e.g., due to an internal refactor), the tracer override is silently skipped and the tests will either pass vacuously or fail with a confusing "0 spans" message. Consider failing fast instead:- if span_emitters: - span_emitters[0]._tracer = tracer_provider.get_tracer(__name__) + assert span_emitters, "Expected at least one span emitter from telemetry handler" + span_emitters[0]._tracer = tracer_provider.get_tracer(__name__)Also applies to: 98-100, 123-125
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@util/opentelemetry-util-genai/tests/test_embedding_invocation.py` around lines 67 - 69, The test currently silently skips overriding the tracer when handler._emitter.emitters_for("span") returns an empty list (span_emitters), which masks API changes; replace the permissive if-check with a fail-fast assertion or explicit raise so the test errors immediately if no span emitter is found, then proceed to set span_emitters[0]._tracer = tracer_provider.get_tracer(__name__); apply the same change to the other occurrences that use emitters_for("span") (the blocks around lines 98-100 and 123-125) to ensure tests fail loudly on internal API changes.
62-69: Consider extracting the repeated tracer-setup boilerplate into a fixture.The span exporter + tracer provider + internal
_emitterwiring is duplicated across three tests (test_embedding_invocation_span_attributes,test_embedding_invocation_span_name,test_embedding_invocation_with_error). A shared pytest fixture would reduce duplication and make it easier to update if the internal API changes.♻️ Example fixture sketch
`@pytest.fixture` def telemetry_setup(): span_exporter = InMemorySpanExporter() tracer_provider = TracerProvider() tracer_provider.add_span_processor(SimpleSpanProcessor(span_exporter)) handler = get_telemetry_handler() span_emitters = list(handler._emitter.emitters_for("span")) if span_emitters: span_emitters[0]._tracer = tracer_provider.get_tracer(__name__) return handler, span_exporterAlso applies to: 93-100, 118-125
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@util/opentelemetry-util-genai/tests/test_embedding_invocation.py` around lines 62 - 69, The three tests (test_embedding_invocation_span_attributes, test_embedding_invocation_span_name, test_embedding_invocation_with_error) repeat tracer setup boilerplate; extract that into a pytest fixture (e.g., telemetry_setup) that creates InMemorySpanExporter, TracerProvider, adds SimpleSpanProcessor, obtains handler via get_telemetry_handler(), wires handler._emitter.emitters_for("span")[0]._tracer with tracer_provider.get_tracer(__name__) if present, and returns (handler, span_exporter); then update the three tests to accept the telemetry_setup fixture and use the returned handler and span_exporter instead of duplicating the setup.instrumentation-genai/opentelemetry-instrumentation-langchain/tests/test_langchain_embedding.py (1)
63-74: Extract duplicate span-search logic into a helper.The span-finding loop is duplicated verbatim between both tests. A small helper would improve readability and reduce maintenance burden.
♻️ Example helper
def _find_span_by_operation(spans, operation_name): for s in spans: attrs = getattr(s, "attributes", None) if attrs and attrs.get(gen_ai_attributes.GEN_AI_OPERATION_NAME) == operation_name: return s return NoneUsage:
embedding_span = _find_span_by_operation(spans, EMBEDDINGS) assert embedding_span is not None, "No embeddings operation span found"Also applies to: 132-143
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@instrumentation-genai/opentelemetry-instrumentation-langchain/tests/test_langchain_embedding.py` around lines 63 - 74, Extract the duplicated span-search loop into a small helper function (e.g., _find_span_by_operation) that accepts spans and an operation_name and returns the matching span or None; replace both verbatim loops (the one using spans, attrs_obj/getattr and gen_ai_attributes.GEN_AI_OPERATION_NAME to compare against EMBEDDINGS) with calls to this helper (e.g., embedding_span = _find_span_by_operation(spans, EMBEDDINGS)) and add an assertion like assert embedding_span is not None where needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@instrumentation-genai/opentelemetry-instrumentation-langchain/tests/test_langchain_embedding.py`:
- Around line 63-74: Extract the duplicated span-search loop into a small helper
function (e.g., _find_span_by_operation) that accepts spans and an
operation_name and returns the matching span or None; replace both verbatim
loops (the one using spans, attrs_obj/getattr and
gen_ai_attributes.GEN_AI_OPERATION_NAME to compare against EMBEDDINGS) with
calls to this helper (e.g., embedding_span = _find_span_by_operation(spans,
EMBEDDINGS)) and add an assertion like assert embedding_span is not None where
needed.
In `@util/opentelemetry-util-genai/tests/test_embedding_invocation.py`:
- Around line 67-69: The test currently silently skips overriding the tracer
when handler._emitter.emitters_for("span") returns an empty list
(span_emitters), which masks API changes; replace the permissive if-check with a
fail-fast assertion or explicit raise so the test errors immediately if no span
emitter is found, then proceed to set span_emitters[0]._tracer =
tracer_provider.get_tracer(__name__); apply the same change to the other
occurrences that use emitters_for("span") (the blocks around lines 98-100 and
123-125) to ensure tests fail loudly on internal API changes.
- Around line 62-69: The three tests (test_embedding_invocation_span_attributes,
test_embedding_invocation_span_name, test_embedding_invocation_with_error)
repeat tracer setup boilerplate; extract that into a pytest fixture (e.g.,
telemetry_setup) that creates InMemorySpanExporter, TracerProvider, adds
SimpleSpanProcessor, obtains handler via get_telemetry_handler(), wires
handler._emitter.emitters_for("span")[0]._tracer with
tracer_provider.get_tracer(__name__) if present, and returns (handler,
span_exporter); then update the three tests to accept the telemetry_setup
fixture and use the returned handler and span_exporter instead of duplicating
the setup.
Code Review by Qodo
1. Changelog missing embeddings rename
|
| # Create embedding invocation | ||
| embedding = UtilEmbeddingInvocation( | ||
| operation_name="embedding", | ||
| request_model=request_model, | ||
| input_texts=texts if isinstance(texts, list) else [texts], | ||
| provider=provider, |
There was a problem hiding this comment.
1. Changelog missing embeddings rename 📘 Rule violation ✧ Quality
The PR changes emitted GenAI operation/span naming for LangChain embeddings by removing the hardcoded operation_name, but the package changelog was not updated to document this user-visible telemetry behavior change. This can surprise users who rely on span names/attributes for dashboards and alerts.
Agent Prompt
## Issue description
The LangChain instrumentation behavior changed (embeddings operation/span naming now uses the default `embeddings` operation name), but the package changelog was not updated.
## Issue Context
This is a user-visible observability change that can affect dashboards/alerts and should be documented per the compliance checklist.
## Fix Focus Areas
- instrumentation-genai/opentelemetry-instrumentation-langchain/CHANGELOG.md[1-30]
- instrumentation-genai/opentelemetry-instrumentation-langchain/src/opentelemetry/instrumentation/langchain/__init__.py[205-211]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
The langchain instrumentation was setting operation_name="embedding" (singular) which overrode the correct semantic convention default of "embeddings" (plural) defined in EmbeddingInvocation types.py. Remove the explicit override so the dataclass default from GenAiOperationNameValues.EMBEDDINGS is used.
Add embedding tests for both packages:
Summary by CodeRabbit
Tests
Chores