fix(genai-util): propagate workflow context to downstream spans and m…#6
fix(genai-util): propagate workflow context to downstream spans and m…#6
Conversation
…etrics Add workflow context inheritance in TelemetryHandler and include gen_ai.workflow.name in downstream LLM/tool/embedding/retrieval metric attributes so nested operations consistently carry workflow identity. Co-authored-by: Cursor <cursoragent@cursor.com>
📝 WalkthroughWalkthroughThe pull request introduces workflow context propagation throughout the OpenTelemetry GenAI utilities. A new workflow context stack in the TelemetryHandler manages workflow lifecycle, propagating workflow names into invocation attributes and metrics across LLM, embedding, retrieval, and tool call invocations. Changes
Sequence DiagramsequenceDiagram
participant Client as Client Code
participant Handler as TelemetryHandler
participant Stack as Workflow<br/>Context Stack
participant Span as Active Span
participant Metrics as Metrics Emitter
Client->>Handler: start_workflow(workflow)
activate Handler
Handler->>Stack: push(workflow.name)
deactivate Handler
Client->>Handler: start_llm(invocation)
activate Handler
Handler->>Handler: _inherit_parent_context_attributes()
Handler->>Span: get current span attributes
Span-->>Handler: span attributes (including GEN_AI_WORKFLOW_NAME)
Handler->>invocation: set GEN_AI_WORKFLOW_NAME from context
deactivate Handler
Client->>Handler: stop_llm(invocation)
activate Handler
Handler->>Metrics: emit_token_metrics(invocation)
Metrics->>invocation: get GEN_AI_WORKFLOW_NAME
invocation-->>Metrics: workflow_name
Metrics->>Metrics: add workflow_name to metric attributes
deactivate Handler
Client->>Handler: stop_workflow(workflow)
activate Handler
Handler->>Stack: pop(workflow.name)
deactivate Handler
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Review Summary by QodoPropagate workflow context to downstream spans and metrics
WalkthroughsDescription• Propagate workflow context to downstream spans and metrics - Add workflow name inheritance in TelemetryHandler - Extract workflow name from parent span attributes - Include gen_ai.workflow.name in LLM/tool/embedding/retrieval metrics • Implement parent context attribute extraction from active spans - Add _get_current_span_attribute() method for best-effort span attribute access - Add _inherit_parent_context_attributes() to propagate agent/workflow identity • Manage workflow context stack lifecycle - Push workflow name on start_workflow() - Pop workflow name on stop_workflow() and fail_workflow() • Add comprehensive test for cross-handler context propagation Diagramflowchart LR
A["Workflow/Agent Start"] -->|Push context| B["Workflow Context Stack"]
B -->|Inherit on nested ops| C["LLM/Tool/Embedding/Retrieval"]
C -->|Extract from parent span| D["Parent Span Attributes"]
D -->|Propagate to metrics| E["Metric Attributes"]
F["Workflow/Agent Stop"] -->|Pop context| B
File Changes1. util/opentelemetry-util-genai/src/opentelemetry/util/genai/emitters/metrics.py
|
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
util/opentelemetry-util-genai/tests/test_metrics.py (1)
59-64:⚠️ Potential issue | 🟡 MinorDuplicate cleanup block in
setUp.Lines 60-61 and Lines 63-64 are identical
hasattr/delattrpairs. This is a copy-paste leftover.🧹 Remove the duplicate
# Reset handler singleton if hasattr(get_telemetry_handler, "_default_handler"): delattr(get_telemetry_handler, "_default_handler") - # Reset handler singleton - if hasattr(get_telemetry_handler, "_default_handler"): - delattr(get_telemetry_handler, "_default_handler")
🧹 Nitpick comments (6)
util/opentelemetry-util-genai/src/opentelemetry/util/genai/handler.py (4)
262-263: Workflow context stack lacks a unique identifier — out-of-order completions can orphan entries.The
_agent_context_stacktracks(name, run_id)tuples and pops only when both match (Lines 965-967), making it resilient to out-of-order stop/fail. In contrast,_workflow_context_stackstores only the name string. If a nested workflow (B) is still active when an outer workflow (A) completes out of order, the pop check ([-1] == workflow.name) will silently skip the pop and leaveApermanently on the stack.Consider storing
(name, run_id)tuples (mirroring the agent stack) and matching on both fields when popping.♻️ Suggested change
- # Active workflow name stack for implicit propagation to nested operations - self._workflow_context_stack: list[str] = [] + # Active workflow name stack for implicit propagation to nested operations + self._workflow_context_stack: list[tuple[str, str]] = [] # (name, run_id)Then adjust push (Line 731):
- self._workflow_context_stack.append(workflow.name) + self._workflow_context_stack.append((workflow.name, str(workflow.run_id)))Pop sites (Lines 873-878, 897-902) become:
- and self._workflow_context_stack[-1] == workflow.name + and self._workflow_context_stack[-1] == (workflow.name, str(workflow.run_id))Read sites extract
[0]for the name:- self._workflow_context_stack[-1] + self._workflow_context_stack[-1][0]
350-367: Accessing_attributesis fragile — document the SDK coupling.
_attributesis a private attribute ofReadableSpanin the OTel SDK. The publicattributesproperty exists onReadableSpan(SDK), but not on the baseSpaninterface (API). The fallback chain (attributes→_attributes) is reasonable defensive coding, but a brief inline comment noting the SDK coupling would help future maintainers.Also,
get_current_span()returnsINVALID_SPAN(notNone) when no span is active. Theis Noneguard on Line 357 will never beTruefor the default implementation. This is harmless becauseINVALID_SPANhas noattributes/_attributes, so execution falls through correctly — but it could be slightly more precise.Minor precision improvement
try: current_span = _trace_mod.get_current_span() except Exception: return None - if current_span is None: + if current_span is None or not current_span.is_recording(): return None
369-419: Dual propagation paths (stack + parent span) may conflict or double-apply.
start_llm(and siblings) first checks_workflow_context_stack(Lines 412-418), then calls_inherit_parent_context_attributes(Line 419), which also tries to setGEN_AI_WORKFLOW_NAMEfrom the parent span (Lines 383-393). When the active span already carries the workflow name (set bystart_workflow→on_start), both paths would agree. However, if the stack and parent span disagree (e.g., stale stack entry), the stack wins because thein invocation.attributesguard in_inherit_parent_context_attributessees it's already set.This is probably intentional (stack = authoritative source, parent span = fallback for cross-handler scenarios), but it's worth a brief comment explaining the precedence. Also, in
_inherit_parent_context_attributes, the workflow check usesin invocation.attributeswhile the agent checks usenot invocation.agent_name— the inconsistency is minor but could confuse readers.
412-419: Repeated workflow-context injection block — consider extracting a helper.The same 7-line pattern (check attributes → check stack → assign) is duplicated verbatim in
start_llm,start_embedding,start_retrieval,start_tool_call, andstart_agent. A small private helper would reduce duplication and make future changes (e.g., addingrun_idto the stack) a single-point edit.♻️ Proposed helper
def _apply_workflow_context(self, invocation: GenAI) -> None: """Set workflow name from context stack if not already present.""" if ( GEN_AI_WORKFLOW_NAME not in invocation.attributes and self._workflow_context_stack ): invocation.attributes[GEN_AI_WORKFLOW_NAME] = ( self._workflow_context_stack[-1] )Then each
start_*call becomes:self._apply_workflow_context(invocation) self._inherit_parent_context_attributes(invocation)Also applies to: 538-545, 612-619, 682-689, 913-920
util/opentelemetry-util-genai/src/opentelemetry/util/genai/emitters/metrics.py (2)
105-111: Consider extracting the repeated workflow-name extraction into a helper.The same block:
workflow_name = ( invocation.attributes.get(GEN_AI_WORKFLOW_NAME) if invocation.attributes else None ) if workflow_name: metric_attrs[GEN_AI_WORKFLOW_NAME] = workflow_nameappears 8 times across
on_end,on_error, and_record_retrieval_metrics. Combined with the similarly repeatedagent_name/agent_idextraction, there's an opportunity to consolidate all context-attribute propagation into a single helper that enrichesmetric_attrsfrom invocation attributes in one pass.♻️ Proposed helper
def _enrich_metric_attrs_with_context( metric_attrs: dict[str, Any], invocation: Any ) -> None: """Copy agent and workflow context from invocation into metric attributes.""" if getattr(invocation, "agent_name", None): metric_attrs[GenAI.GEN_AI_AGENT_NAME] = invocation.agent_name if getattr(invocation, "agent_id", None): metric_attrs[GenAI.GEN_AI_AGENT_ID] = invocation.agent_id attrs = getattr(invocation, "attributes", None) if attrs: wf = attrs.get(GEN_AI_WORKFLOW_NAME) if wf: metric_attrs[GEN_AI_WORKFLOW_NAME] = wfEach call site then becomes a single line:
_enrich_metric_attrs_with_context(metric_attrs, llm_invocation)Also applies to: 143-149, 181-187, 228-234, 260-266, 299-305, 408-414
127-160: MissingreturnafterToolCallbranch inon_end—EmbeddingInvocationandRetrievalInvocationblocks always execute for tool calls.Line 127 uses
if isinstance(obj, ToolCall)but does notreturnat the end of the block (Line 159). SinceToolCallis not anEmbeddingInvocationorRetrievalInvocation, the subsequentifchecks on Lines 161 and 196 evaluate toFalseand are harmless — so there's no runtime bug. However, this differs from theLLMInvocationbranch which doesreturnon Line 126. The inconsistency could cause issues if type hierarchies change.This is pre-existing code (not introduced in this PR), so flagging it only as a minor observation.
Code Review by Qodo
1. Workflow stack not context-safe
|
| # Active workflow name stack for implicit propagation to nested operations | ||
| self._workflow_context_stack: list[str] = [] |
There was a problem hiding this comment.
1. Workflow stack not context-safe 🐞 Bug ⛯ Reliability
TelemetryHandler stores workflow identity in a mutable instance stack that is shared by all callers of the singleton handler, which can mis-attribute workflow name in concurrent/overlapping workflows (threads/async tasks) and pollute downstream span/metric attributes.
Agent Prompt
### Issue description
`TelemetryHandler` tracks workflow identity in a mutable list on the handler instance. Because `get_telemetry_handler()` returns a singleton, that stack becomes shared mutable state across concurrent requests/tasks, which can cause `gen_ai.workflow.name` to be applied to the wrong nested spans/metrics.
### Issue Context
The PR already adds `_inherit_parent_context_attributes()` which can derive workflow/agent identity from the current active span. Prefer context-local propagation over a shared stack.
### Fix Focus Areas
- util/opentelemetry-util-genai/src/opentelemetry/util/genai/handler.py[260-264]
- util/opentelemetry-util-genai/src/opentelemetry/util/genai/handler.py[722-732]
- util/opentelemetry-util-genai/src/opentelemetry/util/genai/handler.py[851-905]
- util/opentelemetry-util-genai/src/opentelemetry/util/genai/handler.py[395-420]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
…etrics
Add workflow context inheritance in TelemetryHandler and include gen_ai.workflow.name in downstream LLM/tool/embedding/retrieval metric attributes so nested operations consistently carry workflow identity.
Summary by CodeRabbit
Release Notes
New Features
Tests