Skip to content

Refactor loongsuite instrumentation for langchain#133

Open
Cirilla-zmh wants to merge 13 commits intoalibaba:mainfrom
Cirilla-zmh:minghui/refactor_langchain
Open

Refactor loongsuite instrumentation for langchain#133
Cirilla-zmh wants to merge 13 commits intoalibaba:mainfrom
Cirilla-zmh:minghui/refactor_langchain

Conversation

@Cirilla-zmh
Copy link
Collaborator

@Cirilla-zmh Cirilla-zmh commented Mar 6, 2026

Description

Summary

Rewrites loongsuite-instrumentation-langchain from scratch, replacing the legacy wrapt-based function wrapping with a BaseTracer callback approach and integrating opentelemetry-util-genai for standardized GenAI semantic convention compliance.

Design

Tracer architecture — LoongsuiteTracer extends langchain_core.tracers.base.BaseTracer and hooks into the fine-grained _on_*_start / _on_*_end / _on_*_error callbacks. This lets us extract telemetry at the point where LangChain's Run objects are fully populated, rather than monkey-patching individual methods.

Operation type mapping — Each LangChain run type is mapped to the appropriate util-genai handler method:

  • LLM / chat_model → start_llm / stop_llm / fail_llm
  • Chain (Agent) → start_invoke_agent / stop_invoke_agent / fail_invoke_agent
  • Chain (generic) → direct span creation with CHAIN span kind
  • Tool → start_execute_tool / stop_execute_tool / fail_execute_tool
  • Retriever → start_retrieve / stop_retrieve / fail_retrieve
    Agent runs are distinguished from generic chains by run.name (e.g. AgentExecutor, MRKLChain).

Context propagation — Follows the Robin/OpenLLMetry pattern: parent-child span relationships are established by passing Context explicitly to start_span / handler.start_*, avoiding hazardous attach/detach in a callback system where exceptions between callbacks can leak context. The sole exception is generic Chain spans, which do attach/detach so that non-LangChain child operations (e.g. HTTP calls) nest correctly.

Content capture gating — Chain input.value / output.value attributes are gated behind util-genai's is_experimental_mode() and get_content_capturing_mode(), consistent with how LLM/Tool/Retriever content is already controlled.

Thread safety — All access to the per-run bookkeeping dict is protected by RLock.

TTFTon_llm_new_token records the monotonic first-token timestamp; util-genai computes gen_ai.response.time_to_first_token on span finalization.

Changes to util-genai

  • All start_* methods in handler.py and extended_handler.py accept an optional context parameter, forwarded to start_span for explicit parent-child linking.
  • _safe_detach uses _RUNTIME_CONTEXT.detach directly to avoid the noisy ERROR log from the OTel SDK's context_api.detach wrapper.
  • Fixed undefined otel_context reference in _multimodal_processing.py.

##Test plan

  • 98 tests covering: instrumentor lifecycle, LLM/Chain/Agent/Tool/Retriever span creation, input/output content capture with enable/disable, error spans, data extraction utilities, and multi-step chain composition.
  • tox-loongsuite.ini configured with oldest / latest dependency matrices.
  • Zero ERROR-level log output during test execution.

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Add unit tests

Does This PR Require a Core Repo Change?

  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@Cirilla-zmh Cirilla-zmh added enhancement New feature or request instrumentaion The instrumentation label represents issues related to instrumentation. genai The genai label represents issues related to generative AI. labels Mar 6, 2026
@Cirilla-zmh
Copy link
Collaborator Author

PR #133 代码审查结果

符合规范的项目

  1. pyproject.toml 配置正确 - 与其他同类 instrumentation 一致
  2. 包含 version 文件 - 内容为 0.2.0.dev,符合开发版本规范
  3. 包含 package 文件 - 正确设置了 langchain_core 依赖版本
  4. LICENSE 文件头完整 - 所有文件都包含正确的 Apache 2.0 LICENSE 头
  5. CHANGELOG 已更新 - 包含了 0.1.0 版本的初始化记录
  6. README.rst 已更新 - 提供了完整的安装和使用说明
  7. GitHub Workflow 文件已生成 - loongsuite_lint_0.ymlloongsuite_test_0.yml 已创建

⚠️ 需要修复的问题

1. tox-loongsuite.ini 中 langchain 测试环境被注释

tox-loongsuite.ini 文件中,langchain 相关的测试环境被注释掉了:

; ; loongsuite-instrumentation-langchain
; py3{9,10,11,12,13}-test-loongsuite-instrumentation-langchain
; lint-loongsuite-instrumentation-langchain

这会导致 GitHub Actions 中的测试不会运行,需要取消注释。

2. util/opentelemetry-util-genai/handler.py 缺少 LoongSuite Extension 注释

根据审查规范,修改 util/opentelemetry-util-handler 中的特定文件(如 handler.py)时,需要在改动处添加 "LoongSuite Extension" 注释。但当前 handler.py 文件中没有看到相关注释。

📋 建议

  1. 取消 tox-loongsuite.ini 中 langchain 测试环境的注释
  2. 在 util/opentelemetry-util-genai/handler.py 的修改处添加 "LoongSuite Extension" 注释

🎯 总体评估

PR #133 基本符合 LoongSuite instrumentation 规范,主要问题在于测试配置和注释缺失。建议作者修复上述问题后再合并。

审查结论:需要修改后重新提交

Copy link
Collaborator Author

@Cirilla-zmh Cirilla-zmh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #133 代码审查结果

符合规范的项目

  1. pyproject.toml 配置正确 - 与其他同类 instrumentation 一致
  2. 包含 version 文件 - 内容为 0.2.0.dev,符合开发版本规范
  3. 包含 package 文件 - 正确设置了 langchain_core 依赖版本
  4. LICENSE 文件头完整 - 所有文件都包含正确的 Apache 2.0 LICENSE 头
  5. CHANGELOG 已更新 - 包含了 0.1.0 版本的初始化记录
  6. README.md 已更新 - 提供了完整的安装和使用说明
  7. GitHub Workflow 文件已生成 - loongsuite_lint_0.ymlloongsuite_test_0.yml 已创建

⚠️ 需要修复的问题

1. tox-loongsuite.ini 中 langchain 测试环境被注释

tox-loongsuite.ini 文件中,langchain 相关的测试环境被注释掉了(行 39-41):

; ; loongsuite-instrumentation-langchain
; py3{9,10,11,12,13}-test-loongsuite-instrumentation-langchain
; lint-loongsuite-instrumentation-langchain

同时相关的依赖配置(行 100)和命令配置(行 150-151)也存在但被注释。

这会导致 GitHub Actions 中的测试不会运行,需要取消注释。

2. util/opentelemetry-util-genai/handler.py 缺少 LoongSuite Extension 注释

根据审查规范,修改 util/opentelemetry-util-handler 中的特定文件(如 handler.py)时,需要在改动处添加 "LoongSuite Extension" 注释。但当前 handler.py 文件中没有看到相关注释。

3. 确认没有多余的 LICENSE 文件

需要确认整个 PR 中没有在 instrumentation 目录下添加不必要的 LICENSE 文件。

🔍 潜在资源泄漏风险检查

从代码结构来看:

  • 使用了 wrapt.wrap_function_wrapper 进行函数包装,这是安全的做法
  • CallbackManager 的处理逻辑看起来合理,通过 add_handler 添加 tracer
  • 没有明显的上下文泄漏或内存泄漏风险

📋 建议

  1. 取消 tox-loongsuite.ini 中 langchain 测试环境的注释(行 39-41, 100, 150-151)
  2. 在 util/opentelemetry-util-genai/handler.py 的修改处添加 "LoongSuite Extension" 注释
  3. 确认没有多余的 LICENSE 文件

🎯 总体评估

PR #133 基本符合 LoongSuite instrumentation 规范,主要问题在于测试配置和注释缺失。建议作者修复上述问题后再合并。

审查结论:需要修改后重新提交

@Cirilla-zmh
Copy link
Collaborator Author

详细审查反馈 - 具体修改位置

🔧 需要修复的具体问题

1. tox-loongsuite.ini 中 langchain 测试环境被注释

文件:
行号: 39-41, 100, 150-151
问题: langchain 相关的测试环境配置被注释掉了,需要取消注释以确保 CI 能正常运行测试。

需要取消注释以下行:

  • 行 39-41: 环境声明部分
  • 行 100: 依赖配置
  • 行 150-151: 命令配置

2. util/opentelemetry-util-genai/handler.py 缺少 LoongSuite Extension 注释

文件:
问题: 根据项目规范,修改 util 目录下的文件时需要在改动处添加 "LoongSuite Extension" 注释。请在相关修改位置添加此注释。

3. 确认没有多余的 LICENSE 文件

请检查整个 PR 变更,确保没有在 instrumentation 目录中添加不必要的 LICENSE 文件。


其他方面都符合规范! 修复以上问题后就可以批准合并了。

Change-Id: Id2d3e8f987ae2cf91c1c1f0bc1187124b473a8a9
Co-developed-by: Cursor <noreply@cursor.com>
…ration

- Added `opentelemetry-util-genai` as a dependency.
- Integrated `ExtendedTelemetryHandler` into `LangChainInstrumentor` for improved telemetry handling.
- Updated `LoongsuiteTracer` to utilize the new handler for span creation and context management.
- Refactored utility functions for better data extraction and base64 image filtering.
- Added tests for agent span detection and base64 filtering functionality.

This update enhances the data extraction capabilities and improves the overall telemetry experience in LangChain.

Change-Id: Ib45f3cc60e6169eed5ed8c47683be4f8deb3daec
Co-developed-by: Cursor <noreply@cursor.com>
…ure improvements

- Updated `LoongsuiteTracer` to explicitly manage context propagation for parent-child span relationships.
- Introduced new utility functions for checking content capture settings in chain spans.
- Enhanced tests to verify input/output content capture for chains, LLMs, tools, and retrievers.
- Added support for capturing message content in span attributes based on configuration.

These changes improve the accuracy and reliability of telemetry data in LangChain, ensuring better tracking of operations and their contexts.

Change-Id: Ib45f3cc60e6169eed5ed8c47683be4f8deb3daec
Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: I83b3e6d0d09b3dd083fcec97c75afa0d89d5c792
Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: I55af3603876d0fe8ee96f3fb4cab6ad92b654818
Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: I34b48596cff260f4701373edf619626359c54319
Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: I1d31136adc5e0509e0572eaf4dbda4014f8ad0ce
Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: Ibf2cddc6b0f2c2ef224c2749ae29729bc60bb590
Co-developed-by: Cursor <noreply@cursor.com>
@@ -36,9 +36,9 @@ envlist =
; py3{9,10,11,12,13}-test-loongsuite-instrumentation-dify
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注释格式不正确。应该使用 而不是 。请修正注释格式以保持一致性。

loongsuite-langchain: {[testenv]test_deps}
loongsuite-langchain: -r {toxinidir}/instrumentation-loongsuite/loongsuite-instrumentation-langchain/test-requirements.txt
langchain-oldest: -r {toxinidir}/instrumentation-loongsuite/loongsuite-instrumentation-langchain/tests/requirements.oldest.txt
langchain-latest: {[testenv]test_deps}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

依赖项配置中的环境名称应该保持一致性。建议使用 'loongsuite-langchain-oldest' 和 'loongsuite-langchain-latest' 而不是 'langchain-oldest' 和 'langchain-latest',以与其他 instrumentation 保持一致的命名约定。

Copy link
Collaborator Author

@Cirilla-zmh Cirilla-zmh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感谢提交 PR!我已经添加了两个行内评论,主要涉及:

  1. tox-loongsuite.ini 注释格式问题:第 39 行的注释格式应该与其他 instrumentation 保持一致
  2. tox-loongsuite.ini 依赖项命名一致性:第 100 行的环境名称应该使用 loongsuite-langchain-* 前缀以保持一致性

请修复这些问题后,我会重新审查并批准 PR。

Change-Id: Ic63cc0128f87670baa5455dbf785f41dcca69494
Co-developed-by: Cursor <noreply@cursor.com>
…r id

Change-Id: Id5acdec8ad24cef90250fbd82149773856c9569f
Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: I51457e772e64f87fe7aa3bd816dc805a519b7170
Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: Ib5de526650719ff7a7867b55217c5c6f71e85740
Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: Idc634b3e9c750437e0716387cad4e99cf3157f5d
Co-developed-by: Cursor <noreply@cursor.com>
@Cirilla-zmh Cirilla-zmh force-pushed the minghui/refactor_langchain branch from 8630acb to bd608f5 Compare March 9, 2026 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request genai The genai label represents issues related to generative AI. instrumentaion The instrumentation label represents issues related to instrumentation.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants