Refactor loongsuite instrumentation for langchain#133
Refactor loongsuite instrumentation for langchain#133Cirilla-zmh wants to merge 13 commits intoalibaba:mainfrom
Conversation
PR #133 代码审查结果✅ 符合规范的项目
|
Cirilla-zmh
left a comment
There was a problem hiding this comment.
PR #133 代码审查结果
✅ 符合规范的项目
- pyproject.toml 配置正确 - 与其他同类 instrumentation 一致
- 包含 version 文件 - 内容为
0.2.0.dev,符合开发版本规范 - 包含 package 文件 - 正确设置了 langchain_core 依赖版本
- LICENSE 文件头完整 - 所有文件都包含正确的 Apache 2.0 LICENSE 头
- CHANGELOG 已更新 - 包含了 0.1.0 版本的初始化记录
- README.md 已更新 - 提供了完整的安装和使用说明
- GitHub Workflow 文件已生成 -
loongsuite_lint_0.yml和loongsuite_test_0.yml已创建
⚠️ 需要修复的问题
1. tox-loongsuite.ini 中 langchain 测试环境被注释
在 tox-loongsuite.ini 文件中,langchain 相关的测试环境被注释掉了(行 39-41):
; ; loongsuite-instrumentation-langchain
; py3{9,10,11,12,13}-test-loongsuite-instrumentation-langchain
; lint-loongsuite-instrumentation-langchain同时相关的依赖配置(行 100)和命令配置(行 150-151)也存在但被注释。
这会导致 GitHub Actions 中的测试不会运行,需要取消注释。
2. util/opentelemetry-util-genai/handler.py 缺少 LoongSuite Extension 注释
根据审查规范,修改 util/opentelemetry-util-handler 中的特定文件(如 handler.py)时,需要在改动处添加 "LoongSuite Extension" 注释。但当前 handler.py 文件中没有看到相关注释。
3. 确认没有多余的 LICENSE 文件
需要确认整个 PR 中没有在 instrumentation 目录下添加不必要的 LICENSE 文件。
🔍 潜在资源泄漏风险检查
从代码结构来看:
- 使用了
wrapt.wrap_function_wrapper进行函数包装,这是安全的做法 - CallbackManager 的处理逻辑看起来合理,通过
add_handler添加 tracer - 没有明显的上下文泄漏或内存泄漏风险
📋 建议
- 取消 tox-loongsuite.ini 中 langchain 测试环境的注释(行 39-41, 100, 150-151)
- 在 util/opentelemetry-util-genai/handler.py 的修改处添加 "LoongSuite Extension" 注释
- 确认没有多余的 LICENSE 文件
🎯 总体评估
PR #133 基本符合 LoongSuite instrumentation 规范,主要问题在于测试配置和注释缺失。建议作者修复上述问题后再合并。
审查结论:需要修改后重新提交
详细审查反馈 - 具体修改位置🔧 需要修复的具体问题1. tox-loongsuite.ini 中 langchain 测试环境被注释文件: 需要取消注释以下行:
2. util/opentelemetry-util-genai/handler.py 缺少 LoongSuite Extension 注释文件: 3. 确认没有多余的 LICENSE 文件请检查整个 PR 变更,确保没有在 instrumentation 目录中添加不必要的 LICENSE 文件。 其他方面都符合规范! 修复以上问题后就可以批准合并了。 |
Change-Id: Id2d3e8f987ae2cf91c1c1f0bc1187124b473a8a9 Co-developed-by: Cursor <noreply@cursor.com>
…ration - Added `opentelemetry-util-genai` as a dependency. - Integrated `ExtendedTelemetryHandler` into `LangChainInstrumentor` for improved telemetry handling. - Updated `LoongsuiteTracer` to utilize the new handler for span creation and context management. - Refactored utility functions for better data extraction and base64 image filtering. - Added tests for agent span detection and base64 filtering functionality. This update enhances the data extraction capabilities and improves the overall telemetry experience in LangChain. Change-Id: Ib45f3cc60e6169eed5ed8c47683be4f8deb3daec Co-developed-by: Cursor <noreply@cursor.com>
…ure improvements - Updated `LoongsuiteTracer` to explicitly manage context propagation for parent-child span relationships. - Introduced new utility functions for checking content capture settings in chain spans. - Enhanced tests to verify input/output content capture for chains, LLMs, tools, and retrievers. - Added support for capturing message content in span attributes based on configuration. These changes improve the accuracy and reliability of telemetry data in LangChain, ensuring better tracking of operations and their contexts. Change-Id: Ib45f3cc60e6169eed5ed8c47683be4f8deb3daec Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: I83b3e6d0d09b3dd083fcec97c75afa0d89d5c792 Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: I34b48596cff260f4701373edf619626359c54319 Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: I1d31136adc5e0509e0572eaf4dbda4014f8ad0ce Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: Ibf2cddc6b0f2c2ef224c2749ae29729bc60bb590 Co-developed-by: Cursor <noreply@cursor.com>
| @@ -36,9 +36,9 @@ envlist = | |||
| ; py3{9,10,11,12,13}-test-loongsuite-instrumentation-dify | |||
There was a problem hiding this comment.
注释格式不正确。应该使用 而不是 。请修正注释格式以保持一致性。
| loongsuite-langchain: {[testenv]test_deps} | ||
| loongsuite-langchain: -r {toxinidir}/instrumentation-loongsuite/loongsuite-instrumentation-langchain/test-requirements.txt | ||
| langchain-oldest: -r {toxinidir}/instrumentation-loongsuite/loongsuite-instrumentation-langchain/tests/requirements.oldest.txt | ||
| langchain-latest: {[testenv]test_deps} |
There was a problem hiding this comment.
依赖项配置中的环境名称应该保持一致性。建议使用 'loongsuite-langchain-oldest' 和 'loongsuite-langchain-latest' 而不是 'langchain-oldest' 和 'langchain-latest',以与其他 instrumentation 保持一致的命名约定。
Cirilla-zmh
left a comment
There was a problem hiding this comment.
感谢提交 PR!我已经添加了两个行内评论,主要涉及:
- tox-loongsuite.ini 注释格式问题:第 39 行的注释格式应该与其他 instrumentation 保持一致
- tox-loongsuite.ini 依赖项命名一致性:第 100 行的环境名称应该使用
loongsuite-langchain-*前缀以保持一致性
请修复这些问题后,我会重新审查并批准 PR。
Change-Id: Ic63cc0128f87670baa5455dbf785f41dcca69494 Co-developed-by: Cursor <noreply@cursor.com>
…r id Change-Id: Id5acdec8ad24cef90250fbd82149773856c9569f Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: I51457e772e64f87fe7aa3bd816dc805a519b7170 Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: Ib5de526650719ff7a7867b55217c5c6f71e85740 Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: Idc634b3e9c750437e0716387cad4e99cf3157f5d Co-developed-by: Cursor <noreply@cursor.com>
8630acb to
bd608f5
Compare
Description
Summary
Rewrites
loongsuite-instrumentation-langchainfrom scratch, replacing the legacywrapt-based function wrapping with a BaseTracer callback approach and integratingopentelemetry-util-genaifor standardized GenAI semantic convention compliance.Design
Tracer architecture —
LoongsuiteTracerextendslangchain_core.tracers.base.BaseTracerand hooks into the fine-grained_on_*_start/_on_*_end/_on_*_errorcallbacks. This lets us extract telemetry at the point where LangChain's Run objects are fully populated, rather than monkey-patching individual methods.Operation type mapping — Each LangChain run type is mapped to the appropriate
util-genaihandler method:start_llm/stop_llm/fail_llmstart_invoke_agent/stop_invoke_agent/fail_invoke_agentCHAINspan kindstart_execute_tool/stop_execute_tool/fail_execute_toolstart_retrieve/stop_retrieve/fail_retrieveAgent runs are distinguished from generic chains by
run.name(e.g.AgentExecutor,MRKLChain).Context propagation — Follows the Robin/OpenLLMetry pattern: parent-child span relationships are established by passing Context explicitly to
start_span/handler.start_*, avoiding hazardousattach/detachin a callback system where exceptions between callbacks can leak context. The sole exception is generic Chain spans, which doattach/detachso that non-LangChain child operations (e.g. HTTP calls) nest correctly.Content capture gating — Chain
input.value/output.valueattributes are gated behindutil-genai'sis_experimental_mode()andget_content_capturing_mode(), consistent with how LLM/Tool/Retriever content is already controlled.Thread safety — All access to the per-run bookkeeping dict is protected by
RLock.TTFT —
on_llm_new_tokenrecords the monotonic first-token timestamp;util-genaicomputesgen_ai.response.time_to_first_tokenon span finalization.Changes to util-genai
start_*methods inhandler.pyandextended_handler.pyaccept an optionalcontextparameter, forwarded tostart_spanfor explicit parent-child linking._safe_detachuses_RUNTIME_CONTEXT.detachdirectly to avoid the noisy ERROR log from the OTel SDK'scontext_api.detachwrapper.otel_contextreference in_multimodal_processing.py.##Test plan
tox-loongsuite.iniconfigured witholdest/latestdependency matrices.ERROR-level log output during test execution.Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Does This PR Require a Core Repo Change?
Checklist:
See contributing.md for styleguide, changelog guidelines, and more.