Skip to content

feat: add exposeRetrievalMetadata and clean recall output#113

Open
Disaster-Terminator wants to merge 3 commits intoCortexReach:mainfrom
Disaster-Terminator:recovery/expose-retrieval-metadata-clean
Open

feat: add exposeRetrievalMetadata and clean recall output#113
Disaster-Terminator wants to merge 3 commits intoCortexReach:mainfrom
Disaster-Terminator:recovery/expose-retrieval-metadata-clean

Conversation

@Disaster-Terminator
Copy link

@Disaster-Terminator Disaster-Terminator commented Mar 8, 2026

Background

autoRecall injection and memory_recall currently include retrieval metadata in their default output, for example:

[category:scope] text (80%, vector+BM25+reranked)

That information can be useful while debugging retrieval behavior, but it is noisy in normal operation:

  • it adds unnecessary token cost to LLM-visible context;
  • it exposes low-level retrieval details in user-facing output;
  • it is not currently controlled by an explicit config flag.

This PR is a follow-up to #107. PR #107 was closed because unrelated upstream commits were accidentally mixed in during squash. This revision reapplies only the intended retrieval-metadata change on top of the current main.

What this PR changes

1. Hide retrieval metadata by default

  • remove (score, sources) suffixes from autoRecall injected lines;
  • remove retrieval score/source annotations from the human/LLM-visible text returned by memory_recall;
  • stop exposing score and sources in details.memories by default.

2. Add an explicit debug switch

Introduce a new exposeRetrievalMetadata config option, defaulting to false.

When enabled, retrieval metadata is exposed separately via details.debug, instead of being mixed into the default text output.

3. Preserve existing candidate response structure

For the candidate flows in memory_forget and memory_update, keep the existing details.candidates structure and attach details.debug only when needed. This avoids introducing an unnecessary response-shape change.

Files changed

File Purpose
index.ts clean up autoRecall output and pass exposeRetrievalMetadata into tool context
src/tools.ts separate default output from debug metadata in recall / forget / update flows
openclaw.plugin.json add the exposeRetrievalMetadata config option
test/memory-recall-metadata.test.mjs add regression coverage for metadata visibility and candidate-structure compatibility
package.json include the new test in the test script

Validation

npm ci
npm test

Result: 41/41 passing.

Scope

This revision is intentionally limited to:

  • removing retrieval metadata from default recall output;
  • adding an explicit debug toggle;
  • preserving compatibility for existing candidate response payloads.

- Rename sanitizeMemoryForSerialization to buildSanitizedMemoryPayload
- Preserve details.candidates for forget/update tools instead of replacing with memories
- Add explicit types for SerializedMemory, DebugMemory, SerializedMemoryPayload
- Extend test coverage to verify candidates contract and clean text output
@Disaster-Terminator Disaster-Terminator changed the title feat: add exposeRetrievalMetadata and clean recall output feat: 新增 exposeRetrievalMetadata 配置,默认隐藏 recall 输出中的检索元数据 Mar 8, 2026
@Disaster-Terminator Disaster-Terminator changed the title feat: 新增 exposeRetrievalMetadata 配置,默认隐藏 recall 输出中的检索元数据 feat: add exposeRetrievalMetadata and clean recall output Mar 8, 2026
@Disaster-Terminator Disaster-Terminator marked this pull request as ready for review March 8, 2026 10:10
Copy link
Collaborator

@rwmjhb rwmjhb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clean resubmission from #107.

Two things to address before merging:

1. Branch needs rebase — vllm-provider test will be dropped

Your branch is based on a commit before PR #76 landed. The current main has test/vllm-provider.test.mjs in the test script, but your package.json change overwrites that line without it:

# current main
"test": "node test/vllm-provider.test.mjs && node test/embedder-error-hints.test.mjs && ..."

# this PR
"test": "node test/embedder-error-hints.test.mjs && ... && node --test test/memory-recall-metadata.test.mjs ..."

Please rebase onto latest main and include vllm-provider.test.mjs in the test script.

2. details payload shape change is a silent breaking change

Removing score and sources from details.memories[*] / details.candidates[*] by default changes the response schema. Any downstream consumer reading result.details.memories[0].score will get undefined after this change.

I understand the motivation (reducing token noise in LLM context), but this should be treated as an explicit breaking change rather than an internal cleanup. Suggestions:

  • Option A: Keep score/sources in details.memories (structured data is not LLM-visible token cost), only clean the content[].text string. This avoids any schema breakage.
  • Option B: If removing from details is intentional, document it as a breaking change and bump accordingly.

The content[].text cleanup (removing 82%, vector+BM25+reranked from the human-readable string) is unambiguously good and non-breaking. The details object change is where the concern lies.

@win4r
Copy link
Collaborator

win4r commented Mar 10, 2026

@claude

@claude
Copy link

claude bot commented Mar 10, 2026

Claude Code is working…

I'll analyze this and get back to you.

View job run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants