feat(server): add memory health statistics API endpoints#706
feat(server): add memory health statistics API endpoints#706mvanhorn wants to merge 5 commits intovolcengine:mainfrom
Conversation
Add two new API endpoints for querying aggregate memory health:
- GET /stats/memories - global memory stats (counts by category,
hotness distribution, staleness metrics)
- GET /stats/sessions/{id} - per-session extraction statistics
The StatsAggregator reads from existing VikingDB indexes and the
hotness_score function without introducing new storage.
Includes unit tests with mocked VikingDB backend.
Replace the audio parser stub with a working implementation that: - Extracts metadata (duration, sample rate, channels, bitrate) via mutagen - Transcribes speech via Whisper API with timestamped segments - Builds structured ResourceNode tree with L0/L1/L2 content tiers - Falls back to metadata-only output when Whisper is unavailable - Adds mutagen as optional dependency under [audio] extra - Adds audio_summary prompt template for semantic indexing - Includes unit tests with mocked Whisper API and mutagen
qin-ctx
left a comment
There was a problem hiding this comment.
[Bug] (blocking) This PR bundles two unrelated features — memory health stats API (commit 1) and a complete audio parser rewrite with Whisper transcription (commit 2). These have no code dependency and should be separate PRs. Mixing them makes review, rollback, and changelog tracking harder.
Additional findings:
- [Design] (non-blocking) PR description says
GET /sessions/{session_id}/statsbut the actual route isGET /api/v1/stats/sessions/{session_id}. Please update the description to match. - [Design] (non-blocking)
_asr_transcribeand_asr_transcribe_with_timestampsduplicate the OpenAI client creation code (get_openviking_config()+openai.AsyncOpenAI(...)). Extract to a shared helper. Also,config.llm.api_keymay not be the correct credential for OpenAI Whisper if the project is configured for a different LLM provider. - [Suggestion] (non-blocking)
audio_summary.yamlprompt template is added but never referenced in any code path — dead code. - [Suggestion] (non-blocking)
_generate_semantic_infoaccepts aviking_fsparameter that is never used in the method body. - [Suggestion] (non-blocking) CI
lint / lintcheck is failing.
| total_vectors = 0 | ||
|
|
||
| for cat in categories: | ||
| records = await self._query_memories_by_category(ctx, cat) |
There was a problem hiding this comment.
[Bug] (blocking) N+1 query: _query_memories_by_category executes the same Eq("context_type", "memory") query with limit=10000 for each of the 8 categories, then filters by URI prefix in Python. This means 8 identical DB round-trips, each returning up to 10,000 records.
Fetch once and group in memory instead:
all_records = await self._query_all_memories(ctx)
by_cat = defaultdict(list)
for r in all_records:
uri = r.get("uri", "")
for cat in categories:
if f"/{cat}/" in uri:
by_cat[cat].append(r)
break| "by_category": by_category, | ||
| "hotness_distribution": hotness_dist, | ||
| "staleness": staleness, | ||
| "total_vectors": total_vectors, |
There was a problem hiding this comment.
[Bug] (blocking) total_vectors is always identical to total_memories — both are sum(by_category.values()). The PR description shows them as different numbers (1247 vs 8941), implying total_vectors should represent the actual vector embedding count (a memory can have multiple vectors). The current implementation is misleading.
Either compute the real vector count from VikingDB index stats, or remove this field until it can report a meaningful value.
openviking/server/routers/stats.py
Outdated
| try: | ||
| result = await aggregator.get_session_extraction_stats(session_id, service, _ctx) | ||
| return Response(status="ok", result=result) | ||
| except Exception as e: |
There was a problem hiding this comment.
[Bug] (blocking) Catching bare Exception and returning NOT_FOUND swallows all error types — DB timeouts, permission errors, serialization failures, etc. are all misreported as "session not found".
Distinguish session-not-found from other failures:
try:
result = await aggregator.get_session_extraction_stats(session_id, service, _ctx)
return Response(status="ok", result=result)
except KeyError:
return Response(
status="error",
error=ErrorInfo(code="NOT_FOUND", message=f"Session not found: {session_id}"),
)
except Exception as e:
logger.error("Failed to get session stats for %s: %s", session_id, e)
return Response(
status="error",
error=ErrorInfo(code="INTERNAL", message="Internal error retrieving session stats"),
)(Adjust the specific exception type to match what session.load() actually raises for missing sessions.)
The audio parser feature is unrelated to memory health stats and belongs in its own PR (volcengine#707). Reverts audio.py to pre-rewrite state, removes the unused audio_summary.yaml template and audio parser tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…or handling - Replace per-category _query_memories_by_category with single _query_all_memories call, grouping by category in Python (1 DB round-trip instead of 8) - Remove misleading total_vectors field (was identical to total_memories). Will add real vector count from VikingDB index stats in a follow-up - Distinguish KeyError (session not found) from other failures in stats.py endpoint, returning INTERNAL_ERROR for unexpected exceptions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Addressed all feedback in 898cc8e and c0d13ad:
|
|
Addressed all blocking feedback in 898cc8e and c0d13ad:
|
Problem Statement
OpenViking has the infrastructure for memory observability (
hotness_score,retrieval_stats, eval recorder) but no API to query aggregate memory health. Operators can't answer basic questions without digging into the database directly:Proposed Solution
Add two API endpoints:
GET /stats/memories- Global memory statistics{ "total_memories": 1247, "by_category": { "profile": 3, "preferences": 42, "entities": 186, "events": 89, "cases": 412, "patterns": 67, "tools": 298, "skills": 150 }, "hotness_distribution": { "cold": 312, "warm": 687, "hot": 248 }, "staleness": { "not_accessed_7d": 89, "not_accessed_30d": 312, "oldest_memory_age_days": 45 }, "total_vectors": 8941 }GET /sessions/{session_id}/stats- Per-session extraction stats{ "session_id": "abc123", "total_turns": 5, "memories_extracted": 3, "contexts_used": 2, "skills_used": 1 }Supports
?category=casesquery parameter to filter by a single memory category.Alternatives Considered
Extending the TUI only (#664) - but the TUI isn't programmatically accessible, and automated monitoring needs an API.
Implementation
openviking/storage/stats_aggregator.py- CoreStatsAggregatorclass that queries VikingDB for category counts, hotness distribution (cold <0.2, warm 0.2-0.6, hot >0.6), and staleness metrics. Uses the existinghotness_score()function frommemory_lifecycle.py.openviking/server/routers/stats.py- FastAPI router with two endpoints, following the pattern inrouters/sessions.py.app.pyandrouters/__init__.pyfollowing existing conventions.Evidence
Test Plan
StatsAggregatorwith mocked VikingDB (empty store, category counts, hotness buckets, staleness, error handling)_parse_datetimehelper tested for None, datetime objects, ISO strings, invalid input