Conversation
ac06f35 to
291aecc
Compare
291aecc to
84ed0a2
Compare
- New src/embedding.ts: Voyage AI client, cosine similarity, vectorSearch - embed() calls POST https://api.voyageai.com/v1/embeddings - cosineSimilarity() pure-JS dot product / magnitude - vectorSearch() brute-force over knowledge BLOBs (<100 entries) - embedKnowledgeEntry() fire-and-forget (errors logged, never thrown) - backfillEmbeddings() batch-embeds entries missing embeddings - checkConfigChange() detects model/dimension changes and clears stale embeddings for re-embedding on next backfill - Schema migration v8: - ADD COLUMN embedding BLOB to knowledge table - CREATE TABLE kv_meta (key-value store for plugin state) - Config: search.embeddings section (enabled, model, dimensions) - Default: disabled, voyage-code-3, 1024 dims - Requires VOYAGE_API_KEY env var - Hook embedding into ltm.create() and ltm.update() - Fire-and-forget after sync DB write - Re-embeds on content change - Add vector search as additional RRF list in recall tool - Same k: key prefix as BM25 knowledge — RRF merges, not duplicates - Entries found by both BM25 and vector get boosted score - Startup backfill when embeddings first enabled - Migration strategy: on startup, compare model+dimensions config fingerprint against stored value — if changed, clear all embeddings and re-embed in background - 18 new tests: cosine similarity, BLOB round-trip, vectorSearch, isAvailable, config schema, config change detection
84ed0a2 to
8ca4351
Compare
BYK
added a commit
that referenced
this pull request
Mar 24, 2026
…embedding BLOBs (#52) ## Problem Both transform hooks in `src/index.ts` — `experimental.chat.system.transform` and `experimental.chat.messages.transform` — had no try-catch wrapping. Any SQLite error (corruption, busy timeout, schema mismatch) propagated through OpenCode's Plugin.trigger mechanism and surfaced as a 500 "Internal server error", halting the user's session. Additionally, after adding the `embedding BLOB` column (schema v8), all `SELECT *` queries in `ltm.ts` were unnecessarily loading 4KB of Float32Array data per knowledge entry (~200KB per `forSession()` call) that was immediately discarded. ## Investigation: Embedding/vector search link The embedding/vector code is **not in the transform hook call path** — `forSession()` uses only FTS5 BM25, not embeddings. The 500 errors were a latent bug (unprotected hooks) that predated the embedding feature. The temporal correlation with the Voyage AI rollout was coincidental — it coincided with the search overhaul (PRs #46-#50). ## Changes ### Error handling (`src/index.ts`, `src/gradient.ts`) - **system.transform**: Wrap knowledge injection in try-catch. On error: log via `log.error()`, reset `setLtmTokens(0)`, push fallback note directing LLM to use recall tool. Track degraded sessions to avoid busting the provider's read-token cache on recovery — if conversation is longer than LTM content, keep fallback note. - **messages.transform**: Wrap entire transform path in try-catch. On error: log and leave `output.messages` unmodified (layer 0 passthrough). - Export `getLastTransformEstimate()` from gradient.ts for the cache trade-off calculation. ### Performance (`src/ltm.ts`) - Define `KNOWLEDGE_COLS` / `KNOWLEDGE_COLS_K` constants listing exactly the 11 columns in `KnowledgeEntry`, excluding `embedding`. - Replace all 10 `SELECT *` / `SELECT k.*` queries across 8 functions. ### Tests (`test/index.test.ts`) 4 new tests: 1. system.transform survives DB error → fallback note + `getLtmTokens() === 0` 2. messages.transform survives DB error → messages unchanged 3. LTM recovery skipped on long session (preserves prompt cache) 4. LTM recovery proceeds on short session (cheap cache bust)
2 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Phase 5: Vector embedding search (depends on #49)
Adds semantic vector search using Voyage AI's
voyage-code-3model, layered on top of the existing BM25 + RRF fusion pipeline.How it works
Why Voyage AI?
voyage-code-3is code-optimized — best-in-class for technical/code text retrievalVOYAGE_API_KEY), OpenAI-compatible response formatWhy pure-JS cosine, not libSQL/sqlite-vec?
Tested both.
bun:sqliteis standard SQLite — novector_distance_cos()or DiskANN. The@libsql/client-wasmWASM package also lacks vector functions. Nativelibsqlworks but adds a ~15MB native dependency — overkill for <100 entries where brute-force cosine takes microseconds.Graceful degradation
search.embeddings.enabled(default: false) +VOYAGE_API_KEYenv varembedKnowledgeEntry()is fire-and-forget — embedding failures are logged, never thrownRRF integration
Vector results use the same
k:key prefix as BM25 knowledge results → RRF merges rather than duplicates. An entry found by both BM25 and vector search gets a higher combined RRF score.Config
{ "search": { "embeddings": { "enabled": true, "model": "voyage-code-3", "dimensions": 1024 } } }Test coverage