Skip to content

feat: BYOK embedding provider support (OpenAI, Voyage, Gemini)#39

Merged
iamvirul merged 6 commits intoVecGrep:mainfrom
Kavirubc:feat/byok-embedding-providers
Mar 4, 2026
Merged

feat: BYOK embedding provider support (OpenAI, Voyage, Gemini)#39
iamvirul merged 6 commits intoVecGrep:mainfrom
Kavirubc:feat/byok-embedding-providers

Conversation

@Kavirubc
Copy link
Contributor

@Kavirubc Kavirubc commented Mar 4, 2026

Type of Change

  • ✨ New feature

Description

Closes #8.

Adds Bring-Your-Own-Key (BYOK) support for cloud embedding providers so users can index and search codebases with OpenAI, Voyage AI, or Google Gemini embeddings in addition to the default local model.

Key design decisions (from issue discussion):

  • API keys via environment variables only — no config files, no key params
  • Per-project provider lock — once a project is indexed with a provider, switching requires force=True
  • watch=True is blocked for cloud providers to prevent unbounded API costs
  • Cloud SDKs are optional dependencies — base install stays lightweight

Provider Table

Provider Model Dims Env Var
local (default) all-MiniLM-L6-v2-code-search-512 384 None
openai text-embedding-3-small 1536 VECGREP_OPENAI_KEY
voyage voyage-code-2 1024 VECGREP_VOYAGE_KEY
gemini gemini-embedding-001 3072 VECGREP_GEMINI_KEY

Note: gemini-embedding-001 returns 3072 dims (verified against live API). The google-genai SDK is used (the older google-generativeai package is deprecated).

How to Use

Install with cloud extras:

pip install 'vecgrep[openai]'    # OpenAI only
pip install 'vecgrep[voyage]'    # Voyage AI only
pip install 'vecgrep[gemini]'    # Gemini only
pip install 'vecgrep[cloud]'     # All three

Set the API key env var, then index:

# OpenAI
export VECGREP_OPENAI_KEY=sk-...
index_codebase(path="/your/project", provider="openai")

# Voyage AI
export VECGREP_VOYAGE_KEY=pa-...
index_codebase(path="/your/project", provider="voyage")

# Gemini
export VECGREP_GEMINI_KEY=AIza...
index_codebase(path="/your/project", provider="gemini")

Search uses the stored provider automatically — no need to specify it again:

search_code(query="authentication logic", path="/your/project")

Switching providers requires a full re-index with force=True:

index_codebase(path="/your/project", provider="openai", force=True)

Live watch is local-only (cloud providers will return a clear error):

# This is blocked — would cause unbounded API costs
index_codebase(path="/your/project", provider="openai", watch=True)
# → Error: watch=True is not supported with provider 'openai'...

Changes Made

  • src/vecgrep/embedder.py — Strategy pattern refactor: EmbeddingProvider ABC + LocalProvider, OpenAIProvider, VoyageProvider, GeminiProvider implementations; PROVIDER_REGISTRY + get_provider() factory; backward-compatible embed() free function preserved
  • src/vecgrep/store.py — Dynamic vector dims: _chunks_schema(dims) function, VectorStore(dims=384), _get_meta/_set_meta helpers, set_provider_meta/get_provider_meta, drop_and_recreate_chunks, status() now includes provider/model/dims
  • src/vecgrep/server.py — Provider wiring: _resolve_provider() enforces per-project lock; _do_index gains provider param; cloud+watch guard; search_code reads stored provider for query embedding; get_index_status shows provider/model/dims; LiveSyncHandler skips non-local providers
  • pyproject.toml — Optional dep groups: openai, voyage, gemini, cloud
  • uv.lock — Updated to include all optional dep trees
  • tests/test_providers.py (new) — Registry, LocalProvider shape/norm, cloud providers raise without keys/packages, mocked API calls for all 3 providers
  • tests/test_store.py — Dynamic dims, drop_and_recreate_chunks, provider meta persistence, meta helpers
  • tests/test_server.py — Provider locking, cloud+watch guard, status fields, LiveSync skip for cloud

Testing

  • 191 tests pass (pytest tests/)
  • Lint clean (ruff check src/ tests/)
  • Pre-commit hook passes
  • CI pipeline simulated locally (uv sync --extra dev → lint → pytest)
  • Gemini live API verified: gemini-embedding-001 returns shape (1, 3072), norm 1.0000
  • index_codebase(path, provider="local") works as before (backward-compatible)
  • index_codebase(path, provider="openai") without key → clear RuntimeError
  • index_codebase(path, provider="openai", watch=True) → blocked with message
  • Switch provider without force=True → clear error about provider lock

Kavirubc added 6 commits March 4, 2026 15:07
Add EmbeddingProvider ABC and four concrete implementations:
- LocalProvider: wraps existing ONNX/torch logic as instance state
- OpenAIProvider: text-embedding-3-small (1536 dims), VECGREP_OPENAI_KEY
- VoyageProvider: voyage-code-2 (1024 dims), VECGREP_VOYAGE_KEY
- GeminiProvider: gemini-embedding-001 (768 dims), VECGREP_GEMINI_KEY

Add PROVIDER_REGISTRY dict and get_provider(name) factory.
Keep backward-compatible embed() free function (delegates to LocalProvider).
All cloud providers lazy-import their SDK and raise RuntimeError with
install instructions when the package or API key is missing.
- Convert hardcoded schema to _chunks_schema(dims) function; keep
  module-level schema = _chunks_schema(384) for backward compat
- VectorStore.__init__ now accepts dims: int = 384; reads stored dims
  from meta table on open and uses them over the param
- Add _get_meta(key) / _set_meta(key, value) helpers
- Add set_provider_meta(provider, model, dims) / get_provider_meta()
- Add drop_and_recreate_chunks(dims) for force re-index with dim change
- Refactor touch_last_indexed() to use _set_meta
- status() now includes provider, model, dims fields
- Import get_provider, EmbeddingProvider from embedder
- Add _resolve_provider(path, requested, force) helper that reads stored
  provider from meta, enforces per-project lock, and errors on mismatch
  without force=True
- _get_store(path, dims) now passes dims through to VectorStore
- _do_index gains provider: str | None = None param:
  - Resolves provider early before acquiring index lock
  - Blocks watch=True with non-local provider (clear error message)
  - Passes emb_provider.dims to _get_store
  - Drops/recreates chunks table when force=True + dims change
  - Calls emb_provider.embed() instead of free-function embed()
  - Persists provider meta after successful index
- index_codebase MCP tool exposes provider param with full docstring
- search_code reads stored provider and uses matching embedder for query
- get_index_status shows provider, model, dims in output
- LiveSyncHandler._process_file reads stored provider and skips if non-local
- _merkle_sync passes provider=None to _do_index (auto-reads stored)
Add openai, voyage, gemini, and cloud extras so users can install only
what they need: pip install 'vecgrep[openai]', 'vecgrep[voyage]',
'vecgrep[gemini]', or 'vecgrep[cloud]' for all three at once.
Base install remains lightweight with no cloud SDK dependencies.
- Switch GeminiProvider from deprecated google-generativeai to google-genai SDK
- Correct gemini-embedding-001 dims from 768 → 3072 (actual API output)
- Update optional dep: google-generativeai → google-genai>=1.0
- Add tests/test_providers.py: registry, LocalProvider embed/norm,
  cloud providers raise without keys, mocked API calls for all 3 providers
- Extend tests/test_store.py: dynamic dims, drop_and_recreate_chunks,
  provider meta persistence, _get_meta/_set_meta helpers
- Extend tests/test_server.py: provider locking (switch without force=error),
  cloud+watch guard, get_index_status provider fields, LiveSync skip for
  cloud providers, force=True allows provider switch
- Fix embedder: catch ValueError when custom model already registered
- Fix _resolve_provider: treat raw_stored=None (fresh index) as unset
  so any provider is allowed without force
…tional deps

- Remove unused `embed` import from server.py (ruff F401)
- Fix _resolve_provider: read raw meta key directly so a fresh/pre-BYOK
  index (stored provider = None) allows any provider without requiring
  force=True; only enforce lock when provider was explicitly stored
- Regenerate uv.lock to include google-genai, openai, voyageai optional
  dependency trees so cloud extras install reproducibly
@Kavirubc Kavirubc requested a review from iamvirul as a code owner March 4, 2026 10:11
@codecov
Copy link

codecov bot commented Mar 4, 2026

Codecov Report

❌ Patch coverage is 97.77778% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/vecgrep/embedder.py 97.86% 4 Missing ⚠️
src/vecgrep/server.py 95.74% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@iamvirul iamvirul merged commit ddae6a1 into VecGrep:main Mar 4, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Implementing BYOK for Embdedding model and add support for Open AI and Gemini Models

2 participants