Skip to content

Add Gemini provider, custom URLs, local providers, and extra headers#14

Open
dpage wants to merge 12 commits intomainfrom
feature/llm-provider-enhancements
Open

Add Gemini provider, custom URLs, local providers, and extra headers#14
dpage wants to merge 12 commits intomainfrom
feature/llm-provider-enhancements

Conversation

@dpage
Copy link
Copy Markdown
Member

@dpage dpage commented Apr 9, 2026

Summary

  • Shared provider infrastructure: Extract duplicated code from OpenAI/Voyage into provider_common.c (curl, JSON, API key loading, response parsing). Slims existing providers from ~516 lines each to ~163-187 lines.
  • Google Gemini provider: New provider_gemini.c with x-goog-api-key auth, model-in-URL-path, and native batch support via batchEmbedContents.
  • Custom base URLs: api_url GUC now defaults to empty; each provider supplies its own default URL. Enables pointing any provider at a custom endpoint.
  • OpenAI-compatible local providers: API key is optional when a custom URL is set, enabling LM Studio, Docker Model Runner, llama.cpp, and EXO.
  • Extra HTTP headers: New extra_headers GUC (semicolon-separated key: value pairs) for proxy servers like Portkey.

Test plan

  • make clean && make builds cleanly with zero warnings
  • make install && make installcheck passes all existing tests plus new provider/GUC test cases
  • Verify Gemini embeddings work with a real API key
  • Verify OpenAI-compatible local provider (e.g., LM Studio) works without an API key
  • Verify extra headers are sent correctly (e.g., via Portkey or request inspection)

🤖 Generated with Claude Code

dpage and others added 9 commits April 9, 2026 08:54
Extract common utilities used across embedding providers into
provider_common.h/.c: curl write callback, API key loading with
tilde expansion and permission checks, JSON string escaping,
OpenAI-format request building, shared HTTP POST via curl, and
OpenAI-format embedding response parsing. Also adds the
pgedge_vectorizer_extra_headers variable stub so the new code
compiles ahead of the full GUC definition in a later task.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… provider list

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace gemini_generate() with a simple delegation to
gemini_generate_batch() with count=1, matching the pattern used
by the OpenAI and Voyage providers. Remove the now-unused
parse_gemini_embedding_response() function and its forward
declaration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…custom URLs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@codacy-production
Copy link
Copy Markdown

codacy-production bot commented Apr 9, 2026

Up to standards ✅

🟢 Issues 2 high · 12 medium

Results:
14 new issues

Category Results
ErrorProne 2 high (2 false positives)
Complexity 12 medium

View in Codacy

🟢 Metrics 110 complexity · -16 duplication

Metric Results
Complexity 110
Duplication -16

View in Codacy

TIP This summary will be updated as you push new changes. Give us feedback

dpage and others added 2 commits April 9, 2026 12:55
Column header widths for Gemini URL, extra_headers, and local
provider URL were too wide in the expected output file.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Extract provider_count_array_dimensions() for counting JSON array elements
- Extract provider_parse_float_array() for parsing JSON float arrays
- Extract provider_append_extra_headers() from provider_do_curl_request()
- Extract provider_free_embeddings() for error cleanup
- Add nosemgrep/flawfinder annotations for false positive suppressions
- Refactor all three response parsers to use shared helpers
- Simplify provider_do_curl_request() and provider_load_api_key()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dpage
Copy link
Copy Markdown
Member Author

dpage commented Apr 9, 2026

Note that the macOS CI failure is a separate, unrelated issue.

@dpage dpage marked this pull request as ready for review April 9, 2026 18:55
@dpage dpage requested a review from bonesmoses April 9, 2026 18:56
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 9, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 247624e2-df0b-43b3-8cce-c844d34d12fb

📥 Commits

Reviewing files that changed from the base of the PR and between 125396c and 3f3d5e8.

📒 Files selected for processing (1)
  • TODO.md
💤 Files with no reviewable changes (1)
  • TODO.md

📝 Walkthrough

Walkthrough

Adds a new Gemini embedding provider and registers it in the provider registry. Introduces shared provider utilities via src/provider_common.c and src/provider_common.h, and refactors OpenAI, Voyage, and Ollama providers to use those helpers. Adds pgedge_vectorizer.extra_headers GUC and changes pgedge_vectorizer.api_url default to empty. Declares pgedge_vectorizer_extra_headers and GeminiProvider in headers, updates Makefile build objects, expands tests (test/sql/providers.sql), updates docs for providers and configuration, and removes the LLM section from TODO.md.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main changes: addition of Gemini provider, custom URL support, local provider compatibility, and extra headers functionality.
Description check ✅ Passed The description is well-structured and directly related to the changeset, detailing shared provider infrastructure refactoring, the new Gemini provider, custom base URLs, local provider support, and extra headers—all of which are reflected in the code changes.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/llm-provider-enhancements

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant