Skip to content

feat(gcp): enable GCP auth support — Redis cache URL + DB credentials#786

Merged
lilyz-ai merged 6 commits intomainfrom
lilyzhu/gcp-auth-support
Mar 27, 2026
Merged

feat(gcp): enable GCP auth support — Redis cache URL + DB credentials#786
lilyz-ai merged 6 commits intomainfrom
lilyzhu/gcp-auth-support

Conversation

@lilyz-ai
Copy link
Collaborator

@lilyz-ai lilyz-ai commented Mar 26, 2026

Summary

  • Adds cache_redis_gcp_url field to HostedModelInferenceServiceConfig for GCP Memorystore — fixes cache_redis_url falling through to AWS path on GCP
  • Adds GCP branch in get_engine_url (db/base.py) — reads DB credentials from env vars (DB_HOST, DB_USER, DB_PASSWORD, DB_PORT, DB_NAME) via K8s secret, fixing NoCredentialsError during alembic migrations on GCP
  • Updates cloud-matrix.md docs to cover both fixes

Context

GCP support was ~95% complete (GCS storage, GAR registry, Redis queues all implemented). Two gaps remained:

  1. Redis cache: cache_redis_url had no GCP branch — operators had to use cache_redis_onprem_url as a workaround
  2. DB migrations: get_engine_url fell through to the AWS Secrets Manager path on GCP, causing NoCredentialsError during alembic DB migrations (e.g. botocore.exceptions.NoCredentialsError: Unable to locate credentials)

Test plan

  • test_gcp_cache_redis_url_returns_gcp_url — verifies URL returned correctly when cache_redis_gcp_url is set
  • test_gcp_cache_redis_url_raises_when_not_set — verifies clear AssertionError with helpful message when field is missing on GCP
  • Existing test_gcp_provider_selects_gcp_implementations continues to pass

🤖 Generated with Claude Code

Greptile Summary

Closes two GCP support gaps: adds a dedicated cache_redis_gcp_url config field for GCP Memorystore and adds a GCP branch in get_engine_url to read DB credentials from env vars (via K8s secrets) instead of falling through to AWS Secrets Manager. Also includes a minor shell script improvement to allow version-only Docker image tags.

  • config.py: New cache_redis_gcp_url field + GCP branch in cache_redis_url property — logic placement is correct (after on-prem, before AWS)
  • db/base.py: GCP branch reads DB credentials from env vars, mirroring the on-prem approach. Note: no unit tests cover the new get_engine_url GCP branch
  • build_and_upload_image.sh: Allows empty USER_TAG via ${VAR:+-$VAR} for clean version-only image tags
  • Tests cover the Redis config changes; existing GCP gateway test continues to pass

Confidence Score: 4/5

This PR is safe to merge — it adds new cloud provider branches without modifying existing AWS/Azure/on-prem paths

The changes are additive (new GCP branches in existing if/elif chains) and don't alter existing cloud provider paths. Tests cover the Redis config changes. The known DB_HOST_RO fallback concern (already flagged in prior thread) is the main remaining item.

model-engine/model_engine_server/db/base.py — the GCP read_only host fallback behavior (already flagged) remains the key concern

Important Files Changed

Filename Overview
model-engine/model_engine_server/common/config.py Adds cache_redis_gcp_url field and GCP branch in cache_redis_url property. Logic is clean — GCP check placed correctly after on-prem early exits and before AWS checks.
model-engine/model_engine_server/db/base.py Adds GCP branch in get_engine_url reading DB credentials from env vars. The DB_HOST_RO fallback issue (already flagged in prior review thread) remains the key concern.
model-engine/model_engine_server/inference/vllm/build_and_upload_image.sh Allows empty USER_TAG for version-only image tags using ${VAR:+-$VAR} pattern. Clean and correct bash change.
model-engine/tests/unit/api/test_dependencies.py Adds two new tests for GCP Redis cache URL behavior — happy path and missing config assertion. Tests are well-structured and cover the new code paths.
docs/internal/cloud-matrix.md Updates GCP docs to reflect cache_redis_gcp_url field and adds Cloud SQL section documenting K8s secret-based DB credentials.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[cache_redis_url property called] --> B{cache_redis_onprem_url set?}
    B -- Yes --> C[Return onprem URL]
    B -- No --> D{cloud_provider}
    D -- onprem --> E{cache_redis_aws_url set?}
    E -- Yes --> F[Return AWS URL with warning]
    E -- No --> G[Return redis://REDIS_HOST:PORT/0]
    D -- gcp --> H{cache_redis_gcp_url set?}
    H -- Yes --> I[Return GCP Memorystore URL]
    H -- No --> J[AssertionError]
    D -- aws --> K[AWS Redis path]
    D -- azure --> L[Azure Redis path]

    style H fill:#90EE90
    style I fill:#90EE90
    style J fill:#FFB6C1
Loading

Reviews (6): Last reviewed commit: "Merge branch 'main' into lilyzhu/gcp-aut..." | Re-trigger Greptile

lilyz-ai and others added 4 commits March 26, 2026 00:22
Add `cache_redis_gcp_url` field to `HostedModelInferenceServiceConfig`
and a GCP branch in `cache_redis_url` property. Previously GCP operators
had to use `cache_redis_onprem_url` as a workaround; this closes the gap
cleanly. Also updates cloud-matrix.md docs to reflect the proper field.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Pin inferenceFramework.vllm tag to "0.17.0" (was "latest" → 0.11.1)
  to support qwen3_5 architecture (Qwen3.5 0.8B, etc.)
- Allow empty USER_TAG in build_and_upload_image.sh to produce
  clean version-only image tags (e.g. "0.17.0" instead of "0.17.0-")

Build command for the new image:
  AWS_PROFILE=ml-admin ./build_and_upload_image.sh "" vllm --account=692474966980

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
On GCP, DB credentials are injected as env vars (DB_HOST, DB_USER,
DB_PASSWORD, DB_PORT, DB_NAME) via a Kubernetes secret. Previously
get_engine_url fell through to the AWS Secrets Manager path, causing
NoCredentialsError on GCP deployments running alembic migrations.

Pattern mirrors the existing onprem branch. Also documents the required
K8s secret keys in cloud-matrix.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment on lines +75 to +79
host = (
os.environ.get("DB_HOST_RO")
if read_only
else os.environ.get("DB_HOST", "localhost")
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 host will be None when DB_HOST_RO is unset

When read_only=True and the DB_HOST_RO environment variable is not set, os.environ.get("DB_HOST_RO") returns None, so host will be None. This produces a broken connection string that will fail at connection time.

The on-prem branch on line 65 handles this correctly with an or fallback:

host = os.environ.get("DB_HOST_RO") or os.environ.get("DB_HOST", "localhost")

The GCP branch should use the same fallback pattern to avoid None hosts:

Suggested change
host = (
os.environ.get("DB_HOST_RO")
if read_only
else os.environ.get("DB_HOST", "localhost")
)
host = (
os.environ.get("DB_HOST_RO")
or os.environ.get("DB_HOST", "localhost")
if read_only
else os.environ.get("DB_HOST", "localhost")
)

Or more simply, match the on-prem pattern exactly if the read_only distinction is not required.

Prompt To Fix With AI
This is a comment left during a code review.
Path: model-engine/model_engine_server/db/base.py
Line: 75-79

Comment:
**`host` will be `None` when `DB_HOST_RO` is unset**

When `read_only=True` and the `DB_HOST_RO` environment variable is not set, `os.environ.get("DB_HOST_RO")` returns `None`, so `host` will be `None`. This produces a broken connection string that will fail at connection time.

The on-prem branch on line 65 handles this correctly with an `or` fallback:
```python
host = os.environ.get("DB_HOST_RO") or os.environ.get("DB_HOST", "localhost")
```

The GCP branch should use the same fallback pattern to avoid `None` hosts:

```suggestion
            host = (
                os.environ.get("DB_HOST_RO")
                or os.environ.get("DB_HOST", "localhost")
                if read_only
                else os.environ.get("DB_HOST", "localhost")
            )
```

Or more simply, match the on-prem pattern exactly if the `read_only` distinction is not required.

How can I resolve this? If you propose a fix, please make it concise.

@lilyz-ai lilyz-ai changed the title feat(gcp): add cache_redis_gcp_url for GCP Memorystore support feat(gcp): enable GCP auth support — Redis cache URL + DB credentials Mar 26, 2026
@lilyz-ai lilyz-ai requested review from a team and arniechops March 26, 2026 00:53
# Configurable inference framework image tags
inferenceFramework:
vllm: "latest"
vllm: "0.17.0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general, we should keep this "latest", and update the individual deployments since not all deployments may have this specific tag

Individual deployments should pin specific versions; not all deployments
may have the 0.17.0 tag available.
@lilyz-ai lilyz-ai requested a review from dmchoiboi March 26, 2026 01:25
@lilyz-ai lilyz-ai enabled auto-merge (squash) March 27, 2026 00:18
@lilyz-ai lilyz-ai merged commit accc030 into main Mar 27, 2026
8 checks passed
@lilyz-ai lilyz-ai deleted the lilyzhu/gcp-auth-support branch March 27, 2026 00:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants