feat(gcp): enable GCP auth support — Redis cache URL + DB credentials#786
Merged
feat(gcp): enable GCP auth support — Redis cache URL + DB credentials#786
Conversation
Add `cache_redis_gcp_url` field to `HostedModelInferenceServiceConfig` and a GCP branch in `cache_redis_url` property. Previously GCP operators had to use `cache_redis_onprem_url` as a workaround; this closes the gap cleanly. Also updates cloud-matrix.md docs to reflect the proper field. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Pin inferenceFramework.vllm tag to "0.17.0" (was "latest" → 0.11.1) to support qwen3_5 architecture (Qwen3.5 0.8B, etc.) - Allow empty USER_TAG in build_and_upload_image.sh to produce clean version-only image tags (e.g. "0.17.0" instead of "0.17.0-") Build command for the new image: AWS_PROFILE=ml-admin ./build_and_upload_image.sh "" vllm --account=692474966980 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
On GCP, DB credentials are injected as env vars (DB_HOST, DB_USER, DB_PASSWORD, DB_PORT, DB_NAME) via a Kubernetes secret. Previously get_engine_url fell through to the AWS Secrets Manager path, causing NoCredentialsError on GCP deployments running alembic migrations. Pattern mirrors the existing onprem branch. Also documents the required K8s secret keys in cloud-matrix.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment on lines
+75
to
+79
| host = ( | ||
| os.environ.get("DB_HOST_RO") | ||
| if read_only | ||
| else os.environ.get("DB_HOST", "localhost") | ||
| ) |
There was a problem hiding this comment.
host will be None when DB_HOST_RO is unset
When read_only=True and the DB_HOST_RO environment variable is not set, os.environ.get("DB_HOST_RO") returns None, so host will be None. This produces a broken connection string that will fail at connection time.
The on-prem branch on line 65 handles this correctly with an or fallback:
host = os.environ.get("DB_HOST_RO") or os.environ.get("DB_HOST", "localhost")The GCP branch should use the same fallback pattern to avoid None hosts:
Suggested change
| host = ( | |
| os.environ.get("DB_HOST_RO") | |
| if read_only | |
| else os.environ.get("DB_HOST", "localhost") | |
| ) | |
| host = ( | |
| os.environ.get("DB_HOST_RO") | |
| or os.environ.get("DB_HOST", "localhost") | |
| if read_only | |
| else os.environ.get("DB_HOST", "localhost") | |
| ) |
Or more simply, match the on-prem pattern exactly if the read_only distinction is not required.
Prompt To Fix With AI
This is a comment left during a code review.
Path: model-engine/model_engine_server/db/base.py
Line: 75-79
Comment:
**`host` will be `None` when `DB_HOST_RO` is unset**
When `read_only=True` and the `DB_HOST_RO` environment variable is not set, `os.environ.get("DB_HOST_RO")` returns `None`, so `host` will be `None`. This produces a broken connection string that will fail at connection time.
The on-prem branch on line 65 handles this correctly with an `or` fallback:
```python
host = os.environ.get("DB_HOST_RO") or os.environ.get("DB_HOST", "localhost")
```
The GCP branch should use the same fallback pattern to avoid `None` hosts:
```suggestion
host = (
os.environ.get("DB_HOST_RO")
or os.environ.get("DB_HOST", "localhost")
if read_only
else os.environ.get("DB_HOST", "localhost")
)
```
Or more simply, match the on-prem pattern exactly if the `read_only` distinction is not required.
How can I resolve this? If you propose a fix, please make it concise.
dmchoiboi
reviewed
Mar 26, 2026
charts/model-engine/values.yaml
Outdated
| # Configurable inference framework image tags | ||
| inferenceFramework: | ||
| vllm: "latest" | ||
| vllm: "0.17.0" |
Collaborator
There was a problem hiding this comment.
in general, we should keep this "latest", and update the individual deployments since not all deployments may have this specific tag
Individual deployments should pin specific versions; not all deployments may have the 0.17.0 tag available.
arniechops
approved these changes
Mar 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
cache_redis_gcp_urlfield toHostedModelInferenceServiceConfigfor GCP Memorystore — fixescache_redis_urlfalling through to AWS path on GCPget_engine_url(db/base.py) — reads DB credentials from env vars (DB_HOST,DB_USER,DB_PASSWORD,DB_PORT,DB_NAME) via K8s secret, fixingNoCredentialsErrorduring alembic migrations on GCPcloud-matrix.mddocs to cover both fixesContext
GCP support was ~95% complete (GCS storage, GAR registry, Redis queues all implemented). Two gaps remained:
cache_redis_urlhad no GCP branch — operators had to usecache_redis_onprem_urlas a workaroundget_engine_urlfell through to the AWS Secrets Manager path on GCP, causingNoCredentialsErrorduring alembic DB migrations (e.g.botocore.exceptions.NoCredentialsError: Unable to locate credentials)Test plan
test_gcp_cache_redis_url_returns_gcp_url— verifies URL returned correctly whencache_redis_gcp_urlis settest_gcp_cache_redis_url_raises_when_not_set— verifies clearAssertionErrorwith helpful message when field is missing on GCPtest_gcp_provider_selects_gcp_implementationscontinues to pass🤖 Generated with Claude Code
Greptile Summary
Closes two GCP support gaps: adds a dedicated
cache_redis_gcp_urlconfig field for GCP Memorystore and adds a GCP branch inget_engine_urlto read DB credentials from env vars (via K8s secrets) instead of falling through to AWS Secrets Manager. Also includes a minor shell script improvement to allow version-only Docker image tags.config.py: Newcache_redis_gcp_urlfield + GCP branch incache_redis_urlproperty — logic placement is correct (after on-prem, before AWS)db/base.py: GCP branch reads DB credentials from env vars, mirroring the on-prem approach. Note: no unit tests cover the newget_engine_urlGCP branchbuild_and_upload_image.sh: Allows emptyUSER_TAGvia${VAR:+-$VAR}for clean version-only image tagsConfidence Score: 4/5
This PR is safe to merge — it adds new cloud provider branches without modifying existing AWS/Azure/on-prem paths
The changes are additive (new GCP branches in existing if/elif chains) and don't alter existing cloud provider paths. Tests cover the Redis config changes. The known DB_HOST_RO fallback concern (already flagged in prior thread) is the main remaining item.
model-engine/model_engine_server/db/base.py — the GCP read_only host fallback behavior (already flagged) remains the key concern
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[cache_redis_url property called] --> B{cache_redis_onprem_url set?} B -- Yes --> C[Return onprem URL] B -- No --> D{cloud_provider} D -- onprem --> E{cache_redis_aws_url set?} E -- Yes --> F[Return AWS URL with warning] E -- No --> G[Return redis://REDIS_HOST:PORT/0] D -- gcp --> H{cache_redis_gcp_url set?} H -- Yes --> I[Return GCP Memorystore URL] H -- No --> J[AssertionError] D -- aws --> K[AWS Redis path] D -- azure --> L[Azure Redis path] style H fill:#90EE90 style I fill:#90EE90 style J fill:#FFB6C1Reviews (6): Last reviewed commit: "Merge branch 'main' into lilyzhu/gcp-aut..." | Re-trigger Greptile