chore(llm): bump LLMSpy to v3.0.33-obol.1 by bussyjd · Pull Request #163 · ObolNetwork/obol-stack

bussyjd · 2026-02-16T19:09:45Z

Summary

Bump LLMSpy image from 3.0.32-obol.1-rc.4 to 3.0.33-obol.1 (both init container and main container)
Syncs upstream ServiceStack/llms v3.0.33 changes: version bump, nostore feature, response_format fix, provider updates

Test plan

Verify ghcr.io/obolnetwork/llms:3.0.33-obol.1 image is available
Deploy to k3s cluster and confirm LLMSpy pod starts successfully
Validate smart routing and streaming SSE passthrough still work

Update dependency versions to latest stable releases: - kubectl: 1.31.0 → 1.35.0 - helm: 3.19.1 → 3.19.4 - helmfile: 1.2.2 → 1.2.3 - k9s: 0.32.5 → 0.50.18 - helm-diff: 3.9.11 → 3.14.1 k3d remains at 5.8.3 (already current).

Replace nginx-ingress controller with Traefik 38.0.2 using Kubernetes Gateway API for routing. This addresses the nginx-ingress deprecation (end of maintenance March 2026). Changes: - Remove --disable=traefik from k3d config to use k3s built-in Traefik - Replace nginx-ingress helm release with Traefik 38.0.2 in infrastructure - Configure Gateway API provider with cross-namespace routing support - Add GatewayClass and Gateway resources via Traefik helm chart - Convert all Ingress resources to HTTPRoute format: - eRPC: /rpc path routing - obol-frontend: / path routing - ethereum: /execution and /beacon path routing with URL rewrite - aztec: namespace-based path routing with URL rewrite - helios: namespace-based path routing with URL rewrite - Disable legacy Ingress in service helm values Closes #125

Add Cloudflare Tunnel integration to expose obol-stack services publicly without port forwarding or static IPs. Uses quick tunnel mode for MVP. Changes: - Add cloudflared Helm chart (internal/embed/infrastructure/cloudflared/) - Add tunnel management package (internal/tunnel/) - Add CLI commands: obol tunnel status/restart/logs - Integrate cloudflared into infrastructure helmfile The tunnel deploys automatically with `obol stack up` and provides a random trycloudflare.com URL accessible via `obol tunnel status`. Future: Named tunnel support for persistent URLs (obol tunnel login)

Update documentation to reflect the upgraded dependency versions in obolup.sh. This keeps the documentation in sync with the actual pinned versions used by the bootstrap installer.

…ostnames

# Conflicts: # internal/embed/infrastructure/helmfile.yaml

Introduce the inference marketplace foundation: an x402-enabled reverse proxy that wraps any OpenAI-compatible inference service with USDC micropayments via the x402 protocol. Components: - internal/inference/gateway.go: net/http reverse proxy with x402 middleware - cmd/inference-gateway/: standalone binary for containerisation - cmd/obol/inference.go: `obol inference serve` CLI command - internal/embed/networks/inference/: helmfile network template deploying Ollama + gateway + HTTPRoute (auto-discovered by existing CLI) - Dockerfile.inference-gateway: distroless multi-stage build Provider: obol network install inference --wallet-address 0x... --model llama3.2:3b Consumer: POST /v1/chat/completions with X-PAYMENT header (USDC on Base)

feat(inference): add x402 pay-per-inference gateway (Phase 1)

- Remove unused $publicDomain variable from helmfile.yaml (caused Helmfile v1 gotmpl pre-processing to fail on .Values.* references) - Fix eRPC secretEnv: chart expects plain strings, not secretKeyRef maps; move OBOL_OAUTH_TOKEN to extraEnv with valueFrom - Fix obol-frontend escaped quotes in gotmpl (invalid \\" in operand)

Replace the in-cluster Ollama Deployment/PVC/Service with an ExternalName Service that routes ollama.llm.svc.cluster.local to the host machine's Ollama server. LLMSpy and all consumers use the stable cluster-internal DNS name; the ExternalName target is resolved during stack init via the {{OLLAMA_HOST}} placeholder: k3d → host.k3d.internal k3s → node gateway IP (future) This avoids duplicating the model cache inside the cluster and leverages the host's GPU/VRAM for inference. Also updates CopyDefaults to accept a replacements map, following the same pattern used for k3d.yaml placeholder resolution.

refactor(llm): proxy to host Ollama via ExternalName Service

The obol-agent deployment in the agent namespace fails with ImagePullBackOff because its container image is not publicly accessible. Wrap the template in a Helm conditional (obolAgent.enabled) defaulting to false so it no longer deploys automatically. The manifest is preserved for future use — set obolAgent.enabled=true in the base chart values to re-enable.

fix(infra): disable obol-agent from default stack deployment

Add GitHub Actions workflow to build and publish the OpenClaw container image to ghcr.io/obolnetwork/openclaw from the upstream openclaw/openclaw repo at a pinned version. Renovate watches for new upstream releases and auto-opens PRs to bump the version file. Closes #142

Add integration-okr-1 and feat/openclaw-ci to push triggers for testing. Remove after verifying the workflow runs successfully — limit to main only.

The pinned SHAs from charon-dkg-sidecar were stale and caused the security-scan job to fail at setup.

ci(openclaw): Docker image build workflow with Renovate auto-bump

* feat(openclaw): add OpenClaw CLI and embedded chart for Obol Stack Adds `obol openclaw` subcommands to deploy and manage OpenClaw AI agent instances on the local k3d cluster. The chart is embedded via go:embed for development use; the canonical chart lives in ObolNetwork/helm-charts. CLI commands: openclaw up - Create and deploy an instance openclaw sync - Re-deploy / update an existing instance openclaw token - Retrieve the gateway token openclaw list - List deployed instances openclaw delete - Remove an instance openclaw skills - Sync skills from a local directory The embedded Helm chart supports: - Pluggable model providers (Anthropic, OpenAI, Ollama) - Chat channels (Telegram, Discord, Slack) - Skills injection via ConfigMap + init container - RBAC, Gateway API HTTPRoute, values schema validation * feat(openclaw): integrate OpenClaw into stack setup with config import OpenClaw is now deployed automatically as a default instance during `obol stack up`. Adds ~/.openclaw/openclaw.json detection and import, interactive provider selection for direct CLI usage, and idempotent re-sync behavior for the default instance. * fix: resolve CRD conflicts, OpenClaw command, HTTPRoute spec, and KUBECONFIG propagation - Remove gateway-api-crds presync hook; Traefik v38+ manages its own CRDs - Fix Ethereum HTTPRoute: use single PathPrefix match (Gateway API spec) - Fix OpenClaw chart command to match upstream Dockerfile (node openclaw.mjs) - Update OpenClaw image tag to match GHCR published format (no v prefix) - Add KUBECONFIG env to helmfile subprocess in stack.go (aligns with all other packages) * feat(openclaw): detect and import existing ~/.openclaw workspace + bump to v2026.2.9 Auto-detect existing OpenClaw installations during `obol stack up` and `obol openclaw up`. When ~/.openclaw/ contains a workspace directory with personality files (SOUL.md, AGENTS.md, etc.), copies them into the pod's PVC after deployment. Auto-skips interactive provider prompts when an existing config with providers is detected. Also bumps the chart image to v2026.2.9 to match the CI-published image. * feat(openclaw): add setup wizard and dashboard commands Add `obol openclaw setup <id>` which port-forwards to the deployed gateway and runs the native OpenClaw onboard wizard via PTY. The wizard provides the full onboarding experience (personality, channels, skills, providers) against the running k8s instance. Add `obol openclaw dashboard <id>` which port-forwards and opens the web dashboard in the browser with auto-injected gateway token. Implementation details: - Port-forward lifecycle manager with auto-port selection - PTY-based wizard with raw terminal mode for @clack/prompts support - Sliding-window marker detection to exit cleanly when wizard completes - Proper PTY shutdown sequence (close master -> kill -> wait) to avoid hang caused by stdin copy goroutine blocking cmd.Wait() - Refactored Token() into reusable getToken() helper - findOpenClawBinary() searches PATH then cfg.BinDir with install hints - obolup.sh gains install_openclaw() for npm-based binary management * feat(llm,openclaw): llmspy universal proxy + openclaw CLI passthrough Route all cloud API traffic through llmspy as a universal gateway: - Add Anthropic/OpenAI providers to llm.yaml (ConfigMap + Secret + envFrom) - New `internal/llm` package with ConfigureLLMSpy() for imperative patching - New `obol llm configure` command for standalone provider setup - OpenClaw overlay routes through llmspy:8000/v1 instead of direct cloud APIs - Bump llmspy image to obol fork rc.2 (fixes SQLite startup race) Add `obol openclaw cli <id> -- <args>` passthrough: - Remote-capable commands (gateway, acp, browser, logs) via port-forward - Local-only commands (doctor, models, config) via kubectl exec - Replace PTY-based setup wizard with non-interactive helmfile sync flow - Remove creack/pty and golang.org/x/term dependencies * fix(openclaw): rename up→onboard, fix api field and macOS host resolution - Rename `obol openclaw up` to `obol openclaw onboard` - Set api: "openai-completions" in llmspy-routed overlay (fixes "No API provider registered for api: undefined" in OpenClaw) - Use host.docker.internal on macOS for Ollama ExternalName service (host.k3d.internal doesn't resolve on Docker Desktop) * feat(openclaw): detect Ollama availability before offering it in setup wizard SetupDefault() now probes the host Ollama endpoint before deploying with Ollama defaults — skips gracefully when unreachable so users without Ollama can configure a cloud provider later via `obol openclaw setup`. interactiveSetup() dynamically shows a 3-option menu (Ollama/OpenAI/ Anthropic) when Ollama is detected, or a 2-option menu (OpenAI/Anthropic) when it isn't. * docs: add LLM configuration architecture to CLAUDE.md Document the two-tier model: global llmspy gateway (cluster-wide keys and provider routing) vs per-instance OpenClaw config (overlay values pointing at llmspy or directly at cloud APIs). Includes data flow diagram, summary table, and key source files reference.

Update Anthropic models to include Opus 4.6, replace retiring GPT-4o with GPT-5.2, add next-step guidance to NOTES.txt, and clarify gateway token and skills injection comments per CTO review feedback.

Sync _helpers.tpl, validate.yaml, and values.yaml comments to match the helm-charts repo. Key changes: - Remove randAlphaNum gateway token fallback (require explicit value) - Add validation: gateway token required for token auth mode - Add validation: RBAC requires serviceAccount.name when create=false - Add validation: initJob requires persistence.enabled=true - Align provider and gateway token comments

Add a local dnsmasq-based DNS resolver that enables wildcard hostname resolution for per-instance routing (e.g., openclaw-myid.obol.stack) without manual /etc/hosts entries. - New internal/dns package: manages dnsmasq Docker container on port 5553 - macOS: auto-configures /etc/resolver/obol.stack (requires sudo once) - Linux: prints manual DNS configuration instructions - stack up: starts DNS resolver (idempotent, non-fatal on failure) - stack purge: stops DNS resolver and removes system resolver config - stack down: leaves DNS resolver running (cheap, persists across restarts) Closes #150

DNS resolver: add systemd-resolved integration for Linux. On Linux, dnsmasq binds to 127.0.0.2:53 (avoids systemd-resolved's stub on 127.0.0.53:53) and a resolved.conf.d drop-in forwards *.obol.stack queries. On macOS, behavior is unchanged (port 5553 + /etc/resolver). Also fixes dnsmasq startup with --conf-file=/dev/null to ignore Alpine's default config which enables local-service (rejects queries from Docker bridge network). Fix llmspy image tag: 3.0.32-obol.1-rc.2 does not exist on GHCR, corrected to 3.0.32-obol.1-rc.1.

…Helm repo (#145) Switch from bundling the OpenClaw Helm chart in the Go binary via //go:embed to referencing obol/openclaw from the published Helm repo, matching the pattern used by Helios and Aztec networks. Changes: - generateHelmfile() now emits chart: obol/openclaw with version pin - Remove copyEmbeddedChart() and all chart/values.yaml copy logic - Remove //go:embed directive, chartFS variable, and embed/io/fs imports - Delete internal/openclaw/chart/ (chart lives in helm-charts repo) - Deployment directory simplified to helmfile.yaml + values-obol.yaml - Setup() regenerates helmfile on each run to pick up version bumps Depends on helm-charts PR #183 being merged and chart published.

Helios is no longer part of the Obol Stack network lineup. Remove the embedded network definition, frontend env var, and all documentation references.

Add comprehensive unit tests for the OpenClaw config import pipeline (25 test cases covering DetectExistingConfig, TranslateToOverlayYAML, workspace detection, and helper functions). Refactor DetectExistingConfig for testability by extracting detectExistingConfigAt(home). Fix silent failures: warn when env-var API keys are skipped, when unknown API types are sanitized, when workspace has no marker files, and when DetectExistingConfig returns an error.

…153) OpenClaw's control UI rejects WebSocket connections with "1008: control ui requires HTTPS or localhost (secure context)" when running behind Traefik over HTTP. This adds: - Chart values and _helpers.tpl rendering for controlUi.allowInsecureAuth and controlUi.dangerouslyDisableDeviceAuth gateway settings - trustedProxies chart value for reverse proxy IP allowlisting - Overlay generation injects controlUi settings for both imported and fresh install paths - RBAC ClusterRole/ClusterRoleBinding for frontend OpenClaw instance discovery (namespaces, pods, configmaps, secrets)

…cloud model routing OpenClaw requires provider/model format (e.g. "llmspy/claude-sonnet-4-5-20250929") for model resolution. Without a provider prefix, it hardcodes a fallback to the "anthropic" provider — which is disabled in the llmspy-routed overlay, causing chat requests to fail silently. This renames the virtual provider used for cloud model routing from "ollama" to "llmspy", adds the proper provider prefix to AgentModel, and disables the default "ollama" provider when a cloud provider is selected. The default Ollama-only path is unchanged since it genuinely routes Ollama models.

fix(openclaw): rename virtual provider to llmspy for cloud model routing

…lowInsecureAuth The dangerouslyDisableDeviceAuth flag is completely redundant when running behind Traefik over HTTP: the browser's crypto.subtle API is unavailable in non-secure contexts (non-localhost HTTP), so the Control UI never sends device identity at all. Setting dangerouslyDisableDeviceAuth only matters when the browser IS in a secure context but you want to skip device auth — which doesn't apply to our Traefik proxy case. allowInsecureAuth alone is sufficient: it allows the gateway to accept token-only authentication when device identity is absent. Token auth remains fully enforced — connections without a valid gateway token are still rejected. Security analysis: - Token/password auth: still enforced (timing-safe comparison) - Origin check: still enforced (same-origin validation) - Device identity: naturally skipped (browser can't provide it on HTTP) - Risk in localhost k3d context: Low (no external attack surface) - OpenClaw security audit classification: critical (general), but acceptable for local-only dev stack Refs: plans/security-audit-controlui.md, plans/trustedproxies-analysis.md

Includes smart routing, streaming SSE passthrough, and db writer startup race fix.

Resolve modify/delete conflicts on embedded OpenClaw chart files: - internal/openclaw/chart/templates/_helpers.tpl - internal/openclaw/chart/values.yaml Accept deletions — chart was replaced with remote Helm repo in ca835f5 (refactor(openclaw): replace embedded chart with remote obol/openclaw Helm repo #145).

feat(dns): add wildcard DNS resolver for *.obol.stack

Remove the p.APIKey value from the env-var reference log message in DetectExistingConfig(). Although the code path only reaches here when the value is an env-var reference (e.g. ${ANTHROPIC_API_KEY}), CodeQL correctly flags it as clear-text logging of a sensitive field (go/ clear-text-logging). Omitting the value is a defense-in-depth fix that prevents accidental exposure if the guard condition ever changes.

Resolve conflict in obol-frontend values: accept pinned v0.1.4 tag from main.

Replace the nodecore RPC upstream with Obol's internal rate-limited eRPC gateway (erpc.gcp.obol.tech). The upstream supports mainnet and hoodi only, so sepolia is removed from all eRPC and ethereum network configurations. Basic Auth credential is intentionally embedded per CTO approval — the endpoint is rate-limited and serves as a convenience proxy for local stack users. Credential is extracted to a template variable with gitleaks:allow suppression.

feat(erpc): switch upstream to erpc.gcp.obol.tech

Resolve conflict: keep rc.4 LLMSpy image tag.

chore(llm): bump LLMSpy to Obol fork rc.4

Replace all references to glm-4.7-flash with Ollama's cloud model gpt-oss:120b-cloud. Cloud models run on Ollama's cloud service, eliminating OOM risk on local machines.

…ibility The remote OpenClaw Helm chart only iterates hardcoded provider names (ollama, anthropic, openai). Using "llmspy" as the virtual provider name caused it to be silently dropped from the rendered config, breaking the Anthropic inference waterfall. Revert to using "ollama" as the provider name — it still points at llmspy's URL (http://llmspy.llm.svc.cluster.local:8000/v1) with api: openai-completions, so all routing works correctly. Found during pre-production validation.

Replace busybox init container with the llmspy image itself, using a Python merge script that: 1. Copies llms.json from ConfigMap (controls enabled/disabled state) 2. Loads the full providers.json from the llmspy package (has model definitions and npm package refs for Anthropic/OpenAI) 3. Merges ConfigMap overrides (Ollama endpoint, API key refs) Also remove "models": {} and "all_models": true from cloud providers in the ConfigMap — these crash llmspy since only Ollama has a load_models() implementation. Add "npm" field for Anthropic/OpenAI. Found during pre-production Anthropic integration validation.

When Docker is installed but the daemon isn't running, obolup now attempts to start it automatically: 1. Try systemd (apt/yum installs): sudo systemctl start docker 2. Try snap: sudo snap start docker If auto-start fails, the error message now shows both systemd and snap commands instead of only systemctl. Fixes Docker startup on Ubuntu with snap-installed Docker where systemctl start docker fails with "Unit docker.service not found".

Syncs upstream v3.0.33 (nohistory, nostore, provider updates, response_format fix) with Obol smart routing extension.

bussyjd and others added 30 commits January 12, 2026 12:26

feat: add monitoring stack and gateway updates

ba54ea5

docs: update CLAUDE.md with new dependency versions

bd21826

Update documentation to reflect the upgraded dependency versions in obolup.sh. This keeps the documentation in sync with the actual pinned versions used by the bootstrap installer.

Merge PR #123

51e495d

Merge PR #127

1f03040

Merge PR #128

1ed55b0

Merge PR #129

c3ad1b9

feat(auth): add dashboard auth and nodecore token refresh

d5e5ccd

feat(llm): add ollama cloud + llmspy foundation

09356aa

docs: note llmspy + ollama cloud default

5328fc6

chore(llm): use official llmspy image and tcp probes

9e4b885

docs(okr1): note official llmspy image

8e8767b

fix(llm): run llmspy via llms entrypoint

9b98def

fix(llm): use http probes for llmspy

8798d07

feat: persist Cloudflare Tunnel hostname via login + loosen Gateway h…

37ed241

…ostnames

chore: bump cloudflared to 2026.1.2

0de01e5

Merge branch 'codex/persistent-tunnel-url' into integration-okr-1

8d97179

# Conflicts: # internal/embed/infrastructure/helmfile.yaml

Merge pull request #136 from ObolNetwork/feat/x402-inference-gateway

7d0f58f

feat(inference): add x402 pay-per-inference gateway (Phase 1)

Merge pull request #140 from ObolNetwork/fix/llmspy-host-routing

c3997f1

refactor(llm): proxy to host Ollama via ExternalName Service

Merge pull request #141 from ObolNetwork/fix/disable-obol-agent

79d4b99

fix(infra): disable obol-agent from default stack deployment

ci(openclaw): temporarily add test branches to workflow triggers

bf4039f

Add integration-okr-1 and feat/openclaw-ci to push triggers for testing. Remove after verifying the workflow runs successfully — limit to main only.

ci(openclaw): trigger workflow test run

104c03b

fix(ci): update Trivy and CodeQL action SHAs to latest

2fa8ae7

The pinned SHAs from charon-dkg-sidecar were stale and caused the security-scan job to fail at setup.

bussyjd and others added 29 commits February 10, 2026 14:52

ci(openclaw): re-trigger workflow to verify security scan fix

13f84ca

chore(openclaw): bump version to v2026.2.9

e27de58

Merge pull request #143 from ObolNetwork/feat/openclaw-ci

ada01b8

ci(openclaw): Docker image build workflow with Renovate auto-bump

fix(openclaw): update model defaults and improve chart documentation

e9b2b09

Update Anthropic models to include Opus 4.6, replace retiring GPT-4o with GPT-5.2, add next-step guidance to NOTES.txt, and clarify gateway token and skills injection comments per CTO review feedback.

cleanup(network): remove Helios light client network (#146)

532b23d

Helios is no longer part of the Obol Stack network lineup. Remove the embedded network definition, frontend env var, and all documentation references.

Merge pull request #149 from ObolNetwork/fix/openclaw-llmspy-provider

1fd4b88

fix(openclaw): rename virtual provider to llmspy for cloud model routing

chore(llm): bump LLMSpy image to 3.0.32-obol.1-rc.4

48ae09a

Includes smart routing, streaming SSE passthrough, and db writer startup race fix.

Merge pull request #151 from ObolNetwork/feat/wildcard-dns-resolver

5cba2f4

feat(dns): add wildcard DNS resolver for *.obol.stack

Merge main into integration-okr-1

1c65a2e

Resolve conflict in obol-frontend values: accept pinned v0.1.4 tag from main.

Merge pull request #159 from ObolNetwork/feat/erpc-gcp-upstream

8535884

feat(erpc): switch upstream to erpc.gcp.obol.tech

Merge integration-okr-1 into fix/openclaw-llmspy-provider

5dfbe7a

Resolve conflict: keep rc.4 LLMSpy image tag.

Merge pull request #157 from ObolNetwork/fix/openclaw-llmspy-provider

058164a

chore(llm): bump LLMSpy to Obol fork rc.4

chore: switch default model from glm-4.7-flash to gpt-oss:120b-cloud

7a003b0

Replace all references to glm-4.7-flash with Ollama's cloud model gpt-oss:120b-cloud. Cloud models run on Ollama's cloud service, eliminating OOM risk on local machines.

chore(llm): bump LLMSpy to v3.0.33-obol.1

8c308aa

Syncs upstream v3.0.33 (nohistory, nostore, provider updates, response_format fix) with Obol smart routing extension.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(llm): bump LLMSpy to v3.0.33-obol.1#163

chore(llm): bump LLMSpy to v3.0.33-obol.1#163
bussyjd wants to merge 59 commits intomainfrom
fix/openclaw-llmspy-provider

bussyjd commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bussyjd commented Feb 16, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant