Skip to content

feat(openrouter): add response validation and harden error handling#81

Open
whoabuddy wants to merge 9 commits intomainfrom
feat/openrouter-response-validation
Open

feat(openrouter): add response validation and harden error handling#81
whoabuddy wants to merge 9 commits intomainfrom
feat/openrouter-response-validation

Conversation

@whoabuddy
Copy link
Contributor

Summary

Closes #80. Adds a response validation layer at the OpenRouter service boundary and hardens all consumer code against malformed API responses.

  • Validation helpers (validateModelsResponse, validateChatResponse, validateStreamChunk) in openrouter.ts validate all API responses before they leave the service layer, throwing OpenRouterError with status 502 on failure
  • Consumer hardening in list-models.ts, chat.ts, and model-cache.ts — defensive guards for empty choices, missing content, and pricing fields
  • Model cache statusgetCacheStatus() returns warm/cold/degraded state; lookupModel() signals degraded state when cache is empty
  • Better error responses — invalid model IDs return 400 with the model ID, error code invalid_model, and up to 3 similar model suggestions
  • 29 unit tests covering all three validators with valid, missing, and malformed inputs
  • Code simplification — extracted shared usage validation, removed redundant belt-and-suspenders guards, reduced test boilerplate (-138 lines net)

Test plan

  • Unit tests pass (37 tests, 117 assertions)
  • Type check passes (pre-existing pricing.ts:191 error is unrelated)
  • E2E tests against staging (npm test with X402_CLIENT_PK)

🤖 Generated with Claude Code

whoabuddy and others added 9 commits March 22, 2026 14:11
Add three TypeScript assertion functions that validate OpenRouter API
responses before they propagate to consumers, replacing silent type
casts with explicit runtime checks:

- validateModelsResponse(): ensures .data is an array with .id (string)
  and .pricing.prompt/.completion (strings) on each model
- validateChatResponse(): ensures .choices is an array and .usage
  (if present) has numeric token fields
- validateStreamChunk(): ensures .usage (if present) has numeric
  prompt_tokens, completion_tokens, total_tokens fields

All three validators throw OpenRouterError(message, 502) on failure so
callers get a clear upstream error rather than undefined access issues.
Validators are applied immediately after each response.json() cast in
getModels(), createChatCompletion(), and createUsageCapturingStream().
Stream chunk validation failures are caught and logged as warnings to
avoid breaking active streams.

OpenRouterError class moved above validators to resolve declaration
ordering (validators reference the class in their throw statements).

Co-Authored-By: Claude <noreply@anthropic.com>
… endpoint

Add belt-and-suspenders Array.isArray check on response.data and per-model
pricing type guards inside the .map() callback. Invalid models are filtered out
(partial data) instead of crashing the endpoint, making the listing resilient
to any future validator contract changes.

Co-Authored-By: Claude <noreply@anthropic.com>
…pletion

Guard response.choices with Array.isArray and length > 0 check after the service
call. An empty choices array from OpenRouter now returns a structured 502 error
instead of passing through a useless response. Also logs a warning when the first
choice has unexpectedly empty content.

Co-Authored-By: Claude <noreply@anthropic.com>
…ulation

Guard modelsResponse.data with Array.isArray before iterating, and add explicit
type checks on each model.pricing before parseFloat(). A single malformed model
now skips with a debug log rather than aborting the entire cache refresh. This
makes the cache resilient to future validator contract changes.

Co-Authored-By: Claude <noreply@anthropic.com>
… flag

Add getCacheStatus() to expose observable cache state (warm/cold/degraded)
so callers and operators can distinguish between a healthy cache, a never-
populated cold start, and a failed-fetch degraded state.

Add getSimilarModels() for use in invalid-model error responses — returns up
to N model IDs from the registry that share a provider prefix with the
requested model, so agents get actionable correction hints.

Update ModelLookupResult to carry an optional degraded flag when the registry
is empty after a refresh attempt, allowing consumers to log and surface the
degraded state rather than silently allowing the request.

Co-Authored-By: Claude <noreply@anthropic.com>
…estions

The x402 middleware rejects invalid models pre-payment, but when the cache
is cold/degraded at middleware time the check is skipped. Adding a second
lookupModel() call in the chat handler catches this case once the cache has
been populated, returning a 400 with code: "invalid_model", the rejected
model ID, and up to 3 provider-prefix-matched suggestions from the live
registry so clients get actionable correction hints.

Co-Authored-By: Claude <noreply@anthropic.com>
…ed state

The pre-payment model rejection now includes the model field alongside the
error message and code so clients and logs always have the rejected model ID
without having to reconstruct it from the request body.

When lookupModel returns valid:true/degraded:true the middleware now logs a
warning instead of silently passing through, giving operators visibility when
the model registry cache was unavailable at validation time.

Co-Authored-By: Claude <noreply@anthropic.com>
Adds tests/openrouter-validation.unit.test.ts covering all three
validator functions (validateModelsResponse, validateChatResponse,
validateStreamChunk). Tests verify both happy paths and every error
branch, ensuring the service-boundary validators reject malformed
OpenRouter responses with OpenRouterError status 502.

Closes #80

Co-Authored-By: Claude <noreply@anthropic.com>
…uards

Extract shared usage validation helper, remove belt-and-suspenders guards
that duplicate Phase 1 validator guarantees, and reduce test boilerplate
with a shared assertion helper. Net -138 lines, same coverage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 22, 2026 21:33
@cloudflare-workers-and-pages
Copy link

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
✅ Deployment successful!
View logs
x402-api-staging cf95e23 Mar 22 2026, 09:33 PM

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a response-validation layer at the OpenRouter service boundary and updates OpenRouter consumers to behave safely when upstream responses are malformed, including improved invalid-model errors and additional model-cache signaling.

Changes:

  • Introduces OpenRouterError plus response validators (validateModelsResponse, validateChatResponse, validateStreamChunk) and wires them into OpenRouterClient.
  • Extends the model cache with status signaling (degraded) and adds similar-model suggestions for invalid model IDs.
  • Adds unit tests for the new validator helpers.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/services/openrouter.ts Adds OpenRouterError and validator functions; validates upstream JSON before returning it from the service client.
src/services/model-cache.ts Adds cache status types/utilities, degraded signaling, and similar-model suggestion helper.
src/middleware/x402.ts Improves invalid-model 400 responses and handles degraded cache condition in pre-payment validation.
src/endpoints/inference/openrouter/list-models.ts Documents the new validation guarantee from getModels().
src/endpoints/inference/openrouter/chat.ts Adds belt-and-suspenders model validation + guards for empty choices / empty content.
tests/openrouter-validation.unit.test.ts Adds validator unit tests for valid/malformed inputs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +101 to +102
const model = obj.data[i] as Record<string, unknown>;

Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateModelsResponse can throw a raw TypeError if the .data array contains null (or any non-object) because it casts obj.data[i] to a Record and then immediately reads model.id / model.pricing. Accessing properties on null will bypass the intended OpenRouterError(502) path and reintroduce unhandled errors. Add an explicit check that each obj.data[i] is a non-null object before reading fields, and throw OpenRouterError with a useful message when it isn't.

Suggested change
const model = obj.data[i] as Record<string, unknown>;
const rawModel = obj.data[i];
if (typeof rawModel !== "object" || rawModel === null) {
throw new OpenRouterError(
`Invalid models response: model[${i}] must be a non-null object`,
502,
JSON.stringify(rawModel).slice(0, 200)
);
}
const model = rawModel as Record<string, unknown>;

Copilot uses AI. Check for mistakes.
status: number,
messageSubstring?: string
): OpenRouterError {
expect(fn).toThrow(OpenRouterError);
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expectOpenRouterError helper invokes fn twice (once via expect(fn).toThrow(...) and again in the try/catch). If fn ever has side effects or depends on mutable state, this can make tests flaky and also obscures which invocation actually threw. Prefer calling fn exactly once and asserting on the caught error instance directly.

Suggested change
expect(fn).toThrow(OpenRouterError);

Copilot uses AI. Check for mistakes.
test("non-array .data throws OpenRouterError", () => {
expectOpenRouterError(() => validateModelsResponse({ data: "not-an-array" }), 502, ".data must be an array");
});

Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateModelsResponse tests cover missing/invalid .data and .pricing, but they don't cover .data containing null elements. Given the validator currently accesses model.id/model.pricing, add a case like { data: [null] } to ensure it throws an OpenRouterError (and not a TypeError) and to prevent regressions.

Suggested change
test("model list containing null element throws OpenRouterError", () => {
expectOpenRouterError(() => validateModelsResponse({ data: [null] }), 502);
});

Copilot uses AI. Check for mistakes.
Comment on lines +109 to +126
// getModels() runs validateModelsResponse() which guarantees .data is an
// array and every model has .id (string) and .pricing with string fields.
const modelsResponse = await client.getModels(controller.signal);

modelRegistry.clear();

for (const model of modelsResponse.data) {
// Guard each model's pricing individually before parseFloat().
// A single malformed model should not abort the entire cache refresh.
if (
typeof model.pricing !== "object" ||
model.pricing === null ||
typeof model.pricing.prompt !== "string" ||
typeof model.pricing.completion !== "string"
) {
logger.debug("Model cache: skipping model with invalid pricing", { modelId: model.id });
continue;
}
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In doRefresh, the comment says getModels()/validateModelsResponse guarantees every model has valid pricing fields, but then the loop adds an additional pricing-shape guard with a comment implying a single malformed model shouldn't abort the refresh. With the current validator, a single malformed model would cause validateModelsResponse to throw before this loop runs, so this guard/comment combination is misleading. Either relax validateModelsResponse to allow per-model skipping here, or remove/adjust the redundant guard/comments to reflect actual behavior.

Copilot uses AI. Check for mistakes.
Copy link

@arc0btc arc0btc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adds response validation at the OpenRouter service boundary and hardens consumer code against malformed API responses — solid defensive work that addresses a real class of operational failures.

What works well:

  • validateUsageFields factored as a shared private helper — clean, avoids duplication between chat and stream chunk validators
  • OpenRouterError moved to the top of the file before the validators that use it — correct placement
  • The degraded state semantics are well-designed: permissive but observable. Logging a warn and allowing through beats a silent outage when OpenRouter's model API is unavailable
  • Belt-and-suspenders model validation in chat.ts is well-commented and makes the reasoning clear
  • 29 unit tests covering all three validators with valid, missing, and malformed inputs. The expectOpenRouterError helper keeps test code clean

[suggestion] Validator and per-model guard have contradictory intent (src/services/model-cache.ts)

The comment in doRefresh() says "A single malformed model should not abort the entire cache refresh" — that's the lenient intent. But validateModelsResponse() runs first (inside client.getModels()), and it throws on the first model with invalid pricing, aborting the entire refresh. The lenient guard in doRefresh() is never exercised because the validator is strict.

These two approaches directly contradict each other. Pick one:

Option A — Trust the validator, remove the per-model guard (strict: one bad model fails the refresh):

      // OpenRouter returns per-token prices as strings (e.g., "0.000003")
      // Convert to per-1K numbers
      const promptPer1k = parseFloat(model.pricing.prompt) * 1000;

Option B — Move validation inline to be lenient (skip bad models, don't use strict validator here): keep the per-model guard, but update validateModelsResponse to only validate structure (data is array, each element has .id) without enforcing pricing fields — let the per-model guard handle skipping.

Option A is simpler and the PR description implies strict behavior is the goal. Option B better matches the comment's stated intent. Either is fine; the current mixed state means the comment is misleading.


[suggestion] getCacheStatus() and getSimilarModels() have no unit tests

Two new exported functions with non-trivial logic have no coverage. getCacheStatus() in particular has a three-state machine (warm/cold/degraded) that's worth testing explicitly — especially the edge cases like lastFailedAt set but cache still fresh (should be warm, not degraded).


[nit] getSimilarModels final fallback returns unrelated models

The last branch returns allModels.filter(...).slice(0, maxResults) — the first N alphabetical models — when no prefix match is found. These may be completely unrelated to the queried model. Worth a comment so callers aren't surprised:

  // No structural match — return first maxResults models as fallback hints.
  // These may be unrelated; callers should treat them as "here are some valid models"
  // rather than "here are close matches."
  return allModels.filter((id) => id !== modelId).slice(0, maxResults);

[question] validateStreamChunk only validates .usage, not .choices

The validator asserts chunk is StreamingChunk but only validates the .usage field. The consumer code only reads .usage from stream chunks, so this is functionally correct for current use. Is the narrow scope intentional — validation scoped to what's consumed — or is this a gap worth filling for future consumers?


Operational context: We run this endpoint at ~80 completions/day. The strict validator means a single malformed model in OpenRouter's response would take the entire cache refresh offline until the next TTL expiry. Given OpenRouter sometimes returns experimental or preview models with incomplete metadata, this might be worth monitoring after deploy — or at least logging how often the cache goes degraded.

Overall this is a clean improvement. The core validation logic is correct, the tests are thorough, and the degraded state handling is production-appropriate. The validator/guard tension is worth resolving before the next cache-related change touches this code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: unhandled 'e is not iterable' error on OpenRouter chat endpoint

3 participants