feat(ai-proxy): add defaults field for fallback model options#12895
feat(ai-proxy): add defaults field for fallback model options#12895sihyeonn wants to merge 2 commits intoapache:masterfrom
Conversation
Baoyuantop
left a comment
There was a problem hiding this comment.
- Supplement Documentation - Update ai-proxy.md and ai-proxy.md, adding documentation for the defaults field
- Supplement ai-proxy Plugin Tests - Add test cases to ai-proxy.openai-compatible.t or relevant test files
- Link to Related Issues or Supplement Requirement Context - Specify the source of use cases for this feature
|
Thanks for the review! I've addressed your feedback:
|
|
Hi @sihyeonn, since this PR is not associated with any issue and is a new feature, we suggest that we first communicate the detailed solution for this feature in an issue, and then push the code forward after the community agrees. |
There was a problem hiding this comment.
Pull request overview
Adds support for a defaults configuration block to the AI proxy plugins so operators can provide fallback model parameters that apply only when the client request omits them (while still allowing options to always override).
Changes:
- Introduces
defaultsinai-proxy/ai-proxy-multiplugin schemas and documents the new field (EN/ZH). - Passes
defaultsthrough the proxy base into the OpenAI-compatible driver layer. - Adds test coverage for default-vs-user-vs-options precedence.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
apisix/plugins/ai-drivers/openai-base.lua |
Applies model_defaults into the outgoing request body before applying model_options. |
apisix/plugins/ai-proxy/base.lua |
Plumbs defaults into extra_opts for drivers. |
apisix/plugins/ai-proxy/schema.lua |
Adds defaults field to single and multi instance schemas. |
t/plugin/ai-proxy.openai-compatible.t |
Adds tests validating precedence of defaults vs request vs options. |
t/plugin/ai-proxy-multi.openai-compatible.t |
Adds equivalent precedence tests for ai-proxy-multi. |
docs/en/latest/plugins/ai-proxy.md |
Documents new defaults field for ai-proxy. |
docs/en/latest/plugins/ai-proxy-multi.md |
Documents new instances.defaults field for ai-proxy-multi. |
docs/zh/latest/plugins/ai-proxy.md |
Chinese documentation update for defaults. |
docs/zh/latest/plugins/ai-proxy-multi.md |
Chinese documentation update for instances.defaults. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| test-type: defaults | ||
| --- response_body | ||
| {"max_tokens":512,"model":"server-model","temperature":0.7} | ||
|
|
There was a problem hiding this comment.
These assertions compare the raw JSON string exactly, but the response is produced via cjson.safe.encode on a Lua table with string keys, where key order is not guaranteed. This can make the test flaky across Lua/cjson builds. Prefer asserting via response_body_like/response_body eval (regexes) or decode the JSON and compare fields rather than matching the exact serialized key order.
There was a problem hiding this comment.
Already switched to response_body_like with regex in the previous commit to handle non-deterministic key ordering.
| test-type: defaults | ||
| --- response_body | ||
| {"max_tokens":100,"model":"server-model","temperature":0.5} | ||
|
|
There was a problem hiding this comment.
These assertions compare the raw JSON string exactly, but the response is produced via cjson.safe.encode on a Lua table with string keys, where key order is not guaranteed. This can make the test flaky across Lua/cjson builds. Prefer asserting via response_body_like/response_body eval (regexes) or decode the JSON and compare fields rather than matching the exact serialized key order.
| --- more_headers | ||
| test-type: defaults | ||
| --- response_body | ||
| {"max_tokens":100,"model":"server-model","temperature":0.7} |
There was a problem hiding this comment.
These assertions compare the raw JSON string exactly, but the response is produced via cjson.safe.encode on a Lua table with string keys, where key order is not guaranteed. This can make the test flaky across Lua/cjson builds. Prefer asserting via response_body_like/response_body eval (regexes) or decode the JSON and compare fields rather than matching the exact serialized key order.
| test-type: defaults | ||
| --- response_body | ||
| {"max_tokens":512,"model":"server-model","temperature":0.7} | ||
|
|
There was a problem hiding this comment.
These assertions compare the raw JSON string exactly, but the response is produced via cjson.safe.encode on a Lua table with string keys, where key order is not guaranteed. This can make the test flaky across Lua/cjson builds. Prefer asserting via response_body_like/response_body eval (regexes) or decode the JSON and compare fields rather than matching the exact serialized key order.
| --- more_headers | ||
| test-type: defaults | ||
| --- response_body | ||
| {"max_tokens":100,"model":"server-model","temperature":0.5} |
There was a problem hiding this comment.
These assertions compare the raw JSON string exactly, but the response is produced via cjson.safe.encode on a Lua table with string keys, where key order is not guaranteed. This can make the test flaky across Lua/cjson builds. Prefer asserting via response_body_like/response_body eval (regexes) or decode the JSON and compare fields rather than matching the exact serialized key order.
| { "messages": [ { "role": "user", "content": "hello" } ], "model": "user-model", "max_tokens": 100 } | ||
| --- more_headers | ||
| test-type: defaults | ||
| --- response_body | ||
| {"max_tokens":100,"model":"server-model","temperature":0.7} |
There was a problem hiding this comment.
These assertions compare the raw JSON string exactly, but the response is produced via cjson.safe.encode on a Lua table with string keys, where key order is not guaranteed. This can make the test flaky across Lua/cjson builds. Prefer asserting via response_body_like/response_body eval (regexes) or decode the JSON and compare fields rather than matching the exact serialized key order.
| -- defaults: apply only when not set in request | ||
| if extra_opts.model_defaults then | ||
| for opt, val in pairs(extra_opts.model_defaults) do | ||
| if request_table[opt] == nil then | ||
| request_table[opt] = val | ||
| end | ||
| end | ||
| end |
There was a problem hiding this comment.
Applying model_defaults here mutates request_table after apisix/plugins/ai-proxy/base.lua has already derived ctx.var.request_type and stream_options from the original request body. If defaults (or options) sets fields that affect control flow (e.g., stream), APISIX can end up sending a streamed request while still treating it as non-streaming for logging/metrics and without setting stream_options.include_usage. Consider applying defaults/options earlier (before the request_body.stream branch in before_proxy) or re-evaluating stream-related ctx vars after merging defaults/options.
There was a problem hiding this comment.
Moved defaults application to before the stream detection block so ctx vars are set correctly. This was addressed in the previous commit.
| local extra_opts = { | ||
| name = ai_instance.name, | ||
| endpoint = core.table.try_read_attr(ai_instance, "override", "endpoint"), | ||
| model_options = ai_instance.options, | ||
| model_defaults = ai_instance.defaults, | ||
| conf = ai_instance.provider_conf or {}, |
There was a problem hiding this comment.
Now that ai_instance.defaults is passed through as extra_opts.model_defaults, fields like defaults.model can affect the effective model sent upstream, but before_proxy currently sets ctx.var.llm_model based only on options.model or request_body.model. This means logging/metrics may miss the actual model when it comes solely from defaults. Consider including ai_instance.defaults.model in the model selection logic (with the intended precedence) so ctx vars reflect the request actually sent.
There was a problem hiding this comment.
Added defaults.model as a fallback in the llm_model assignment. Also removed the now-unnecessary model_defaults passthrough in extra_opts since defaults are applied directly to request_body in before_proxy.
moonming
left a comment
There was a problem hiding this comment.
Hi @sihyeonn, thank you for the ai-proxy defaults field proposal!
The concept of having a defaults field for fallback model options is interesting. A few points to discuss:
-
Naming clarity:
defaultsvsfallback— the namedefaultsimplies these are baseline values that can be overridden, whilefallbackimplies they're used when something is missing. Could you clarify the intended semantics? If these are values used when the request doesn't specify them,defaultsis the right name. -
Interaction with
options: How doesdefaultsinteract with the existingoptionsfield? We need clear documentation on the precedence: request values > options (override) > defaults (fallback). -
Schema documentation: Please ensure the schema includes clear descriptions for each field in
defaults.
Looking forward to the design discussion! Thank you.
|
Addressed all review feedback. Thanks for the reviews! |
|
@Baoyuantop Addressed all the review feedback — tests switched to regex matching, defaults applied before stream detection, and llm_model now includes defaults.model fallback. Ready for review. |
|
Hi @sihyeonn, before we push this PR further, I hope you can answer these design questions first #12895 (review) |
|
Hi @sihyeonn, please fix the code lint error and merge the master branch |
a8a92cf to
c343d20
Compare
Separate options and defaults behavior: - options: always override user request values - defaults: apply only when not set in user request This allows more flexible configuration where administrators can enforce certain values (via options) while providing sensible defaults for optional parameters. Priority order: options > client request > defaults Closes apache#13149 Signed-off-by: Sihyeon Jang <sihyeon.jang@navercorp.com>
cc62f9e to
ced79c5
Compare
…erty Add explicit `model` property to model_defaults_schema for consistency with model_options_schema, and improve the description to clarify the fallback semantics (only applied when not set in request). Signed-off-by: Sihyeon Jang <sihyeon.jang@navercorp.com>
|
Hi @moonming and @Baoyuantop, apologies for the very late response — I've been on an extended vacation and kept missing the review notifications. Sorry for the repeated delays! 1. Naming: I'd like to keep the name 2. Priority order I believe the correct precedence is
3. Schema documentation Good catch — I've added an explicit |
Summary
Add a
defaultsfield to theai-proxyandai-proxy-multiplugins for fallback model parameters that apply only when the client request omits them.This is complementary to the existing
optionsfield:options: always overrides request values (enforcement)defaults: applied only when the client does not set the field (fallback)Priority order:
options> client request >defaultsCloses #13149
Changes
defaultsfield to schema forai-proxyandai-proxy-multibase.luabefore stream detectiondefaults.modelinllm_modelvariable fallbackai-proxy.mdandai-proxy-multi.md