unified client interface by mikasenghaas · Pull Request #897 · PrimeIntellect-ai/verifiers

mikasenghaas · 2026-02-11T14:14:13Z

Description

Based off of PR #788 which added provider-agnostic types for message/client/tool/response flow.

This PR replaces direct native clients (e.g. AsyncOpenAI) and types (e.g. OAI-style tool message and response types) calls with a provider-agnostic vf.Client adapter that converts Messages and Tool definitions to native provider requests and normalizes outputs into a unified vf.Response (including usage, tool calls, and optional reasoning content).

It adds first-class Anthropic support via AnthropicMessagesClient, and integrates interleaved thinking support into the default OpenAIChatCompletionsClient.

Client Interface

vf.Client is the adapter layer that wraps a native SDK client and standardizes everything into a vf.Response.

Each client implementation defines four core methods:

to_native_prompt — Convert vf.Messages into provider-native prompt format.
get_native_response — Execute the provider-native API call.
raise_from_native_response (optional) — Map/raise provider-specific errors (e.g. overlong prompt).
from_native_response — Convert provider-native output into unified vf.Response.

We intentionally moved to custom unified types (instead of continuing to normalize to OpenAI-only types) because some provider features do not map cleanly to OAI schemas. The current type system is provider-agnostic and supports multimodal/reasoning/tool patterns across clients. We currently implement the following clients:

OpenAICompletionsClient
OpenAIChatCompletionsClient
OpenAIChatCompletionsTokenClient
AnthropicMessagesClient

This architecture is extensible to additional providers/API surfaces (including future OAI Responses-style adapters).

Tests

gpt-4.1-mini via PI inference and OAI chat completions client

uv run vf-eval continuation-quality -n1 -r1 -d -v
uv run vf-eval gsm8k -n1 -r1 -d -v
uv run vf-eval wiki-search -n1 -r1 -d -v

glm-4.7 via PI inference and OAI chat completions client

uv run vf-eval continuation-quality -n1 -r1 -d -v -m glm-4.7
uv run vf-eval gsm8k -n1 -r1 -d -v -m glm-4.7
uv run vf-eval wiki-search -n1 -r1 -d -v -m glm-4.7

Variety of models via native API

uv run vf-eval wiki-search -n1 -r1 -d -v -m deepseek-reasoner -b https://api.deepseek.com/v1 -k DEEPSEEK_API_KEY
uv run vf-eval wiki-search -n1 -r1 -d -v -m kimi-k2.5 -b https://api.moonshot.ai/v1 -k MOONSHOT_API_KEY

Against vLLM server

uv run inference --model.name Qwen/Qwen3-4B-Thinking-2507 --tensor-parallel-size 2 \
     --tool-call-parser hermes \
     --reasoning-parser deepseek_r1 \
     --enable-auto-tool-choice

uv run vf-eval wiki-search -n1 -r1 -d -v -m Qwen/Qwen3-4B-Thinking-2507 -b http://localhost:8000/v1

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Test improvement

Testing

All existing tests pass when running uv run pytest locally.
New tests have been added to cover the changes

Checklist

My code follows the style guidelines of this project as outlined in AGENTS.md
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

Note

High Risk
Broad, breaking interface refactor across client invocation, tool schemas, and response types; mistakes can affect all model calls and endpoint/CLI configuration across providers.

Overview
Moves generation to a provider-agnostic client adapter: environments now call Client.get_response() and receive a unified vf.Response, with typed tool definitions (tool_defs/Tool) replacing OpenAI-specific oai_tools and legacy native response types.

Adds first-class provider selection via ClientConfig.client_type/ClientType, updates the endpoint registry and CLI to carry api_client_type (with type shorthand in registries), and switches built-in Anthropic model aliases to direct api.anthropic.com with ANTHROPIC_API_KEY while adding DeepSeek endpoints.

Updates OpenEnv prompt renderers to return typed Messages, replaces OpenAI-specific test mocking with a new MockClient, and adds focused tests for auth/overlong-prompt error handling, multimodal prompt conversions, interception serialization, and message normalization; docs/reference and eval/development/test docs are updated, and anthropic is added as a dependency.

^{Written by Cursor Bugbot for commit 7ecbd40. This will update automatically on new commits. Configure here.}

…d tool flow

…plumbing

…types

# Conflicts: # verifiers/scripts/eval.py # verifiers/types.py

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

Adapt rlm_env.py to fully use the provider-agnostic types introduced in #897 (unified client interface): - Use flat ToolCall attributes (name, arguments) instead of nested function object dance - Return ToolMessage objects from _call_sub_tool instead of raw dicts - Use Client type annotation instead of Any for client parameters - Pass tool_defs directly to get_model_response instead of via state - Use typed AssistantMessage access in no_tools_called stop condition - Simplify _extract_tokens_from_response (remove dead dict code paths) - Fix SubLLMResult final_content type narrowing for MessageContent union Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* migrate rlm_env to unified client types Adapt rlm_env.py to fully use the provider-agnostic types introduced in #897 (unified client interface): - Use flat ToolCall attributes (name, arguments) instead of nested function object dance - Return ToolMessage objects from _call_sub_tool instead of raw dicts - Use Client type annotation instead of Any for client parameters - Pass tool_defs directly to get_model_response instead of via state - Use typed AssistantMessage access in no_tools_called stop condition - Simplify _extract_tokens_from_response (remove dead dict code paths) - Fix SubLLMResult final_content type narrowing for MessageContent union Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * restore prompt_state tool_defs as safety measure Restore setting prompt_state["tool_defs"] in _call_sub_llm_api alongside the new direct tool_defs kwarg pass. While both paths resolve equivalently through resolve_optional_args, keeping the state key is safer for any code that may read it downstream. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

eligotts added 30 commits February 7, 2026 17:42

feat(clients): introduce unified client interface and provider adapters

61633b4

refactor(environment): migrate rollout core to Response + tool_defs

2b4f413

refactor(envs): align tool/multiturn/integrations with new message an…

85671c6

…d tool flow

refactor(eval): unify eval outputs/serialization and improve display …

b978e38

…plumbing

test(migration): update env/tool trajectory tests for unified client/…

2508876

…types

test(envs): harden env smoke tests and fixture setup for clean installs

af4f274

chore(endpoints): annotate endpoint registry with explicit client types

797b48c

merged in main

341e212

dont strip client types

e302ddf

merged in usage stuff

354bff8

small reversions

5ddc50a

format checks

de5f81a

Merge remote-tracking branch 'origin/main' into eli/anthropic-client

86acbb1

# Conflicts: # verifiers/scripts/eval.py # verifiers/types.py

cleaner handling of pure completions with custom types

520b226

formatting

c608032

patch single turn to convert message list back to raw string

f919282

safe serialization, output has both oai_tools and tool_defs

b59f8e6

test needs to accept kwargs

e4465aa

update tests with tool_defs

2357f3a

rename convert_func_to_tool_def

4dc4720

formatting

2eefcf1

fix lines in types

1ffd26e

harden tool_defs/oai_tools to error when oai tools are passed in

d970eb0

harden downstream reading of tool_defs

382c0c7

use vf.client

084439b

much tighter use of custom types

4c8ce64

get rid of legacy response.choices

035378a

patch test

7442652

clean up into unified normalized_messages util function

009c815

bugbot fixes

e507485

mikasenghaas added 3 commits February 11, 2026 13:33

pop unrecognized stop arg

f5635c5

add deepseek reasoner

917801e

remove interleaved thinking from env

e3e8317

mikasenghaas changed the title ~~feat(clients): unified client interface with Anthropic support~~ unified client interface Feb 11, 2026

cursor Bot reviewed Feb 11, 2026

View reviewed changes

Comment thread configs/endpoints.py Outdated

Comment thread verifiers/clients/__init__.py

Comment thread verifiers/envs/environment.py Outdated

Comment thread verifiers/envs/environment.py

mikasenghaas added 13 commits February 11, 2026 14:42

fix tests with claude

f164d64

rename to mock_client

b9bfa37

update docs

be0793c

updated more reasoning content fields

1364106

do not make interleaved settable on client

02f96bd

simplify

3b7abba

make thinking block part of our ass msg type

82c0256

fix endpoints

2ce361c

rename to oai_tools

add919e

allow closing clients

a29514b

removed unused native client

94e522a

add docs

dc9d283

conditional warn

092dad1

cursor Bot reviewed Feb 11, 2026

View reviewed changes

Comment thread docs/evaluation.md

Comment thread verifiers/clients/anthropic_messages_client.py

mikasenghaas mentioned this pull request Feb 11, 2026

[hack] interleave think #899

Closed

13 tasks

eligotts and others added 3 commits February 11, 2026 13:16

only add reasoning content if its a string

37708b0

Merge branch 'main' into mika/anthropic-client

e28b06b

Merge branch 'main' into mika/anthropic-client

7ecbd40

mikasenghaas merged commit 191c516 into main Feb 14, 2026
6 checks passed

This was referenced Feb 14, 2026

Add reasoning content normalization middleware #896

Closed

abstract client + support for interleaved thinking #860

Closed

anthropic client + interleaved thinking support #788

Closed

snimu mentioned this pull request Feb 14, 2026

migrate rlm_env to unified client types #914

Merged

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unified client interface#897

unified client interface#897
mikasenghaas merged 79 commits intomainfrom
mika/anthropic-client

mikasenghaas commented Feb 11, 2026 •

edited by cursor Bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mikasenghaas commented Feb 11, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Client Interface

Tests

Type of Change

Testing

Checklist

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mikasenghaas commented Feb 11, 2026 •

edited by cursor Bot

Loading