Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 69 additions & 29 deletions docs/providers/anthropic.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,63 +117,100 @@ When multiple tool results are returned, Prism automatically applies caching to

Please ensure you read Anthropic's [prompt caching documentation](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching), which covers some important information on e.g. minimum cacheable tokens and message order consistency.

## Extended thinking
## Thinking

Claude Sonnet 3.7 supports an optional extended thinking mode, where it will reason before returning its answer. Please ensure your consider [Anthropic's own extended thinking documentation](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking) before using extended thinking with caching and/or tools, as there are some important limitations and behaviours to be aware of.
Anthropic models support extended thinking, where the model reasons before returning its answer. Claude 4.6+ models (Opus 4.6, Sonnet 4.6) use **adaptive thinking** (recommended), where Claude dynamically determines when and how much to think. Older models use manual thinking with a fixed token budget.

### Enabling extended thinking and setting budget
Prism supports thinking mode for text and structured with the same API:
Please refer to Anthropic's [adaptive thinking](https://docs.anthropic.com/en/docs/build-with-claude/adaptive-thinking) and [extended thinking](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking) documentation for important limitations and behaviours when using thinking with caching and/or tools.

### Adaptive thinking (recommended for Claude 4.6+)

Adaptive thinking lets Claude decide when and how much to think based on the complexity of each request:

```php
use Prism\Prism\Enums\Provider;
use Prism\Prism\Facades\Prism;

Prism::text()
->using('anthropic', 'claude-sonnet-4-6')
->withPrompt('What is the meaning of life, the universe and everything in popular fiction?')
->withProviderOptions(['thinking' => ['type' => 'adaptive']])
->asText();
```

Use the `effort` parameter to guide how much thinking Claude does:

```php
Prism::text()
->using('anthropic', 'claude-sonnet-4-6')
->withPrompt('Analyze the trade-offs between microservices and monolithic architectures')
->withProviderOptions([
'thinking' => ['type' => 'adaptive'],
'effort' => 'high',
])
->asText();
```

Available effort levels:

| Level | Description |
|----------|--------------------------------------------------------------------------|
| `max` | Maximum capability with no constraints on thinking depth. Opus 4.6 only. |
| `high` | Deep reasoning on complex tasks. This is the default. |
| `medium` | Balanced approach with moderate token savings. |
| `low` | Most efficient. Significant token savings with some capability reduction.|

> [!NOTE]
> Setting `effort` to `"high"` produces the same behavior as omitting the parameter entirely.

> [!TIP]
> The `effort` parameter can also be used without thinking enabled, in which case it controls overall token spend for text responses and tool calls.

Works identically with `Prism::structured()`.

### Manual thinking (legacy)

> [!WARNING]
> Manual thinking with `enabled` and `budgetTokens` is deprecated on Claude 4.6+ models. Use adaptive thinking instead. Manual thinking is still required for older models (Sonnet 4.5, Opus 4.5, Sonnet 3.7, etc.).

```php
Prism::text()
->using('anthropic', 'claude-3-7-sonnet-latest')
->withPrompt('What is the meaning of life, the universe and everything in popular fiction?')
// enable thinking
->withProviderOptions(['thinking' => ['enabled' => true]])
->withProviderOptions(['thinking' => ['enabled' => true]])
->asText();
```

By default Prism will set the thinking budget to the value set in config, or where that isn't set, the minimum allowed (1024).

You can overide the config (or its default) using `withProviderOptions`:
You can override the config (or its default) using `withProviderOptions`:

```php
use Prism\Prism\Enums\Provider;
use Prism\Prism\Facades\Prism;

Prism::text()
->using('anthropic', 'claude-3-7-sonnet-latest')
->withPrompt('What is the meaning of life, the universe and everything in popular fiction?')
// Enable thinking and set a budget
->withProviderOptions([
'thinking' => [
'enabled' => true,
'budgetTokens' => 2048
]
'enabled' => true,
'budgetTokens' => 2048,
],
]);
```
Note that thinking tokens count towards output tokens, so you will be billed for them and your token budget must be less than the max tokens you have set for the request.

If you expect a long response, you should ensure there's enough tokens left for the response - i.e. does (maxTokens - thinkingBudget) leave a sufficient remainder.
Note that thinking tokens count towards output tokens, so you will be billed for them and your token budget must be less than the max tokens you have set for the request. If you expect a long response, ensure there's enough tokens left for the response i.e. does (maxTokens - thinkingBudget) leave a sufficient remainder.

### Inspecting the thinking block

Anthropic returns the thinking block with its response.
Anthropic returns the thinking block with its response. This works identically for both adaptive and manual thinking modes.

You can access it via the additionalContent property on either the Response or the relevant step.

On the Response (easiest if not using tools):

```php
use Prism\Prism\Enums\Provider;
use Prism\Prism\Facades\Prism;

Prism::text()
->using('anthropic', 'claude-3-7-sonnet-latest')
$response = Prism::text()
->using('anthropic', 'claude-sonnet-4-6')
->withPrompt('What is the meaning of life, the universe and everything in popular fiction?')
->withProviderOptions(['thinking' => ['enabled' => true']])
->withProviderOptions(['thinking' => ['type' => 'adaptive']])
->asText();

$response->additionalContent['thinking'];
Expand All @@ -185,19 +222,22 @@ On the Step (necessary if using tools, as Anthropic returns the thinking block o
$tools = [...];

$response = Prism::text()
->using('anthropic', 'claude-3-7-sonnet-latest')
->using('anthropic', 'claude-sonnet-4-6')
->withTools($tools)
->withMaxSteps(3)
->withPrompt('What time is the tigers game today and should I wear a coat?')
->withProviderOptions(['thinking' => ['enabled' => true]])
->withProviderOptions(['thinking' => ['type' => 'adaptive']])
->asText();

$response->steps->first()->additionalContent->thinking;
```

> [!NOTE]
> With adaptive thinking, Claude may skip thinking for simple queries, in which case no thinking block is returned.

### Extended output mode

Claude Sonnet 3.7 also brings extended output mode which increase the output limit to 128k tokens.
Claude Sonnet 3.7 also brings extended output mode which increase the output limit to 128k tokens.

This feature is currently in beta, so you will need to enable to by adding `output-128k-2025-02-19` to your Anthropic anthropic_beta config (see [Configuration](#configuration) above).

Expand All @@ -219,9 +259,9 @@ return Prism::text()
->asEventStreamResponse();
```

### Streaming with Extended Thinking
### Streaming with Thinking

When using extended thinking, the reasoning process streams separately from the final answer:
When using thinking (adaptive or manual), the reasoning process streams separately from the final answer:

```php
use Prism\Prism\Enums\StreamEventType;
Expand Down
3 changes: 2 additions & 1 deletion src/Providers/Anthropic/Concerns/ExtractsThinking.php
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ trait ExtractsThinking
*/
protected function extractThinking(array $data): array
{
if ($this->request->providerOptions('thinking.enabled') !== true) {
if ($this->request->providerOptions('thinking.enabled') !== true
&& $this->request->providerOptions('thinking.type') !== 'adaptive') {
return [];
}

Expand Down
33 changes: 25 additions & 8 deletions src/Providers/Anthropic/Handlers/Structured.php
Original file line number Diff line number Diff line change
Expand Up @@ -93,21 +93,17 @@ public static function buildHttpRequestPayload(PrismRequest $request): array
'model' => $request->model(),
'messages' => MessageMap::map($request->messages(), $request->providerOptions()),
'system' => MessageMap::mapSystemMessages($request->systemPrompts()) ?: null,
'thinking' => $request->providerOptions('thinking.enabled') === true
? [
'type' => 'enabled',
'budget_tokens' => is_int($request->providerOptions('thinking.budgetTokens'))
? $request->providerOptions('thinking.budgetTokens')
: config('prism.anthropic.default_thinking_budget', 1024),
]
: null,
'thinking' => static::resolveThinking($request),
'max_tokens' => $request->maxTokens() ?? 64000,
'temperature' => $request->temperature(),
'top_p' => $request->topP(),
'tools' => static::buildTools($request) ?: null,
'tool_choice' => ToolChoiceMap::map($request->toolChoice()),
'mcp_servers' => $request->providerOptions('mcp_servers'),
'cache_control' => $request->providerOptions('cache_control'),
'output_config' => $request->providerOptions('effort') !== null
? ['effort' => $request->providerOptions('effort')]
: null,
]);

return $structuredStrategy->mutatePayload($basePayload);
Expand Down Expand Up @@ -143,6 +139,27 @@ protected static function buildTools(StructuredRequest $request): array
return array_merge($providerTools, $tools);
}

/**
* @return array<string, mixed>|null
*/
protected static function resolveThinking(PrismRequest $request): ?array
{
if ($request->providerOptions('thinking.type') === 'adaptive') {
return ['type' => 'adaptive'];
}

if ($request->providerOptions('thinking.enabled') === true) {
return [
'type' => 'enabled',
'budget_tokens' => is_int($request->providerOptions('thinking.budgetTokens'))
? $request->providerOptions('thinking.budgetTokens')
: config('prism.anthropic.default_thinking_budget', 1024),
];
}

return null;
}

/**
* @param ToolCall[] $toolCalls
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,12 @@ public function mutatePayload(array $payload): array
{
$schemaArray = $this->request->schema()->toArray();

$payload['output_config'] = [
$payload['output_config'] = array_merge($payload['output_config'] ?? [], [
'format' => [
'type' => 'json_schema',
'schema' => $schemaArray,
],
];
]);

return $payload;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,8 @@ public function mutateResponse(HttpResponse $httpResponse, PrismResponse $prismR
protected function resolveToolChoice(): string|array|null
{
// Thinking mode doesn't support tool_choice (Anthropic restriction)
if ($this->request->providerOptions('thinking.enabled') === true) {
if ($this->request->providerOptions('thinking.enabled') === true
|| $this->request->providerOptions('thinking.type') === 'adaptive') {
return null;
}

Expand Down
33 changes: 25 additions & 8 deletions src/Providers/Anthropic/Handlers/Text.php
Original file line number Diff line number Diff line change
Expand Up @@ -75,21 +75,17 @@ public static function buildHttpRequestPayload(PrismRequest $request): array
'model' => $request->model(),
'system' => MessageMap::mapSystemMessages($request->systemPrompts()) ?: null,
'messages' => MessageMap::map($request->messages(), $request->providerOptions()),
'thinking' => $request->providerOptions('thinking.enabled') === true
? [
'type' => 'enabled',
'budget_tokens' => is_int($request->providerOptions('thinking.budgetTokens'))
? $request->providerOptions('thinking.budgetTokens')
: config('prism.anthropic.default_thinking_budget', 1024),
]
: null,
'thinking' => static::resolveThinking($request),
'max_tokens' => $request->maxTokens() ?? 64000,
'temperature' => $request->temperature(),
'top_p' => $request->topP(),
'tools' => static::buildTools($request) ?: null,
'tool_choice' => ToolChoiceMap::map($request->toolChoice()),
'mcp_servers' => $request->providerOptions('mcp_servers'),
'cache_control' => $request->providerOptions('cache_control'),
'output_config' => $request->providerOptions('effort') !== null
? ['effort' => $request->providerOptions('effort')]
: null,
]);
}

Expand Down Expand Up @@ -198,6 +194,27 @@ protected static function buildTools(TextRequest $request): array
return array_merge($providerTools, $tools);
}

/**
* @return array<string, mixed>|null
*/
protected static function resolveThinking(PrismRequest $request): ?array
{
if ($request->providerOptions('thinking.type') === 'adaptive') {
return ['type' => 'adaptive'];
}

if ($request->providerOptions('thinking.enabled') === true) {
return [
'type' => 'enabled',
'budget_tokens' => is_int($request->providerOptions('thinking.budgetTokens'))
? $request->providerOptions('thinking.budgetTokens')
: config('prism.anthropic.default_thinking_budget', 1024),
];
}

return null;
}

/**
* @param array<string, mixed> $data
* @return ToolCall[]
Expand Down
Loading
Loading