Skip to content

add anthropic caching to system prompt in agent#1655

Merged
tkattkat merged 1 commit intomainfrom
add-anthropic-caching-to-system-prompt-in-agent
Feb 6, 2026
Merged

add anthropic caching to system prompt in agent#1655
tkattkat merged 1 commit intomainfrom
add-anthropic-caching-to-system-prompt-in-agent

Conversation

@tkattkat
Copy link
Copy Markdown
Collaborator

@tkattkat tkattkat commented Feb 4, 2026

Add Anthropic Prompt Caching for Agent System Prompts

Summary

Added Anthropic's ephemeral cache control to system prompts in v3AgentHandler.ts. Refactored into a single prependSystemMessage() function used by both execute() and stream().

Why

Reduces token costs and latency for Claude models by caching the system prompt server-side. The providerOptions are ignored by non-Anthropic models, so this is safe for all providers.

Limitations

Currently limited to the system prompt only. Extending to conversation messages requires restructuring message processing to insert cache breakpoints at stable boundaries.

Test Plan

  • Verify agent works with Claude and non Anthropic models
  • Confirm cached_input_tokens appears in usage metrics with Claude

Summary by cubic

Added Anthropic ephemeral caching to the agent’s system prompt to lower Claude token usage and reduce latency. Safe for non-Anthropic models; the options are ignored.

  • New Features

    • System messages now include Anthropic providerOptions with cacheControl: ephemeral.
  • Refactors

    • Introduced prependSystemMessage() and used it in both text generation and streaming to centralize system prompt handling.

Written for commit 6e2eef6. Summary will update on new commits. Review in cubic

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Feb 4, 2026

⚠️ No Changeset found

Latest commit: 6e2eef6

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Feb 4, 2026

Greptile Overview

Greptile Summary

Implemented Anthropic prompt caching for agent system prompts by creating a new prependSystemMessage() helper function that wraps the system prompt with ephemeral cache control in providerOptions.anthropic.cacheControl. Both execute() and stream() methods were refactored to use this function instead of passing the system prompt as a separate parameter.

Key Changes:

  • Created prependSystemMessage() function that converts the system prompt to a system message with Anthropic cache control
  • Refactored execute() method to use prependSystemMessage() instead of separate system parameter
  • Refactored stream() method to use prependSystemMessage() instead of separate system parameter
  • The providerOptions.anthropic.cacheControl field is ignored by non-Anthropic models, ensuring compatibility

Benefits:

  • Reduces token costs for Claude models by caching the system prompt server-side
  • Improves latency for subsequent requests with the same system prompt
  • No impact on non-Anthropic models since they ignore the providerOptions field

Confidence Score: 5/5

  • This PR is safe to merge with no concerns
  • The implementation is clean, well-documented, and follows best practices. The refactoring consolidates duplicate code into a reusable helper function while adding Anthropic prompt caching. The providerOptions field is safely ignored by non-Anthropic models, ensuring backward compatibility. The code correctly prepends the system message to the messages array as expected by the AI SDK.
  • No files require special attention

Important Files Changed

Filename Overview
packages/core/lib/v3/handlers/v3AgentHandler.ts Added prependSystemMessage() helper that wraps system prompt with Anthropic cache control, refactored both execute() and stream() to use it

Sequence Diagram

sequenceDiagram
    participant Client
    participant V3AgentHandler
    participant prependSystemMessage
    participant LLMClient
    participant AnthropicAPI

    Client->>V3AgentHandler: execute() or stream()
    V3AgentHandler->>V3AgentHandler: prepareAgent()
    V3AgentHandler->>V3AgentHandler: buildAgentSystemPrompt()
    
    alt execute() flow
        V3AgentHandler->>prependSystemMessage: prependSystemMessage(systemPrompt, messages)
        prependSystemMessage->>prependSystemMessage: Add system message with cache control
        prependSystemMessage-->>V3AgentHandler: messages with cached system prompt
        V3AgentHandler->>LLMClient: generateText({messages})
    else stream() flow
        V3AgentHandler->>prependSystemMessage: prependSystemMessage(systemPrompt, messages)
        prependSystemMessage->>prependSystemMessage: Add system message with cache control
        prependSystemMessage-->>V3AgentHandler: messages with cached system prompt
        V3AgentHandler->>LLMClient: streamText({messages})
    end
    
    LLMClient->>AnthropicAPI: Request with cached system prompt
    Note over AnthropicAPI: Caches system prompt<br/>with ephemeral cache control
    AnthropicAPI-->>LLMClient: Response with cached_input_tokens
    LLMClient-->>V3AgentHandler: Result with usage metrics
    V3AgentHandler-->>Client: AgentResult with cached token count
Loading

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

Confidence score: 5/5

  • Automated review surfaced no issues in the provided summaries.
  • No files require special attention.
Architecture diagram
sequenceDiagram
    participant Agent as V3AgentHandler
    participant LLM as LLMClient
    participant Provider as Model Provider (Claude/Generic)

    Note over Agent,Provider: Text Generation or Streaming Flow

    Agent->>Agent: NEW: prependSystemMessage(systemPrompt, messages)
    Note right of Agent: Constructs message with<br/>providerOptions.anthropic.cacheControl: "ephemeral"

    Agent->>LLM: CHANGED: generateText() or streamText()
    Note right of Agent: System prompt is now prepended to 'messages' array<br/>instead of being passed as 'system' property.

    LLM->>Provider: Send API Request

    alt Provider is Anthropic (Claude)
        Provider->>Provider: Identify ephemeral cache breakpoint
        alt Cache Hit
            Provider-->>LLM: Response (with cached_input_tokens)
        else Cache Miss
            Provider->>Provider: Store system prompt in cache
            Provider-->>LLM: Response (standard usage)
        end
    else Other Provider (OpenAI/etc)
        Provider->>Provider: Ignore anthropic providerOptions
        Provider-->>LLM: Standard Response
    end

    LLM-->>Agent: Return Result / Stream Content
Loading

@tkattkat tkattkat merged commit 0e4a144 into main Feb 6, 2026
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants