Overview

AnthropicLLMService provides integration with Anthropic’s Claude models, supporting streaming responses, function calling, and prompt caching with specialized context handling for Anthropic’s message format.

Installation

To use Anthropic services, install the required dependency:

pip install "pipecat-ai[anthropic]"

You’ll also need to set up your Anthropic API key as an environment variable: ANTHROPIC_API_KEY.

Get your API key from Anthropic Console.

Frames

Input

  • OpenAILLMContextFrame - Conversation context and history
  • LLMMessagesFrame - Direct message list
  • VisionImageRawFrame - Images for vision processing
  • LLMUpdateSettingsFrame - Runtime parameter updates
  • LLMEnablePromptCachingFrame - Toggle prompt caching

Output

  • LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries
  • LLMTextFrame - Streamed completion chunks
  • FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle
  • ErrorFrame - API or processing errors

Function Calling

Function Calling Guide

Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications.

Context Management

Context Management Guide

Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences.

Usage Example

import os
from pipecat.services.anthropic.llm import AnthropicLLMService
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.adapters.schemas.function_schema import FunctionSchema
from pipecat.adapters.schemas.tools_schema import ToolsSchema

# Configure the service
llm = AnthropicLLMService(
    api_key=os.getenv("ANTHROPIC_API_KEY"),
    model="claude-sonnet-4-20250514",
    params=AnthropicLLMService.InputParams(
        temperature=0.7,
        enable_prompt_caching_beta=True
    )
)

# Define function for tool calling
weather_function = FunctionSchema(
    name="get_weather",
    description="Get current weather information",
    properties={
        "location": {
            "type": "string",
            "description": "City and state, e.g. San Francisco, CA"
        }
    },
    required=["location"]
)

tools = ToolsSchema(standard_tools=[weather_function])

# Create context with system message
context = OpenAILLMContext(
    messages=[{"role": "user", "content": "What's the weather like?"}],
    tools=tools
)

# Create context aggregators
context_aggregator = llm.create_context_aggregator(context)

# Register function handler
async def get_weather(params):
    location = params.arguments["location"]
    await params.result_callback(f"Weather in {location}: 72°F and sunny")

llm.register_function("get_weather", get_weather)

# Use in pipeline
pipeline = Pipeline([
    transport.input(),
    stt,
    context_aggregator.user(),    # Handles user messages
    llm,                          # Processes with Anthropic
    tts,
    transport.output(),
    context_aggregator.assistant() # Captures responses
])

Metrics

The service provides:

  • Time to First Byte (TTFB) - Latency from request to first response token
  • Processing Duration - Total request processing time
  • Token Usage - Prompt tokens, completion tokens, and total usage
  • Cache Metrics - Cache creation and read token usage

Enable with:

task = PipelineTask(
    pipeline,
    params=PipelineParams(
        enable_metrics=True,
        enable_usage_metrics=True
    )
)

Additional Notes

  • Streaming Responses: All responses are streamed for low latency
  • Context Persistence: Use context aggregators to maintain conversation history
  • Error Handling: Automatic retry logic for rate limits and transient errors
  • Message Format: Automatically converts between OpenAI and Anthropic message formats
  • Prompt Caching: Reduces costs and latency for repeated context patterns

Overview

AnthropicLLMService provides integration with Anthropic’s Claude models, supporting streaming responses, function calling, and prompt caching with specialized context handling for Anthropic’s message format.

Installation

To use Anthropic services, install the required dependency:

pip install "pipecat-ai[anthropic]"

You’ll also need to set up your Anthropic API key as an environment variable: ANTHROPIC_API_KEY.

Get your API key from Anthropic Console.

Frames

Input

  • OpenAILLMContextFrame - Conversation context and history
  • LLMMessagesFrame - Direct message list
  • VisionImageRawFrame - Images for vision processing
  • LLMUpdateSettingsFrame - Runtime parameter updates
  • LLMEnablePromptCachingFrame - Toggle prompt caching

Output

  • LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries
  • LLMTextFrame - Streamed completion chunks
  • FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle
  • ErrorFrame - API or processing errors

Function Calling

Function Calling Guide

Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications.

Context Management

Context Management Guide

Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences.

Usage Example

import os
from pipecat.services.anthropic.llm import AnthropicLLMService
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.adapters.schemas.function_schema import FunctionSchema
from pipecat.adapters.schemas.tools_schema import ToolsSchema

# Configure the service
llm = AnthropicLLMService(
    api_key=os.getenv("ANTHROPIC_API_KEY"),
    model="claude-sonnet-4-20250514",
    params=AnthropicLLMService.InputParams(
        temperature=0.7,
        enable_prompt_caching_beta=True
    )
)

# Define function for tool calling
weather_function = FunctionSchema(
    name="get_weather",
    description="Get current weather information",
    properties={
        "location": {
            "type": "string",
            "description": "City and state, e.g. San Francisco, CA"
        }
    },
    required=["location"]
)

tools = ToolsSchema(standard_tools=[weather_function])

# Create context with system message
context = OpenAILLMContext(
    messages=[{"role": "user", "content": "What's the weather like?"}],
    tools=tools
)

# Create context aggregators
context_aggregator = llm.create_context_aggregator(context)

# Register function handler
async def get_weather(params):
    location = params.arguments["location"]
    await params.result_callback(f"Weather in {location}: 72°F and sunny")

llm.register_function("get_weather", get_weather)

# Use in pipeline
pipeline = Pipeline([
    transport.input(),
    stt,
    context_aggregator.user(),    # Handles user messages
    llm,                          # Processes with Anthropic
    tts,
    transport.output(),
    context_aggregator.assistant() # Captures responses
])

Metrics

The service provides:

  • Time to First Byte (TTFB) - Latency from request to first response token
  • Processing Duration - Total request processing time
  • Token Usage - Prompt tokens, completion tokens, and total usage
  • Cache Metrics - Cache creation and read token usage

Enable with:

task = PipelineTask(
    pipeline,
    params=PipelineParams(
        enable_metrics=True,
        enable_usage_metrics=True
    )
)

Additional Notes

  • Streaming Responses: All responses are streamed for low latency
  • Context Persistence: Use context aggregators to maintain conversation history
  • Error Handling: Automatic retry logic for rate limits and transient errors
  • Message Format: Automatically converts between OpenAI and Anthropic message formats
  • Prompt Caching: Reduces costs and latency for repeated context patterns