> ## Documentation Index > Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt > Use this file to discover all available pages before exploring further. # Anthropic > Large Language Model service implementation using Anthropic's Claude API ## Overview `AnthropicLLMService` provides integration with Anthropic's Claude models, supporting streaming responses, function calling, and prompt caching with specialized context handling for Anthropic's message format and advanced reasoning capabilities. Pipecat's API methods for Anthropic Claude integration Complete example with function calling Official Anthropic API documentation and features Access Claude models and API keys ## Installation To use Anthropic services, install the required dependency: ```bash theme={null} uv add "pipecat-ai[anthropic]" ``` ## Prerequisites ### Anthropic Account Setup Before using Anthropic LLM services, you need: 1. **Anthropic Account**: Sign up at [Anthropic Console](https://console.anthropic.com/) 2. **API Key**: Generate an API key from your console dashboard 3. **Model Selection**: Choose from available Claude models (Claude Sonnet 4.5, Claude Opus 4.6, etc.) ### Required Environment Variables * `ANTHROPIC_API_KEY`: Your Anthropic API key for authentication ## Configuration Anthropic API key for authentication. Claude model name to use (e.g., `"claude-sonnet-4-5-20250929"`, `"claude-opus-4-6-20250929"`). *Deprecated in v0.0.105. Use `settings=AnthropicLLMService.Settings(...)` instead.* Runtime-configurable model settings. See [Settings](#settings) below. Runtime-configurable model settings. See [Settings](#settings) below. *Deprecated in v0.0.105. Use `settings=AnthropicLLMService.Settings(...)` instead.* Optional custom Anthropic client instance. Useful for custom clients like `AsyncAnthropicBedrock` or `AsyncAnthropicVertex`. Request timeout in seconds. Used when `retry_on_timeout` is enabled to determine when to retry. Whether to retry the request once if it times out. The retry attempt has no timeout limit. ### Settings Runtime-configurable settings passed via the `settings` constructor argument using `AnthropicLLMService.Settings(...)`. These can be updated mid-conversation with `LLMUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details. | Parameter | Type | Default | Description | | ----------------------- | ------------------------- | ----------- | ----------------------------------------------------------------------------------------------- | | `model` | `str` | `None` | Anthropic model identifier. *(Inherited from base settings.)* | | `system_instruction` | `str` | `None` | System instruction/prompt for the model. *(Inherited from base settings.)* | | `max_tokens` | `int` | `NOT_GIVEN` | Maximum tokens to generate. | | `temperature` | `float` | `NOT_GIVEN` | Sampling temperature (0.0 to 1.0). Lower values are more focused, higher values more creative. | | `top_k` | `int` | `NOT_GIVEN` | Top-k sampling parameter. Limits tokens to the top k most likely. | | `top_p` | `float` | `NOT_GIVEN` | Top-p (nucleus) sampling (0.0 to 1.0). Controls diversity of output. | | `enable_prompt_caching` | `bool` | `NOT_GIVEN` | Whether to enable Anthropic's prompt caching feature. Reduces costs for repeated context. | | `thinking` | `AnthropicThinkingConfig` | `NOT_GIVEN` | Extended thinking configuration. See [AnthropicThinkingConfig](#anthropicthinkingconfig) below. | `NOT_GIVEN` values are omitted from the API request entirely, letting the Anthropic API use its own defaults. ### AnthropicThinkingConfig Configuration for Anthropic's extended thinking feature, which causes the model to spend more time reasoning before responding. | Parameter | Type | Default | Description | | --------------- | --------------------------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------ | | `type` | `"enabled"` or `"disabled"` | | Whether extended thinking is enabled. | | `budget_tokens` | `int` (optional) | `None` | Maximum number of tokens for thinking. Currently required when type is "enabled", minimum 1024 with today's models. Not allowed when "disabled". | When extended thinking is enabled, the service emits `LLMThoughtStartFrame`, `LLMThoughtTextFrame`, and `LLMThoughtEndFrame` during response generation. ## Usage ### Basic Setup ```python theme={null} from pipecat.services.anthropic import AnthropicLLMService llm = AnthropicLLMService( api_key=os.getenv("ANTHROPIC_API_KEY"), model="claude-sonnet-4-5-20250929", ) ``` ### With Custom Settings ```python theme={null} from pipecat.services.anthropic import AnthropicLLMService llm = AnthropicLLMService( api_key=os.getenv("ANTHROPIC_API_KEY"), settings=AnthropicLLMService.Settings( model="claude-sonnet-4-5-20250929", enable_prompt_caching=True, max_tokens=2048, temperature=0.7, ), ) ``` ### With Extended Thinking ```python theme={null} llm = AnthropicLLMService( api_key=os.getenv("ANTHROPIC_API_KEY"), settings=AnthropicLLMService.Settings( model="claude-sonnet-4-5-20250929", max_tokens=16384, thinking=AnthropicLLMService.AnthropicThinkingConfig( type="enabled", budget_tokens=10000, ), ), ) ``` ### Updating Settings at Runtime Model settings can be changed mid-conversation using `LLMUpdateSettingsFrame`: ```python theme={null} from pipecat.frames.frames import LLMUpdateSettingsFrame from pipecat.services.anthropic.llm import AnthropicLLMSettings await worker.queue_frame( LLMUpdateSettingsFrame( delta=AnthropicLLMSettings( temperature=0.3, max_tokens=1024, ) ) ) ``` The `InputParams` / `params=` pattern is deprecated as of v0.0.105. Use `Settings` / `settings=` instead. See the [Service Settings guide](/pipecat/fundamentals/service-settings) for migration details. ## Notes * **Prompt caching**: When `enable_prompt_caching` is enabled, Anthropic caches repeated context to reduce costs. Cache control markers are automatically added to the most recent user messages. This is most effective for conversations with large system prompts or long conversation histories. * **Extended thinking**: Enabling thinking increases response quality for complex tasks but adds latency. When `type="enabled"`, you must provide a `budget_tokens` value (minimum 1024 with current models). Extended thinking is disabled by default. * **Custom clients**: You can pass custom Anthropic client instances (e.g., `AsyncAnthropicBedrock` or `AsyncAnthropicVertex`) via the `client` parameter to use Anthropic models through other cloud providers. * **Retry behavior**: When `retry_on_timeout=True`, the first attempt uses the `retry_timeout_secs` timeout. If it times out, a second attempt is made with no timeout limit. * **System instruction precedence**: If both `system_instruction` (from the constructor) and a system message in the context are set, the constructor's `system_instruction` takes precedence and a warning is logged. ## Event Handlers `AnthropicLLMService` supports the following event handlers, inherited from [LLMService](/api-reference/server/events/service-events): | Event | Description | | --------------------------- | ----------------------------------------------------------------------- | | `on_completion_timeout` | Called when an LLM completion request times out | | `on_function_calls_started` | Called when function calls are received and execution is about to start | ```python theme={null} @llm.event_handler("on_completion_timeout") async def on_completion_timeout(service): print("LLM completion timed out") ```