> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Anthropic

> Large Language Model service implementation using Anthropic's Claude API

## Overview

`AnthropicLLMService` provides integration with Anthropic's Claude models, supporting streaming responses, function calling, and prompt caching with specialized context handling for Anthropic's message format and advanced reasoning capabilities.

<CardGroup cols={2}>
  <Card title="Anthropic LLM API Reference" icon="code" href="https://reference-server.pipecat.ai/en/latest/api/pipecat.services.anthropic.llm.html">
    Pipecat's API methods for Anthropic Claude integration
  </Card>

  <Card title="Example Implementation" icon="play" href="https://github.com/pipecat-ai/pipecat/blob/main/examples/function-calling/function-calling-anthropic.py">
    Complete example with function calling
  </Card>

  <Card title="Anthropic Documentation" icon="book" href="https://docs.anthropic.com/en/api/messages">
    Official Anthropic API documentation and features
  </Card>

  <Card title="Anthropic Console" icon="microphone" href="https://console.anthropic.com/">
    Access Claude models and API keys
  </Card>
</CardGroup>

## Installation

To use Anthropic services, install the required dependency:

```bash theme={null}
uv add "pipecat-ai[anthropic]"
```

## Prerequisites

### Anthropic Account Setup

Before using Anthropic LLM services, you need:

1. **Anthropic Account**: Sign up at [Anthropic Console](https://console.anthropic.com/)
2. **API Key**: Generate an API key from your console dashboard
3. **Model Selection**: Choose from available Claude models (Claude Sonnet 4.5, Claude Opus 4.6, etc.)

### Required Environment Variables

* `ANTHROPIC_API_KEY`: Your Anthropic API key for authentication

## Configuration

<ParamField path="api_key" type="str" required>
  Anthropic API key for authentication.
</ParamField>

<ParamField path="model" type="str" default="None" deprecated>
  Claude model name to use (e.g., `"claude-sonnet-4-5-20250929"`,
  `"claude-opus-4-6-20250929"`). *Deprecated in v0.0.105. Use
  `settings=AnthropicLLMService.Settings(...)` instead.*
</ParamField>

<ParamField path="settings" type="AnthropicLLMService.Settings" default="None">
  Runtime-configurable model settings. See [Settings](#settings) below.
</ParamField>

<ParamField path="params" type="InputParams" default="None" deprecated>
  Runtime-configurable model settings. See [Settings](#settings) below.
  *Deprecated in v0.0.105. Use `settings=AnthropicLLMService.Settings(...)`
  instead.*
</ParamField>

<ParamField path="client" type="AsyncAnthropic" default="None">
  Optional custom Anthropic client instance. Useful for custom clients like
  `AsyncAnthropicBedrock` or `AsyncAnthropicVertex`.
</ParamField>

<ParamField path="retry_timeout_secs" type="float" default="5.0">
  Request timeout in seconds. Used when `retry_on_timeout` is enabled to
  determine when to retry.
</ParamField>

<ParamField path="retry_on_timeout" type="bool" default="False">
  Whether to retry the request once if it times out. The retry attempt has no
  timeout limit.
</ParamField>

### Settings

Runtime-configurable settings passed via the `settings` constructor argument using `AnthropicLLMService.Settings(...)`. These can be updated mid-conversation with `LLMUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details.

| Parameter               | Type                      | Default     | Description                                                                                     |
| ----------------------- | ------------------------- | ----------- | ----------------------------------------------------------------------------------------------- |
| `model`                 | `str`                     | `None`      | Anthropic model identifier. *(Inherited from base settings.)*                                   |
| `system_instruction`    | `str`                     | `None`      | System instruction/prompt for the model. *(Inherited from base settings.)*                      |
| `max_tokens`            | `int`                     | `NOT_GIVEN` | Maximum tokens to generate.                                                                     |
| `temperature`           | `float`                   | `NOT_GIVEN` | Sampling temperature (0.0 to 1.0). Lower values are more focused, higher values more creative.  |
| `top_k`                 | `int`                     | `NOT_GIVEN` | Top-k sampling parameter. Limits tokens to the top k most likely.                               |
| `top_p`                 | `float`                   | `NOT_GIVEN` | Top-p (nucleus) sampling (0.0 to 1.0). Controls diversity of output.                            |
| `enable_prompt_caching` | `bool`                    | `NOT_GIVEN` | Whether to enable Anthropic's prompt caching feature. Reduces costs for repeated context.       |
| `thinking`              | `AnthropicThinkingConfig` | `NOT_GIVEN` | Extended thinking configuration. See [AnthropicThinkingConfig](#anthropicthinkingconfig) below. |

<Note>
  `NOT_GIVEN` values are omitted from the API request entirely, letting the
  Anthropic API use its own defaults.
</Note>

### AnthropicThinkingConfig

Configuration for Anthropic's extended thinking feature, which causes the model to spend more time reasoning before responding.

| Parameter       | Type                        | Default | Description                                                                                                                                      |
| --------------- | --------------------------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| `type`          | `"enabled"` or `"disabled"` |         | Whether extended thinking is enabled.                                                                                                            |
| `budget_tokens` | `int` (optional)            | `None`  | Maximum number of tokens for thinking. Currently required when type is "enabled", minimum 1024 with today's models. Not allowed when "disabled". |

When extended thinking is enabled, the service emits `LLMThoughtStartFrame`, `LLMThoughtTextFrame`, and `LLMThoughtEndFrame` during response generation.

## Usage

### Basic Setup

```python theme={null}
from pipecat.services.anthropic import AnthropicLLMService

llm = AnthropicLLMService(
    api_key=os.getenv("ANTHROPIC_API_KEY"),
    model="claude-sonnet-4-5-20250929",
)
```

### With Custom Settings

```python theme={null}
from pipecat.services.anthropic import AnthropicLLMService

llm = AnthropicLLMService(
    api_key=os.getenv("ANTHROPIC_API_KEY"),
    settings=AnthropicLLMService.Settings(
        model="claude-sonnet-4-5-20250929",
        enable_prompt_caching=True,
        max_tokens=2048,
        temperature=0.7,
    ),
)
```

### With Extended Thinking

```python theme={null}
llm = AnthropicLLMService(
    api_key=os.getenv("ANTHROPIC_API_KEY"),
    settings=AnthropicLLMService.Settings(
        model="claude-sonnet-4-5-20250929",
        max_tokens=16384,
        thinking=AnthropicLLMService.AnthropicThinkingConfig(
            type="enabled",
            budget_tokens=10000,
        ),
    ),
)
```

### Updating Settings at Runtime

Model settings can be changed mid-conversation using `LLMUpdateSettingsFrame`:

```python theme={null}
from pipecat.frames.frames import LLMUpdateSettingsFrame
from pipecat.services.anthropic.llm import AnthropicLLMSettings

await task.queue_frame(
    LLMUpdateSettingsFrame(
        delta=AnthropicLLMSettings(
            temperature=0.3,
            max_tokens=1024,
        )
    )
)
```

<Tip>
  The `InputParams` / `params=` pattern is deprecated as of v0.0.105. Use
  `Settings` / `settings=` instead. See the [Service Settings
  guide](/pipecat/fundamentals/service-settings) for migration details.
</Tip>

## Notes

* **Prompt caching**: When `enable_prompt_caching` is enabled, Anthropic caches repeated context to reduce costs. Cache control markers are automatically added to the most recent user messages. This is most effective for conversations with large system prompts or long conversation histories.
* **Extended thinking**: Enabling thinking increases response quality for complex tasks but adds latency. When `type="enabled"`, you must provide a `budget_tokens` value (minimum 1024 with current models). Extended thinking is disabled by default.
* **Custom clients**: You can pass custom Anthropic client instances (e.g., `AsyncAnthropicBedrock` or `AsyncAnthropicVertex`) via the `client` parameter to use Anthropic models through other cloud providers.
* **Retry behavior**: When `retry_on_timeout=True`, the first attempt uses the `retry_timeout_secs` timeout. If it times out, a second attempt is made with no timeout limit.
* **System instruction precedence**: If both `system_instruction` (from the constructor) and a system message in the context are set, the constructor's `system_instruction` takes precedence and a warning is logged.

## Event Handlers

`AnthropicLLMService` supports the following event handlers, inherited from [LLMService](/api-reference/server/events/service-events):

| Event                       | Description                                                             |
| --------------------------- | ----------------------------------------------------------------------- |
| `on_completion_timeout`     | Called when an LLM completion request times out                         |
| `on_function_calls_started` | Called when function calls are received and execution is about to start |

```python theme={null}
@llm.event_handler("on_completion_timeout")
async def on_completion_timeout(service):
    print("LLM completion timed out")
```
