> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# ElevenLabs

> Speech-to-text service implementation using ElevenLabs' file-based transcription API

## Overview

ElevenLabs provides two STT service implementations:

* **`ElevenLabsSTTService`** (HTTP) -- File-based transcription using ElevenLabs' Speech-to-Text API with segmented audio processing. Uploads audio files and receives transcription results directly.
* **`ElevenLabsRealtimeSTTService`** (WebSocket) -- Real-time streaming transcription with ultra-low latency, supporting both partial (interim) and committed (final) transcripts with manual or VAD-based commit strategies.

<CardGroup cols={2}>
  <Card title="ElevenLabs STT API Reference" icon="code" href="https://reference-server.pipecat.ai/en/latest/api/pipecat.services.elevenlabs.stt.html">
    Pipecat's API methods for ElevenLabs STT integration
  </Card>

  <Card title="Example Implementation" icon="play" href="https://github.com/pipecat-ai/pipecat/blob/main/examples/voice/voice-elevenlabs-http.py">
    Complete example with ElevenLabs STT and TTS
  </Card>

  <Card title="ElevenLabs Documentation" icon="book" href="https://elevenlabs.io/docs/api-reference/speech-to-text/get">
    Official ElevenLabs STT API documentation
  </Card>

  <Card title="ElevenLabs Platform" icon="microphone" href="https://elevenlabs.io/">
    Access API keys and speech-to-text models
  </Card>
</CardGroup>

## Installation

To use ElevenLabs STT services, install the required dependencies:

```bash theme={null}
uv add "pipecat-ai[elevenlabs]"
```

## Prerequisites

### ElevenLabs Account Setup

Before using ElevenLabs STT services, you need:

1. **ElevenLabs Account**: Sign up at [ElevenLabs Platform](https://elevenlabs.io/)
2. **API Key**: Generate an API key from your account dashboard
3. **Model Access**: Ensure access to the Scribe v2 transcription model (default: `scribe_v2`)

### Required Environment Variables

* `ELEVENLABS_API_KEY`: Your ElevenLabs API key for authentication

## ElevenLabsSTTService

<ParamField path="api_key" type="str" required>
  ElevenLabs API key for authentication.
</ParamField>

<ParamField path="aiohttp_session" type="aiohttp.ClientSession" required>
  An aiohttp session for HTTP requests. You must create and manage this
  yourself.
</ParamField>

<ParamField path="base_url" type="str" default="https://api.elevenlabs.io">
  Base URL for the ElevenLabs API.
</ParamField>

<ParamField path="model" type="str" default="scribe_v2" deprecated>
  Model ID for transcription. *Deprecated in v0.0.105. Use
  `settings=ElevenLabsSTTService.Settings(...)` instead.*
</ParamField>

<ParamField path="sample_rate" type="int" default="None">
  Audio sample rate in Hz. When `None`, uses the pipeline's configured sample
  rate.
</ParamField>

<ParamField path="settings" type="ElevenLabsSTTService.Settings" default="None">
  Runtime-configurable settings for the STT service. See [Settings](#settings)
  below.
</ParamField>

<ParamField path="params" type="ElevenLabsSTTService.InputParams" default="None" deprecated>
  Configuration parameters for the STT service. *Deprecated in v0.0.105. Use
  `settings=ElevenLabsSTTService.Settings(...)` instead.*
</ParamField>

<ParamField path="ttfs_p99_latency" type="float" default="ELEVENLABS_TTFS_P99">
  P99 latency from speech end to final transcript in seconds. Override for your
  deployment.
</ParamField>

### Settings

Runtime-configurable settings passed via the `settings` constructor argument using `ElevenLabsSTTService.Settings(...)`. These can be updated mid-conversation with `STTUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details.

| Parameter          | Type              | Default       | Description                                                              |
| ------------------ | ----------------- | ------------- | ------------------------------------------------------------------------ |
| `model`            | `str`             | `None`        | Model ID for transcription. *(Inherited from base STT settings.)*        |
| `language`         | `Language \| str` | `Language.EN` | Target language for transcription. *(Inherited from base STT settings.)* |
| `tag_audio_events` | `bool`            | `True`        | Include audio events like (laughter), (coughing) in transcription.       |

### Usage

```python theme={null}
import aiohttp
from pipecat.services.elevenlabs.stt import ElevenLabsSTTService

async with aiohttp.ClientSession() as session:
    stt = ElevenLabsSTTService(
        api_key=os.getenv("ELEVENLABS_API_KEY"),
        aiohttp_session=session,
    )
```

#### With Language and Audio Events

```python theme={null}
import aiohttp
from pipecat.services.elevenlabs.stt import ElevenLabsSTTService
from pipecat.transcriptions.language import Language

async with aiohttp.ClientSession() as session:
    stt = ElevenLabsSTTService(
        api_key=os.getenv("ELEVENLABS_API_KEY"),
        aiohttp_session=session,
        settings=ElevenLabsSTTService.Settings(
            language=Language.ES,
            tag_audio_events=False,
        ),
    )
```

### Notes

* The HTTP service uploads complete audio segments and is best for VAD-segmented transcription.
* Does not have connection events since it uses per-request HTTP calls.
* **Multilingual support**: ElevenLabs Scribe supports 99+ languages. The default is `Language.EN` (English). Set `language=None` in settings to enable automatic language detection, which will transcribe whatever language the user speaks.

## ElevenLabsRealtimeSTTService

<ParamField path="api_key" type="str" required>
  ElevenLabs API key for authentication.
</ParamField>

<ParamField path="base_url" type="str" default="api.elevenlabs.io">
  Base URL for the ElevenLabs WebSocket API.
</ParamField>

<ParamField path="model" type="str" default="scribe_v2_realtime" deprecated>
  Model ID for real-time transcription. *Deprecated in v0.0.105. Use
  `settings=ElevenLabsRealtimeSTTService.Settings(...)` instead.*
</ParamField>

<ParamField path="sample_rate" type="int" default="None">
  Audio sample rate in Hz. When `None`, uses the pipeline's configured sample
  rate.
</ParamField>

<ParamField path="settings" type="ElevenLabsRealtimeSTTService.Settings" default="None">
  Runtime-configurable settings for the Realtime STT service. See
  [Settings](#settings-2) below.
</ParamField>

<ParamField path="commit_strategy" type="CommitStrategy" default="CommitStrategy.MANUAL">
  How to segment speech. `CommitStrategy.MANUAL` uses Pipecat's VAD to control
  when transcript segments are committed. `CommitStrategy.VAD` uses ElevenLabs'
  built-in VAD for segment boundaries.
</ParamField>

<ParamField path="include_timestamps" type="bool" default="False">
  Whether to include word-level timestamps in transcripts.
</ParamField>

<ParamField path="enable_logging" type="bool" default="False">
  Whether to enable logging on ElevenLabs' side.
</ParamField>

<ParamField path="include_language_detection" type="bool" default="False">
  Whether to include language detection in transcripts.
</ParamField>

<ParamField path="params" type="ElevenLabsRealtimeSTTService.InputParams" default="None" deprecated>
  Configuration parameters for the STT service. *Deprecated in v0.0.105. Use
  `settings=ElevenLabsRealtimeSTTService.Settings(...)` instead.*
</ParamField>

<ParamField path="ttfs_p99_latency" type="float" default="ELEVENLABS_REALTIME_TTFS_P99">
  P99 latency from speech end to final transcript in seconds. Override for your
  deployment.
</ParamField>

### Settings

Runtime-configurable settings passed via the `settings` constructor argument using `ElevenLabsRealtimeSTTService.Settings(...)`. These can be updated mid-conversation with `STTUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details.

| Parameter                    | Type              | Default       | Description                                                                             |
| ---------------------------- | ----------------- | ------------- | --------------------------------------------------------------------------------------- |
| `model`                      | `str`             | `None`        | Model ID for transcription. *(Inherited from base STT settings.)*                       |
| `language`                   | `Language \| str` | `Language.EN` | Language for speech recognition. *(Inherited from base STT settings.)*                  |
| `vad_silence_threshold_secs` | `float`           | `None`        | Seconds of silence before VAD commits (0.3-3.0). Only used with VAD commit strategy.    |
| `vad_threshold`              | `float`           | `None`        | VAD sensitivity (0.1-0.9, lower is more sensitive). Only used with VAD commit strategy. |
| `min_speech_duration_ms`     | `int`             | `None`        | Minimum speech duration for VAD (50-2000ms). Only used with VAD commit strategy.        |
| `min_silence_duration_ms`    | `int`             | `None`        | Minimum silence duration for VAD (50-2000ms). Only used with VAD commit strategy.       |

### Usage

```python theme={null}
from pipecat.services.elevenlabs.stt import ElevenLabsRealtimeSTTService

stt = ElevenLabsRealtimeSTTService(
    api_key=os.getenv("ELEVENLABS_API_KEY"),
)
```

#### With Timestamps and Custom Commit Strategy

```python theme={null}
from pipecat.services.elevenlabs.stt import ElevenLabsRealtimeSTTService, CommitStrategy

stt = ElevenLabsRealtimeSTTService(
    api_key=os.getenv("ELEVENLABS_API_KEY"),
    language_code="eng",
    commit_strategy=CommitStrategy.VAD,
    include_timestamps=True,
    settings=ElevenLabsRealtimeSTTService.Settings(
        vad_silence_threshold_secs=1.0,
    ),
)
```

### Notes

* **Commit strategies**: Defaults to `manual` commit strategy, where Pipecat's VAD controls when transcription segments are committed. Set `commit_strategy=CommitStrategy.VAD` to let ElevenLabs handle segment boundaries. When using `MANUAL` commit strategy, transcription frames are marked as finalized (`TranscriptionFrame.finalized=True`).
* **Keepalive**: Sends silent audio chunks as keepalive to prevent idle disconnections (keepalive interval: 5s, timeout: 10s).
* **Auto-reconnect**: Automatically reconnects if the WebSocket connection is closed when new audio arrives.
* **Multilingual support**: ElevenLabs Scribe supports 99+ languages. The Realtime service defaults to automatic language detection (`language=None`). To restrict transcription to a specific language, set `language` in settings.

### Event Handlers

Supports the standard [service connection events](/api-reference/server/events/service-events):

| Event             | Description                                         |
| ----------------- | --------------------------------------------------- |
| `on_connected`    | Connected to ElevenLabs Realtime STT WebSocket      |
| `on_disconnected` | Disconnected from ElevenLabs Realtime STT WebSocket |

```python theme={null}
@stt.event_handler("on_connected")
async def on_connected(service):
    print("Connected to ElevenLabs Realtime STT")
```

<Tip>
  The `InputParams` / `params=` pattern is deprecated as of v0.0.105. Use
  `Settings` / `settings=` instead. See the [Service Settings
  guide](/pipecat/fundamentals/service-settings) for migration details.
</Tip>
