ElevenLabs

Overview

ElevenLabs provides two STT service implementations:

ElevenLabsSTTService (HTTP) — File-based transcription using ElevenLabs’ Speech-to-Text API with segmented audio processing. Uploads audio files and receives transcription results directly.
ElevenLabsRealtimeSTTService (WebSocket) — Real-time streaming transcription with ultra-low latency, supporting both partial (interim) and committed (final) transcripts with manual or VAD-based commit strategies.

ElevenLabs STT API Reference

Pipecat’s API methods for ElevenLabs STT integration

Example Implementation

Complete example with ElevenLabs STT and TTS

ElevenLabs Documentation

Official ElevenLabs STT API documentation

ElevenLabs Platform

Access API keys and speech-to-text models

Installation

To use ElevenLabs STT services, install the required dependencies:

uv add "pipecat-ai[elevenlabs]"

Prerequisites

ElevenLabs Account Setup

Before using ElevenLabs STT services, you need:

ElevenLabs Account: Sign up at ElevenLabs Platform
API Key: Generate an API key from your account dashboard
Model Access: Ensure access to the Scribe v2 transcription model (default: scribe_v2)

Required Environment Variables

ELEVENLABS_API_KEY: Your ElevenLabs API key for authentication

ElevenLabsSTTService

str

required

ElevenLabs API key for authentication.

aiohttp.ClientSession

required

An aiohttp session for HTTP requests. You must create and manage this yourself.

str

default:"https://api.elevenlabs.io"

Base URL for the ElevenLabs API.

str

default:"scribe_v2"

deprecated

Model ID for transcription. Deprecated in v0.0.105. Use settings=ElevenLabsSTTService.Settings(...) instead.

int

default:"None"

Audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.

ElevenLabsSTTService.Settings

default:"None"

Runtime-configurable settings for the STT service. See Settings below.

ElevenLabsSTTService.InputParams

default:"None"

deprecated

Configuration parameters for the STT service. Deprecated in v0.0.105. Use settings=ElevenLabsSTTService.Settings(...) instead.

float

default:"ELEVENLABS_TTFS_P99"

P99 latency from speech end to final transcript in seconds. Override for your deployment.

Settings

Runtime-configurable settings passed via the settings constructor argument using ElevenLabsSTTService.Settings(...). These can be updated mid-conversation with STTUpdateSettingsFrame. See Service Settings for details.

Parameter	Type	Default	Description
`model`	`str`	`None`	Model ID for transcription. (Inherited from base STT settings.)
`language`	`Language \| str`	`Language.EN`	Target language for transcription. (Inherited from base STT settings.)
`tag_audio_events`	`bool`	`True`	Include audio events like (laughter), (coughing) in transcription.
`keyterms`	`list[str]`	`None`	List of key terms or phrases to bias transcription towards.

Usage

import aiohttp
from pipecat.services.elevenlabs.stt import ElevenLabsSTTService

async with aiohttp.ClientSession() as session:
    stt = ElevenLabsSTTService(
        api_key=os.getenv("ELEVENLABS_API_KEY"),
        aiohttp_session=session,
    )

With Language and Audio Events

import aiohttp
from pipecat.services.elevenlabs.stt import ElevenLabsSTTService
from pipecat.transcriptions.language import Language

async with aiohttp.ClientSession() as session:
    stt = ElevenLabsSTTService(
        api_key=os.getenv("ELEVENLABS_API_KEY"),
        aiohttp_session=session,
        settings=ElevenLabsSTTService.Settings(
            language=Language.ES,
            tag_audio_events=False,
        ),
    )

Notes

The HTTP service uploads complete audio segments and is best for VAD-segmented transcription.
Does not have connection events since it uses per-request HTTP calls.
Multilingual support: ElevenLabs Scribe supports 99+ languages. The default is Language.EN (English). Set language=None in settings to enable automatic language detection, which will transcribe whatever language the user speaks.

ElevenLabsRealtimeSTTService

str

required

ElevenLabs API key for authentication.

str

default:"api.elevenlabs.io"

Base URL for the ElevenLabs WebSocket API.

str

default:"scribe_v2_realtime"

deprecated

Model ID for real-time transcription. Deprecated in v0.0.105. Use settings=ElevenLabsRealtimeSTTService.Settings(...) instead.

int

default:"None"

Audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.

ElevenLabsRealtimeSTTService.Settings

default:"None"

Runtime-configurable settings for the Realtime STT service. See Settings below.

CommitStrategy

default:"CommitStrategy.MANUAL"

How to segment speech. CommitStrategy.MANUAL uses Pipecat’s VAD to control when transcript segments are committed. CommitStrategy.VAD uses ElevenLabs’ built-in VAD for segment boundaries.

bool

default:"False"

Whether to include word-level timestamps in transcripts.

bool

default:"False"

Whether to enable logging on ElevenLabs’ side.

bool

default:"False"

Whether to include language detection in transcripts.

ElevenLabsRealtimeSTTService.InputParams

default:"None"

deprecated

Configuration parameters for the STT service. Deprecated in v0.0.105. Use settings=ElevenLabsRealtimeSTTService.Settings(...) instead.

float

default:"ELEVENLABS_REALTIME_TTFS_P99"

P99 latency from speech end to final transcript in seconds. Override for your deployment.

Settings

Runtime-configurable settings passed via the settings constructor argument using ElevenLabsRealtimeSTTService.Settings(...). These can be updated mid-conversation with STTUpdateSettingsFrame. See Service Settings for details.

Parameter	Type	Default	Description
`model`	`str`	`None`	Model ID for transcription. (Inherited from base STT settings.)
`language`	`Language \| str`	`None`	Language for speech recognition. (Inherited from base STT settings.)
`keyterms`	`list[str]`	`None`	List of key terms or phrases to bias transcription towards.
`vad_silence_threshold_secs`	`float`	`None`	Seconds of silence before VAD commits (0.3-3.0). Only used with VAD commit strategy.
`vad_threshold`	`float`	`None`	VAD sensitivity (0.1-0.9, lower is more sensitive). Only used with VAD commit strategy.
`min_speech_duration_ms`	`int`	`None`	Minimum speech duration for VAD (50-2000ms). Only used with VAD commit strategy.
`min_silence_duration_ms`	`int`	`None`	Minimum silence duration for VAD (50-2000ms). Only used with VAD commit strategy.

Usage

from pipecat.services.elevenlabs.stt import ElevenLabsRealtimeSTTService

stt = ElevenLabsRealtimeSTTService(
    api_key=os.getenv("ELEVENLABS_API_KEY"),
)

With Timestamps and Custom Commit Strategy

from pipecat.services.elevenlabs.stt import ElevenLabsRealtimeSTTService, CommitStrategy

stt = ElevenLabsRealtimeSTTService(
    api_key=os.getenv("ELEVENLABS_API_KEY"),
    language_code="eng",
    commit_strategy=CommitStrategy.VAD,
    include_timestamps=True,
    settings=ElevenLabsRealtimeSTTService.Settings(
        vad_silence_threshold_secs=1.0,
    ),
)

Notes

Commit strategies: Defaults to manual commit strategy, where Pipecat’s VAD controls when transcription segments are committed. Set commit_strategy=CommitStrategy.VAD to let ElevenLabs handle segment boundaries. When using MANUAL commit strategy, transcription frames are marked as finalized (TranscriptionFrame.finalized=True).
Keepalive: Sends silent audio chunks as keepalive to prevent idle disconnections (keepalive interval: 5s, timeout: 10s).
Auto-reconnect: Automatically reconnects if the WebSocket connection is closed when new audio arrives.
Multilingual support: ElevenLabs Scribe supports 99+ languages. The Realtime service defaults to automatic language detection (language=None). To restrict transcription to a specific language, set language in settings.

Event Handlers

Supports the standard service connection events:

Event	Description
`on_connected`	Connected to ElevenLabs Realtime STT WebSocket
`on_disconnected`	Disconnected from ElevenLabs Realtime STT WebSocket

@stt.event_handler("on_connected")
async def on_connected(service):
    print("Connected to ElevenLabs Realtime STT")

The InputParams / params= pattern is deprecated as of v0.0.105. Use Settings / settings= instead. See the Service Settings guide for migration details.

Pipecat Server

Client SDKs

Pipecat Flows

Pipecat Cloud

CLI

Pipecat Context Hub

Overview

ElevenLabs STT API Reference

Example Implementation

ElevenLabs Documentation

ElevenLabs Platform

Installation

Prerequisites

ElevenLabs Account Setup

Required Environment Variables

ElevenLabsSTTService

Settings

Usage

With Language and Audio Events

Notes

ElevenLabsRealtimeSTTService

Settings

Usage

With Timestamps and Custom Commit Strategy

Notes

Event Handlers

​Overview

ElevenLabs STT API Reference

Example Implementation

ElevenLabs Documentation

ElevenLabs Platform

​Installation

​Prerequisites

​ElevenLabs Account Setup

​Required Environment Variables

​ElevenLabsSTTService

​Settings

​Usage

​With Language and Audio Events

​Notes

​ElevenLabsRealtimeSTTService

​Settings

​Usage

​With Timestamps and Custom Commit Strategy

​Notes

​Event Handlers

Overview

Installation

Prerequisites

ElevenLabs Account Setup

Required Environment Variables

ElevenLabsSTTService

Settings

Usage

With Language and Audio Events

Notes

ElevenLabsRealtimeSTTService

Settings

Usage

With Timestamps and Custom Commit Strategy

Notes

Event Handlers