Skip to main content

Overview

ElevenLabs provides high-quality text-to-speech synthesis with two service implementations:
  • ElevenLabsTTSService (WebSocket) — Real-time streaming with word-level timestamps, audio context management, and interruption handling. Recommended for interactive applications.
  • ElevenLabsHttpTTSService (HTTP) — Simpler batch-style synthesis. Suitable for non-interactive use cases or when WebSocket connections are not possible.

Installation

pip install "pipecat-ai[elevenlabs]"

Prerequisites

  1. ElevenLabs Account: Sign up at ElevenLabs
  2. API Key: Generate an API key from your account dashboard
  3. Voice Selection: Choose voice IDs from the voice library
Set the following environment variable:
export ELEVENLABS_API_KEY=your_api_key

Configuration

ElevenLabsTTSService

api_key
str
required
ElevenLabs API key.
voice_id
str
required
deprecated
Voice ID from the voice library. Deprecated in v0.0.105. Use settings=ElevenLabsTTSService.Settings(voice=...) instead.
model
str
default:"eleven_turbo_v2_5"
deprecated
ElevenLabs model ID. Use a multilingual model variant (e.g. eleven_multilingual_v2) if you need non-English language support. Deprecated in v0.0.105. Use settings=ElevenLabsTTSService.Settings(model=...) instead.
url
str
default:"wss://api.elevenlabs.io"
WebSocket endpoint URL. Override for custom or proxied deployments.
sample_rate
int
default:"None"
Output audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.
text_aggregation_mode
TextAggregationMode
default:"TextAggregationMode.SENTENCE"
Controls how incoming text is aggregated before synthesis. SENTENCE (default) buffers text until sentence boundaries, producing more natural speech. TOKEN streams tokens directly for lower latency. Import from pipecat.services.tts_service.
aggregate_sentences
bool
default:"None"
deprecated
Deprecated in v0.0.104. Use text_aggregation_mode instead.
params
InputParams
default:"None"
deprecated
Deprecated in v0.0.105. Use settings=ElevenLabsTTSService.Settings(...) instead.
settings
ElevenLabsTTSService.Settings
default:"None"
Runtime-configurable settings. See Settings below.

ElevenLabsHttpTTSService

The HTTP service accepts the same parameters as the WebSocket service, with these differences:
aiohttp_session
aiohttp.ClientSession
required
An aiohttp session for HTTP requests. You must create and manage this yourself.
base_url
str
default:"https://api.elevenlabs.io"
HTTP API base URL (instead of url for WebSocket).
The HTTP service uses ElevenLabsHttpTTSSettings which also includes:
optimize_streaming_latency
int
default:"None"
Latency optimization level (0–4). Higher values reduce latency at the cost of quality.

Settings

Runtime-configurable settings passed via the settings constructor argument using ElevenLabsTTSService.Settings(...). These can be updated mid-conversation with TTSUpdateSettingsFrame. See Service Settings for details.
ParameterTypeDefaultDescription
modelstrNoneElevenLabs model identifier. (Inherited from base settings.)
voicestrNoneVoice identifier. (Inherited from base settings.)
languageLanguage | strNoneLanguage code. Only effective with multilingual models. (Inherited from base settings.)
stabilityfloatNOT_GIVENVoice consistency (0.0–1.0). Lower values are more expressive, higher values are more consistent.
similarity_boostfloatNOT_GIVENVoice clarity and similarity to the original (0.0–1.0).
stylefloatNOT_GIVENStyle exaggeration (0.0–1.0). Higher values amplify the voice’s style.
use_speaker_boostboolNOT_GIVENEnhance clarity and target speaker similarity.
speedfloatNOT_GIVENSpeech rate. WebSocket: 0.7–1.2. HTTP: 0.25–4.0.
apply_text_normalizationLiteralNOT_GIVENText normalization: "auto", "on", or "off".
NOT_GIVEN values use the ElevenLabs API defaults. See ElevenLabs voice settings for details on how these parameters interact.

Usage

Basic Setup

from pipecat.services.elevenlabs import ElevenLabsTTSService

tts = ElevenLabsTTSService(
    api_key=os.getenv("ELEVENLABS_API_KEY"),
    settings=ElevenLabsTTSService.Settings(
        voice="21m00Tcm4TlvDq8ikWAM",  # Rachel
    ),
)

With Voice Customization

tts = ElevenLabsTTSService(
    api_key=os.getenv("ELEVENLABS_API_KEY"),
    settings=ElevenLabsTTSService.Settings(
        voice="21m00Tcm4TlvDq8ikWAM",
        model="eleven_multilingual_v2",
        language=Language.ES,
        stability=0.7,
        similarity_boost=0.8,
        speed=1.1,
    ),
)

Updating Settings at Runtime

Voice settings can be changed mid-conversation using TTSUpdateSettingsFrame:
from pipecat.frames.frames import TTSUpdateSettingsFrame
from pipecat.services.elevenlabs.tts import ElevenLabsTTSSettings

await task.queue_frame(
    TTSUpdateSettingsFrame(
        delta=ElevenLabsTTSSettings(
            stability=0.3,
            speed=1.1,
        )
    )
)

HTTP Service

import aiohttp
from pipecat.services.elevenlabs import ElevenLabsHttpTTSService

async with aiohttp.ClientSession() as session:
    tts = ElevenLabsHttpTTSService(
        api_key=os.getenv("ELEVENLABS_API_KEY"),
        settings=ElevenLabsHttpTTSService.Settings(
            voice="21m00Tcm4TlvDq8ikWAM",
        ),
        aiohttp_session=session,
    )
The InputParams / params= pattern is deprecated as of v0.0.105. Use Settings / settings= instead. See the Service Settings guide for migration details.

Notes

  • Multilingual models required for language: Setting language with a non-multilingual model (e.g. eleven_turbo_v2_5) has no effect. Use eleven_multilingual_v2 or similar.
  • WebSocket vs HTTP: The WebSocket service supports word-level timestamps and interruption handling, making it significantly better for interactive conversations. The HTTP service is simpler but lacks these features.
  • Text aggregation: Sentence aggregation is enabled by default (text_aggregation_mode=TextAggregationMode.SENTENCE). Buffering until sentence boundaries produces more natural speech. Set text_aggregation_mode=TextAggregationMode.TOKEN to stream tokens directly for lower latency, but you must also set auto_mode=False in settings when using TOKEN mode.

Event Handlers

ElevenLabs TTS supports the standard service connection events:
EventDescription
on_connectedConnected to ElevenLabs WebSocket
on_disconnectedDisconnected from ElevenLabs WebSocket
on_connection_errorWebSocket connection error occurred
@tts.event_handler("on_connected")
async def on_connected(service):
    print("Connected to ElevenLabs")