Cartesia

Overview

Cartesia provides two TTS service implementations:

CartesiaTTSService: WebSocket-based with streaming and word timestamps
CartesiaHttpTTSService: HTTP-based for simpler synthesis

CartesiaTTSService is recommended for real-time applications.

API Reference

Complete API documentation and method details

Cartesia Docs

Official Cartesia documentation and features

Example Code

Working example with interruption handling

Installation

To use Cartesia services, install the required dependencies:

pip install "pipecat-ai[cartesia]"

You’ll also need to set up your Cartesia API key as an environment variable: CARTESIA_API_KEY.

Get your API key by signing up at Cartesia.

Frames

Input

TextFrame - Text content to synthesize into speech
TTSSpeakFrame - Text that the TTS service should speak
TTSUpdateSettingsFrame - Runtime configuration updates (e.g., voice)
LLMFullResponseStartFrame / LLMFullResponseEndFrame - LLM response boundaries

Output

TTSStartedFrame - Signals start of synthesis
TTSAudioRawFrame - Generated audio data chunks
TTSStoppedFrame - Signals completion of synthesis
ErrorFrame - Connection or processing errors

Service Comparison

Feature	CartesiaTTSService (WebSocket)	CartesiaHttpTTSService (HTTP)
Streaming	✅ Real-time chunks	❌ Single audio block
Word Timestamps	✅ Precise timing	❌ Not available
Interruption	✅ Advanced handling	⚠️ Basic support
Latency	🚀 Low	📈 Higher
Best For	Interactive apps	Batch processing

Language Support

Supports multiple languages through the Language enum:

Language Code	Description	Service Code
`Language.DE`	German	`de`
`Language.EN`	English	`en`
`Language.ES`	Spanish	`es`
`Language.FR`	French	`fr`
`Language.HI`	Hindi	`hi`
`Language.IT`	Italian	`it`
`Language.JA`	Japanese	`ja`
`Language.KO`	Korean	`ko`
`Language.NL`	Dutch	`nl`
`Language.PL`	Polish	`pl`
`Language.PT`	Portuguese	`pt`
`Language.RU`	Russian	`ru`
`Language.SV`	Swedish	`sv`
`Language.TR`	Turkish	`tr`
`Language.ZH`	Chinese (Mandarin)	`zh`

Usage Example

WebSocket Service (Recommended)

Initialize the WebSocket service with your API key and desired voice:

from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.transcriptions.language import Language
import os

# Configure WebSocket service
tts = CartesiaTTSService(
    api_key=os.getenv("CARTESIA_API_KEY"),
    voice_id="your-voice-id",
    model="sonic-2",
    params=CartesiaTTSService.InputParams(
        language=Language.EN,
        speed="normal"
    )
)

# Use in pipeline
pipeline = Pipeline([
    transport.input(),
    stt,
    llm,
    tts,  # Word timestamps enable precise context updates
    transport.output()
])

HTTP Service

Initialize the HTTP service and use it in a pipeline:

# For simpler, non-streaming use cases
http_tts = CartesiaHttpTTSService(
    api_key=os.getenv("CARTESIA_API_KEY"),
    voice_id="your-voice-id",
    model="sonic-2",
    params=CartesiaHttpTTSService.InputParams(
        language=Language.EN
    )
)

Dynamic Configuration

Make settings updates by pushing a TTSUpdateSettingsFrame for the CartesiaTTSService:

from pipecat.frames.frames import TTSUpdateSettingsFrame

await task.queue_frame(
    TTSUpdateSettingsFrame(settings={"voice": "your-new-voice-id"})
)

Metrics

Both services provide:

Time to First Byte (TTFB) - Latency from text input to first audio
Processing Duration - Total synthesis time
Usage Metrics - Character count and synthesis statistics

Learn how to enable Metrics in your Pipeline.

Additional Notes

WebSocket Recommended: Use CartesiaTTSService for low-latency streaming and accurate context updates with word timestamps
Connection Management: WebSocket lifecycle is handled automatically with reconnection support
Sample Rate: Set globally in PipelineParams rather than per-service for consistency

API Reference

Services

Utilities

Frameworks

Pipeline

Overview

API Reference

Cartesia Docs

Example Code

Installation

Frames

Input

Output

Service Comparison

Language Support

Usage Example

WebSocket Service (Recommended)

HTTP Service

Dynamic Configuration

Metrics

Additional Notes

API Reference

Services

Utilities

Frameworks

Pipeline

​Overview

API Reference

Cartesia Docs

Example Code

​Installation

​Frames

​Input

​Output

​Service Comparison

​Language Support

​Usage Example

​WebSocket Service (Recommended)

​HTTP Service

​Dynamic Configuration

​Metrics

​Additional Notes

Overview

Installation

Frames

Input

Output

Service Comparison

Language Support

Usage Example

WebSocket Service (Recommended)

HTTP Service

Dynamic Configuration

Metrics

Additional Notes