Overview

Rime AI’s text-to-speech capabilities are available through two service implementations:

  • RimeTTSService: WebSocket-based implementation with word-level timing and interruption support
  • RimeHttpTTSService: HTTP-based implementation for simpler use cases

You can obtain a Rime API key by signing up at Rime.

RimeTTSService (WebSocket Service)

Uses Rime’s WebSocket JSON API for real-time speech synthesis with word-level timing information.

Constructor Parameters

api_key
str
required

Rime API key

voice_id
str
required

Rime voice identifier

url
str
default:
"wss://users-ws.rime.ai/ws2"

Rime WebSocket API endpoint

model
str
default:
"mistv2"

Model ID to use for synthesis

sample_rate
int
default:
"None"

Output audio sample rate in Hz

params
InputParams
default:
"InputParams()"

Speech generation parameters

Features

  • Word-level timing information
  • Support for interruptions
  • Context tracking across multiple messages
  • Real-time audio streaming
  • Proper sentence aggregation

RimeHttpTTSService (HTTP Service)

Constructor Parameters

api_key
str
required

Rime API key

voice_id
str
default:
"eva"

Rime voice identifier. See Rime’s documentation for supported voices.

model
str
default:
"mist"

Choose mist for hyper-realistic conversational voices or v1 for Rime’s first-gen model.

sample_rate
int
default:
"None"

Output audio sample rate in Hz

params
InputParams
default:
"InputParams()"

Speech generation parameters

Output Frames

Both services generate the following frames:

Control Frames

TTSStartedFrame
Frame

Signals start of speech synthesis

TTSStoppedFrame
Frame

Signals completion of speech synthesis

Audio Frames

TTSAudioRawFrame
Frame

Contains generated audio data: - PCM audio format - Specified sample rate - Single channel (mono)

Text Frames (WebSocket only)

TTSTextFrame
Frame

Contains word-level text with timing information

Error Frames

ErrorFrame
Frame

Contains Rime TTS error information

Usage Example

# WebSocket Service
from pipecat.services.rime import RimeTTSService

ws_tts = RimeTTSService(
    api_key="your-rime-api-key",
    voice_id="cove",
    model="mistv2",
    params=RimeTTSService.InputParams(
        language=Language.EN,
        speed_alpha=1.0
    )
)

# HTTP Service
from pipecat.services.rime import RimeHttpTTSService

http_tts = RimeHttpTTSService(
    api_key="your-rime-api-key",
    voice_id="eva",
    model="mist",
    params=RimeHttpTTSService.InputParams(
        speed_alpha=1.2,
        reduce_latency=True
    )
)

# Use in pipeline
pipeline = Pipeline([
    ...,
    llm,
    ws_tts,  # or http_tts
    transport.output(),
])

Frame Flow

Metrics Support

Both services collect processing metrics:

  • Time to First Byte (TTFB)
  • Character usage statistics

Service Comparison

FeatureWebSocketHTTP
Word timing-
Interruption support-
Bracket-based pauses-
Phoneme control-
Inline speed control-
Streaming audio