Overview

RimeHttpTTSService provides text-to-speech capabilities using Rime AI’s TTS service. It supports streaming audio output and various speech customization options.

You can obtain a Rime API key by signing up at Rime.

Configuration

Constructor Parameters

api_key
str
required

Rime API key

voice_id
str
default: "eva"

Rime voice identifier. See Rime’s documentation for supported voices.

model
str
default: "mist"

Choose mist for hyper-realistic conversational voices or v1 for Rime’s first-gen model.

sample_rate
int
default: "24000"

The value, if provided, must be between 4000 and 44100. Default: 24000

params
InputParams
default: "InputParams()"

Speech generation parameters

Output Frames

Control Frames

TTSStartedFrame
Frame

Signals start of speech synthesis

TTSStoppedFrame
Frame

Signals completion of speech synthesis

Audio Frames

TTSAudioRawFrame
Frame

Contains generated audio data with: - PCM audio format - Specified sample rate

  • Single channel (mono)

Error Frames

ErrorFrame
Frame

Contains Rime TTS error information

Usage Example

from pipecat.services.rime import RimeHttpTTSService

# Configure service
tts_service = RimeHttpTTSService(
    api_key="your-rime-api-key",
    voice_id="eva",
    model="mist",
    params=RimeHttpTTSService.InputParams(
        speed_alpha=1.2,
        reduce_latency=True
    )
)

# Use in pipeline
pipeline = Pipeline([
    text_input,         # Produces text
    tts_service,        # Converts text to speech
    audio_output        # Plays audio
])

Frame Flow

Metrics Support

The service collects processing metrics:

  • Time to First Byte (TTFB)
  • Character usage statistics

Notes

  • Supports streaming audio output
  • Configurable speech speed
  • Latency optimization options
  • Bracket-based text processing
  • Thread-safe processing
  • Automatic error handling
  • Chunked audio delivery