Skip to main content

Overview

Rime AI provides two TTS service implementations: RimeTTSService (WebSocket-based) with word-level timing and interruption support, and RimeHttpTTSService (HTTP-based) for simpler use cases. RimeTTSService is recommended for real-time interactive applications.

Installation

To use Rime services, install the required dependencies:
pip install "pipecat-ai[rime]"

Prerequisites

Rime Account Setup

Before using Rime TTS services, you need:
  1. Rime Account: Sign up at Rime AI
  2. API Key: Generate an API key from your account dashboard
  3. Voice Selection: Choose from available voice models

Required Environment Variables

  • RIME_API_KEY: Your Rime API key for authentication

Configuration

RimeTTSService

api_key
str
required
Rime API key for authentication.
voice_id
str
required
ID of the voice to use for synthesis.
url
str
default:"wss://users.rime.ai/ws2"
Rime WebSocket API endpoint.
model
str
default:"mistv2"
Model ID to use for synthesis.
sample_rate
int
default:"None"
Output audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.
aggregate_sentences
bool
default:"True"
Buffer text until sentence boundaries before sending to Rime.
params
InputParams
default:"None"
Runtime-configurable voice and generation settings. See InputParams (WebSocket) below.

RimeHttpTTSService

api_key
str
required
Rime API key for authentication.
voice_id
str
required
ID of the voice to use for synthesis.
aiohttp_session
aiohttp.ClientSession
required
An aiohttp session for HTTP requests.
model
str
default:"mistv2"
Model ID to use for synthesis.
sample_rate
int
default:"None"
Output audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.
params
InputParams
default:"None"
Runtime-configurable voice and generation settings. See InputParams (HTTP) below.

RimeNonJsonTTSService

A non-JSON WebSocket service for models like Arcana that use plain text messages.
api_key
str
required
Rime API key for authentication.
voice_id
str
required
ID of the voice to use for synthesis.
url
str
default:"wss://users.rime.ai/ws"
Rime WebSocket API endpoint.
model
str
default:"arcana"
Model ID to use for synthesis.
audio_format
str
default:"pcm"
Audio output format.
sample_rate
int
default:"None"
Output audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.
aggregate_sentences
bool
default:"True"
Buffer text until sentence boundaries before sending.
params
InputParams
default:"None"
Runtime-configurable settings. See InputParams (Non-JSON) below.

InputParams (WebSocket)

ParameterTypeDefaultDescription
languageLanguageLanguage.ENLanguage for synthesis.
speed_alphafloat1.0Speech speed multiplier.
reduce_latencyboolFalseWhether to reduce latency at potential quality cost.
pause_between_bracketsboolFalseWhether to add pauses between bracketed content.
phonemize_between_bracketsboolFalseWhether to phonemize bracketed content.

InputParams (HTTP)

ParameterTypeDefaultDescription
languageLanguageLanguage.ENLanguage for synthesis.
speed_alphafloat1.0Speech speed multiplier.
reduce_latencyboolFalseWhether to reduce latency at potential quality cost.
pause_between_bracketsboolFalseWhether to add pauses between bracketed content.
phonemize_between_bracketsboolFalseWhether to phonemize bracketed content.
inline_speed_alphastrNoneInline speed control markup.

InputParams (Non-JSON)

ParameterTypeDefaultDescription
languageLanguageNoneLanguage for synthesis.
segmentstrNoneText segmentation mode ("immediate", "bySentence", "never").
repetition_penaltyfloatNoneToken repetition penalty (1.0-2.0).
temperaturefloatNoneSampling temperature (0.0-1.0).
top_pfloatNoneCumulative probability threshold (0.0-1.0).
extradictNoneAdditional parameters to pass to the API.

Usage

Basic Setup (WebSocket)

from pipecat.services.rime import RimeTTSService

tts = RimeTTSService(
    api_key=os.getenv("RIME_API_KEY"),
    voice_id="cove",
)

With Customization (WebSocket)

from pipecat.services.rime import RimeTTSService
from pipecat.transcriptions.language import Language

tts = RimeTTSService(
    api_key=os.getenv("RIME_API_KEY"),
    voice_id="cove",
    model="mistv2",
    params=RimeTTSService.InputParams(
        language=Language.ES,
        speed_alpha=1.2,
        reduce_latency=True,
    ),
)

HTTP Service

import aiohttp
from pipecat.services.rime import RimeHttpTTSService

async with aiohttp.ClientSession() as session:
    tts = RimeHttpTTSService(
        api_key=os.getenv("RIME_API_KEY"),
        voice_id="cove",
        aiohttp_session=session,
    )

Non-JSON WebSocket (Arcana)

from pipecat.services.rime import RimeNonJsonTTSService

tts = RimeNonJsonTTSService(
    api_key=os.getenv("RIME_API_KEY"),
    voice_id="cove",
    model="arcana",
)

Customizing Speech

RimeTTSService provides a set of helper methods for implementing Rime-specific customizations, meant to be used as part of text transformers. These include methods for spelling out text, adjusting speech rate, and modifying pitch. See the Text Transformers for TTS section in the Text-to-Speech guide for usage examples.

SPELL(text: str) -> str:

Implements Rime’s spell function to spell out text character by character.
# Text transformers for TTS
# This will insert Rime's spell tags around the provided text.
async def spell_out_text(text: str, type: str) -> str:
    return RimeTTSService.SPELL(text)

tts = RimeTTSService(
    api_key=os.getenv("RIME_API_KEY"),
    text_transforms=[
        ("phone_number", spell_out_text),
    ],
)

PAUSE_TAG(seconds: float) -> str:

Implements Rime’s custom pause functionality to generate a properly formatted pause tag you can insert into the text.
# Text transformers for TTS
# This will insert a one second pause after questions.
async def pause_after_questions(text: str, type: str) -> str:
    if text.endswith("?"):
        return f"{text}{RimeTTSService.PAUSE_TAG(1.0)}"
    return text

tts = RimeTTSService(
    api_key=os.getenv("RIME_API_KEY"),
    text_transforms=[
        ("sentence", pause_after_questions), # Only apply to sentence aggregations
    ],
)

PRONOUNCE(self, text: str, word: str, phoneme: str) -> str:

Convenience method to support Rime’s custom pronunciations feature. It takes a word and its desired phoneme representation, returning the text with the provided word replaced by the appropriate phoneme tag.
# Text transformers for TTS
# This will a phoneme in place of the word "potato" to define how it
# should be pronounced.
async def maybe_say_potato_all_fancylike(text: str, type: str) -> str:
    if using_fancy_voice:
        return RimeTTSService.PRONOUNCE(text, "potato", "potato")
    else:
        return RimeTTSService.PRONOUNCE(text, "potato", "poteto")

tts = RimeTTSService(
    api_key=os.getenv("RIME_API_KEY"),
    text_transforms=[
        ("*", maybe_say_potato_all_fancylike), # Apply to all text
    ],
)

INLINE_SPEED(self, text: str, speed: float) -> str:

A convenience method to support Rime’s inline speed adjustment feature. It will wrap the provided text in the [] tags and add the provided speed to the inlineSpeedAlpha field in the request metadata.
# Text transformers for TTS
# This will make the word slow always be spoken more slowly.
async def slow_down_slow_words(text: str, type: str) -> str:
    return text.replace(
        "slow",
        RimeTTSService.INLINE_SPEED("slow", speed=0.5)
    )

tts = RimeTTSService(
    api_key=os.getenv("RIME_API_KEY"),
    text_transforms=[
        ("*", slow_down_slow_words), # Apply to all text
    ],
)

Notes

  • Word-level timestamps: RimeTTSService provides word-level timing information, enabling synchronized text highlighting.
  • WebSocket vs HTTP: The WebSocket service supports word-level timestamps, interruption handling, and maintains context across messages within a turn. The HTTP service is simpler but lacks these features.
  • Non-JSON WebSocket: RimeNonJsonTTSService is for models like Arcana that use plain text messages instead of JSON. It does not support word-level timestamps.

Event Handlers

Rime WebSocket TTS services support the standard service connection events:
EventDescription
on_connectedConnected to Rime WebSocket
on_disconnectedDisconnected from Rime WebSocket
on_connection_errorWebSocket connection error occurred
@tts.event_handler("on_connected")
async def on_connected(service):
    print("Connected to Rime")