Overview
VoiceAiTTSService converts text to speech using Voice.ai’s
Multi-Context WebSocket API. It maintains a persistent WebSocket connection that
streams raw PCM audio (32kHz mono), with support for multiple languages, custom
voice selection, and temperature/top_p generation controls.
Source Repository
Source code, examples, and issues for the Voice.ai integration
Voice.ai Website
Sign up and get a Voice.ai API key
API Documentation
Voice.ai Multi-Context WebSocket TTS API reference
Installation
This is a community-maintained package distributed separately frompipecat-ai.
It is not published to PyPI; install it from source:
Prerequisites
Voice.ai Account Setup
Before using the Voice.ai TTS service, you need:- Voice.ai Account: Sign up at Voice.ai
- API Key: Obtain an API key (format:
vk_*) from Voice.ai
Required Environment Variables
VOICEAI_API_KEY: Your Voice.ai API key for authentication
Configuration
Voice.ai API key for authentication (format:
vk_*).Voice identifier for synthesis. If not provided, uses the default built-in
voice.
WebSocket URL for the Voice.ai multi-context TTS API.
Output audio sample rate. Defaults to 32000 Hz (Voice.ai’s native rate) when
not set.
Voice synthesis settings. See Input Parameters below.
Whether to aggregate text by sentences before TTS. When
True, each sentence
is sent separately for lower latency; when False, larger text chunks are
batched for more natural flow at the cost of higher latency.Input Parameters
Synthesis settings passed via theparams constructor argument using
VoiceAiTTSService.InputParams(...).
| Parameter | Type | Default | Description |
|---|---|---|---|
language | Language | Language.EN | Target language. Supports en, ca, sv, es, fr, de, it, pt, pl, ru, nl. |
model | str | None | TTS model. Auto-selected based on language when not provided. |
audio_format | str | "pcm" | Audio output format (raw PCM). |
temperature | float | 1.0 | Generation temperature (0.0–2.0). Higher values are more random. |
top_p | float | 0.8 | Top-p sampling (0.0–1.0). Controls output diversity. |
Available parameters and defaults are defined by the integration and the
Voice.ai API. See the source
repository for the
authoritative, up-to-date list.