Overview
Inworld provides high-quality, low-latency speech synthesis via two implementation types:InworldTTSService for real-time, minimal-latency use-cases through websockets and InworldHttpTTSService for streaming and non-streaming use-cases over HTTP. Featuring support for 12+ languages, timestamps, custom pronunciation and instant voice cloning.
Inworld TTS API Reference
Pipecat’s API methods for Inworld TTS integration
Example Implementation (Websockets)
Complete example with Inworld TTS
Inworld Documentation
Official Inworld TTS API documentation
Inworld Portal
Create and manage voice models
Installation
To use Inworld services, no additional dependencies are required beyond the base installation:Prerequisites
Inworld Account Setup
Before using Inworld TTS services, you need:- Inworld Account: Sign up at Inworld Studio
- API Key: Generate an API key from your account dashboard
- Voice Selection: Choose from available voice models
Required Environment Variables
INWORLD_API_KEY: Your Inworld API key for authentication
Configuration
InworldTTSService
WebSocket-based service for lowest latency streaming.Inworld API key.
ID of the voice to use for synthesis.
ID of the model to use for synthesis.
URL of the Inworld WebSocket API.
Audio sample rate in Hz. When
None, uses the pipeline’s configured sample rate.Audio encoding format.
Whether to aggregate sentences before synthesis.
Whether to append a trailing space to text before sending to TTS.
Runtime-configurable synthesis settings. See InworldTTSService InputParams below.
InworldTTSService InputParams
| Parameter | Type | Default | Description |
|---|---|---|---|
temperature | float | None | Temperature for speech synthesis. |
speaking_rate | float | None | Speaking rate for speech synthesis. |
apply_text_normalization | str | None | Whether to apply text normalization. |
max_buffer_delay_ms | int | None | Maximum buffer delay in milliseconds. Defaults to 3000 if not set. |
buffer_char_threshold | int | None | Buffer character threshold. Defaults to 250 if not set. |
auto_mode | bool | True | Server-controlled flushing for optimal latency and quality. Recommended when text is sent in full sentences/phrases. |
timestamp_transport_strategy | Literal["ASYNC", "SYNC"] | None | Strategy for timestamp transport. |
InworldHttpTTSService
HTTP-based service supporting both streaming and non-streaming modes.Inworld API key.
aiohttp ClientSession for HTTP requests.
ID of the voice to use for synthesis.
ID of the model to use for synthesis.
Whether to use streaming mode.
Audio sample rate in Hz.
Audio encoding format.
Runtime-configurable synthesis settings. See InworldHttpTTSService InputParams below.
InworldHttpTTSService InputParams
| Parameter | Type | Default | Description |
|---|---|---|---|
temperature | float | None | Temperature for speech synthesis. |
speaking_rate | float | None | Speaking rate for speech synthesis. |
timestamp_transport_strategy | Literal["ASYNC", "SYNC"] | None | Strategy for timestamp transport. |
Usage
Basic Setup (WebSocket)
With Custom Settings
HTTP Service
Notes
- WebSocket vs HTTP: The WebSocket service (
InworldTTSService) provides the lowest latency with bidirectional streaming and supports multiple independent audio contexts per connection (max 5). The HTTP service supports both streaming and non-streaming modes via thestreamingparameter. - Word timestamps: Both services provide word-level timestamps for synchronized text display. Timestamps are tracked cumulatively across utterances within a turn.
- Auto mode: When
auto_mode=True(default), the server controls flushing of buffered text for optimal latency and quality. This is recommended when text is sent in full sentences or phrases (i.e., whenaggregate_sentences=True). - Keepalive: The WebSocket service sends periodic keepalive messages every 60 seconds to maintain the connection.
Event Handlers
Inworld TTS supports the standard service connection events:| Event | Description |
|---|---|
on_connected | Connected to Inworld WebSocket |
on_disconnected | Disconnected from Inworld WebSocket |
on_connection_error | WebSocket connection error occurred |