> ## Documentation Index > Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt > Use this file to discover all available pages before exploring further. # Inworld > Text-to-speech service using Inworld AI's Realtime TTS-2 (and TTS-1.5) models ## Overview Inworld provides high-quality, low-latency speech synthesis via two implementation types: `InworldTTSService` for real-time, minimal-latency use-cases through websockets and `InworldHttpTTSService` for streaming and non-streaming use-cases over HTTP. Featuring support for 12+ languages, timestamps, custom pronunciation and instant voice cloning. ## Models The default model is **Realtime TTS-2** (`inworld-tts-2`). Realtime TTS-1.5-Max (`inworld-tts-1.5-max`) and Realtime TTS-1.5-Mini (`inworld-tts-1.5-mini`) remain available. | Display name | Model ID | | -------------------------- | ---------------------- | | Realtime TTS-2 *(default)* | `inworld-tts-2` | | Realtime TTS-1.5-Max | `inworld-tts-1.5-max` | | Realtime TTS-1.5-Mini | `inworld-tts-1.5-mini` | Pipecat's API methods for Inworld TTS integration Complete example with Inworld TTS Official Inworld TTS API documentation Create and manage voice models ## Installation To use Inworld services, no additional dependencies are required beyond the base installation: ```bash theme={null} uv add "pipecat-ai" ``` ## Prerequisites ### Inworld Account Setup Before using Inworld TTS services, you need: 1. **Inworld Account**: Sign up at [Inworld Studio](https://studio.inworld.ai/) 2. **API Key**: Generate an API key from your account dashboard 3. **Voice Selection**: Choose from available voice models ### Required Environment Variables * `INWORLD_API_KEY`: Your Inworld API key for authentication ## Configuration ### InworldTTSService WebSocket-based service for lowest latency streaming. Inworld API key. ID of the voice to use for synthesis. *Deprecated in v0.0.105. Use `settings=InworldTTSService.Settings(voice=...)` instead.* ID of the model to use for synthesis. *Deprecated in v0.0.105. Use `settings=InworldTTSService.Settings(model=...)` instead.* URL of the Inworld WebSocket API. Audio sample rate in Hz. When `None`, uses the pipeline's configured sample rate. Audio encoding format. Controls how incoming text is aggregated before synthesis. `SENTENCE` (default) buffers text until sentence boundaries, producing more natural speech. `TOKEN` streams tokens directly for lower latency. Import from `pipecat.services.tts_service`. *Deprecated in v0.0.104.* Use `text_aggregation_mode` instead. Whether to append a trailing space to text before sending to TTS. *Deprecated in v0.0.105. Use `settings=InworldTTSService.Settings(...)` instead.* Runtime-configurable settings. See [InworldTTSService Settings](#inworldttsservice-settings) below. #### InworldTTSService Settings Runtime-configurable settings passed via the `settings` constructor argument using `InworldTTSService.Settings(...)`. These can be updated mid-conversation with `TTSUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details. | Parameter | Type | Default | Description | | --------------- | -------------------------------------- | ----------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `model` | `str` | `None` | Model identifier. *(Inherited.)* | | `voice` | `str` | `None` | Voice identifier. *(Inherited.)* | | `language` | `Language \| str` | `None` | Language for synthesis. *(Inherited.)* | | `speaking_rate` | `float` | `NOT_GIVEN` | Speaking rate for speech synthesis. | | `temperature` | `float` | `NOT_GIVEN` | Temperature for speech synthesis. | | `delivery_mode` | `"STABLE" \| "BALANCED" \| "CREATIVE"` | `NOT_GIVEN` | Controls the stability vs. creativity tradeoff. `"STABLE"` produces reliable, predictable speech. `"BALANCED"` is the default midpoint. `"CREATIVE"` produces more expressive, emotionally varied speech. Only supported by `inworld-tts-2`. | ### InworldHttpTTSService HTTP-based service supporting both streaming and non-streaming modes. Inworld API key. aiohttp ClientSession for HTTP requests. ID of the voice to use for synthesis. *Deprecated in v0.0.105. Use `settings=InworldHttpTTSService.Settings(voice=...)` instead.* ID of the model to use for synthesis. *Deprecated in v0.0.105. Use `settings=InworldHttpTTSService.Settings(model=...)` instead.* Whether to use streaming mode. Audio sample rate in Hz. Audio encoding format. *Deprecated in v0.0.105. Use `settings=InworldHttpTTSService.Settings(...)` instead.* Runtime-configurable settings. See [InworldTTSService Settings](#inworldttsservice-settings) below. ## Usage ### Basic Setup (WebSocket) ```python theme={null} from pipecat.services.inworld import InworldTTSService tts = InworldTTSService( api_key=os.getenv("INWORLD_API_KEY"), settings=InworldTTSService.Settings( voice="Ashley", ), ) ``` ### With Custom Settings ```python theme={null} tts = InworldTTSService( api_key=os.getenv("INWORLD_API_KEY"), settings=InworldTTSService.Settings( voice="Ashley", model="inworld-tts-2", temperature=0.8, speaking_rate=1.1, ), ) ``` ### HTTP Service ```python theme={null} import aiohttp from pipecat.services.inworld import InworldHttpTTSService async with aiohttp.ClientSession() as session: tts = InworldHttpTTSService( api_key=os.getenv("INWORLD_API_KEY"), aiohttp_session=session, voice_id="Ashley", streaming=True, ) ``` The `InputParams` / `params=` pattern is deprecated as of v0.0.105. Use `Settings` / `settings=` instead. See the [Service Settings guide](/pipecat/fundamentals/service-settings) for migration details. ## Notes * **WebSocket vs HTTP**: The WebSocket service (`InworldTTSService`) provides the lowest latency with bidirectional streaming and supports multiple independent audio contexts per connection (max 5). The HTTP service supports both streaming and non-streaming modes via the `streaming` parameter. * **Word timestamps**: Both services provide word-level timestamps for synchronized text display. Timestamps are tracked cumulatively across utterances within a turn. When timestamps are not received from the service, a fallback mechanism ensures the full text is still committed to the LLM conversation context, even on interruption. * **Auto mode**: When `auto_mode=True` (default), the server controls flushing of buffered text for optimal latency and quality. This is recommended when text is sent in full sentences or phrases (i.e., when using `text_aggregation_mode=TextAggregationMode.SENTENCE`). * **Keepalive**: The WebSocket service sends periodic keepalive messages every 60 seconds to maintain the connection. ## Event Handlers Inworld TTS supports the standard [service connection events](/api-reference/server/events/service-events): | Event | Description | | --------------------- | ----------------------------------- | | `on_connected` | Connected to Inworld WebSocket | | `on_disconnected` | Disconnected from Inworld WebSocket | | `on_connection_error` | WebSocket connection error occurred | ```python theme={null} @tts.event_handler("on_connected") async def on_connected(service): print("Connected to Inworld") ```