> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Inworld

> Text-to-speech service using Inworld AI's TTS APIs

## Overview

Inworld provides high-quality, low-latency speech synthesis via two implementation types: `InworldTTSService` for real-time, minimal-latency use-cases through websockets and `InworldHttpTTSService` for streaming and non-streaming use-cases over HTTP. Featuring support for 12+ languages, timestamps, custom pronunciation and instant voice cloning.

<CardGroup cols={2}>
  <Card title="Inworld TTS API Reference" icon="code" href="https://reference-server.pipecat.ai/en/latest/api/pipecat.services.inworld.tts.html">
    Pipecat's API methods for Inworld TTS integration
  </Card>

  <Card title="Example Implementation (Websockets)" icon="play" href="https://github.com/pipecat-ai/pipecat/blob/main/examples/voice/voice-inworld.py">
    Complete example with Inworld TTS
  </Card>

  <Card title="Inworld Documentation" icon="book" href="https://docs.inworld.ai/docs/tts/tts">
    Official Inworld TTS API documentation
  </Card>

  <Card title="Inworld Portal" icon="microphone" href="https://platform.inworld.ai/">
    Create and manage voice models
  </Card>
</CardGroup>

## Installation

To use Inworld services, no additional dependencies are required beyond the base installation:

```bash theme={null}
uv add "pipecat-ai"
```

## Prerequisites

### Inworld Account Setup

Before using Inworld TTS services, you need:

1. **Inworld Account**: Sign up at [Inworld Studio](https://studio.inworld.ai/)
2. **API Key**: Generate an API key from your account dashboard
3. **Voice Selection**: Choose from available voice models

### Required Environment Variables

* `INWORLD_API_KEY`: Your Inworld API key for authentication

## Configuration

### InworldTTSService

WebSocket-based service for lowest latency streaming.

<ParamField path="api_key" type="str" required>
  Inworld API key.
</ParamField>

<ParamField path="voice_id" type="str" default="Ashley" deprecated>
  ID of the voice to use for synthesis. *Deprecated in v0.0.105. Use
  `settings=InworldTTSService.Settings(voice=...)` instead.*
</ParamField>

<ParamField path="model" type="str" default="inworld-tts-1.5-max" deprecated>
  ID of the model to use for synthesis. *Deprecated in v0.0.105. Use
  `settings=InworldTTSService.Settings(model=...)` instead.*
</ParamField>

<ParamField path="url" type="str" default="wss://api.inworld.ai/tts/v1/voice:streamBidirectional">
  URL of the Inworld WebSocket API.
</ParamField>

<ParamField path="sample_rate" type="int" default="None">
  Audio sample rate in Hz. When `None`, uses the pipeline's configured sample
  rate.
</ParamField>

<ParamField path="encoding" type="str" default="LINEAR16">
  Audio encoding format.
</ParamField>

<ParamField path="text_aggregation_mode" type="TextAggregationMode" default="TextAggregationMode.SENTENCE">
  Controls how incoming text is aggregated before synthesis. `SENTENCE`
  (default) buffers text until sentence boundaries, producing more natural
  speech. `TOKEN` streams tokens directly for lower latency. Import from
  `pipecat.services.tts_service`.
</ParamField>

<ParamField path="aggregate_sentences" type="bool" default="None" deprecated>
  *Deprecated in v0.0.104.* Use `text_aggregation_mode` instead.
</ParamField>

<ParamField path="append_trailing_space" type="bool" default="True">
  Whether to append a trailing space to text before sending to TTS.
</ParamField>

<ParamField path="params" type="InputParams" default="None" deprecated>
  *Deprecated in v0.0.105. Use `settings=InworldTTSService.Settings(...)`
  instead.*
</ParamField>

<ParamField path="settings" type="InworldTTSService.Settings" default="None">
  Runtime-configurable settings. See [InworldTTSService
  Settings](#inworldttsservice-settings) below.
</ParamField>

#### InworldTTSService Settings

Runtime-configurable settings passed via the `settings` constructor argument using `InworldTTSService.Settings(...)`. These can be updated mid-conversation with `TTSUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details.

| Parameter       | Type              | Default     | Description                            |
| --------------- | ----------------- | ----------- | -------------------------------------- |
| `model`         | `str`             | `None`      | Model identifier. *(Inherited.)*       |
| `voice`         | `str`             | `None`      | Voice identifier. *(Inherited.)*       |
| `language`      | `Language \| str` | `None`      | Language for synthesis. *(Inherited.)* |
| `speaking_rate` | `float`           | `NOT_GIVEN` | Speaking rate for speech synthesis.    |
| `temperature`   | `float`           | `NOT_GIVEN` | Temperature for speech synthesis.      |

### InworldHttpTTSService

HTTP-based service supporting both streaming and non-streaming modes.

<ParamField path="api_key" type="str" required>
  Inworld API key.
</ParamField>

<ParamField path="aiohttp_session" type="aiohttp.ClientSession" required>
  aiohttp ClientSession for HTTP requests.
</ParamField>

<ParamField path="voice_id" type="str" default="Ashley" deprecated>
  ID of the voice to use for synthesis. *Deprecated in v0.0.105. Use
  `settings=InworldHttpTTSService.Settings(voice=...)` instead.*
</ParamField>

<ParamField path="model" type="str" default="inworld-tts-1.5-max" deprecated>
  ID of the model to use for synthesis. *Deprecated in v0.0.105. Use
  `settings=InworldHttpTTSService.Settings(model=...)` instead.*
</ParamField>

<ParamField path="streaming" type="bool" default="True">
  Whether to use streaming mode.
</ParamField>

<ParamField path="sample_rate" type="int" default="None">
  Audio sample rate in Hz.
</ParamField>

<ParamField path="encoding" type="str" default="LINEAR16">
  Audio encoding format.
</ParamField>

<ParamField path="params" type="InputParams" default="None" deprecated>
  *Deprecated in v0.0.105. Use `settings=InworldHttpTTSService.Settings(...)`
  instead.*
</ParamField>

<ParamField path="settings" type="InworldHttpTTSService.Settings" default="None">
  Runtime-configurable settings. See [InworldTTSService
  Settings](#inworldttsservice-settings) below.
</ParamField>

## Usage

### Basic Setup (WebSocket)

```python theme={null}
from pipecat.services.inworld import InworldTTSService

tts = InworldTTSService(
    api_key=os.getenv("INWORLD_API_KEY"),
    settings=InworldTTSService.Settings(
        voice="Ashley",
    ),
)
```

### With Custom Settings

```python theme={null}
tts = InworldTTSService(
    api_key=os.getenv("INWORLD_API_KEY"),
    settings=InworldTTSService.Settings(
        voice="Ashley",
        model="inworld-tts-1.5-max",
        temperature=0.8,
        speaking_rate=1.1,
    ),
)
```

### HTTP Service

```python theme={null}
import aiohttp
from pipecat.services.inworld import InworldHttpTTSService

async with aiohttp.ClientSession() as session:
    tts = InworldHttpTTSService(
        api_key=os.getenv("INWORLD_API_KEY"),
        aiohttp_session=session,
        voice_id="Ashley",
        streaming=True,
    )
```

<Tip>
  The `InputParams` / `params=` pattern is deprecated as of v0.0.105. Use
  `Settings` / `settings=` instead. See the [Service Settings
  guide](/pipecat/fundamentals/service-settings) for migration details.
</Tip>

## Notes

* **WebSocket vs HTTP**: The WebSocket service (`InworldTTSService`) provides the lowest latency with bidirectional streaming and supports multiple independent audio contexts per connection (max 5). The HTTP service supports both streaming and non-streaming modes via the `streaming` parameter.
* **Word timestamps**: Both services provide word-level timestamps for synchronized text display. Timestamps are tracked cumulatively across utterances within a turn. When timestamps are not received from the service, a fallback mechanism ensures the full text is still committed to the LLM conversation context, even on interruption.
* **Auto mode**: When `auto_mode=True` (default), the server controls flushing of buffered text for optimal latency and quality. This is recommended when text is sent in full sentences or phrases (i.e., when using `text_aggregation_mode=TextAggregationMode.SENTENCE`).
* **Keepalive**: The WebSocket service sends periodic keepalive messages every 60 seconds to maintain the connection.

## Event Handlers

Inworld TTS supports the standard [service connection events](/api-reference/server/events/service-events):

| Event                 | Description                         |
| --------------------- | ----------------------------------- |
| `on_connected`        | Connected to Inworld WebSocket      |
| `on_disconnected`     | Disconnected from Inworld WebSocket |
| `on_connection_error` | WebSocket connection error occurred |

```python theme={null}
@tts.event_handler("on_connected")
async def on_connected(service):
    print("Connected to Inworld")
```
