> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# xAI

> Text-to-speech services using xAI's HTTP and WebSocket streaming APIs with support for 20 languages

## Overview

xAI provides two text-to-speech services:

* **XAIHttpTTSService**: Batch synthesis via HTTP API. Sends complete text and receives the full audio response.
* **XAITTSService**: Streaming synthesis via WebSocket. Streams text incrementally and receives audio chunks as they're synthesized, reducing latency.

Both support multiple languages and audio encoding formats.

<CardGroup cols={2}>
  <Card title="xAI TTS API Reference" icon="code" href="https://reference-server.pipecat.ai/en/latest/api/pipecat.services.xai.tts.html">
    Complete API reference for all parameters and methods
  </Card>

  <Card title="WebSocket Example" icon="play" href="https://github.com/pipecat-ai/pipecat/blob/main/examples/voice/voice-xai.py">
    Streaming WebSocket example with interruption handling
  </Card>

  <Card title="HTTP Example" icon="play" href="https://github.com/pipecat-ai/pipecat/blob/main/examples/voice/voice-xai-http.py">
    Batch HTTP example
  </Card>

  <Card title="xAI Documentation" icon="book" href="https://docs.x.ai/developers/rest-api-reference/inference/voice">
    Official xAI voice API documentation
  </Card>
</CardGroup>

## Installation

```bash theme={null}
uv add "pipecat-ai[xai]"
```

## Prerequisites

1. **xAI Account**: Sign up at [xAI](https://x.ai/)
2. **API Key**: Generate an API key from your account dashboard (also works with Grok API keys)

Set the following environment variable:

```bash theme={null}
export GROK_API_KEY=your_api_key
```

## Configuration

### XAIHttpTTSService

<ParamField path="api_key" type="str" required>
  xAI API key for authentication.
</ParamField>

<ParamField path="base_url" type="str" default="https://api.x.ai/v1/tts">
  xAI TTS endpoint URL. Override for custom or proxied deployments.
</ParamField>

<ParamField path="sample_rate" type="int" default="None">
  Output audio sample rate in Hz. When `None`, uses the pipeline's configured
  sample rate.
</ParamField>

<ParamField path="encoding" type="str" default="pcm">
  Output audio encoding format. Supported formats: `"pcm"`, `"mp3"`, `"wav"`,
  `"mulaw"`, `"alaw"`.
</ParamField>

<ParamField path="aiohttp_session" type="aiohttp.ClientSession" default="None">
  Optional shared aiohttp session for HTTP requests. If `None`, the service
  creates and manages its own session.
</ParamField>

<ParamField path="settings" type="XAIHttpTTSService.Settings" default="None">
  Runtime-configurable settings. See [Settings](#settings) below.
</ParamField>

### Settings

Runtime-configurable settings passed via the `settings` constructor argument using `XAIHttpTTSService.Settings(...)`. These can be updated mid-conversation with `TTSUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details.

| Parameter  | Type              | Default       | Description                                         |
| ---------- | ----------------- | ------------- | --------------------------------------------------- |
| `model`    | `str`             | `None`        | Model identifier. *(Inherited from base settings.)* |
| `voice`    | `str`             | `"eve"`       | Voice identifier. *(Inherited from base settings.)* |
| `language` | `Language \| str` | `Language.EN` | Language code. *(Inherited from base settings.)*    |

### XAITTSService

<ParamField path="api_key" type="str" required>
  xAI API key for authentication.
</ParamField>

<ParamField path="base_url" type="str" default="wss://api.x.ai/v1/tts">
  xAI TTS WebSocket endpoint URL. Override for custom or proxied deployments.
</ParamField>

<ParamField path="sample_rate" type="int" default="None">
  Output audio sample rate in Hz. When `None`, uses the pipeline's configured
  sample rate.
</ParamField>

<ParamField path="codec" type="str" default="pcm">
  Output audio codec. Supported codecs: `"pcm"`, `"wav"`, `"mulaw"`, `"alaw"`.
  Defaults to `"pcm"` so emitted `TTSAudioRawFrame` objects need no decoding
  downstream.
</ParamField>

<ParamField path="settings" type="XAITTSService.Settings" default="None">
  Runtime-configurable settings. Uses the same settings structure as
  `XAIHttpTTSService`. Changing voice or language settings at runtime reconnects
  the WebSocket with new query parameters.
</ParamField>

## Supported Languages

xAI TTS supports 20 languages. Use the `Language` enum from `pipecat.transcriptions.language`:

* Arabic (Egyptian, Saudi, UAE): `Language.AR`, `Language.AR_EG`, `Language.AR_SA`, `Language.AR_AE`
* Bengali: `Language.BN`
* Chinese: `Language.ZH`
* English: `Language.EN`
* French: `Language.FR`
* German: `Language.DE`
* Hindi: `Language.HI`
* Indonesian: `Language.ID`
* Italian: `Language.IT`
* Japanese: `Language.JA`
* Korean: `Language.KO`
* Portuguese (Brazil, Portugal): `Language.PT`, `Language.PT_BR`, `Language.PT_PT`
* Russian: `Language.RU`
* Spanish (Spain, Mexico): `Language.ES`, `Language.ES_ES`, `Language.ES_MX`
* Turkish: `Language.TR`
* Vietnamese: `Language.VI`

## Usage

### WebSocket Streaming (XAITTSService)

#### Basic Setup

```python theme={null}
import os
from pipecat.services.xai.tts import XAITTSService

tts = XAITTSService(
    api_key=os.getenv("GROK_API_KEY"),
    settings=XAITTSService.Settings(
        voice="eve",
    ),
)
```

#### With Custom Language

```python theme={null}
from pipecat.transcriptions.language import Language

tts = XAITTSService(
    api_key=os.getenv("GROK_API_KEY"),
    settings=XAITTSService.Settings(
        voice="eve",
        language=Language.ES,
    ),
)
```

#### With Custom Sample Rate and Codec

```python theme={null}
tts = XAITTSService(
    api_key=os.getenv("GROK_API_KEY"),
    sample_rate=24000,
    codec="wav",
    settings=XAITTSService.Settings(
        voice="eve",
    ),
)
```

### HTTP Batch (XAIHttpTTSService)

#### Basic Setup

```python theme={null}
import os
from pipecat.services.xai.tts import XAIHttpTTSService

tts = XAIHttpTTSService(
    api_key=os.getenv("GROK_API_KEY"),
    settings=XAIHttpTTSService.Settings(
        voice="eve",
    ),
)
```

#### With Custom Encoding

```python theme={null}
tts = XAIHttpTTSService(
    api_key=os.getenv("GROK_API_KEY"),
    encoding="mp3",
    settings=XAIHttpTTSService.Settings(
        voice="eve",
    ),
)
```

#### With Shared HTTP Session

```python theme={null}
import aiohttp

async with aiohttp.ClientSession() as session:
    tts = XAIHttpTTSService(
        api_key=os.getenv("GROK_API_KEY"),
        aiohttp_session=session,
        settings=XAIHttpTTSService.Settings(
            voice="eve",
        ),
    )
```

### Updating Settings at Runtime

Voice settings can be changed mid-conversation using `TTSUpdateSettingsFrame`. This works for both services:

```python theme={null}
from pipecat.frames.frames import TTSUpdateSettingsFrame
from pipecat.services.xai.tts import XAITTSSettings
from pipecat.transcriptions.language import Language

await task.queue_frame(
    TTSUpdateSettingsFrame(
        delta=XAITTSSettings(
            language=Language.FR,
        )
    )
)
```

Note: For `XAITTSService`, changing voice or language settings reconnects the WebSocket with updated query parameters.

## Notes

* **Service choice**:
  * Use `XAITTSService` (WebSocket) for lower latency streaming synthesis where audio begins playing before the full utterance finishes.
  * Use `XAIHttpTTSService` (HTTP) for simpler batch synthesis or when WebSocket connections are not available.
* **Default audio format**: Both services default to raw PCM output, which matches Pipecat's downstream expectations without extra decoding.
* **Encoding/codec options**: When using non-PCM formats (`mp3`, `wav`, `mulaw`, `alaw`), ensure your audio pipeline can handle the selected format.
* **Session management**:
  * `XAIHttpTTSService`: If you don't provide an `aiohttp_session`, the service creates and manages its own session lifecycle automatically.
  * `XAITTSService`: WebSocket connection is managed automatically; settings changes that affect URL parameters (voice, language) trigger a reconnection.
