> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# OpenAI

> Text-to-speech service using OpenAI's TTS API

## Overview

`OpenAITTSService` provides high-quality text-to-speech synthesis using OpenAI's TTS API with multiple voice models including traditional TTS models and advanced GPT-based models. The service outputs 24kHz PCM audio with streaming capabilities for real-time applications.

<CardGroup cols={2}>
  <Card title="OpenAI TTS API Reference" icon="code" href="https://reference-server.pipecat.ai/en/latest/api/pipecat.services.openai.tts.html">
    Pipecat's API methods for OpenAI TTS integration
  </Card>

  <Card title="Example Implementation" icon="play" href="https://github.com/pipecat-ai/pipecat/blob/main/examples/voice/voice-openai.py">
    Complete example with voice customization
  </Card>

  <Card title="OpenAI Documentation" icon="book" href="https://platform.openai.com/docs/api-reference/audio/createSpeech">
    Official OpenAI TTS API documentation
  </Card>

  <Card title="Voice Samples" icon="microphone" href="https://platform.openai.com/docs/guides/text-to-speech/voice-options">
    Listen to available voice options
  </Card>
</CardGroup>

## Installation

To use OpenAI services, install the required dependencies:

```bash theme={null}
uv add "pipecat-ai[openai]"
```

## Prerequisites

### OpenAI Account Setup

Before using OpenAI TTS services, you need:

1. **OpenAI Account**: Sign up at [OpenAI Platform](https://platform.openai.com/)
2. **API Key**: Generate an API key from your [API keys page](https://platform.openai.com/api-keys)
3. **Voice Selection**: Choose from available voice options (alloy, ash, ballad, cedar, coral, echo, fable, marin, nova, onyx, sage, shimmer, verse)

### Required Environment Variables

* `OPENAI_API_KEY`: Your OpenAI API key for authentication

## Configuration

### OpenAITTSService

<ParamField path="api_key" type="str" default="None">
  OpenAI API key for authentication. If `None`, uses the `OPENAI_API_KEY`
  environment variable.
</ParamField>

<ParamField path="base_url" type="str" default="None">
  Custom base URL for OpenAI API. If `None`, uses the default OpenAI endpoint.
</ParamField>

<ParamField path="voice" type="str" default="alloy" deprecated>
  Voice ID to use for synthesis. Options: `alloy`, `ash`, `ballad`, `cedar`,
  `coral`, `echo`, `fable`, `marin`, `nova`, `onyx`, `sage`, `shimmer`, `verse`.

  *Deprecated in v0.0.105. Use `settings=OpenAITTSService.Settings(...)` instead.*
</ParamField>

<ParamField path="model" type="str" default="gpt-4o-mini-tts" deprecated>
  TTS model to use.

  *Deprecated in v0.0.105. Use `settings=OpenAITTSService.Settings(...)` instead.*
</ParamField>

<ParamField path="sample_rate" type="int" default="None">
  Output audio sample rate in Hz. If `None`, uses OpenAI's default 24kHz. OpenAI
  TTS only supports 24kHz output.
</ParamField>

<ParamField path="params" type="InputParams" default="None" deprecated>
  Runtime-configurable voice and generation settings. See
  [InputParams](#inputparams) below.

  *Deprecated in v0.0.105. Use `settings=OpenAITTSService.Settings(...)` instead.*
</ParamField>

<ParamField path="settings" type="OpenAITTSService.Settings" default="None">
  Runtime-configurable settings. See [Settings](#settings) below.
</ParamField>

### Settings

Runtime-configurable settings passed via the `settings` constructor argument using `OpenAITTSService.Settings(...)`. These can be updated mid-conversation with `TTSUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details.

| Parameter      | Type              | Default     | Description                                                                 |
| -------------- | ----------------- | ----------- | --------------------------------------------------------------------------- |
| `model`        | `str`             | `None`      | TTS model identifier. *(Inherited from base settings.)*                     |
| `voice`        | `str`             | `None`      | Voice identifier. *(Inherited from base settings.)*                         |
| `language`     | `Language \| str` | `None`      | Language for synthesis. *(Inherited from base settings.)*                   |
| `instructions` | `str`             | `NOT_GIVEN` | Instructions to guide voice synthesis behavior (e.g. affect, tone, pacing). |
| `speed`        | `float`           | `NOT_GIVEN` | Voice speed control (0.25 to 4.0).                                          |

## Usage

### Basic Setup

```python theme={null}
from pipecat.services.openai import OpenAITTSService

tts = OpenAITTSService(
    api_key=os.getenv("OPENAI_API_KEY"),
    settings=OpenAITTSService.Settings(
        voice="nova",
    ),
)
```

### With Voice Customization

```python theme={null}
from pipecat.services.openai import OpenAITTSService

tts = OpenAITTSService(
    api_key=os.getenv("OPENAI_API_KEY"),
    settings=OpenAITTSService.Settings(
        voice="coral",
        model="gpt-4o-mini-tts",
        instructions="Speak in a warm, friendly tone with moderate pacing.",
        speed=1.1,
    ),
)
```

### Updating Settings at Runtime

Voice settings can be changed mid-conversation using `TTSUpdateSettingsFrame`:

```python theme={null}
from pipecat.frames.frames import TTSUpdateSettingsFrame
from pipecat.services.openai.tts import OpenAITTSSettings

await task.queue_frame(
    TTSUpdateSettingsFrame(
        delta=OpenAITTSSettings(
            instructions="Now speak more formally.",
            speed=0.9,
        )
    )
)
```

<Tip>
  The `InputParams` / `params=` pattern is deprecated as of v0.0.105. Use
  `Settings` / `settings=` instead. See the [Service Settings
  guide](/pipecat/fundamentals/service-settings) for migration details.
</Tip>

## Notes

* **Fixed sample rate**: OpenAI TTS always outputs audio at 24kHz. Using a different sample rate may cause issues.
* **Model selection**: The `gpt-4o-mini-tts` model supports the `instructions` parameter for controlling voice affect and tone, which traditional TTS models do not support.
* **HTTP-based service**: OpenAI TTS uses HTTP streaming, so it does not have WebSocket connection events.
