> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Ultravox Realtime

> Real-time speech-to-speech service implementation using Ultravox's Realtime API

## Overview

`UltravoxRealtimeLLMService` provides real-time conversational AI capabilities using Ultravox's Realtime API. It supports both text and audio modalities with voice transcription, streaming responses, and tool usage for creating interactive AI experiences.

<CardGroup cols={2}>
  <Card title="Ultravox Realtime API Reference" icon="code" href="https://reference-server.pipecat.ai/en/latest/api/pipecat.services.ultravox.llm.html">
    Pipecat's API methods for Ultravox Realtime integration
  </Card>

  <Card title="Example Implementation" icon="play" href="https://github.com/pipecat-ai/pipecat/blob/main/examples/realtime/realtime-ultravox.py">
    Complete Ultravox Realtime conversation example
  </Card>

  <Card title="Ultravox Documentation" icon="book" href="https://docs.ultravox.ai/overview">
    Official Ultravox API documentation
  </Card>

  <Card title="Ultravox Console" icon="external-link" href="https://app.ultravox.ai/">
    Access Ultravox models and manage API keys
  </Card>
</CardGroup>

## Installation

To use Ultravox Realtime services, install the required dependencies:

```bash theme={null}
uv add "pipecat-ai[ultravox]"
```

## Prerequisites

### Ultravox Account Setup

Before using Ultravox Realtime services, you need:

1. **Ultravox Account**: Sign up at [Ultravox Console](https://app.ultravox.ai/)
2. **API Key**: Generate an Ultravox API key from your account dashboard
3. **Model Access**: Ensure access to Ultravox Realtime models
4. **Usage Limits**: Configure appropriate usage limits and billing

### Required Environment Variables

* `ULTRAVOX_API_KEY`: Your Ultravox API key for authentication

### Key Features

* **Audio-Native Model**: Ultravox is an audio-native model for natural voice interactions
* **Real-time Streaming**: Low-latency audio processing and streaming responses
* **Multiple Input Modes**: Support for Agent, One-Shot, and Join URL input parameters
* **Voice Transcription**: Built-in transcription with streaming output
* **Function Calling**: Support for tool integration and API calling
* **Configurable Duration**: Set maximum call duration limits

## Configuration

### UltravoxRealtimeLLMService

<ParamField path="params" type="AgentInputParams | OneShotInputParams | JoinUrlInputParams" required>
  Configuration parameters for connecting to Ultravox. One of three input
  parameter types must be provided. See [Input Parameter
  Types](#input-parameter-types) below.
</ParamField>

<ParamField path="one_shot_selected_tools" type="ToolsSchema" default="None">
  Tools to use with a one-shot call. May only be set when using
  `OneShotInputParams`.
</ParamField>

<ParamField path="settings" type="UltravoxRealtimeLLMService.Settings" default="None">
  Runtime-configurable settings. See [Settings](#settings) below.
</ParamField>

### Settings

Runtime-configurable settings passed via the `settings` constructor argument using `UltravoxRealtimeLLMService.Settings(...)`. These can be updated mid-conversation with `LLMUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details.

| Parameter       | Type  | Default     | Description                                                     |
| --------------- | ----- | ----------- | --------------------------------------------------------------- |
| `model`         | `str` | `NOT_GIVEN` | Model identifier. *(Inherited from base settings.)*             |
| `output_medium` | `str` | `NOT_GIVEN` | Output medium: `"voice"` for audio or `"text"` for text output. |

<Note>
  `NOT_GIVEN` values are omitted, letting the service use its own defaults. Only
  parameters that are explicitly set are included.
</Note>

### Input Parameter Types

Ultravox supports three different ways to create or join a call:

#### AgentInputParams

Use a pre-configured Ultravox Agent to handle calls consistently.

| Parameter          | Type             | Default  | Description                                                                                                              |
| ------------------ | ---------------- | -------- | ------------------------------------------------------------------------------------------------------------------------ |
| `api_key`          | `str`            | required | Ultravox API key for authentication.                                                                                     |
| `agent_id`         | `UUID`           | required | The ID of the Ultravox agent. Create and edit agents in the [Ultravox Console](https://app.ultravox.ai/agents).          |
| `template_context` | `Dict[str, Any]` | `{}`     | Context variables for agent template instantiation.                                                                      |
| `metadata`         | `Dict[str, str]` | `{}`     | Metadata to attach to the call.                                                                                          |
| `max_duration`     | `timedelta`      | `None`   | Maximum call duration (10s to 1h). `None` uses the agent's default.                                                      |
| `extra`            | `Dict[str, Any]` | `{}`     | Extra parameters for the [agent call creation request](https://docs.ultravox.ai/api-reference/agents/agents-calls-post). |

#### OneShotInputParams

Create a one-off call with inline configuration.

| Parameter       | Type             | Default  | Description                                                                                                |
| --------------- | ---------------- | -------- | ---------------------------------------------------------------------------------------------------------- |
| `api_key`       | `str`            | required | Ultravox API key for authentication.                                                                       |
| `system_prompt` | `str`            | `None`   | System prompt to guide the model's behavior.                                                               |
| `temperature`   | `float`          | `0.0`    | Sampling temperature for response generation (0.0-1.0).                                                    |
| `model`         | `str`            | `None`   | Model identifier to use (e.g., `"fixie-ai/ultravox"`).                                                     |
| `voice`         | `UUID`           | `None`   | Voice identifier for speech generation.                                                                    |
| `metadata`      | `Dict[str, str]` | `{}`     | Metadata to attach to the call.                                                                            |
| `max_duration`  | `timedelta`      | `1 hour` | Maximum call duration (10s to 1h).                                                                         |
| `extra`         | `Dict[str, Any]` | `{}`     | Extra parameters for the [call creation request](https://docs.ultravox.ai/api-reference/calls/calls-post). |

#### JoinUrlInputParams

Join an existing Ultravox call using a join URL.

| Parameter  | Type  | Default  | Description                                           |
| ---------- | ----- | -------- | ----------------------------------------------------- |
| `join_url` | `str` | required | The join URL for the existing Ultravox Realtime call. |

## Usage

### Basic Setup with Agent

```python theme={null}
import os
import uuid
from pipecat.services.ultravox import UltravoxRealtimeLLMService, AgentInputParams

llm = UltravoxRealtimeLLMService(
    params=AgentInputParams(
        api_key=os.getenv("ULTRAVOX_API_KEY"),
        agent_id=uuid.UUID("your-agent-id-here"),
    ),
)
```

### One-Shot Call

```python theme={null}
from pipecat.services.ultravox import UltravoxRealtimeLLMService, OneShotInputParams

llm = UltravoxRealtimeLLMService(
    params=OneShotInputParams(
        api_key=os.getenv("ULTRAVOX_API_KEY"),
        system_prompt="You are a helpful assistant.",
        temperature=0.3,
        model="fixie-ai/ultravox",
    ),
)
```

### One-Shot with Tools

```python theme={null}
from pipecat.services.ultravox import UltravoxRealtimeLLMService, OneShotInputParams

llm = UltravoxRealtimeLLMService(
    params=OneShotInputParams(
        api_key=os.getenv("ULTRAVOX_API_KEY"),
        system_prompt="You are a helpful assistant that can check the weather.",
    ),
    one_shot_selected_tools=tools,  # ToolsSchema instance
)

@llm.function("get_weather")
async def get_weather(function_name, tool_call_id, args, llm, context, result_callback):
    location = args.get("location", "unknown")
    await result_callback({"temperature": 72, "condition": "sunny", "location": location})
```

### Join Existing Call

```python theme={null}
from pipecat.services.ultravox import UltravoxRealtimeLLMService, JoinUrlInputParams

llm = UltravoxRealtimeLLMService(
    params=JoinUrlInputParams(
        join_url="wss://your-ultravox-join-url",
    ),
)
```

### Switching Output Medium at Runtime

```python theme={null}
from pipecat.frames.frames import LLMUpdateSettingsFrame
from pipecat.services.ultravox.llm import UltravoxRealtimeLLMService

# Switch to text-only output
await task.queue_frame(
    LLMUpdateSettingsFrame(
        delta=UltravoxRealtimeLLMService.Settings(
            output_medium="text",
        )
    )
)

# Switch back to voice output
await task.queue_frame(
    LLMUpdateSettingsFrame(
        delta=UltravoxRealtimeLLMService.Settings(
            output_medium="voice",
        )
    )
)
```

## Notes

* **Audio-native model**: Ultravox processes audio directly rather than relying on a separate STT step. Voice transcriptions are provided for reference but may not always align with the model's understanding of user input.
* **Server-side context management**: Ultravox handles conversation context server-side. The LLM context in Pipecat is only used for passing function call results back to the service.
* **Audio sample rate**: The service uses a 48kHz sample rate. Input audio at different sample rates is automatically resampled.
* **Output medium**: The service supports both `"voice"` and `"text"` output modes, switchable at runtime using `LLMUpdateSettingsFrame`.
* **Call duration limits**: When using `AgentInputParams` or `OneShotInputParams`, you can set a maximum call duration between 10 seconds and 1 hour.
* **Tools with agents**: When using `AgentInputParams`, tools are configured on the agent itself. Use `one_shot_selected_tools` only with `OneShotInputParams`.
