> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Azure

> Speech-to-text service using Azure Cognitive Services Speech SDK

## Overview

`AzureSTTService` provides real-time speech recognition using Azure's Cognitive Services Speech SDK with support for continuous recognition, extensive language support, and configurable audio processing for enterprise applications.

<CardGroup cols={2}>
  <Card title="Azure STT API Reference" icon="code" href="https://reference-server.pipecat.ai/en/latest/api/pipecat.services.azure.stt.html">
    Pipecat's API methods for Azure Speech integration
  </Card>

  <Card title="Example Implementation" icon="play" href="https://github.com/pipecat-ai/pipecat/blob/main/examples/voice/voice-azure.py">
    Complete example with Azure services integration
  </Card>

  <Card title="Azure Speech Documentation" icon="book" href="https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-to-text">
    Official Azure Speech Service documentation and features
  </Card>

  <Card title="Azure Portal" icon="microphone" href="https://portal.azure.com/">
    Create Speech Services resource and get API keys
  </Card>
</CardGroup>

## Installation

To use Azure Speech services, install the required dependency:

```bash theme={null}
uv add "pipecat-ai[azure]"
```

## Prerequisites

### Azure Account Setup

Before using Azure STT services, you need:

1. **Azure Account**: Sign up at [Azure Portal](https://portal.azure.com/)
2. **Speech Services Resource**: Create a Speech Services resource in Azure
3. **API Credentials**: Get your API key and region from the resource

### Required Environment Variables

* `AZURE_SPEECH_API_KEY`: Your Azure Speech API key
* `AZURE_SPEECH_REGION`: Your Azure Speech region (required unless using `private_endpoint`)

## Configuration

<ParamField path="api_key" type="str" required>
  Azure Cognitive Services subscription key.
</ParamField>

<ParamField path="region" type="str" default="None">
  Azure region for the Speech service (e.g., `"eastus"`, `"westus2"`). Required
  unless `private_endpoint` is provided.
</ParamField>

<ParamField path="language" type="Language" default="Language.EN_US" deprecated>
  Language for speech recognition. *Deprecated in v0.0.105. Use
  `settings=AzureSTTService.Settings(...)` instead.*
</ParamField>

<ParamField path="sample_rate" type="int" default="None">
  Audio sample rate in Hz. When `None`, uses the pipeline's configured sample
  rate.
</ParamField>

<ParamField path="private_endpoint" type="str" default="None">
  Private endpoint for STT behind firewall. Enables use in private networks.
  When provided, `region` becomes optional (takes priority if both are
  specified). See [Azure Speech private link
  documentation](https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-services-private-link?tabs=portal)
  for setup details.
</ParamField>

<ParamField path="endpoint_id" type="str" default="None">
  Custom model endpoint ID. Use this for custom speech models deployed in Azure.
</ParamField>

<ParamField path="settings" type="AzureSTTService.Settings" default="None">
  Runtime-configurable settings for the STT service. See [Settings](#settings)
  below.
</ParamField>

<ParamField path="ttfs_p99_latency" type="float" default="AZURE_TTFS_P99">
  P99 latency from speech end to final transcript in seconds. Override for your
  deployment.
</ParamField>

### Settings

Runtime-configurable settings passed via the `settings` constructor argument using `AzureSTTService.Settings(...)`. These can be updated mid-conversation with `STTUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details.

| Parameter  | Type              | Default          | Description                                                            |
| ---------- | ----------------- | ---------------- | ---------------------------------------------------------------------- |
| `model`    | `str`             | `None`           | STT model identifier. *(Inherited from base STT settings.)*            |
| `language` | `Language \| str` | `Language.EN_US` | Language for speech recognition. *(Inherited from base STT settings.)* |

## Usage

### Basic Setup

```python theme={null}
from pipecat.services.azure.stt import AzureSTTService

stt = AzureSTTService(
    api_key=os.getenv("AZURE_SPEECH_API_KEY"),
    region=os.getenv("AZURE_SPEECH_REGION"),
)
```

### With Custom Language

```python theme={null}
from pipecat.services.azure.stt import AzureSTTService
from pipecat.transcriptions.language import Language

stt = AzureSTTService(
    api_key=os.getenv("AZURE_SPEECH_API_KEY"),
    region="westus2",
    settings=AzureSTTService.Settings(
        language=Language.FR,
    ),
)
```

<Tip>
  The `InputParams` / `params=` pattern is deprecated as of v0.0.105. Use
  `Settings` / `settings=` instead. See the [Service Settings
  guide](/pipecat/fundamentals/service-settings) for migration details.
</Tip>

## Notes

* **SDK-based (not WebSocket)**: Unlike most other STT services in Pipecat, Azure STT uses the Azure Cognitive Services Speech SDK rather than a raw WebSocket connection. Recognition callbacks run on SDK-managed threads and are bridged to asyncio via `asyncio.run_coroutine_threadsafe`.
* **Continuous recognition**: The service uses Azure's `start_continuous_recognition_async` for always-on transcription. It provides both interim (`recognizing`) and final (`recognized`) results automatically.
* **Custom endpoints**: Use the `endpoint_id` parameter to point to a custom speech model deployed in your Azure subscription for domain-specific accuracy improvements.
* **Region vs private endpoint**: Either `region` or `private_endpoint` must be provided (but not both). If both are specified, `private_endpoint` takes priority and a warning is logged. If neither is provided, a `ValueError` is raised.
