> ## Documentation Index > Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt > Use this file to discover all available pages before exploring further. # Azure > Speech-to-text service using Azure Cognitive Services Speech SDK ## Overview `AzureSTTService` provides real-time speech recognition using Azure's Cognitive Services Speech SDK with support for continuous recognition, extensive language support, and configurable audio processing for enterprise applications. Pipecat's API methods for Azure Speech integration Complete example with Azure services integration Official Azure Speech Service documentation and features Create Speech Services resource and get API keys ## Installation To use Azure Speech services, install the required dependency: ```bash theme={null} uv add "pipecat-ai[azure]" ``` ## Prerequisites ### Azure Account Setup Before using Azure STT services, you need: 1. **Azure Account**: Sign up at [Azure Portal](https://portal.azure.com/) 2. **Speech Services Resource**: Create a Speech Services resource in Azure 3. **API Credentials**: Get your API key and region from the resource ### Required Environment Variables * `AZURE_SPEECH_API_KEY`: Your Azure Speech API key * `AZURE_SPEECH_REGION`: Your Azure Speech region (required unless using `private_endpoint`) ## Configuration Azure Cognitive Services subscription key. Azure region for the Speech service (e.g., `"eastus"`, `"westus2"`). Required unless `private_endpoint` is provided. Language for speech recognition. *Deprecated in v0.0.105. Use `settings=AzureSTTService.Settings(...)` instead.* Audio sample rate in Hz. When `None`, uses the pipeline's configured sample rate. Private endpoint for STT behind firewall. Enables use in private networks. When provided, `region` becomes optional (takes priority if both are specified). See [Azure Speech private link documentation](https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-services-private-link?tabs=portal) for setup details. Custom model endpoint ID. Use this for custom speech models deployed in Azure. Runtime-configurable settings for the STT service. See [Settings](#settings) below. P99 latency from speech end to final transcript in seconds. Override for your deployment. ### Settings Runtime-configurable settings passed via the `settings` constructor argument using `AzureSTTService.Settings(...)`. These can be updated mid-conversation with `STTUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details. | Parameter | Type | Default | Description | | ----------- | -------------------------------- | ---------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `model` | `str` | `None` | STT model identifier. *(Inherited from base STT settings.)* | | `language` | `Language \| str` | `Language.EN_US` | Language for speech recognition. *(Inherited from base STT settings.)* | | `profanity` | `"raw" \| "masked" \| "removed"` | `None` | How Azure handles profanity in transcripts. `"raw"` returns text as recognized with no masking, `"masked"` replaces profane words with `****` (Azure default), `"removed"` drops profane words. Default `None` keeps Azure SDK default (`"masked"`). Use `"raw"` for non-English deployments where Azure's profanity filter over-eagerly masks ordinary words. | ## Usage ### Basic Setup ```python theme={null} from pipecat.services.azure.stt import AzureSTTService stt = AzureSTTService( api_key=os.getenv("AZURE_SPEECH_API_KEY"), region=os.getenv("AZURE_SPEECH_REGION"), ) ``` ### With Custom Language ```python theme={null} from pipecat.services.azure.stt import AzureSTTService from pipecat.transcriptions.language import Language stt = AzureSTTService( api_key=os.getenv("AZURE_SPEECH_API_KEY"), region="westus2", settings=AzureSTTService.Settings( language=Language.FR, ), ) ``` ### With Profanity Filtering ```python theme={null} from pipecat.services.azure.stt import AzureSTTService # Disable profanity masking for non-English deployments stt = AzureSTTService( api_key=os.getenv("AZURE_SPEECH_API_KEY"), region="westus2", settings=AzureSTTService.Settings( profanity="raw", # No masking ), ) ``` The `InputParams` / `params=` pattern is deprecated as of v0.0.105. Use `Settings` / `settings=` instead. See the [Service Settings guide](/pipecat/fundamentals/service-settings) for migration details. ## Notes * **SDK-based (not WebSocket)**: Unlike most other STT services in Pipecat, Azure STT uses the Azure Cognitive Services Speech SDK rather than a raw WebSocket connection. Recognition callbacks run on SDK-managed threads and are bridged to asyncio via `asyncio.run_coroutine_threadsafe`. * **Continuous recognition**: The service uses Azure's `start_continuous_recognition_async` for always-on transcription. It provides both interim (`recognizing`) and final (`recognized`) results automatically. * **Finalized transcripts**: Azure's `RecognizedSpeech` events are marked as finalized (`TranscriptionFrame(finalized=True)`). This enables downstream user-turn stop strategies (e.g., `SpeechTimeoutUserTurnStop`) to take their fast-path instead of waiting for VAD events, which may not arrive on short utterances. * **Custom endpoints**: Use the `endpoint_id` parameter to point to a custom speech model deployed in your Azure subscription for domain-specific accuracy improvements. * **Region vs private endpoint**: Either `region` or `private_endpoint` must be provided (but not both). If both are specified, `private_endpoint` takes priority and a warning is logged. If neither is provided, a `ValueError` is raised.