Azure
Speech-to-text service using Azure Cognitive Services Speech SDK
Overview
AzureSTTService
provides real-time speech recognition using Azure’s Cognitive Services Speech SDK with support for continuous recognition, extensive language support, and configurable audio processing.
API Reference
Complete API documentation and method details
Azure Speech Docs
Official Azure Speech Service documentation and features
Example Code
Working example with Azure services integration
Installation
To use Azure Speech services, install the required dependency:
You’ll also need to set up your Azure credentials as environment variables:
AZURE_API_KEY
(orAZURE_SPEECH_API_KEY
)AZURE_REGION
(orAZURE_SPEECH_REGION
)
Get your API key and region from the Azure Portal by creating a Speech Services resource.
Frames
Input
InputAudioRawFrame
- Raw PCM audio data (configurable sample rate, mono or stereo)STTUpdateSettingsFrame
- Runtime transcription configuration updatesSTTMuteFrame
- Mute audio input for transcription
Output
TranscriptionFrame
- Final transcription resultsErrorFrame
- Connection or processing errors
Language Support
Azure Speech STT supports extensive language coverage with regional variants:
View All Supported Languages
View All Supported Languages
Language Code | Description | Service Codes |
---|---|---|
Language.AF | Afrikaans | af-ZA |
Language.AM | Amharic | am-ET |
Language.AR | Arabic (UAE) | ar-AE |
Language.AR_SA | Arabic (Saudi Arabia) | ar-SA |
Language.AR_EG | Arabic (Egypt) | ar-EG |
Language.AS | Assamese | as-IN |
Language.AZ | Azerbaijani | az-AZ |
Language.BG | Bulgarian | bg-BG |
Language.BN | Bengali | bn-IN |
Language.BS | Bosnian | bs-BA |
Language.CA | Catalan | ca-ES |
Language.CS | Czech | cs-CZ |
Language.CY | Welsh | cy-GB |
Language.DA | Danish | da-DK |
Language.DE | German | de-DE |
Language.DE_AT | German (Austria) | de-AT |
Language.DE_CH | German (Switzerland) | de-CH |
Language.EL | Greek | el-GR |
Language.EN | English (US) | en-US |
Language.EN_AU | English (Australia) | en-AU |
Language.EN_CA | English (Canada) | en-CA |
Language.EN_GB | English (UK) | en-GB |
Language.EN_IN | English (India) | en-IN |
Language.ES | Spanish (Spain) | es-ES |
Language.ES_MX | Spanish (Mexico) | es-MX |
Language.ES_US | Spanish (US) | es-US |
Language.ET | Estonian | et-EE |
Language.EU | Basque | eu-ES |
Language.FA | Persian | fa-IR |
Language.FI | Finnish | fi-FI |
Language.FIL | Filipino | fil-PH |
Language.FR | French | fr-FR |
Language.FR_CA | French (Canada) | fr-CA |
Language.GA | Irish | ga-IE |
Language.GL | Galician | gl-ES |
Language.GU | Gujarati | gu-IN |
Language.HE | Hebrew | he-IL |
Language.HI | Hindi | hi-IN |
Language.HR | Croatian | hr-HR |
Language.HU | Hungarian | hu-HU |
Language.HY | Armenian | hy-AM |
Language.ID | Indonesian | id-ID |
Language.IS | Icelandic | is-IS |
Language.IT | Italian | it-IT |
Language.JA | Japanese | ja-JP |
Language.JV | Javanese | jv-ID |
Language.KA | Georgian | ka-GE |
Language.KK | Kazakh | kk-KZ |
Language.KM | Khmer | km-KH |
Language.KN | Kannada | kn-IN |
Language.KO | Korean | ko-KR |
Language.LO | Lao | lo-LA |
Language.LT | Lithuanian | lt-LT |
Language.LV | Latvian | lv-LV |
Language.MK | Macedonian | mk-MK |
Language.ML | Malayalam | ml-IN |
Language.MN | Mongolian | mn-MN |
Language.MR | Marathi | mr-IN |
Language.MS | Malay | ms-MY |
Language.MT | Maltese | mt-MT |
Language.MY | Burmese | my-MM |
Language.NB | Norwegian | nb-NO |
Language.NE | Nepali | ne-NP |
Language.NL | Dutch | nl-NL |
Language.OR | Odia | or-IN |
Language.PA | Punjabi | pa-IN |
Language.PL | Polish | pl-PL |
Language.PS | Pashto | ps-AF |
Language.PT | Portuguese | pt-PT |
Language.PT_BR | Portuguese (Brazil) | pt-BR |
Language.RO | Romanian | ro-RO |
Language.RU | Russian | ru-RU |
Language.SI | Sinhala | si-LK |
Language.SK | Slovak | sk-SK |
Language.SL | Slovenian | sl-SI |
Language.SO | Somali | so-SO |
Language.SQ | Albanian | sq-AL |
Language.SR | Serbian | sr-RS |
Language.SU | Sundanese | su-ID |
Language.SV | Swedish | sv-SE |
Language.SW | Swahili | sw-KE |
Language.TA | Tamil | ta-IN |
Language.TE | Telugu | te-IN |
Language.TH | Thai | th-TH |
Language.TR | Turkish | tr-TR |
Language.UK | Ukrainian | uk-UA |
Language.UR | Urdu | ur-IN |
Language.UZ | Uzbek | uz-UZ |
Language.VI | Vietnamese | vi-VN |
Language.ZH | Chinese (Mandarin) | zh-CN |
Language.ZH_HK | Chinese (Hong Kong) | zh-HK |
Language.ZH_TW | Chinese (Taiwan) | zh-TW |
Language.ZU | Zulu | zu-ZA |
Common languages:
Language.EN_US
- English (US) -en-US
Language.ES
- Spanish -es-ES
Language.FR
- French -fr-FR
Language.DE
- German -de-DE
Language.IT
- Italian -it-IT
Language.JA
- Japanese -ja-JP
Usage Example
Basic Configuration
Initialize the AzureSTTService
and use it in a pipeline:
Dynamic Configuration
Make settings updates by pushing an STTUpdateSettingsFrame
for the AzureSTTService
:
Metrics
The service provides:
- Time to First Byte (TTFB) - Latency from audio input to first transcription
- Processing Duration - Total time spent processing audio
Learn how to enable Metrics in your Pipeline.
Additional Notes
- Continuous Recognition: Uses Azure’s continuous recognition mode for real-time processing
- Audio Flexibility: Supports configurable sample rates and both mono/stereo input
- Resource Management: Automatic cleanup of Azure speech recognizer and audio streams
- Threading: Thread-safe operation with proper async event loop handling using
asyncio.run_coroutine_threadsafe
- Regional Support: Requires Azure region specification for optimal performance and compliance
- Connection Management: Handles Azure SDK connection lifecycle with proper start/stop/cancel operations