Text-to-speech service using Azure Cognitive Services Speech SDK
AzureTTSService
(WebSocket-based streaming)AzureHttpTTSService
(HTTP-based batch synthesis).AzureTTSService
is recommended for real-time applications requiring low
latency and streaming capabilities.AZURE_API_KEY
(or AZURE_SPEECH_API_KEY
)AZURE_REGION
(or AZURE_SPEECH_REGION
)TextFrame
- Text content to synthesize into speechTTSSpeakFrame
- Text that should be spoken immediatelyTTSUpdateSettingsFrame
- Runtime configuration updatesLLMFullResponseStartFrame
/ LLMFullResponseEndFrame
- LLM response boundariesTTSStartedFrame
- Signals start of synthesisTTSAudioRawFrame
- Generated audio data (PCM format)TTSStoppedFrame
- Signals completion of synthesisErrorFrame
- Azure API or processing errorsFeature | AzureTTSService (Streaming) | AzureHttpTTSService (HTTP) |
---|---|---|
Streaming | ✅ Real-time chunks | ❌ Single audio block |
Latency | 🚀 Low | 📈 Higher |
Complexity | ⚠️ WebSocket management | ✅ Simple HTTP |
Connection | WebSocket-based | HTTP-based |
View All Supported Languages
Language Code | Description | Service Code |
---|---|---|
Language.BG | Bulgarian | bg-BG |
Language.CA | Catalan | ca-ES |
Language.ZH | Chinese (Simplified) | zh-CN |
Language.ZH_TW | Chinese (Traditional) | zh-TW |
Language.CS | Czech | cs-CZ |
Language.DA | Danish | da-DK |
Language.NL | Dutch (Netherlands) | nl-NL |
Language.NL_BE | Dutch (Belgium) | nl-BE |
Language.EN | English (US) | en-US |
Language.EN_US | English (US) | en-US |
Language.EN_AU | English (Australia) | en-AU |
Language.EN_GB | English (UK) | en-GB |
Language.EN_NZ | English (New Zealand) | en-NZ |
Language.EN_IN | English (India) | en-IN |
Language.ET | Estonian | et-EE |
Language.FI | Finnish | fi-FI |
Language.FR | French (France) | fr-FR |
Language.FR_CA | French (Canada) | fr-CA |
Language.DE | German (Germany) | de-DE |
Language.DE_CH | German (Switzerland) | de-CH |
Language.EL | Greek | el-GR |
Language.HI | Hindi | hi-IN |
Language.HU | Hungarian | hu-HU |
Language.ID | Indonesian | id-ID |
Language.IT | Italian | it-IT |
Language.JA | Japanese | ja-JP |
Language.KO | Korean | ko-KR |
Language.LV | Latvian | lv-LV |
Language.LT | Lithuanian | lt-LT |
Language.MS | Malay | ms-MY |
Language.NO | Norwegian | nb-NO |
Language.PL | Polish | pl-PL |
Language.PT | Portuguese (Portugal) | pt-PT |
Language.PT_BR | Portuguese (Brazil) | pt-BR |
Language.RO | Romanian | ro-RO |
Language.RU | Russian | ru-RU |
Language.SK | Slovak | sk-SK |
Language.ES | Spanish | es-ES |
Language.SV | Swedish | sv-SE |
Language.TH | Thai | th-TH |
Language.TR | Turkish | tr-TR |
Language.UK | Ukrainian | uk-UA |
Language.VI | Vietnamese | vi-VN |
Language.EN_US
- English (US)Language.EN_GB
- English (UK)Language.FR
- FrenchLanguage.DE
- GermanLanguage.ES
- SpanishLanguage.IT
- ItalianRaw8Khz16BitMonoPcm
Raw16Khz16BitMonoPcm
Raw22050Hz16BitMonoPcm
Raw24Khz16BitMonoPcm
(default)Raw44100Hz16BitMonoPcm
Raw48Khz16BitMonoPcm
AzureTTSService
and use it in a pipeline:
AzureHttpTTSService
and use it in a pipeline:
TTSUpdateSettingsFrame
for the AzureTTSService
:
AzureTTSService
for real-time applications requiring low latency