Text-to-speech services using Async’s WebSocket and HTTP APIs
AsyncAITTSService
: WebSocket-based streaming TTS with interruption supportAsyncAIHttpTTSService
: HTTP-based streaming TTS service for simpler synthesisAsyncAITTSService
is recommended for real-time applications.ASYNCAI_API_KEY
.
TextFrame
- Text content to synthesize into speechTTSSpeakFrame
- Text that the TTS service should speakTTSUpdateSettingsFrame
- Runtime configuration updates (e.g., voice)LLMFullResponseStartFrame
/ LLMFullResponseEndFrame
- LLM response boundariesTTSStartedFrame
- Signals start of synthesisTTSAudioRawFrame
- Generated audio data chunksTTSStoppedFrame
- Signals completion of synthesisErrorFrame
- Connection or processing errorsFeature | AsyncAITTSService (WebSocket) | AsyncAIHttpTTSService (HTTP) |
---|---|---|
Streaming | ✅ Low-latency chunks | ✅ Response streaming |
Interruption | ✅ Advanced handling | ⚠️ Basic support |
Latency | 🚀 Low | 📈 Higher |
Connection | WebSocket persistent | HTTP per-request |
Language Code | Description | Service Code |
---|---|---|
Language.EN | English | en |
AsyncAIHttpTTSService
and use it in a pipeline:
TTSUpdateSettingsFrame
for either service:
AsyncAITTSService
for low-latency streaming casesPipelineParams
rather than per-service for consistency