Overview
PlayHT provides high-quality text-to-speech synthesis with two implementations:PlayHTTTSService
: WebSocket-based with real-time streamingPlayHTHttpTTSService
: HTTP-based for simpler synthesis
PlayHTTTSService
is recommended for interactive applications requiring low
latency.API Reference
Complete API documentation and method details
PlayHT Docs
Official PlayHT WebSocket API documentation
Example Code
Working example with voice cloning
Installation
To use PlayHT services, install the required dependencies:PLAY_HT_USER_ID
PLAY_HT_API_KEY
Get your credentials from the PlayHT Dashboard.
Frames
Input
TextFrame
- Text content to synthesize into speechTTSSpeakFrame
- Text that should be spoken immediatelyTTSUpdateSettingsFrame
- Runtime configuration updatesLLMFullResponseStartFrame
/LLMFullResponseEndFrame
- LLM response boundaries
Output
TTSStartedFrame
- Signals start of synthesisTTSAudioRawFrame
- Generated audio data (WAV format)TTSStoppedFrame
- Signals completion of synthesisErrorFrame
- API or processing errors
Service Comparison
Feature | PlayHTTTSService (WebSocket) | PlayHTHttpTTSService (HTTP) |
---|---|---|
Streaming | ✅ Real-time chunks | ❌ Single audio block |
Latency | 🚀 Ultra-low | 📈 Higher |
Interruption | ✅ Advanced handling | ⚠️ Basic support |
Connection | WebSocket-based | HTTP-based |
Language Support
View All Supported Languages
View All Supported Languages
Language Code | Description | Service Code |
---|---|---|
Language.AF | Afrikaans | afrikans |
Language.AM | Amharic | amharic |
Language.AR | Arabic | arabic |
Language.BN | Bengali | bengali |
Language.BG | Bulgarian | bulgarian |
Language.CA | Catalan | catalan |
Language.CS | Czech | czech |
Language.DA | Danish | danish |
Language.DE | German | german |
Language.EL | Greek | greek |
Language.EN | English | english |
Language.ES | Spanish | spanish |
Language.FR | French | french |
Language.GL | Galician | galician |
Language.HE | Hebrew | hebrew |
Language.HI | Hindi | hindi |
Language.HR | Croatian | croatian |
Language.HU | Hungarian | hungarian |
Language.ID | Indonesian | indonesian |
Language.IT | Italian | italian |
Language.JA | Japanese | japanese |
Language.KO | Korean | korean |
Language.MS | Malay | malay |
Language.NL | Dutch | dutch |
Language.PL | Polish | polish |
Language.PT | Portuguese | portuguese |
Language.RU | Russian | russian |
Language.SQ | Albanian | albanian |
Language.SR | Serbian | serbian |
Language.SV | Swedish | swedish |
Language.TH | Thai | thai |
Language.TL | Tagalog | tagalog |
Language.TR | Turkish | turkish |
Language.UK | Ukrainian | ukrainian |
Language.UR | Urdu | urdu |
Language.XH | Xhosa | xhosa |
Language.ZH | Mandarin | mandarin |
Language.EN
- EnglishLanguage.ES
- SpanishLanguage.FR
- FrenchLanguage.DE
- GermanLanguage.IT
- ItalianLanguage.JA
- Japanese
Usage Example
WebSocket Service (Recommended)
Initialize thePlayHTTTSService
and use it in a pipeline:
HTTP Service
Initialize thePlayHTHttpTTSService
and use it in a pipeline:
Dynamic Voice Switching
Make settings updates by pushing aTTSUpdateSettingsFrame
:
Metrics
Both services provide comprehensive metrics:- Time to First Byte (TTFB) - Latency from text input to first audio
- Processing Duration - Total synthesis time
- Character Usage - Text processed for billing
Learn how to enable Metrics in your Pipeline.
Additional Notes
- Voice URLs: Use S3 URLs for both standard and cloned voices from PlayHT
- Engine Selection: Choose based on latency requirements and quality needs
- WebSocket Recommended: Use
PlayHTTTSService
for real-time interactive applications