AWS Polly
Text-to-speech service implementation using AWS Polly
Overview
AWSPollyTTSService
provides text-to-speech capabilities using AWS’s Polly service. It supports multiple voices, languages, and speech customization options through SSML.
The older PollyTTSService
class is still available but has been deprecated.
Use AWSPollyTTSService
instead.
Installation
To use AWSPollyTTSService
, install the required dependencies:
You’ll also need to set up your AWS credentials as environment variables:
AWS_SECRET_ACCESS_KEY
AWS_ACCESS_KEY_ID
AWS_SESSION_TOKEN
(if using temporary credentials)AWS_REGION
(defaults to “us-east-1”)
Configuration
Constructor Parameters
AWS secret access key (can also use environment variable)
AWS access key ID (can also use environment variable)
AWS session token for temporary credentials (can also use environment variable)
AWS region name (defaults to “us-east-1” if not provided)
AWS Polly voice identifier
Output audio sample rate in Hz (resampled from Polly’s 16kHz)
Modifies text provided to the TTS. Learn more about the available filters.
TTS configuration parameters
Input Parameters
Output Frames
Control Frames
Signals start of speech synthesis
Signals completion of speech synthesis
Audio Frames
Contains generated audio data with:
- PCM audio format
- Sample rate as specified (resampled from 16kHz)
- Single channel (mono)
Error Frames
Contains AWS Polly error information
Methods
See the TTS base class methods for additional functionality.
Language Support
Supports an extensive range of languages and regional variants:
Language Code | Description | Service Code |
---|---|---|
Language.AR | Arabic | arb |
Language.AR_AE | Arabic (UAE) | ar-AE |
Language.CA | Catalan | ca-ES |
Language.ZH | Chinese (Mandarin) | cmn-CN |
Language.YUE | Chinese (Cantonese) | yue-CN |
Language.YUE_CN | Chinese (Cantonese) | yue-CN |
Language.CS | Czech | cs-CZ |
Language.DA | Danish | da-DK |
Language.NL | Dutch | nl-NL |
Language.NL_BE | Dutch (Belgium) | nl-BE |
Language.EN | English (US) | en-US |
Language.EN_AU | English (Australia) | en-AU |
Language.EN_GB | English (UK) | en-GB |
Language.EN_IN | English (India) | en-IN |
Language.EN_NZ | English (New Zealand) | en-NZ |
Language.EN_US | English (US) | en-US |
Language.EN_ZA | English (South Africa) | en-ZA |
Language.FI | Finnish | fi-FI |
Language.FR | French | fr-FR |
Language.FR_BE | French (Belgium) | fr-BE |
Language.FR_CA | French (Canada) | fr-CA |
Language.DE | German | de-DE |
Language.DE_AT | German (Austria) | de-AT |
Language.DE_CH | German (Switzerland) | de-CH |
Language.HI | Hindi | hi-IN |
Language.IS | Icelandic | is-IS |
Language.IT | Italian | it-IT |
Language.JA | Japanese | ja-JP |
Language.KO | Korean | ko-KR |
Language.NO | Norwegian | nb-NO |
Language.NB | Norwegian (Bokmål) | nb-NO |
Language.NB_NO | Norwegian (Bokmål) | nb-NO |
Language.PL | Polish | pl-PL |
Language.PT | Portuguese | pt-PT |
Language.PT_BR | Portuguese (Brazil) | pt-BR |
Language.PT_PT | Portuguese (Portugal) | pt-PT |
Language.RO | Romanian | ro-RO |
Language.RU | Russian | ru-RU |
Language.ES | Spanish | es-ES |
Language.ES_MX | Spanish (Mexico) | es-MX |
Language.ES_US | Spanish (US) | es-US |
Language.SV | Swedish | sv-SE |
Language.TR | Turkish | tr-TR |
Language.CY | Welsh | cy-GB |
Language.CY_GB | Welsh | cy-GB |
Usage Example
SSML Support
The service automatically constructs SSML tags for advanced speech control:
Prosody tags (pitch, rate, volume) have different behaviors based on the engine: - Standard engine: Supports all prosody tags - Neural engine: Full prosody support - Generative engine: Only rate is supported, with a different format (e.g., “1.1” for 10% faster)
Frame Flow
Metrics Support
The service collects processing metrics:
- Time to First Byte (TTFB)
- Processing duration
- Character usage
- API calls
Notes
- Supports all AWS Polly engines:
- Standard (non-neural voices)
- Neural (improved quality voices)
- Generative (high-quality, natural-sounding voices)
- Automatic audio resampling from 16kHz to any desired rate
- Thread-safe processing
- Automatic error handling
- Manages AWS client lifecycle