Deepgram
Text-to-speech service implementation using Deepgram’s Aura API
Overview
DeepgramTTSService
converts text to speech using Deepgram’s Aura API. It supports various voices and audio configurations.
Installation
To use DeepgramTTSService
, install the required dependencies:
You’ll also need to set up your Deepgram API key as an environment variable: DEEPGRAM_API_KEY
Configuration
Constructor Parameters
Your Deepgram API key
Voice identifier to use for synthesis
Output audio sample rate in Hz
Audio encoding format
Modifies text provided to the TTS. Learn more about the available filters.
Input
The service accepts text input through its TTS pipeline.
Output Frames
TTSStartedFrame
Signals the start of audio generation.
TTSAudioRawFrame
Contains generated audio data:
Raw audio data chunk
Audio sample rate (24kHz default)
Number of audio channels (1 for mono)
TTSStoppedFrame
Signals the completion of audio generation.
Methods
See the TTS base class methods for additional functionality.
Language Support
Deepgram TTS supports the following languages and regional variants:
Language Code | Description | Service Codes |
---|---|---|
Language.EN | English | en |
Usage Example
Frame Flow
Metrics Support
The service supports metrics collection:
- Time to First Byte (TTFB)
- TTS usage metrics
- Processing duration
Audio Processing
- Streams audio in 8KB chunks
- Supports 16-bit PCM format
- Generates mono audio output
- Handles memory buffering
Error Handling
Notes
- Requires valid Deepgram API key
- Streams audio in chunks
- Supports various voices
- Provides metrics collection
- Handles memory efficiently
- Thread-safe processing