Groq
Text-to-speech service implementation using Groq’s TTS API
Overview
GroqTTSService
converts text to speech using Groq’s TTS API. It supports real-time audio generation with multiple voices.
Installation
To use GroqTTSService
, install the required dependencies:
You’ll also need to set up your Groq API key as an environment variable: GROQ_API_KEY
.
You can obtain a Groq Cloud API key by signing up at Groq.
Configuration
Constructor Parameters
Your Groq API key
Audio output format
Configuration parameters for speech generation
TTS model to use. See the Groq Cloud docs for available models.
Voice identifier to use for synthesis
Input Parameters
Language for speech synthesis
Speech rate multiplier (higher values produce faster speech)
Random seed for reproducible audio generation
Input
The service accepts text input through the pipeline, including streaming text from an LLM service.
Output Frames
TTSStartedFrame
Signals the start of audio generation.
TTSAudioRawFrame
Contains generated audio data:
Raw audio data chunk
Audio sample rate, based on the constructor setting
Number of audio channels (1 for mono)
TTSStoppedFrame
Signals the completion of audio generation.
Methods
See the TTS base class methods for additional functionality.
Language Support
GroqTTSService
supports the following languages:
Language Code | Description | Service Codes |
---|---|---|
Language.EN | English | en |
Usage Example
Frame Flow
Metrics Support
The service supports metrics collection:
- Time to First Byte (TTFB)
- Processing duration
Audio Processing
- Streams audio in chunks
- Outputs mono audio at the defined sample rate
- Handles WAV header removal automatically
- Supports WAV format by default
Notes
- Requires a Groq Cloud API key
- Streams audio in chunks for efficient processing
- Automatically handles WAV headers in the response
- Provides metrics collection
- Supports configurable speech parameters