NVIDIA Parakeet
Speech-to-text service implementation using NVIDIA’s Parakeet speech recognition model
Overview
ParakeetSTTService
provides real-time speech-to-text capabilities using NVIDIA’s Riva Parakeet model. It supports interim results and configurable recognition parameters for enhanced accuracy.
Installation
To use ParakeetSTTService
, install the required dependencies:
You’ll also need to set up your NVIDIA API key as an environment variable: NVIDIA_API_KEY
.
You can obtain an NVIDIA API key by signing up through NVIDIA’s developer portal.
Configuration
Constructor Parameters
Your NVIDIA API key
NVIDIA Riva server address
NVIDIA function identifier for the STT service
Audio sample rate in Hz
Additional configuration parameters
InputParams
The language for speech recognition
Input
The service processes audio frames containing:
- Raw PCM audio data
- 16-bit depth
- Single channel (mono)
Output Frames
TranscriptionFrame
Generated for final transcriptions, containing:
Transcribed text
User identifier
ISO 8601 formatted timestamp
Language used for transcription
InterimTranscriptionFrame
Generated during ongoing speech, containing same fields as TranscriptionFrame but with preliminary results.
Methods
See the STT base class methods for additional functionality.
Usage Example
Language Support
Parakeet STT primarily supports English with various regional accents:
Language Code | Description | Service Codes |
---|---|---|
Language.EN_US | English (US) | en-US |
Frame Flow
Advanced Configuration
The service supports several advanced configuration options that can be adjusted:
Filter profanity from transcription
Automatically add punctuation
Whether to disable verbatim transcripts
List of words to boost in the language model
Score applied to boosted words
Example with Advanced Configuration
Notes
- Uses NVIDIA’s Riva AI Services platform
- Handles streaming audio input
- Provides real-time transcription results
- Manages connection lifecycle
- Uses asyncio for asynchronous processing
- Automatically cleans up resources on stop/cancel