SambaNova (Whisper)
Speech-to-text service implementation using SambaNova’s Whisper API
Overview
SambaNovaSTTService
provides speech-to-text capabilities using SambaNova’s hosted Whisper API with Voice Activity Detection (VAD) for optimized processing. It uses Voice Activity Detection (VAD) to process speech segments efficiently to create speech segments to send to the API.
API Reference
Complete API documentation and method details
SambaNova Docs
Official SambaNova API documentation and features
Example Code
Working example with function calling
Installation
To use SambaNova services, install the required dependency:
You’ll also need to set up your SambaNova API key as an environment variable: SAMBANOVA_API_KEY
.
Get your SambaNova API key from SambaNova Cloud.
Frames
Input
InputAudioRawFrame
- Raw PCM audio data (16-bit, mono)UserStartedSpeakingFrame
- VAD start signal (begins audio buffering)UserStoppedSpeakingFrame
- VAD stop signal (triggers transcription)
Output
TranscriptionFrame
- Final transcription resultsErrorFrame
- API or processing errors
Models
SambaNova currently offers one Whisper model:
Model | Description | Features |
---|---|---|
Whisper-Large-v3 | OpenAI’s Whisper Large v3 | High accuracy, 99+ languages, robust to noise |
Language Support
SambaNova’s Whisper implementation supports 99+ languages with automatic language detection:
View All Supported Languages
View All Supported Languages
Language Code | Description | Whisper Code |
---|---|---|
Language.AF | Afrikaans | af |
Language.AR | Arabic | ar |
Language.HY | Armenian | hy |
Language.AZ | Azerbaijani | az |
Language.BE | Belarusian | be |
Language.BS | Bosnian | bs |
Language.BG | Bulgarian | bg |
Language.CA | Catalan | ca |
Language.ZH | Chinese | zh |
Language.HR | Croatian | hr |
Language.CS | Czech | cs |
Language.DA | Danish | da |
Language.NL | Dutch | nl |
Language.EN | English | en |
Language.ET | Estonian | et |
Language.FI | Finnish | fi |
Language.FR | French | fr |
Language.GL | Galician | gl |
Language.DE | German | de |
Language.EL | Greek | el |
Language.HE | Hebrew | he |
Language.HI | Hindi | hi |
Language.HU | Hungarian | hu |
Language.IS | Icelandic | is |
Language.ID | Indonesian | id |
Language.IT | Italian | it |
Language.JA | Japanese | ja |
Language.KN | Kannada | kn |
Language.KK | Kazakh | kk |
Language.KO | Korean | ko |
Language.LV | Latvian | lv |
Language.LT | Lithuanian | lt |
Language.MK | Macedonian | mk |
Language.MS | Malay | ms |
Language.MR | Marathi | mr |
Language.MI | Maori | mi |
Language.NE | Nepali | ne |
Language.NO | Norwegian | no |
Language.FA | Persian | fa |
Language.PL | Polish | pl |
Language.PT | Portuguese | pt |
Language.RO | Romanian | ro |
Language.RU | Russian | ru |
Language.SR | Serbian | sr |
Language.SK | Slovak | sk |
Language.SL | Slovenian | sl |
Language.ES | Spanish | es |
Language.SW | Swahili | sw |
Language.SV | Swedish | sv |
Language.TL | Tagalog | tl |
Language.TA | Tamil | ta |
Language.TH | Thai | th |
Language.TR | Turkish | tr |
Language.UK | Ukrainian | uk |
Language.UR | Urdu | ur |
Language.VI | Vietnamese | vi |
Language.CY | Welsh | cy |
Common languages:
Language.EN
- English -en
Language.ES
- Spanish -es
Language.FR
- French -fr
Language.DE
- German -de
Language.IT
- Italian -it
Language.JA
- Japanese -ja
Language variants (like en-US
, fr-CA
) are automatically mapped to their
base language codes.
Usage Example
Basic Configuration
Initialize the SambaNovaSTTService
and use it in a pipeline:
Advanced Configuration
Dynamic Configuration
Make settings updates by pushing an STTUpdateSettingsFrame
for the SambaNovaSTTService
:
Metrics
The service provides comprehensive metrics:
- Time to First Byte (TTFB) - Latency from audio input to first transcription
- Processing Duration - Total time spent processing audio
Learn how to enable Metrics in your Pipeline.
Additional Notes
- VAD Requirement: Must use a transport with VAD analyzer for proper operation
- Segmented Processing: Transcribes complete utterances, not continuous streams
- OpenAI Compatibility: Uses OpenAI-compatible API interface
- Language Detection: Automatic language detection when no language is specified