Audio
SileroVADAnalyzer
Voice Activity Detection analyzer using the Silero VAD ONNX model
Overview
SileroVADAnalyzer
is a Voice Activity Detection (VAD) analyzer that uses the Silero VAD ONNX model to detect speech in audio streams. It provides high-accuracy speech detection with efficient processing using ONNX runtime.
Installation
The Silero VAD analyzer requires additional dependencies:
Constructor Parameters
sample_rate
int
default: "16000"Audio sample rate in Hz. Must be either 8000 or 16000.
params
VADParams
default: "VADParams()"Voice Activity Detection parameters object
Usage Example
Technical Details
Sample Rate Requirements
The analyzer supports two sample rates:
- 8000 Hz (256 samples per frame)
- 16000 Hz (512 samples per frame)
Model Management
- Uses ONNX runtime for efficient inference
- Automatically resets model state every 5 seconds to manage memory
- Runs on CPU by default for consistent performance
- Includes built-in model file
Notes
- High-accuracy speech detection
- Efficient ONNX-based processing
- Automatic memory management
- Thread-safe for pipeline processing
- Built-in model file included
- CPU-optimized inference
- Supports 8kHz and 16kHz audio