Audio
SileroVADAnalyzer
Voice Activity Detection analyzer using the Silero VAD ONNX model
Overview
SileroVADAnalyzer
is a Voice Activity Detection (VAD) analyzer that uses the Silero VAD ONNX model to detect speech in audio streams. It provides high-accuracy speech detection with efficient processing using ONNX runtime.
Installation
The Silero VAD analyzer requires additional dependencies:
Constructor Parameters
Audio sample rate in Hz. Must be either 8000 or 16000.
Voice Activity Detection parameters object
Usage Example
Technical Details
Sample Rate Requirements
The analyzer supports two sample rates:
- 8000 Hz (256 samples per frame)
- 16000 Hz (512 samples per frame)
Model Management
- Uses ONNX runtime for efficient inference
- Automatically resets model state every 5 seconds to manage memory
- Runs on CPU by default for consistent performance
- Includes built-in model file
Notes
- High-accuracy speech detection
- Efficient ONNX-based processing
- Automatic memory management
- Thread-safe for pipeline processing
- Built-in model file included
- CPU-optimized inference
- Supports 8kHz and 16kHz audio