> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# SileroVADAnalyzer

> Voice Activity Detection analyzer using the Silero VAD ONNX model

## Overview

`SileroVADAnalyzer` is a Voice Activity Detection (VAD) analyzer that uses the Silero VAD ONNX model to detect speech in audio streams. It provides high-accuracy speech detection with efficient processing using ONNX runtime.

## Installation

The Silero VAD analyzer is now included as a core dependency:

```bash theme={null}
uv add "pipecat-ai"
```

## Constructor Parameters

<ParamField path="sample_rate" type="int" default="None">
  Audio sample rate in Hz. Must be either 8000 or 16000.
</ParamField>

<ParamField path="params" type="VADParams" default="VADParams()">
  Voice Activity Detection parameters object

  <Expandable title="properties">
    <ParamField path="confidence" type="float" default="0.7">
      Confidence threshold for speech detection. Higher values make detection more strict. Must be between 0 and 1.
    </ParamField>

    <ParamField path="start_secs" type="float" default="0.2">
      Time in seconds that speech must be detected before transitioning to SPEAKING state.
    </ParamField>

    <ParamField path="stop_secs" type="float" default="0.2">
      Time in seconds of silence required before transitioning back to QUIET state.
    </ParamField>

    <ParamField path="min_volume" type="float" default="0.6">
      Minimum audio volume threshold for speech detection. Must be between 0 and 1.
    </ParamField>
  </Expandable>
</ParamField>

## Usage Example

```python theme={null}
context = LLMContext(messages)
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
    context,
    user_params=LLMUserAggregatorParams(
        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
    ),
)
```

## Technical Details

### Sample Rate Requirements

The analyzer supports two sample rates:

* 8000 Hz (256 samples per frame)
* 16000 Hz (512 samples per frame)

Model Management

* Uses ONNX runtime for efficient inference
* Automatically resets model state every 5 seconds to manage memory
* Runs on CPU by default for consistent performance
* Includes built-in model file

## Notes

* High-accuracy speech detection
* Efficient ONNX-based processing
* Automatic memory management
* Thread-safe for pipeline processing
* Built-in model file included
* CPU-optimized inference
* Supports 8kHz and 16kHz audio
