> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Krisp VIVA

> Learn how to integrate Krisp's VIVA voice isolation and turn detection into your Pipecat application

## Overview

Krisp's VIVA SDK provides four capabilities for Pipecat applications:

* **Voice Isolation** — Filter out background noise and voices from the user's audio input stream, yielding clearer audio for fewer false interruptions and better transcription.
* **Turn Detection** — Determine when a user has finished speaking using Krisp's streaming turn detection model, as an alternative to the [Smart Turn model](/api-reference/server/utilities/turn-detection/smart-turn-overview).
* **Interruption Prediction** — Distinguish genuine user interruptions from backchannels (e.g. "uh-huh", "yeah"), preventing the bot from being interrupted by brief acknowledgements.
* **Voice Activity Detection** — Detect speech in audio streams using Krisp's VAD model, supporting sample rates from 8kHz to 48kHz.

You can use any combination of these features together.

<CardGroup cols={2}>
  <Card title="KrispVivaFilter Reference" icon="code" href="/api-reference/server/utilities/audio/krisp-viva-filter">
    API reference for voice isolation
  </Card>

  <Card title="KrispVivaTurn Reference" icon="code" href="/api-reference/server/utilities/turn-detection/krisp-viva-turn">
    API reference for turn detection
  </Card>

  <Card title="KrispVivaIPUserTurnStartStrategy" icon="code" href="/api-reference/server/utilities/turn-management/user-turn-strategies#krispvivaipuserturnstartstrategy">
    API reference for interruption prediction
  </Card>

  <Card title="KrispVivaVadAnalyzer Reference" icon="code" href="/api-reference/server/utilities/audio/krisp-viva-vad-analyzer">
    API reference for voice activity detection
  </Card>

  <Card title="Krisp VIVA Example" icon="play" href="https://github.com/pipecat-ai/pipecat/blob/main/examples/voice/voice-krisp-viva.py">
    Complete example with Krisp features
  </Card>

  <Card title="Krisp Developers" icon="globe" href="https://krisp.ai/developers">
    Get the Krisp SDK and API key
  </Card>
</CardGroup>

## Prerequisites

To complete this setup, you will need access to a Krisp developers account, where you can download the Python SDK, models, and generate an API key.

<Tip>
  Get started on the [Krisp developers website](https://krisp.ai/developers).
</Tip>

## Setup

### Download the Python SDK and Models

1. Log in to the [Krisp developer portal](https://sdk.krisp.ai/)
2. Navigate to the `Server SDK Version` Tab
3. Find the latest version of the Python SDK:
   * Download the SDK
   * Download the Voice Isolation models (for voice isolation)
   * Download the Turn Detection models (for turn detection)

### Install the Python wheel file

1. First, unzip the SDK files you downloaded in the previous step. In the unzipped folder, you will find a `dist` folder containing the Python wheel file you will need to install.
2. Install the Python wheel file that corresponds to your platform. For example, a macOS ARM64 platform running Python 3.12 would install the following:

   ```bash theme={null}
   uv pip install /PATH_TO_DOWNLOADED_SDK/krisp-viva-uar-python-sdk-1.8.0/dist/krisp_audio-1.8.0-cp312-cp312-macosx_12_0_arm64.whl
   ```

### Generate an API key

1. In the [Krisp developer portal](https://sdk.krisp.ai/), generate an API key for your application.

<Note>
  The `KRISP_VIVA_API_KEY` is required for Krisp SDK v1.6.1 and later. For older
  SDK versions, this is not required.
</Note>

### Set up environment variables

1. Unzip the models you downloaded in the first step.

2. For voice isolation, choose a model:

   * `krisp-viva-pro`: Mobile, Desktop, Browser (WebRTC, up to 32kHz)
   * `krisp-viva-tel`: Telephony, Cellular, Landline, Mobile, Desktop, Browser (up to 16kHz)

   Note: the full model name will be in the format of `krisp-viva-tel-v2.kef`.

3. In your .env file, add the environment variables for the features you're using:

```bash theme={null}
# Krisp SDK API key (required for SDK v1.6.1+)
KRISP_VIVA_API_KEY=your_api_key_here

# Voice isolation model path
KRISP_VIVA_FILTER_MODEL_PATH=/PATH_TO_UNZIPPED_MODELS/krisp-viva-tel-v2.kef

# Turn detection model path
KRISP_VIVA_TURN_MODEL_PATH=/PATH_TO_UNZIPPED_MODELS/krisp-viva-tt-v2.kef

# Interruption prediction model path
KRISP_VIVA_IP_MODEL_PATH=/PATH_TO_UNZIPPED_MODELS/krisp-viva-ip-v3.kef

# Voice activity detection model path (optional)
KRISP_VIVA_VAD_MODEL_PATH=/PATH_TO_UNZIPPED_MODELS/krisp-viva-vad-v2.kef
```

<Note>
  Each feature uses a **different model**. Set `KRISP_VIVA_FILTER_MODEL_PATH`
  for voice isolation, `KRISP_VIVA_TURN_MODEL_PATH` for turn detection,
  `KRISP_VIVA_IP_MODEL_PATH` for interruption prediction, and
  `KRISP_VIVA_VAD_MODEL_PATH` for voice activity detection.
</Note>

## Test the integration

You're ready to test the integration! Try running the [Krisp VIVA foundation example](https://github.com/pipecat-ai/pipecat/blob/main/examples/voice/voice-krisp-viva.py), which demonstrates both voice isolation and turn detection together.

<Tip>
  Learn how to [run foundational
  examples](https://github.com/pipecat-ai/pipecat/blob/main/examples/README.md)
  in Pipecat.
</Tip>

## Voice Isolation

`KrispVivaFilter` isolates the user's voice by filtering out background noise and other voices in real-time audio streams. Add it to any transport via the `audio_in_filter` parameter.

```python theme={null}
from pipecat.audio.filters.krisp_viva_filter import KrispVivaFilter
from pipecat.transports.base_transport import TransportParams

transport = SmallWebRTCTransport(
    webrtc_connection=webrtc_connection,
    params=TransportParams(
        audio_in_enabled=True,
        audio_in_filter=KrispVivaFilter(),  # Enable Krisp voice isolation
        audio_out_enabled=True,
    ),
)
```

See the [KrispVivaFilter reference](/api-reference/server/utilities/audio/krisp-viva-filter) for configuration options.

## Turn Detection

`KrispVivaTurn` uses Krisp's streaming turn detection model to determine when a user has finished speaking. Unlike the [Smart Turn model](/api-reference/server/utilities/turn-detection/smart-turn-overview) which analyzes audio in batches, `KrispVivaTurn` processes each audio frame in real time.

Configure it as a user turn stop strategy:

```python theme={null}
from pipecat.audio.turn.krisp_viva_turn import KrispVivaTurn
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
    LLMUserAggregatorParams,
)
from pipecat.turns.user_stop import TurnAnalyzerUserTurnStopStrategy
from pipecat.turns.user_turn_strategies import UserTurnStrategies

user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
    context,
    user_params=LLMUserAggregatorParams(
        user_turn_strategies=UserTurnStrategies(
            stop=[TurnAnalyzerUserTurnStopStrategy(
                turn_analyzer=KrispVivaTurn()
            )]
        ),
        vad_analyzer=SileroVADAnalyzer(),
    ),
)
```

See the [KrispVivaTurn reference](/api-reference/server/utilities/turn-detection/krisp-viva-turn) for configuration options.

## Interruption Prediction

`KrispVivaIPUserTurnStartStrategy` uses Krisp's Interruption Prediction (IP) model to distinguish genuine user interruptions from backchannels. When VAD detects user speech, the IP model analyzes the audio and outputs a probability indicating whether the speech is a real interruption or a brief acknowledgement (e.g., "uh-huh", "yeah").

This prevents the bot from being interrupted unnecessarily by short utterances. Configure it as a user turn start strategy:

```python theme={null}
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
    LLMUserAggregatorParams,
)
from pipecat.turns.user_start import (
    KrispVivaIPUserTurnStartStrategy,
    TranscriptionUserTurnStartStrategy,
)
from pipecat.turns.user_turn_strategies import UserTurnStrategies

user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
    context,
    user_params=LLMUserAggregatorParams(
        user_turn_strategies=UserTurnStrategies(
            start=[
                KrispVivaIPUserTurnStartStrategy(threshold=0.5),
                TranscriptionUserTurnStartStrategy(),  # Fallback
            ],
        ),
        vad_analyzer=SileroVADAnalyzer(),
    ),
)
```

See the [KrispVivaIPUserTurnStartStrategy reference](/api-reference/server/utilities/turn-management/user-turn-strategies#krispvivaipuserturnstartstrategy) for configuration options.

## Voice Activity Detection

`KrispVivaVadAnalyzer` detects speech in audio streams using Krisp's VAD model. It supports sample rates from 8kHz to 48kHz, making it suitable for a wide range of applications including telephony and high-quality audio.

Configure it as a VAD analyzer:

```python theme={null}
from pipecat.audio.vad.krisp_viva_vad import KrispVivaVadAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
    LLMUserAggregatorParams,
)

user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
    context,
    user_params=LLMUserAggregatorParams(
        vad_analyzer=KrispVivaVadAnalyzer(params=VADParams(stop_secs=0.2)),
    ),
)
```

See the [KrispVivaVadAnalyzer reference](/api-reference/server/utilities/audio/krisp-viva-vad-analyzer) for configuration options.
