Overview

TavusVideoService enables the creation of AI avatar video responses by sending audio to Tavus’s API. It handles real-time audio streaming, conversation management, and video generation through Tavus’s platform.

Installation

To use TavusVideoService, install the required dependencies:

pip install pipecat-ai[tavus]

You’ll need to set up the following environment variables:

  • TAVUS_API_KEY - Your Tavus API key
  • TAVUS_REPLICA_ID - Your Tavus replica identifier

You can obtain a Tavus API key by signing up at Tavus.

Configuration

Constructor Parameters

api_key
str
required

Your Tavus API key

replica_id
str
required

Tavus replica identifier

persona_id
str
default:
"pipecat0"

Tavus persona identifier

The persona ID is optional and defaults to pipecat0. To use the LLM output and TTS voice, do not set the persona_id. Instead, leave it set to the default value, pipecat0.
session
aiohttp.ClientSession
required

HTTP client session for API communication

Input Frames

Audio Input

TTSAudioRawFrame
Frame

Raw audio data for avatar speech

Control Frames

TTSStartedFrame
Frame

Signals start of speech synthesis

TTSStoppedFrame
Frame

Signals end of speech synthesis

StartInterruptionFrame
Frame

Signals conversation interruption

EndFrame
Frame

Signals end of conversation

CancelFrame
Frame

Signals conversation cancellation

Usage Example

from pipecat.services.tavus import TavusVideoService
import aiohttp

async def main():
    async with aiohttp.ClientSession() as session:
        # Configure service
        tavus = TavusVideoService(
            api_key="your-tavus-api-key",
            replica_id="your-replica-id",
            session=session
        )

        # Initialize conversation
        room_url = await tavus.initialize()

        # Get persona name
        persona_name = await tavus.get_persona_name()

        transport = DailyTransport(
            room_url=room_url, # Your Tavus room URL
            token=None,
            bot_name="Pipecat bot",
            params=DailyParams(
                vad_enabled=True,
                vad_analyzer=SileroVADAnalyzer(),
                vad_audio_passthrough=True,
            ),
        )

        # Use in pipeline
        pipeline = Pipeline(
            [
                transport.input(),                # Transport user input
                stt,                              # STT
                context_aggregator.user(),        # User responses
                llm,                              # LLM
                tts,                              # TTS
                tavus,                            # Tavus output layer
                transport.output(),               # Transport bot output
                context_aggregator.assistant(),   # Assistant spoken responses
            ]
        )

API Methods

initialize

async def initialize(self) -> str:

Initializes a new conversation and returns the conversation URL.

get_persona_name

async def get_persona_name(self) -> str:

Retrieves the name of the configured persona.

Frame Flow

Metrics Support

The service collects processing metrics:

  • Processing duration
  • Time to First Byte (TTFB)
  • API response times
  • Audio processing metrics

Common Use Cases

  1. AI Video Avatars

    • Virtual assistants
    • Customer service
    • Educational content
  2. Interactive Presentations

    • Product demonstrations
    • Training materials
    • Marketing content
  3. Real-time Communication

    • Video conferencing
    • Virtual meetings
    • Interactive broadcasts

Notes

  • Handles real-time audio streaming
  • Supports conversation interruptions
  • Manages conversation lifecycle
  • Automatic audio resampling
  • Thread-safe processing
  • WebRTC integration through Daily
  • Includes comprehensive error handling