Overview

TavusVideoService integrates with Tavus to generate AI-powered video avatars that speak your text-to-speech output in real-time. The service takes audio input and produces synchronized video of a realistic avatar speaking, enabling engaging conversational AI experiences with visual presence.

Installation

To use Tavus services, install the required dependency:
pip install "pipecat-ai[tavus]"
You’ll also need to set up your Tavus credentials as environment variables:
  • TAVUS_API_KEY - Your Tavus API key
  • TAVUS_REPLICA_ID - ID of your trained voice replica
Sign up for a Tavus account at Tavus Platform to get your API key and create voice replicas.

Frames

Input

  • TTSAudioRawFrame - Text-to-speech audio to be spoken by the avatar
  • StartInterruptionFrame - Signals conversation interruption
  • EndFrame - Signals end of conversation

Output

  • OutputImageRawFrame - Generated avatar video frames
  • OutputAudioRawFrame - Synchronized audio from the avatar
  • StartInterruptionFrame - Forwarded interruption signals

Service Features

  • Realistic Avatars: High-quality AI-generated talking heads
  • Real-time Generation: Low-latency video creation for live conversations
  • Audio Synchronization: Perfect lip-sync with generated speech
  • Video Streaming: Optimized for real-time video transport

Usage Example

import os
import aiohttp
from pipecat.services.tavus.video import TavusVideoService
from pipecat.pipeline.pipeline import Pipeline

async def main():
    async with aiohttp.ClientSession() as session:
        # Configure Tavus service
        tavus = TavusVideoService(
            api_key=os.getenv("TAVUS_API_KEY"),
            replica_id=os.getenv("TAVUS_REPLICA_ID"),
            persona_id="pipecat-stream",  # Default persona for TTS
            session=session,
        )

        # Create pipeline with video output
        pipeline = Pipeline([
            transport.input(),              # User input
            stt,                            # Speech-to-text
            context_aggregator.user(),      # User context
            llm,                            # Language model
            tts,                            # Text-to-speech
            tavus,                          # Avatar video generation
            transport.output(),             # Video/audio output
            context_aggregator.assistant(), # Assistant context
        ])

        # Configure transport for video
        transport_params = DailyParams(
            audio_in_enabled=True,
            audio_out_enabled=True,
            video_out_enabled=True,         # Enable video output
            video_out_is_live=True,         # Real-time video streaming
            video_out_width=1280,
            video_out_height=720,
            vad_analyzer=SileroVADAnalyzer(),
        )

        # Run the pipeline
        # ...

Avatar Configuration

Voice Replicas

Tavus uses voice replicas to generate speech that matches a specific voice:
tavus = TavusVideoService(
    api_key=os.getenv("TAVUS_API_KEY"),
    replica_id="your-replica-id",  # Trained voice replica
    persona_id="pipecat-stream",   # Visual persona
    session=session,
)

Integration Patterns

With Daily Transport

Tavus works seamlessly with Daily for video conferencing applications:
# Daily transport with video capabilities
transport = DailyTransport(
    room_url=room_url,
    token=token,
    bot_name="AI Avatar",
    params=DailyParams(
        video_out_enabled=True,
        video_out_is_live=True,
        video_out_width=1280,
        video_out_height=720,
    ),
)

With WebRTC Transport

For peer-to-peer video communication:
# WebRTC transport with video support
transport = SmallWebRTCTransport(
    webrtc_connection=connection,
    params=TransportParams(
        video_out_enabled=True,
        video_out_is_live=True,
        video_out_width=1280,
        video_out_height=720,
    ),
)

Additional Notes

  • Latency Optimization: Designed for real-time conversation with minimal delay
  • Network Requirements: Video streaming requires sufficient bandwidth for quality delivery
  • Processing Requirements: Ensure sufficient server resources for real-time video processing and streaming
  • Session Management: Automatically handles avatar lifecycle and cleanup
  • Error Handling: Robust error recovery for uninterrupted conversations
Tavus provides a powerful way to add visual presence to your conversational AI applications, creating more engaging and human-like interactions.