Overview

HeyGenVideoService integrates with HeyGen to create interactive AI-powered video avatars that respond naturally in real-time conversations. The service handles bidirectional audio/video streaming, avatar animations, voice activity detection, and conversation interruptions to deliver engaging conversational AI experiences with lifelike visual presence.

Installation

To use HeyGen services, install the required dependency:
pip install "pipecat-ai[heygen]"
You’ll also need to set up your HeyGen API key as an environment variable:
  • HEYGEN_API_KEY - Your HeyGen API key
Sign up for a HeyGen account at HeyGen Platform to get your API key and access interactive avatars.

Frames

Input

  • TTSAudioRawFrame - Text-to-speech audio for avatar to speak
  • UserStartedSpeakingFrame - Triggers avatar listening animation
  • UserStoppedSpeakingFrame - Stops avatar listening state
  • EndFrame - Signals end of conversation

Output

  • OutputImageRawFrame - Generated avatar video frames
  • OutputAudioRawFrame - Avatar’s synchronized audio output
  • UserStartedSpeakingFrame - Forwarded user speech events
  • UserStoppedSpeakingFrame - Forwarded user speech events

Service Features

  • Interactive Avatars: Real-time conversational avatars with natural expressions
  • Voice Activity Detection: Intelligent listening animations and interruption handling
  • Real-time Streaming: Low-latency bidirectional audio/video communication
  • Natural Conversations: Smooth interruption handling for fluid interactions
  • Avatar Animations: Contextual animations based on conversation state

Usage Example

import os
import aiohttp
from pipecat.services.heygen.video import HeyGenVideoService
from pipecat.services.heygen.api import NewSessionRequest
from pipecat.pipeline.pipeline import Pipeline

async def main():
    async with aiohttp.ClientSession() as session:
        # Configure HeyGen service with custom avatar
        heygen = HeyGenVideoService(
            api_key=os.getenv("HEYGEN_API_KEY"),
            session=session,
            session_request=NewSessionRequest(
                avatar_id="Shawn_Therapist_public"  # Or your custom avatar ID
            ),
        )

        # Create pipeline with video output
        pipeline = Pipeline([
            transport.input(),              # User input
            stt,                            # Speech-to-text
            context_aggregator.user(),      # User context
            llm,                            # Language model
            tts,                            # Text-to-speech
            heygen,                         # Avatar video generation
            transport.output(),             # Video/audio output
            context_aggregator.assistant(), # Assistant context
        ])

        # Configure transport for video
        transport_params = DailyParams(
            audio_in_enabled=True,
            audio_out_enabled=True,
            video_out_enabled=True,         # Enable video output
            video_out_is_live=True,         # Real-time video streaming
            video_out_width=1280,
            video_out_height=720,
            vad_analyzer=SileroVADAnalyzer(),
        )

        # Run the pipeline
        # ...

Avatar Configuration

Avatar Selection

HeyGen provides various pre-built avatars or you can use custom avatars:
from pipecat.services.heygen.api import NewSessionRequest

# Use a public avatar
heygen = HeyGenVideoService(
    api_key=os.getenv("HEYGEN_API_KEY"),
    session=session,
    session_request=NewSessionRequest(
        avatar_id="Shawn_Therapist_public"  # Public avatar
    ),
)

# Use your custom avatar
heygen = HeyGenVideoService(
    api_key=os.getenv("HEYGEN_API_KEY"),
    session=session,
    session_request=NewSessionRequest(
        avatar_id="your-custom-avatar-id"  # Your trained avatar
    ),
)

Default Configuration

If no session request is provided, the service uses the Shawn_Therapist_public avatar by default.

Integration Patterns

With Daily Transport

HeyGen works seamlessly with Daily for video conferencing applications:
# Daily transport with video capabilities
transport = DailyTransport(
    room_url=room_url,
    token=token,
    bot_name="AI Avatar",
    params=DailyParams(
        video_out_enabled=True,
        video_out_is_live=True,
        video_out_width=1280,
        video_out_height=720,
    ),
)

With WebRTC Transport

For peer-to-peer video communication:
# WebRTC transport with video support
transport = SmallWebRTCTransport(
    webrtc_connection=connection,
    params=TransportParams(
        video_out_enabled=True,
        video_out_is_live=True,
        video_out_width=1280,
        video_out_height=720,
    ),
)

Additional Notes

  • Real-time Optimization: Designed for low-latency conversational interactions
  • Network Requirements: Video streaming requires sufficient bandwidth for quality delivery
  • Processing Requirements: Ensure adequate server resources for real-time video processing
  • Session Management: Automatically handles avatar lifecycle and conversation state
  • Audio Synchronization: Maintains perfect lip-sync with generated speech
  • Error Handling: Robust error recovery for uninterrupted conversations
HeyGen provides powerful interactive avatar capabilities that create more engaging and natural conversational AI experiences with sophisticated conversation flow management.