Skip to main content

Overview

VonageFrameSerializer enables integration with the Vonage Video API Audio Connector WebSocket protocol, allowing Pipecat applications to process real-time audio streams from active Vonage video sessions.

Installation

The VonageFrameSerializer does not require any additional dependencies beyond the core Pipecat library:
pip install "pipecat-ai"

Prerequisites

Vonage Video API Account Setup

Before using VonageFrameSerializer, you need:
  1. Vonage (TokBox) Account: Sign up at Vonage Video API Console
  2. Vonage Video API Project: Create a project to obtain Project API Key and Project Secret
  3. Existing Vonage Video Session: A Vonage session must already exist. Sessions can be created using TokBox Playground or Vonage Video API SDKs

Required Environment Variables

  • VONAGE_API_KEY: Your Vonage Video API project key
  • VONAGE_API_SECRET: Your Vonage Video API project secret
  • VONAGE_SESSION_ID: The existing routed session ID
  • WS_URI: Public WebSocket endpoint URI of the server application running Pipecat (e.g. via ngrok)

Required Configuration

  • WebSocket Endpoint (/ws): A WebSocket server application (e.g. FastAPI) running Pipecat that accepts raw PCM audio frames.
  • Audio Connector /connect Request: Triggers Vonage to open a WebSocket connection to your server and begin streaming audio from the active session.

Key Features

  • Bidirectional Audio: Convert between Pipecat and Vonage Audio Connector formats
  • Real-Time AI Pipelines: Stream live audio into Pipecat and process it through any real-time pipeline configuration supported by the framework
  • Session Control Events: Handle Vonage Audio Connector JSON events
  • Linear PCM Audio: Handle raw 16-bit linear PCM audio streams used by the Vonage Video API Audio Connector