Skip to main content

Overview

GeminiLiveVertexLLMService enables natural, real-time conversations with Google’s Gemini model through Vertex AI. It provides built-in audio transcription, voice activity detection, and context management for creating interactive AI experiences with multimodal capabilities including audio, video, and text processing.
Want to start building? Check out our Gemini Live Guide for general concepts, then follow the Vertex AI-specific setup below.

Installation

To use Gemini Live Vertex AI services, install the required dependencies:
pip install "pipecat-ai[google]"

Prerequisites

Google Cloud Setup

Before using Gemini Live Vertex AI services, you need:
  1. Google Cloud Project: Set up a project in the Google Cloud Console
  2. Vertex AI API: Enable the Vertex AI API in your project
  3. Service Account: Create a service account with roles/aiplatform.user and roles/ml.developer permissions
  4. Authentication: Set up service account credentials or Application Default Credentials

Required Environment Variables

  • GOOGLE_VERTEX_TEST_CREDENTIALS: JSON string of service account credentials (optional if using ADC)
  • GOOGLE_CLOUD_PROJECT_ID: Your Google Cloud project ID
  • GOOGLE_CLOUD_LOCATION: Vertex AI region (e.g., “us-east4”)

Key Features

  • Enterprise Authentication: Secure service account-based authentication
  • Multimodal Processing: Handle audio, video, and text inputs simultaneously
  • Real-time Streaming: Low-latency audio and video processing
  • Voice Activity Detection: Automatic speech detection and turn management
  • Function Calling: Advanced tool integration and API calling capabilities
  • Context Management: Intelligent conversation history and system instruction handling
I