Overview

The OpenAIRealTimeWebRTCTransport is a fully functional Pipecat Transport. It provides a framework for implementing real-time communication directly with the OpenAI Realtime API using WebRTC voice-to-voice service. It handles media device management, audio/video streams, and state management for the connection.
Transports of this type are designed primarily for development and testing purposes. For production applications, you will need to build a server component with a server-friendly transport, like the DailyTransport, to securely handle API keys.

Usage

Basic Setup

import { OpenAIRealTimeWebRTCTransport, OpenAIServiceOptions } from '@pipecat-ai/openai-realtime-webrtc-transport';
import { PipecatClient } from '@pipecat-ai/client-js';

const options: OpenAIServiceOptions = {
  api_key: 'YOUR_API_KEY',
  session_config: {
    instructions: 'You are a confused jellyfish.',
  },
  initial_messages: [{ role: "user", content: "Blub blub!" }],
};

let pcClient = new PipecatClient({
  transport: new OpenAIRealTimeWebRTCTransport (options),
  ...
});
pcClient.connect();

API Reference

Constructor Options

Below is the transport’s type definition for the OpenAI Session configuration you need to pass in to the create() method. See the OpenAI Realtime API documentation for more details on each of the options and their defaults.
export type OpenAIFunctionTool = {
  type: "function";
  name: string;
  description: string;
  parameters: JSONSchema;
};

export type OpenAIServerVad = {
  type: "server_vad";
  create_response?: boolean;
  interrupt_response?: boolean;
  prefix_padding_ms?: number;
  silence_duration_ms?: number;
  threshold?: number;
};

export type OpenAISemanticVAD = {
  type: "semantic_vad";
  eagerness?: "low" | "medium" | "high" | "auto";
  create_response?: boolean; // defaults to true
  interrupt_response?: boolean; // defaults to true
};

export type OpenAISessionConfig = Partial<{
  modalities?: string;
  instructions?: string;
  voice?:
    | "alloy"
    | "ash"
    | "ballad"
    | "coral"
    | "echo"
    | "sage"
    | "shimmer"
    | "verse";
  input_audio_noise_reduction?: {
    type: "near_field" | "far_field";
  } | null; // defaults to null/off
  input_audio_transcription?: {
    model: "whisper-1" | "gpt-4o-transcribe" | "gpt-4o-mini-transcribe";
    language?: string;
    prompt?: string[] | string; // gpt-4o models take a string
  } | null; // we default this to gpt-4o-transcribe
  turn_detection?: OpenAIServerVad | OpenAISemanticVAD | null; // defaults to server_vad
  temperature?: number;
  max_tokens?: number | "inf";
  tools?: Array<OpenAIFunctionTool>;
}>;

export interface OpenAIServiceOptions {
  api_key: string;
  model?: string;
  initial_messages?: LLMContextMessage[];
  settings?: OpenAISessionConfig;
}

TransportConnectionParams

The OpenAIRealTimeWebRTCTransport does not take connection parameters. It connects directly to the OpenAI Realtime API using the API key provided as part of the initial configuration.

Events

The transport implements the various PipecatClient event handlers. Check out the docs or samples for more info.

More Information

Package

@pipecat-ai/openai-realtime-webrtc-transport