Build real-time voice and multimodal applications with Pipecat’s RTVI protocol
Pipecat’s RTVI (Real-Time Voice Interaction) protocol provides a standardized communication layer between clients and servers for building real-time voice and multimodal applications. It handles the synchronization of user and bot interactions, transcriptions, LLM processing, and text-to-speech delivery.
Track when users and bots start/stop speaking for natural turn-taking
Handle real-time transcriptions from both users and bots
Manage LLM responses and function calls with proper client notifications
Control text-to-speech state and audio delivery
RTVI operates with two primary components:
RTVIProcessor - A frame processor residing in the pipeline that serves as the entry point for sending and receiving messages to/from the client.
RTVIObserver - An observer that monitors pipeline events and translates them into client-compatible messages, handling:
Here’s how to set up RTVI in your Pipecat application:
client-ready
messagebot-ready
and initial configurationuser/bot-started/stopped-speaking
)user/bot-transcription
)bot-llm-text
, llm-function-call
)bot-tts-text
, bot-tts-audio
)Configure and manage RTVI services, actions, and client communication
Translate internal pipeline events to standardized client messages
RTVI is implemented in Pipecat client SDKs, providing a high-level API to interact with the protocol. Visit the Pipecat Client SDKs documentation:
Learn how to implement RTVI on the client-side with our JavaScript, React, and mobile SDKs
Build real-time voice and multimodal applications with Pipecat’s RTVI protocol
Pipecat’s RTVI (Real-Time Voice Interaction) protocol provides a standardized communication layer between clients and servers for building real-time voice and multimodal applications. It handles the synchronization of user and bot interactions, transcriptions, LLM processing, and text-to-speech delivery.
Track when users and bots start/stop speaking for natural turn-taking
Handle real-time transcriptions from both users and bots
Manage LLM responses and function calls with proper client notifications
Control text-to-speech state and audio delivery
RTVI operates with two primary components:
RTVIProcessor - A frame processor residing in the pipeline that serves as the entry point for sending and receiving messages to/from the client.
RTVIObserver - An observer that monitors pipeline events and translates them into client-compatible messages, handling:
Here’s how to set up RTVI in your Pipecat application:
client-ready
messagebot-ready
and initial configurationuser/bot-started/stopped-speaking
)user/bot-transcription
)bot-llm-text
, llm-function-call
)bot-tts-text
, bot-tts-audio
)Configure and manage RTVI services, actions, and client communication
Translate internal pipeline events to standardized client messages
RTVI is implemented in Pipecat client SDKs, providing a high-level API to interact with the protocol. Visit the Pipecat Client SDKs documentation:
Learn how to implement RTVI on the client-side with our JavaScript, React, and mobile SDKs