Overview
Frames are the fundamental units of data in Pipecat. Every piece of information that moves through a pipeline — audio, text, images, control signals — is wrapped in a frame. Frame processors receive frames, act on them, and push new or modified frames along to the next processor. All frames inherit from the baseFrame class and are Python dataclasses.
Frame Categories
Pipecat has three base frame types, each with different processing behavior:| Base Type | Processing | Interruption Behavior |
|---|---|---|
DataFrame | Queued, processed in order with non-SystemFrames | Cancelled on user interruption |
ControlFrame | Queued, processed in order with non-SystemFrames | Cancelled on user interruption |
SystemFrame | Higher priority, queued, processed in order with SystemFrames | Not cancelled on user interruption |
DataFrame
Data frames carry the main content flowing through a pipeline: audio chunks, text, images, and LLM messages. They are queued and processed in order with other DataFrames and ControlFrames. If a user interrupts (starts speaking while the bot is responding), any pending data frames are discarded so the new input can be handled immediately. Examples:TextFrame, OutputAudioRawFrame, LLMMessagesAppendFrame, TTSSpeakFrame
ControlFrame
ControlFrames signal processing boundaries and configuration changes: response start/end markers, settings updates, and state transitions. They are queued and processed in order alongside DataFrames, and like DataFrames, any pending ControlFrames are discarded when a user interrupts unless combined withUninterruptibleFrame.
Examples: EndFrame, LLMFullResponseStartFrame, TTSStartedFrame, ServiceUpdateSettingsFrame
SystemFrame
SystemFrames are high-priority signals that must always be delivered: interruptions, user input, error notifications, and pipeline lifecycle events. They are queued and processed in order with other SystemFrames. Unlike DataFrames and ControlFrames, they are never discarded when a user interrupts. Examples:StartFrame, CancelFrame, InterruptionFrame., UserStartedSpeakingFrame, InputAudioRawFrame
Frame Properties
Every frame has these properties set automatically:Unique identifier for the frame instance.
Human-readable name combining class name and instance count (e.g.,
TextFrame#3). Useful for debugging.Presentation timestamp in nanoseconds. Used for audio/video synchronization.
Dictionary for arbitrary frame metadata.
Name of the transport source that created this frame.
Name of the transport destination for this frame. Used when a transport
supports multiple output tracks.
Frame Direction
Frames flow through the pipeline in one of two directions:Pushing Frames
Within a frame processor, callpush_frame() to send a frame to the next processor:
Broadcasting Frames
To send a frame in both directions simultaneously, usebroadcast_frame():
broadcast_sibling_id.
To broadcast an existing frame instance (when you are not the original creator of the frame), use broadcast_frame_instance():
id and name, which get fresh values.
Mixins
Mixins add cross-cutting behavior or shared data fields to frames without changing their base type.UninterruptibleFrame
Occasionally aDataFrame or ControlFrame is too important to discard during an interruption. Adding the UninterruptibleFrame mixin protects it: the frame stays in internal queues and any task processing it will not be cancelled.
EndFrame, StopFrame, FunctionCallResultFrame, FunctionCallInProgressFrame
AudioRawFrame
Carries raw audio fields shared by both input and output audio frames.Raw audio bytes in PCM format.
Audio sample rate in Hz (e.g., 16000).
Number of audio channels (e.g., 1 for mono).
Number of audio frames. Calculated automatically from the audio data.
ImageRawFrame
Carries raw image fields shared by both input and output image frames.Raw image bytes.
Image dimensions as (width, height).
Image format (e.g.,
"RGB", "RGBA").Common Patterns
Pipecat prefers pushing frames over calling methods directly between processors. Routing data through the pipeline as frames ensures correct processing order, which is critical for real-time use cases. Most frames are produced and consumed by Pipecat’s built-in services. The patterns below cover the frames you’re most likely to push yourself in application code.Starting a Conversation
Add an initial message to the context, then pushLLMRunFrame to kick off processing:
Injecting a Prompt
LLMMessagesAppendFrame adds messages to the context without replacing what’s already there. Set run_llm=True to trigger a response immediately:
Speaking Without the LLM
TTSSpeakFrame sends text directly to the TTS service as a standalone utterance, bypassing the LLM entirely:
Ending a Conversation
PushEndTaskFrame upstream to gracefully shut down the pipeline. Pair it with a TTSSpeakFrame to say goodbye first:
Changing Service Settings at Runtime
Push settings frames to adjust LLM, TTS, or STT configuration mid-conversation:Updating Tools at Runtime
Add or replace available function-calling tools while the conversation is active:Playing Sound Effects
Load audio files and pushOutputAudioRawFrame directly from a custom processor:
Reacting to LLM Response Boundaries
LLMFullResponseStartFrame and LLMFullResponseEndFrame bracket every LLM response. Custom processors can watch for these to trigger side effects:
Frame Type Reference
The individual reference pages below document every frame class, organized by function:Data Frames
Audio, image, text, transcription, and transport message frames that carry
content through the pipeline.
Control Frames
Pipeline lifecycle, LLM response boundaries, TTS state, service settings,
and filter/mixer configuration.
System Frames
Interruptions, user/bot speaking state, VAD events, errors, metrics, and raw
input frames.
LLM Frames
LLM context frame, function calling helper dataclasses, and links to
LLM-related frames on other pages.