Skip to main content
SystemFrames have higher priority than DataFrames and ControlFrames and are never cancelled during user interruptions. They are queued and processed in order with other SystemFrames. They carry signals that must always be delivered: pipeline startup and teardown, error notifications, user input, and speaking state changes. See the frames overview for base class details, mixin fields, and frame properties common to all frames.

Pipeline Lifecycle

StartFrame

The first frame pushed into a pipeline, initializing all processors. Every processor receives this before any DataFrames or ControlFrames arrive.
audio_in_sample_rate
int
default:"16000"
Input audio sample rate in Hz.
audio_out_sample_rate
int
default:"24000"
Output audio sample rate in Hz.
allow_interruptions
bool
default:"False"
Whether user interruptions are allowed. Deprecated since 0.0.99: use interruption strategies instead.
enable_metrics
bool
default:"False"
Enable performance metrics collection from processors.
enable_tracing
bool
default:"False"
Enable tracing for pipeline execution.
enable_usage_metrics
bool
default:"False"
Enable usage metrics (token counts, API calls) from services.
interruption_strategies
List[BaseInterruptionStrategy]
default:"[]"
List of interruption strategies for the pipeline. Deprecated since 0.0.99.
report_only_initial_ttfb
bool
default:"False"
When True, only report time-to-first-byte for the initial response rather than every response.
tracing_context
Optional[TracingContext]
default:"None"
Optional tracing context for distributed tracing integration.

CancelFrame

Stops the pipeline immediately, skipping any queued non-SystemFrames. Use this when you need to abort without waiting for pending work to drain. For example, when the user has left the session.
reason
Optional[Any]
default:"None"
Optional reason for the cancellation.

Errors

ErrorFrame

Carries an error notification, typically pushed upstream so earlier processors can react.
error
str
required
Human-readable error message.
fatal
bool
default:"False"
Whether this error is fatal and requires the bot to shut down.
processor
Optional[FrameProcessor]
default:"None"
The processor that raised the error.
exception
Optional[Exception]
default:"None"
The underlying exception, if one was caught.

FatalErrorFrame

An unrecoverable error requiring the bot to shut down. The fatal field is always True. Inherits from ErrorFrame.

Processor Pause/Resume (Urgent)

These are the SystemFrame variants of FrameProcessorPauseFrame and FrameProcessorResumeFrame. As SystemFrames, they flow through the high-priority input queue rather than the process queue, so they are not blocked by paused state or buffered frames. This makes FrameProcessorResumeUrgentFrame the correct way to resume a processor externally — the ControlFrame variant (FrameProcessorResumeFrame) would get stuck behind any DataFrames that queued up during the pause. See Control Frames for the full explanation.

FrameProcessorPauseUrgentFrame

Pauses a processor immediately, without waiting for queued frames to drain first.
processor
FrameProcessor
The processor to pause.

FrameProcessorResumeUrgentFrame

Resumes a paused processor immediately, releasing buffered frames. Use this instead of FrameProcessorResumeFrame when the processor may have frames queued up.
processor
FrameProcessor
The processor to resume.

Interruptions

InterruptionFrame

Interrupts the pipeline, discarding pending DataFrames and ControlFrames. Typically triggered when the user starts speaking during a bot response.

User Speaking State

UserStartedSpeakingFrame

Indicates that a user turn has begun. By this point, transcriptions are usually already flowing through the pipeline.
emulated
bool
default:"False"
Whether this event was emulated rather than detected by VAD. Deprecated since 0.0.99.

UserStoppedSpeakingFrame

Marks the end of a user turn. The bot’s response is triggered separately by the turn detection system.
emulated
bool
default:"False"
Whether this event was emulated rather than detected by VAD. Deprecated since 0.0.99.

UserSpeakingFrame

Emitted by the VAD processor while the user is actively speaking. Useful for UI feedback or suppressing idle timeouts.

UserMuteStartedFrame

Broadcast when one or more user mute strategies activate. User mute temporarily suppresses user input while the bot is speaking to prevent interruptions. While muted, the LLMUserAggregator drops incoming user frames (InputAudioRawFrame, TranscriptionFrame, InterimTranscriptionFrame, UserStartedSpeakingFrame, UserStoppedSpeakingFrame, VAD signals, and InterruptionFrame). Lifecycle frames (StartFrame, EndFrame, CancelFrame) are never muted.

UserMuteStoppedFrame

Broadcast when all active user mute strategies deactivate, allowing user input to be processed again.

VAD Events

These frames are emitted directly by the Voice Activity Detection (VAD) processor and carry timing metadata. Higher-level speaking-state frames (UserStartedSpeakingFrame, UserStoppedSpeakingFrame) are derived from these.

VADUserStartedSpeakingFrame

VAD confirmed that speech has started.
start_secs
float
default:"0.0"
Timestamp in seconds when speech onset was detected.
timestamp
float
default:"time.time()"
Wall-clock time when the frame was created.

VADUserStoppedSpeakingFrame

VAD confirmed that speech has ended.
stop_secs
float
default:"0.0"
Timestamp in seconds when speech ended.
timestamp
float
default:"time.time()"
Wall-clock time when the frame was created.

SpeechControlParamsFrame

Notifies processors that VAD or turn detection parameters have changed at runtime.
vad_params
Optional[VADParams]
default:"None"
Updated VAD parameters.
turn_params
Optional[BaseTurnParams]
default:"None"
Updated turn detection parameters.

Bot Speaking State

BotStartedSpeakingFrame

Emitted by the output transport when the bot begins speaking. Broadcast in both directions so processors on either side of the transport can react.

BotStoppedSpeakingFrame

Emitted by the output transport when the bot finishes speaking. Also broadcast in both directions.

BotSpeakingFrame

Emitted continuously while the bot is speaking. Processors can use this to suppress idle timeouts or drive visual indicators.

Connection Status

BotConnectedFrame

The bot has joined the transport room. Only relevant for SFU-based transports: Daily, LiveKit, HeyGen, and Tavus.

ClientConnectedFrame

A client or participant has connected to the transport.

Input Frames

Input frames carry raw data from transport sources into the pipeline. As SystemFrames, they are never discarded during interruptions. Incoming user data must always be processed.

InputAudioRawFrame

Raw audio received from the transport. Inherits the audio, sample_rate, num_channels, and num_frames fields from the AudioRawFrame mixin. Inherits from AudioRawFrame.

UserAudioRawFrame

Audio from a specific user in a multi-participant session. Inherits from InputAudioRawFrame.
user_id
str
default:"\"\""
Identifier for the user who produced this audio.

InputImageRawFrame

Raw image received from the transport. Inherits image, size, and format from the ImageRawFrame mixin. Inherits from ImageRawFrame.

UserImageRawFrame

An image from a specific user, optionally tied to a pending image request. Inherits from InputImageRawFrame.
user_id
str
default:"\"\""
Identifier for the user who produced this image.
text
Optional[str]
default:"None"
Optional text associated with the image.
append_to_context
Optional[bool]
default:"None"
Whether to append this image to the LLM context.
request
Optional[UserImageRequestFrame]
default:"None"
The original request frame that triggered this image capture.

InputTextRawFrame

Text received from the transport, such as a user typing in a chat interface. Inherits the text field from TextFrame. Inherits from TextFrame.

DTMF Input

InputDTMFFrame

A DTMF keypress received from the transport. Inherits the button field from the DTMFFrame mixin. Inherits from DTMFFrame.

OutputDTMFUrgentFrame

A DTMF keypress for immediate output, bypassing the normal frame queue. Inherits from DTMFFrame.

Transport Messages

InputTransportMessageFrame

A message received from an external transport. The message format is transport-specific.
message
Any
required
The transport message payload.

OutputTransportMessageUrgentFrame

An outbound transport message that bypasses the normal queue for immediate delivery.
message
Any
required
The transport message payload.

Function Calling

FunctionCallsStartedFrame

Signals that one or more function calls are about to begin executing.
function_calls
Sequence[FunctionCallFromLLM]
required
Sequence of function calls that will be executed.

FunctionCallCancelFrame

Signals that a function call was cancelled, typically due to user interruption when the function’s cancel_on_interruption flag is set.
function_name
str
required
Name of the function that was cancelled.
tool_call_id
str
required
Unique identifier for the cancelled function call.

User Interaction

UserImageRequestFrame

Requests an image from a specific user, typically to capture a camera frame for vision processing.
user_id
str
required
Identifier for the user to capture from.
text
Optional[str]
default:"None"
Optional text prompt associated with the image request.
append_to_context
Optional[bool]
default:"None"
Whether to append the resulting image to the LLM context.
video_source
Optional[str]
default:"None"
Specific video source to capture from.
function_name
Optional[str]
default:"None"
Function name if this request originated from a tool call.
tool_call_id
Optional[str]
default:"None"
Tool call identifier if this request originated from a tool call.
result_callback
Optional[Any]
default:"None"
Callback to invoke with the captured image result.

STTMuteFrame

Mutes or unmutes the STT service. While muted, incoming audio is not sent to the STT provider.
mute
bool
required
True to mute, False to unmute.

UserIdleTimeoutUpdateFrame

Updates the user idle timeout at runtime. Set to 0 to disable idle detection entirely.
timeout
float
required
New idle timeout in seconds. 0 disables detection.

Diagnostics

MetricsFrame

Performance metrics collected from processors. Emitted when metrics reporting is enabled via StartFrame.
data
List[MetricsData]
required
List of metrics data entries.

Service Metadata

ServiceMetadataFrame

Base metadata frame broadcast by services at startup, providing information about service capabilities and configuration.
service_name
str
required
Name of the service that emitted this metadata.

STTMetadataFrame

Metadata from an STT service, including latency characteristics used for turn detection tuning. Inherits from ServiceMetadataFrame.
ttfs_p99_latency
float
required
P99 latency in seconds for time-to-final-segment. Used by turn detectors to calibrate wait times.

RTVI

Frames for the Real-Time Voice Interface (RTVI) protocol, which bridges clients and the pipeline. These frames handle custom messaging between the client and server.

RTVIServerMessageFrame

Sends a server message to the connected client.
data
Any
required
The message data to send to the client.

RTVIClientMessageFrame

A message received from the client, expecting a server response via RTVIServerResponseFrame.
msg_id
str
required
Unique identifier for the client message.
type
str
required
The message type.
data
Optional[Any]
default:"None"
Optional message data from the client.

RTVIServerResponseFrame

Responds to an RTVIClientMessageFrame. Include the original client message frame to ensure the response is properly correlated. Set the error field to respond with an error instead of a normal response.
client_msg
RTVIClientMessageFrame
required
The original client message this response is for.
data
Optional[Any]
default:"None"
Response data to send to the client.
error
Optional[str]
default:"None"
Error message. When set, the client receives an error-response instead of a server-response.

Task Frames

Task frames provide a system-priority mechanism for requesting pipeline actions from outside the normal frame flow. They are converted into their corresponding standard frames when processed.

TaskSystemFrame

Base class for system-priority task frames.

CancelTaskFrame

Requests immediate pipeline cancellation. Converted to a CancelFrame when processed by the pipeline. Inherits from TaskSystemFrame.
reason
Optional[Any]
default:"None"
Optional reason for the cancellation request.

InterruptionTaskFrame

Requests a pipeline interruption. Converted to an InterruptionFrame when processed. Inherits from TaskSystemFrame.