> ## Documentation Index > Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt > Use this file to discover all available pages before exploring further. # Data Frames > Reference for DataFrame types: audio, image, text, transcription, and transport messages ## Overview DataFrames carry the main content flowing through a pipeline: audio chunks, text, images, transcriptions, and messages. They are queued and processed in order with other DataFrames and ControlFrames, and any pending DataFrames are discarded when a user interrupts. See the [Frames overview](/api-reference/server/frames/overview) for base class details, mixin fields, and frame properties common to all frames. ## Audio Frames These frames carry raw audio through the pipeline toward the output transport. Each inherits the `audio`, `sample_rate`, `num_channels`, and `num_frames` fields from the [`AudioRawFrame`](/api-reference/server/frames/overview#audiorawframe) mixin. ### OutputAudioRawFrame A chunk of raw audio destined for the output transport. Use the inherited `transport_destination` field when your transport supports multiple audio tracks. Inherits from `AudioRawFrame`. ### TTSAudioRawFrame Audio generated by a TTS service, ready for playback. Inherits from `OutputAudioRawFrame`. Identifier for the TTS context that generated this audio. ### SpeechOutputAudioRawFrame Audio from a continuous speech stream. The stream may contain silence frames intermixed with speech, so downstream processors may need to distinguish between the two. Inherits from `OutputAudioRawFrame`. ## Image Frames Frames for carrying image data to the output transport. Each inherits `image`, `size`, and `format` from the [`ImageRawFrame`](/api-reference/server/frames/overview#imagerawframe) mixin. ### OutputImageRawFrame An image for display by the output transport. Supports the `transport_destination` field for transports with multiple video tracks. Inherits from `ImageRawFrame`. The `sync_with_audio` field (default `False`) is set internally, not via the constructor. When `True`, the image is queued with audio frames so it displays only after all preceding audio has been sent. When `False`, the transport displays it immediately. ### URLImageRawFrame An output image with an associated download URL, typically from a third-party image generation service. Inherits from `OutputImageRawFrame`. URL where the image can be downloaded. ### AssistantImageRawFrame An image generated by the assistant for both display and inclusion in LLM context. The superclass handles display; the additional fields here carry the original image data in a format suitable for direct use in LLM context messages. Inherits from `OutputImageRawFrame`. Original image data for use in LLM context messages without further encoding. MIME type of the original image data. ### SpriteFrame An animated sprite composed of multiple image frames. The transport plays the images at the framerate specified by the transport's `camera_out_framerate` parameter. Ordered list of image frames that make up the sprite animation. ## Text Frames Text content at various stages of processing: raw text, LLM output, aggregated results, TTS input, and transcriptions. ### TextFrame The fundamental text container. Emitted by LLM services, consumed by context aggregators, TTS services, and other processors. The text content. Several non-constructor fields control downstream behavior: - `skip_tts` (default `None`): when set, tells the TTS service to skip this text - `includes_inter_frame_spaces` (default `False`): indicates whether leading/trailing spaces are already included - `append_to_context` (default `True`): whether this text should be appended to the LLM context ### LLMTextFrame Text generated by an LLM service. Behaves like a `TextFrame` with `includes_inter_frame_spaces` set to `True`, since LLM services include all necessary spacing. Inherits from `TextFrame`. ### LLMMarkerFrame A sideband marker emitted by an LLM service — short, structured assistant output that is persisted to the conversation context but kept out of the standard text path (TTS, transcript). The primary use is the [filter incomplete user turns](/api-reference/server/utilities/turn-management/filter-incomplete-turns) protocol, where the LLM emits the turn-completion markers `✓` / `○` / `◐`. Inherits from `DataFrame`. The marker payload (typically a short string such as a single character). If `True`, the marker is written to the context as its own standalone assistant message as soon as it's received. If `False`, it is appended to the running assistant aggregation and flushed to the context together with the following text as a single message (e.g. `"✓ "`). ### AggregatedTextFrame Multiple text frames combined into a single frame for processing or output. Inherits from `TextFrame`. Method used to aggregate the text frames. Identifier for the TTS context associated with this text. ### VisionTextFrame Text output from a vision model. Functionally identical to `LLMTextFrame` but distinguished by type for routing purposes. Inherits from `LLMTextFrame`. ### TTSTextFrame Text that has been sent to a TTS service for synthesis. Inherits from `AggregatedTextFrame`. Identifier for the TTS context that generated this text. ### Transcriptions Frames produced by speech-to-text services at different stages of recognition. All inherit from `TextFrame`, so they flow through text aggregators and other `TextFrame` handlers. #### TranscriptionFrame A non-interim transcription result from an STT service: the service's best recognition of what the user said, as opposed to the streaming partial results in `InterimTranscriptionFrame`. Identifier for the user who spoke. When the transcription occurred. Detected or specified language of the speech. Raw result object from the STT service. Whether the STT service has explicitly committed this transcription via a finalize signal. Some services (AssemblyAI, Deepgram, Soniox, Speechmatics) support this; others don't, so it defaults to `False`. Turn detection strategies can use this flag to trigger the bot's response immediately rather than waiting for a timeout. #### InterimTranscriptionFrame A partial, in-progress transcription. These frames update frequently while the user is still speaking, and are superseded by a `TranscriptionFrame` once the STT service produces its result. The partial transcription text. Identifier for the user who spoke. When the interim transcription occurred. Detected or specified language of the speech. Raw result object from the STT service. #### TranslationFrame A translated transcription, typically placed in the transport's receive queue when a participant speaks in a different language. Identifier for the user who spoke. When the translation occurred. Target language of the translation. ## TTS Frames ### TTSSpeakFrame Sends text to the pipeline's TTS service as a standalone utterance, independent of any LLM response turn. The TTS service creates a fresh audio context for each `TTSSpeakFrame`, whereas `TextFrame`s produced during an LLM response are grouped under the same turn context. The text to be spoken. Whether to append the spoken text to the LLM context. ## Transport Message Frames ### OutputTransportMessageFrame A transport-specific message payload for sending data through the output transport. The message format depends on the transport implementation. The transport message payload. ## DTMF Frames ### OutputDTMFFrame A DTMF (Dual-Tone Multi-Frequency) keypress queued for output. Inherits the `button` field from the `DTMFFrame` mixin, which holds the keypad entry that was pressed. Inherits from `DTMFFrame`. The DTMF keypad entry to send. For transports that support multiple dial-out destinations, set the `transport_destination` field (inherited from `Frame`) to specify which destination receives the DTMF tone. ## LLM Context Management Frames that modify or trigger processing of the LLM conversation context. ### LLMMessagesAppendFrame Appends messages to the current conversation context without replacing existing ones. List of message dictionaries to append. Whether the LLM should process the updated context immediately. When `None`, the default behavior of the context aggregator applies. ### LLMMessagesUpdateFrame Replaces the current context messages entirely with a new set. List of message dictionaries to replace the current context. Whether the LLM should process the updated context immediately. When `None`, the default behavior of the context aggregator applies. ### LLMRunFrame Triggers LLM processing with the current context. Push this frame when you want the LLM to generate a response using whatever context has already been assembled. ### LLMContextAssistantTimestampFrame Records when an assistant message was created. Used internally to track timing of assistant responses in the conversation context. Timestamp when the assistant message was created. ## LLM Thinking ### LLMThoughtTextFrame A chunk of thought or reasoning text from the LLM. This is a `DataFrame`, not a `TextFrame` subclass — TTS services and text aggregators will not process it. The text (or text chunk) of the thought. ## LLM Tool Configuration Frames for configuring LLM function calling behavior and output settings at runtime. ### LLMSetToolsFrame Changes the set of tools advertised to the LLM mid-conversation. The tools to advertise. May be a `ToolsSchema`, a plain list of direct functions and/or `FunctionSchema` objects, a list of provider-specific tool dicts, or `NOT_GIVEN` to clear all tools. Direct functions and `FunctionSchema`s with a bundled `handler` are auto-registered with the LLM service, meaning no manual handler registration is needed. Any such tool dropped from the advertised set is automatically unregistered. ### LLMSetToolChoiceFrame Configures how the LLM selects tools during function calling. Tool choice setting: `"none"` disables tool use, `"auto"` lets the LLM decide, `"required"` forces a tool call, or a dict specifying a particular tool. ### LLMEnablePromptCachingFrame Toggles prompt caching for LLMs that support it. Whether to enable prompt caching. ### LLMConfigureOutputFrame Configures how the LLM produces output. Useful for scenarios where you want the LLM to generate tokens that update context but should not be spoken aloud. When `True`, LLM tokens are added to context but not passed to TTS. ## Function Call Results ### FunctionCallResultFrame Contains the result of a completed function call execution. Inherits from `UninterruptibleFrame` to ensure the result always reaches the context aggregator. Name of the function that was executed. Unique identifier for the function call. Arguments that were passed to the function. The result returned by the function. Whether to run the LLM after this result. Overrides the default behavior. Additional properties for result handling.