Media
Audio Frames
Frame types for handling audio data in Pipecat
AudioRawFrame
Base class for all audio data frames. Contains raw audio samples and associated metadata.
Properties
audio
bytes
requiredRaw audio data in PCM format
sample_rate
int
requiredAudio sample rate in Hz (e.g., 16000, 44100)
num_channels
int
requiredNumber of audio channels (typically 1 for mono, 2 for stereo)
num_frames
int
Number of audio frames, calculated as: len(audio) / (num_channels * 2)
Methods
InputAudioRawFrame
Audio frame specifically for input sources (e.g., microphone).
OutputAudioRawFrame
Audio frame for output playback.
TTSAudioRawFrame
Audio frame containing synthesized speech from TTS services.
Usage Examples
Creating Audio Frames
Common Pipeline Usage
Frame Flow
Notes
- Audio data should be in PCM format
- Frame size is typically aligned with processing requirements (e.g., 20ms chunks)
- Sample rate should match service requirements (e.g., 16kHz for most STT services)