Recording Conversation Audio
Learn how to record and save audio from conversations between users and your bot
Overview
Recording audio from conversations provides valuable data for analysis, debugging, and quality control. Pipecat’s AudioBufferProcessor
makes it easy to capture high-quality audio recordings of both the user and bot during interactions.
How It Works
The AudioBufferProcessor
captures audio by:
- Collecting audio frames from both the user (input) and bot (output)
- Emitting events with recorded audio data
- Providing options for composite or separate track recordings
Add the processor to your pipeline after the transport.output()
to capture
both the user audio and the bot audio as it’s spoken.
Audio Recording Options
The AudioBufferProcessor
offers several configuration options:
- Composite recording: Combined audio from both user and bot
- Track-level recording: Separate audio files for user and bot
- Turn-based recording: Individual audio clips for each speaking turn
- Mono or stereo output: Single channel mixing or two-channel separation
Basic Implementation
Step 1: Create an Audio Buffer Processor
Initialize the audio buffer processor with your desired configuration:
Step 2: Add to Your Pipeline
Place the processor in your pipeline after all audio-producing components:
Step 3: Start Recording
Explicitly start recording when needed, typically when a session begins:
You must call start_recording()
explicitly to begin capturing audio. The
processor won’t record automatically when initialized.
Step 4: Handle Audio Data
Register an event handler to process audio data:
If recording separate tracks, you can use the on_track_audio_data
event
handler to save user and bot audio separately.
Next Steps
Try the Audio Recording Example
Explore a complete working example that demonstrates how to record and save both composite and track-level audio with Pipecat.
AudioBufferProcessor Reference
Read the complete API reference documentation for advanced configuration options and event handlers.
Consider implementing audio recording in your application for quality assurance, training data collection, or creating conversation archives. The recorded audio can be stored locally, uploaded to cloud storage, or processed in real-time for further analysis.