Learn how to record and save audio from conversations between users and your bot
Recording audio from conversations provides valuable data for analysis, debugging, and quality control. You have two options for how to record with Pipecat:
Record without writing custom code by using your transport provider’s recording capabilities. In addition to saving you development time, some providers offer unique recording capabilities.
Pipecat’s AudioBufferProcessor
makes it easy to capture high-quality audio recordings of both the user and bot during interactions. Opt for this approach if you want more control over your recording.
This guide focuses on how to recording using the AudioBufferProcessor
, including high-level guidance for how to set up post-processing jobs for longer recordings.
The AudioBufferProcessor
captures audio by:
Add the processor to your pipeline after the transport.output()
to capture
both the user audio and the bot audio as it’s spoken.
The AudioBufferProcessor
offers several configuration options:
Initialize the audio buffer processor with your desired configuration:
Place the processor in your pipeline after all audio-producing components:
Explicitly start recording when needed, typically when a session begins:
You must call start_recording()
explicitly to begin capturing audio. The
processor won’t record automatically when initialized.
Register an event handler to process audio data:
If recording separate tracks, you can use the on_track_audio_data
event
handler to save user and bot audio separately.
For conversations that last a few minutes, it may be sufficient to just buffer the audio in memory. However, for longer sessions, storing audio in memory poses two challenges:
Instead, consider using a chunked approach to record audio in manageable segments. This allows you to periodically save audio data to disk or upload it to cloud storage, reducing memory usage and ensuring data persistence.
Set a reasonable buffer_size
to trigger periodic uploads:
For cloud storage, consider using multipart uploads to stream audio chunks:
Conceptual Approach:
Benefits:
After uploading chunks, create final audio files using tools like FFmpeg:
Concatenating Audio Files:
Automation Considerations:
Explore a complete working example that demonstrates how to record and save both composite and track-level audio with Pipecat.
Read the complete API reference documentation for advanced configuration options and event handlers.
Consider implementing audio recording in your application for quality assurance, training data collection, or creating conversation archives. The recorded audio can be stored locally, uploaded to cloud storage, or processed in real-time for further analysis.
Learn how to record and save audio from conversations between users and your bot
Recording audio from conversations provides valuable data for analysis, debugging, and quality control. You have two options for how to record with Pipecat:
Record without writing custom code by using your transport provider’s recording capabilities. In addition to saving you development time, some providers offer unique recording capabilities.
Pipecat’s AudioBufferProcessor
makes it easy to capture high-quality audio recordings of both the user and bot during interactions. Opt for this approach if you want more control over your recording.
This guide focuses on how to recording using the AudioBufferProcessor
, including high-level guidance for how to set up post-processing jobs for longer recordings.
The AudioBufferProcessor
captures audio by:
Add the processor to your pipeline after the transport.output()
to capture
both the user audio and the bot audio as it’s spoken.
The AudioBufferProcessor
offers several configuration options:
Initialize the audio buffer processor with your desired configuration:
Place the processor in your pipeline after all audio-producing components:
Explicitly start recording when needed, typically when a session begins:
You must call start_recording()
explicitly to begin capturing audio. The
processor won’t record automatically when initialized.
Register an event handler to process audio data:
If recording separate tracks, you can use the on_track_audio_data
event
handler to save user and bot audio separately.
For conversations that last a few minutes, it may be sufficient to just buffer the audio in memory. However, for longer sessions, storing audio in memory poses two challenges:
Instead, consider using a chunked approach to record audio in manageable segments. This allows you to periodically save audio data to disk or upload it to cloud storage, reducing memory usage and ensuring data persistence.
Set a reasonable buffer_size
to trigger periodic uploads:
For cloud storage, consider using multipart uploads to stream audio chunks:
Conceptual Approach:
Benefits:
After uploading chunks, create final audio files using tools like FFmpeg:
Concatenating Audio Files:
Automation Considerations:
Explore a complete working example that demonstrates how to record and save both composite and track-level audio with Pipecat.
Read the complete API reference documentation for advanced configuration options and event handlers.
Consider implementing audio recording in your application for quality assurance, training data collection, or creating conversation archives. The recorded audio can be stored locally, uploaded to cloud storage, or processed in real-time for further analysis.