The AudioBufferProcessor allows you to record and buffer audio from both input (user) and output (bot) sources during conversations. It supports both mono and stereo recording with configurable sample rates.

Usage

To record audio, create an instance of AudioBufferProcessor and add it to your pipeline:

from pipecat.processors.audio.audio_buffer_processor import AudioBufferProcessor

# Create an audio buffer processor
audiobuffer = AudioBufferProcessor(
    sample_rate=44100,  # Optional: desired output sample rate
    num_channels=2,     # 1 for mono, 2 for stereo
    buffer_size=0       # Size in bytes to trigger buffer callbacks
)

# Add to pipeline
pipeline = Pipeline([
    transport.input(),  # microphone
    context_aggregator.user(),
    llm,
    tts,
    transport.output(),
    audiobuffer,  # used to buffer the audio in the pipeline
    context_aggregator.assistant(),
])

# Example: Save recorded audio to WAV file
async def save_audio(audio: bytes, sample_rate: int, num_channels: int):
    if len(audio) > 0:
        filename = f"conversation_recording{datetime.datetime.now().strftime('%Y%m%d_%H%M%S')}.wav"
        with io.BytesIO() as buffer:
            with wave.open(buffer, "wb") as wf:
                wf.setsampwidth(2)
                wf.setnchannels(num_channels)
                wf.setframerate(sample_rate)
                wf.writeframes(audio)
            async with aiofiles.open(filename, "wb") as file:
                await file.write(buffer.getvalue())
        print(f"Merged audio saved to {filename}")

# Handle the recorded audio chunks
@audiobuffer.event_handler("on_audio_data")
async def on_audio_data(buffer, audio, sample_rate, num_channels):
    await save_audio(audio, sample_rate, num_channels)

Recording Controls

Start Recording

Begin recording audio from the conversation:

await audiobuffer.start_recording()

Stop Recording

Stop the current recording session:

await audiobuffer.stop_recording()

Audio Event Handling

You can handle recorded audio by registering an event handler for the on_audio_data event:

@audiobuffer.event_handler("on_audio_data")
async def on_audio_data(buffer, audio: bytes, sample_rate: int, num_channels: int):
    # Handle the recorded audio
    # audio: Raw audio bytes
    # sample_rate: Sample rate in Hz
    # num_channels: Number of audio channels (1 or 2)

Configuration Options

sample_rate
int

The desired output sample rate in Hz. If not specified, uses the transport’s sample rate.

num_channels
int
default:
"1"

Number of audio channels: - 1: Mono (mixed user and bot audio) - 2: Stereo (user audio in left channel, bot audio in right channel)

buffer_size
int
default:
"0"

Size in bytes that triggers the on_audio_data event: - 0: Only triggers when recording stops - >0: Triggers whenever buffer reaches this size

Properties

sample_rate
int

The current sample rate being used for recording

num_channels
int

The number of channels being recorded (1 or 2)