Aggregates DTMF (phone keypad) input into meaningful sequences for LLM processing
DTMFAggregator
processes incoming DTMF (Dual-Tone Multi-Frequency) frames from phone keypad input and aggregates them into complete sequences that can be understood by LLM services. It buffers individual digit presses and flushes them as transcription frames when a termination digit is pressed, a timeout occurs, or an interruption happens.
This aggregator is essential for telephony applications where users interact via phone keypad buttons, converting raw DTMF input into structured text that LLMs can process alongside voice transcriptions.
InputDTMFFrame
instances.
KeypadEntry | Value | Description |
---|---|---|
ZERO through NINE | "0" - "9" | Numeric digits |
STAR | "*" | Star/asterisk key |
POUND | "#" | Pound/hash key |
#
)StartInterruptionFrame
is receivedEndFrame
is receivedUser Input | Aggregation Trigger | Output TranscriptionFrame |
---|---|---|
1 , 2 , 3 , # | Termination digit | "DTMF: 123#" |
* , 0 | 2-second timeout | "DTMF: *0" |
5 , interruption | StartInterruptionFrame | "DTMF: 5" |
9 , 9 , EndFrame | Pipeline shutdown | "DTMF: 99" |
#
for confirmation, *
for cancel/back operations