Overview

STTMuteFilter is a general-purpose processor that combines STT muting and interruption control. When active, it prevents both transcription and interruptions during bot speech, providing a cleaner conversation flow by ensuring the bot’s speech isn’t interrupted or transcribed.

The processor supports multiple strategies for when to mute the STT service, making it flexible for different use cases.

Constructor Parameters

stt_service
STTService
required

The STT service to control

config
STTMuteConfig
required

Configuration object that defines the muting strategy and optional custom logic

Configuration

The processor is configured using STTMuteConfig, which determines when and how the STT service should be muted:

strategy
STTMuteStrategy

The muting strategy to use

should_mute_callback
Callable[[STTMuteFilter], Awaitable[bool]]
default: "None"

Optional callback for custom muting logic (required when strategy is CUSTOM)

Muting Strategies

STTMuteConfig accepts one of these STTMuteStrategy values:

FIRST_SPEECH
STTMuteStrategy

Mute only during the bot’s first speech (typically during introduction)

ALWAYS
STTMuteStrategy

Mute during all bot speech

CUSTOM
STTMuteStrategy

Use custom logic provided via callback to determine when to mute

Configuration Examples

# Mute during first speech only
config = STTMuteConfig(strategy=STTMuteStrategy.FIRST_SPEECH)

# Always mute during bot speech
config = STTMuteConfig(strategy=STTMuteStrategy.ALWAYS)

# Custom muting logic
def custom_mute_logic(processor: STTMuteFilter) -> bool:
    return processor._bot_is_speaking and datetime.now().hour < 17

config = STTMuteConfig(
    strategy=STTMuteStrategy.CUSTOM,
    should_mute_callback=custom_mute_logic
)

Input Frames

BotStartedSpeakingFrame
Frame

Indicates bot has started speaking

BotStoppedSpeakingFrame
Frame

Indicates bot has stopped speaking

StartInterruptionFrame
Frame

User interruption start event (suppressed when muted)

StopInterruptionFrame
Frame

User interruption stop event (suppressed when muted)

UserStartedSpeakingFrame
Frame

Indicates user has started speaking (suppressed when muted)

UserStoppedSpeakingFrame
Frame

Indicates user has stopped speaking (suppressed when muted)

Output Frames

STTMuteFrame
Frame

Control frame to mute/unmute the STT service

All input frames are passed through except VAD-related frames (interruptions and user speaking events) when muted.

Usage Examples

Basic Usage (Mute During First Speech)

stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
stt_mute_filter = STTMuteFilter(
    stt_service=stt,
    config=STTMuteConfig(strategy=STTMuteStrategy.FIRST_SPEECH)
)

pipeline = Pipeline([
    transport.input(),
    stt_mute_filter,  # Add before STT service
    stt,
    # ... rest of pipeline
])

Always Mute During Bot Speech

stt_mute_filter = STTMuteFilter(
    stt_service=stt,
    config=STTMuteConfig(strategy=STTMuteStrategy.ALWAYS)
)

Custom Muting Logic

async def custom_mute_logic(processor: STTMuteFilter) -> bool:
    # Example: Mute during business hours only
    current_hour = datetime.now().hour
    return processor._bot_is_speaking and (9 <= current_hour < 17)

stt_mute_filter = STTMuteFilter(
    stt_service=stt,
    config=STTMuteConfig(
        strategy=STTMuteStrategy.CUSTOM,
        should_mute_callback=custom_mute_logic
    )
)

Frame Flow

Notes

  • Combines STT muting and interruption control into a single concept
  • Muting prevents both transcription and interruptions
  • Strategies can be changed at initialization
  • Custom strategy allows for complex muting logic
  • Placed before STT service in pipeline
  • Maintains conversation flow during bot speech
  • Efficient state tracking for minimal overhead