Overview

LocalCoreMLSmartTurnAnalyzer runs Smart Turn inference directly on your Mac using Apple’s CoreML framework. This provides low-latency inference without external API dependencies, making it ideal for development and applications where network access is limited or latency is critical.

Installation

pip install "pipecat-ai[local-smart-turn]"

Requirements

  • Apple Silicon Mac (M1/M2/M3 series)
  • macOS 11.0 or later

Local Model Setup

To use the LocalCoreMLSmartTurnAnalyzer, you need to set up the CoreML model locally:

  1. Install Git LFS (Large File Storage):

    brew install git-lfs
    
  2. Initialize Git LFS

    git lfs install
    
  3. Clone the Smart Turn model repository:

    git clone https://huggingface.co/pipecat-ai/smart-turn
    
  4. Set the environment variable to the cloned repository path:

    # Add to your .env file or environment
    export LOCAL_SMART_TURN_MODEL_PATH=/path/to/smart-turn
    

Configuration

Constructor Parameters

smart_turn_model_path
str
required

Path to the directory containing the Smart Turn model files

sample_rate
Optional[int]
default:"None"

Audio sample rate (will be set by the transport if not provided)

params
SmartTurnParams
default:"SmartTurnParams()"

Configuration parameters for turn detection. See SmartTurnParams for details.

Example

import os
from pipecat.audio.turn.smart_turn.local_coreml_smart_turn import LocalCoreMLSmartTurnAnalyzer
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.transports.base_transport import TransportParams

# Get the path to the Smart Turn model
smart_turn_model_path = os.getenv("LOCAL_SMART_TURN_MODEL_PATH")

# Create transport with local Smart Turn detection
transport = SmallWebRTCTransport(
    webrtc_connection=webrtc_connection,
    params=TransportParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
        vad_enabled=True,
        vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
        vad_audio_passthrough=True,
        turn_analyzer=LocalCoreMLSmartTurnAnalyzer(
            smart_turn_model_path=smart_turn_model_path,
            params=SmartTurnParams(
                stop_secs=2.0,  # Shorter stop time when using Smart Turn
                pre_speech_ms=0.0,
                max_duration_secs=8.0
            )
        ),
    ),
)

Performance Considerations

  • Latency: Very low latency since inference happens locally
  • Resource Usage: Uses local CPU/GPU resources
  • Reliability: No dependency on external services or network connectivity

Notes

  • Optimal for development environments and latency-sensitive applications
  • The CoreML model is optimized for Apple Silicon but will work on Intel Macs with reduced performance
  • First inference may be slower as the model is loaded and compiled