Overview

LMNT provides real-time text-to-speech synthesis through a WebSocket-based streaming API optimized for conversational AI. The service offers ultra-low latency with high-quality voice models and supports multiple languages with automatic interruption handling.

Installation

To use LMNT services, install the required dependencies:
pip install "pipecat-ai[lmnt]"
You’ll also need to set up your LMNT API key as an environment variable: LMNT_API_KEY.
Get your API key from the LMNT Console.

Frames

Input

  • TextFrame - Text content to synthesize into speech
  • TTSSpeakFrame - Text that should be spoken immediately
  • TTSUpdateSettingsFrame - Runtime configuration updates
  • LLMFullResponseStartFrame / LLMFullResponseEndFrame - LLM response boundaries

Output

  • TTSStartedFrame - Signals start of synthesis
  • TTSAudioRawFrame - Generated audio data chunks (streaming PCM)
  • TTSStoppedFrame - Signals completion of synthesis
  • ErrorFrame - WebSocket or API errors

Language Support

Most common languages supported:
  • Language.EN - English
  • Language.ES - Spanish
  • Language.FR - French
  • Language.DE - German
  • Language.ZH - Chinese
  • Language.JA - Japanese

Usage Example

Basic Configuration

Initialize the LmntTTSService and use it in a pipeline:
from pipecat.services.lmnt.tts import LmntTTSService
from pipecat.transcriptions.language import Language
import os

# Configure service
tts = LmntTTSService(
    api_key=os.getenv("LMNT_API_KEY"),
    voice_id="morgan",
    model="aurora",
    language=Language.EN,
    sample_rate=24000
)

# Use in pipeline
pipeline = Pipeline([
    transport.input(),
    stt,
    context_aggregator.user(),
    llm,
    tts,
    transport.output(),
    context_aggregator.assistant()
])

Dynamic Configuration

Make settings updates by pushing a TTSUpdateSettingsFrame for the LmntTTSService:
from pipecat.frames.frames import TTSUpdateSettingsFrame

await task.queue_frame(
    TTSUpdateSettingsFrame(settings={"voice": "your-new-voice-id"})
)

Metrics

The service provides real-time metrics:
  • Time to First Byte (TTFB) - Latency from text input to first audio
  • Processing Duration - Total synthesis time
  • Character Usage - Text processed for billing
Learn how to enable Metrics in your Pipeline.

Additional Notes

  • WebSocket Streaming: Uses persistent WebSocket connection for ultra-low latency
  • Custom Voices: Supports custom voice creation through LMNT dashboard
  • Language Detection: Automatically handles language variants and fallbacks