Kokoro - Pipecat

Overview

KokoroTTSService provides local, offline text-to-speech synthesis using the kokoro-onnx engine. It runs entirely on the host machine with no external API calls or authentication required. Model files are automatically downloaded to ~/.cache/pipecat/kokoro-onnx/ on first use.

Kokoro TTS API Reference

Pipecat’s API methods for Kokoro TTS integration

Example Implementation

Complete example with interruption handling

kokoro-onnx Repository

Official kokoro-onnx project and documentation

Settings Update Example

Example showing runtime settings updates

Installation

To use Kokoro TTS, install the required dependencies:

uv add "pipecat-ai[kokoro]"

This installs kokoro-onnx>=0.5.0 and its dependencies.

Prerequisites

Local Setup

Kokoro runs locally and does not require an API key or external service. On first use, the service automatically downloads two model files to ~/.cache/pipecat/kokoro-onnx/:

kokoro-v1.0.onnx — the ONNX speech synthesis model
voices-v1.0.bin — the voice data file

You can also provide custom paths to pre-downloaded model files via the model_path and voices_path constructor parameters.

The initial model download may take a few minutes depending on your connection speed. Subsequent runs use the cached files.

Configuration

KokoroTTSService

model_path

str

default:"None"

Path to a custom ONNX model file. When None, the model is automatically downloaded to ~/.cache/pipecat/kokoro-onnx/kokoro-v1.0.onnx.

voices_path

str

default:"None"

Path to a custom voices binary file. When None, the file is automatically downloaded to ~/.cache/pipecat/kokoro-onnx/voices-v1.0.bin.

voice_id

str

default:"None"

deprecated

Voice identifier for synthesis. Deprecated in v0.0.105. Use settings=KokoroTTSService.Settings(voice=...) instead.

params

InputParams

default:"None"

deprecated

Deprecated in v0.0.105. Use settings=KokoroTTSService.Settings(...) instead.

settings

KokoroTTSService.Settings

default:"None"

Runtime-configurable settings. See Settings below.

Settings

Runtime-configurable settings passed via the settings constructor argument using KokoroTTSService.Settings(...). These can be updated mid-conversation with TTSUpdateSettingsFrame. See Service Settings for details.

Parameter	Type	Default	Description
`model`	`str`	`None`	Model identifier. (Inherited from base settings.)
`voice`	`str`	`None`	Voice identifier (e.g. `"af_heart"`).
`language`	`Language \| str`	`Language.EN`	Language for synthesis. See supported languages.

Supported Languages

Kokoro supports the following languages:

Language	Code
English (US)	`Language.EN_US`
English (UK)	`Language.EN_GB`
English (generic)	`Language.EN`
Spanish	`Language.ES`
French	`Language.FR`
Hindi	`Language.HI`
Italian	`Language.IT`
Japanese	`Language.JA`
Portuguese	`Language.PT`
Chinese	`Language.ZH`

Usage

Basic Setup

from pipecat.services.kokoro import KokoroTTSService

tts = KokoroTTSService(
    settings=KokoroTTSService.Settings(
        voice="af_heart",
    ),
)

With Language Configuration

from pipecat.services.kokoro import KokoroTTSService
from pipecat.transcriptions.language import Language

tts = KokoroTTSService(
    settings=KokoroTTSService.Settings(
        voice="af_heart",
        language=Language.ES,
    ),
)

With Custom Model Paths

from pipecat.services.kokoro import KokoroTTSService

tts = KokoroTTSService(
    model_path="/path/to/kokoro-v1.0.onnx",
    voices_path="/path/to/voices-v1.0.bin",
    settings=KokoroTTSService.Settings(
        voice="af_heart",
    ),
)

The InputParams / params= pattern is deprecated as of v0.0.105. Use Settings / settings= instead. See the Service Settings guide for migration details.

Notes

Fully local: Kokoro runs entirely on the host machine using ONNX Runtime. No API keys, network access, or external services are required after the initial model download.
Automatic model caching: Model files are downloaded once to ~/.cache/pipecat/kokoro-onnx/ and reused on subsequent runs. You can also pre-download models and specify custom paths.
Audio resampling: Kokoro’s native output is automatically resampled to match the pipeline’s configured sample rate.
Streaming output: The service uses kokoro-onnx’s async streaming API, delivering audio frames incrementally as they are generated.
Metrics support: The service supports TTFB (time to first byte) and usage metrics for performance monitoring.

​Overview

Kokoro TTS API Reference

Example Implementation

kokoro-onnx Repository

Settings Update Example

​Installation

​Prerequisites

​Local Setup

​Configuration

​KokoroTTSService

​Settings

​Supported Languages

​Usage

​Basic Setup

​With Language Configuration

​With Custom Model Paths

​Notes

Overview

Installation

Prerequisites

Local Setup

Configuration

KokoroTTSService

Settings

Supported Languages

Usage

Basic Setup

With Language Configuration

With Custom Model Paths

Notes