Pipecat TTS Cache

Overview

TTSCacheMixin is a lightweight caching layer that transparently wraps an existing Pipecat TTS service to eliminate API costs for repeated phrases and reduce response latency for cached audio. It is a utility mixin rather than a TTS provider: it does not synthesize speech itself, but caches the audio produced by another TTS service (such as Cartesia, ElevenLabs, Deepgram, Google, or OpenAI) and replays it on subsequent requests. Audio can be cached in process with MemoryCacheBackend (LRU) or shared across instances with RedisCacheBackend.

Source Repository

Source code, examples, and issues for the TTS Cache integration

PyPI Package

The pipecat-tts-cache package on PyPI

Installation

This is a community-maintained package distributed separately from pipecat-ai:

# Standard installation (Memory backend only)
pip install pipecat-tts-cache

# Production installation (with Redis support)
pip install "pipecat-tts-cache[redis]"

How It Works

TTSCacheMixin is applied alongside an existing Pipecat TTS service class to produce a cached variant. It intercepts frames in the pipeline to transparently cache and replay audio:

Deterministic key generation: Before requesting audio, a cache key is generated from the normalized text, voice ID, model, sample rate, and settings. API keys are excluded from the key.
Cache check (run_tts): On a cache hit, the mixin immediately pushes the cached audio frames (and any word timestamps) to the pipeline. On a miss, it calls the wrapped parent TTS service.
Collection (push_frame): As the parent service produces audio, the mixin intercepts and aggregates the frames, then stores them in the cache backend for future use.
Interruption handling: When an InterruptionFrame is received, the mixin clears pending cache write tasks and resets its batch state so no partial audio is committed.

You create a cached service by subclassing the mixin together with any TTSService subclass:

from pipecat_tts_cache import TTSCacheMixin
from pipecat.services.google.tts import GoogleHttpTTSService

class CachedGoogleTTS(TTSCacheMixin, GoogleHttpTTSService):
    pass

Configuration

TTSCacheMixin adds the following keyword arguments to the constructor of the wrapped TTS service. All other positional and keyword arguments are passed through to the parent class.

CacheBackend

default:"None"

Cache backend instance (MemoryCacheBackend or RedisCacheBackend). If None, caching is disabled and calls pass straight through to the parent service.

int

default:"86400"

Time-to-live for cache entries, in seconds. Defaults to 24 hours.

str

default:"None"

Optional namespace prefix applied to cache keys.

MemoryCacheBackend

In-memory LRU cache with TTL support, suitable for local development and single-process bots.

int

default:"1000"

Maximum number of cache entries to store before LRU eviction.

RedisCacheBackend

Distributed Redis cache that persists across restarts and can be shared across multiple bot instances. Requires the redis extra.

str

default:"redis://localhost:6379/0"

Redis connection URL.

str

default:"pipecat:tts:cache:"

Prefix applied to all cache keys.

int

default:"10"

Maximum number of Redis connections.

float

default:"5.0"

Socket timeout in seconds.

dict

Additional keyword arguments forwarded to the underlying Redis client.

Usage

Basic in-memory cache

from pipecat_tts_cache import TTSCacheMixin, MemoryCacheBackend
from pipecat.services.google.tts import GoogleHttpTTSService

# 1. Create a cached class using the mixin
class CachedGoogleTTS(TTSCacheMixin, GoogleHttpTTSService):
    pass

# 2. Initialize with a memory backend
tts = CachedGoogleTTS(
    voice_id="en-US-Chirp3-HD-Charon",
    cache_backend=MemoryCacheBackend(max_size=1000),
    cache_ttl=86400,  # Cache for 24 hours
)

Distributed Redis cache

from pipecat_tts_cache import TTSCacheMixin, RedisCacheBackend
from pipecat.services.google.tts import GoogleHttpTTSService

class CachedGoogleTTS(TTSCacheMixin, GoogleHttpTTSService):
    pass

tts = CachedGoogleTTS(
    voice_id="en-US-Chirp3-HD-Charon",
    cache_backend=RedisCacheBackend(
        redis_url="redis://localhost:6379/0",
        key_prefix="pipecat:tts:",
    ),
    cache_ttl=604800,  # Cache for 1 week
)

Monitoring and maintenance

# Check performance
stats = await tts.get_cache_stats()
print(f"Hit Rate: {stats['hit_rate']:.1%}")
print(f"Total Saved Calls: {stats['hits']}")

# Clear all entries, or a specific namespace
await tts.clear_cache()
await tts.clear_cache(namespace="user_123")

Compatibility

The caching layer works with all Pipecat TTS services, applying a different caching strategy depending on the service architecture:

Service type	Caching strategy	Supported providers (examples)
`AudioContextWordTTS`	Batch caching — splits audio at word boundaries per sentence	Cartesia, Rime
`WordTTSService`	Full caching with preserved word-level timestamps	ElevenLabs, Hume
`TTSService`	Standard caching of the full audio response (no alignment data)	Google, OpenAI, Deepgram (HTTP)
`InterruptibleTTS`	Sentence caching — single-sentence responses only	Sarvam, Deepgram (WebSocket)

Tested with Pipecat v0.0.91+. Check the source repository for the latest tested version and changelog.

Pipecat Server

Client SDKs

Pipecat Flows

Pipecat Cloud

CLI

Pipecat Context Hub

Pipecat TTS Cache

Overview

Source Repository

PyPI Package

Installation

How It Works

Configuration

MemoryCacheBackend

RedisCacheBackend

Usage

Basic in-memory cache

Distributed Redis cache

Monitoring and maintenance

Compatibility

​Overview

Source Repository

PyPI Package

​Installation

​How It Works

​Configuration

​MemoryCacheBackend

​RedisCacheBackend

​Usage

​Basic in-memory cache

​Distributed Redis cache

​Monitoring and maintenance

​Compatibility

Overview

Installation

How It Works

Configuration

MemoryCacheBackend

RedisCacheBackend

Usage

Basic in-memory cache

Distributed Redis cache

Monitoring and maintenance

Compatibility