> ## Documentation Index > Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt > Use this file to discover all available pages before exploring further. # XTTS > Text-to-speech service implementation using Coqui's XTTS streaming server Coqui, the XTTS maintainer, has shut down. XTTS may not receive future updates or support. ## Overview `XTTSTTSService` provides multilingual voice synthesis with voice cloning capabilities through a locally hosted streaming server. The service supports real-time streaming and custom voice training using Coqui's XTTS-v2 model for cross-lingual text-to-speech. Pipecat's API methods for XTTS integration Complete example with voice cloning Official XTTS streaming server repository Learn about custom voice training ## Installation XTTS requires a running streaming server. Start the server using Docker: ```bash theme={null} docker run --gpus=all -e COQUI_TOS_AGREED=1 --rm -p 8000:80 \ ghcr.io/coqui-ai/xtts-streaming-server:latest-cuda121 ``` ## Prerequisites ### XTTS Server Setup Before using XTTSTTSService, you need: 1. **Docker Environment**: Set up Docker with GPU support for optimal performance 2. **XTTS Server**: Run the XTTS streaming server container 3. **Voice Models**: Configure voice models and cloning samples as needed ### Required Configuration * **Server URL**: Configure the XTTS server endpoint (default: `http://localhost:8000`) * **Voice Selection**: Set up voice models or voice cloning samples GPU acceleration is recommended for optimal performance. The server requires CUDA support for best results. ## Configuration ### XTTSService ID of the studio speaker to use for synthesis. *Deprecated in v0.0.105. Use `settings=XTTSService.Settings(voice=...)` instead.* Base URL of the XTTS streaming server (e.g. `http://localhost:8000`). An aiohttp session for HTTP requests to the XTTS server. Language for synthesis. Supports Czech, German, English, Spanish, French, Hindi, Hungarian, Italian, Japanese, Korean, Dutch, Polish, Portuguese, Russian, Turkish, and Chinese. *Deprecated in v0.0.106. Use `settings=XTTSService.Settings(language=...)` instead.* Runtime-configurable settings. See [Settings](#settings) below. Output audio sample rate in Hz. When `None`, uses the pipeline's configured sample rate. Audio is automatically resampled from XTTS's native 24kHz output. ### Settings Runtime-configurable settings passed via the `settings` constructor argument using `XTTSService.Settings(...)`. These can be updated mid-conversation with `TTSUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details. | Parameter | Type | Default | Description | | ---------- | ----------------- | ------- | -------------------------------------- | | `model` | `str` | `None` | Model identifier. *(Inherited.)* | | `voice` | `str` | `None` | Voice identifier. *(Inherited.)* | | `language` | `Language \| str` | `None` | Language for synthesis. *(Inherited.)* | ## Usage ### Basic Setup ```python theme={null} import aiohttp from pipecat.services.xtts import XTTSService async with aiohttp.ClientSession() as session: tts = XTTSService( settings=XTTSService.Settings( voice="Ana Florence", ), base_url="http://localhost:8000", aiohttp_session=session, ) ``` ### With Language Configuration ```python theme={null} import aiohttp from pipecat.services.xtts import XTTSService from pipecat.transcriptions.language import Language async with aiohttp.ClientSession() as session: tts = XTTSService( settings=XTTSService.Settings( voice="Ana Florence", language=Language.ES, ), base_url="http://localhost:8000", aiohttp_session=session, ) ``` The `InputParams` / `params=` pattern is deprecated as of v0.0.105. Use `Settings` / `settings=` instead. See the [Service Settings guide](/pipecat/fundamentals/service-settings) for migration details. ## Notes * **Local server required**: XTTS requires a locally running streaming server (via Docker). The service connects to this server over HTTP. * **Studio speakers**: On startup, the service fetches available "studio speakers" from the server's `/studio_speakers` endpoint. The `voice_id` must match one of these speakers. * **Audio resampling**: XTTS natively outputs audio at 24kHz. The service automatically resamples to match the pipeline's configured sample rate. * **GPU recommended**: The XTTS server performs best with CUDA-enabled GPU acceleration. CPU inference is significantly slower. * **No API key required**: XTTS runs locally, so no external API credentials are needed.