> ## Documentation Index > Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt > Use this file to discover all available pages before exploring further. # XTTS-vLLM > Streaming text-to-speech using a self-hosted XTTSv2 + vLLM server export const CommunityMaintained = ({maintainer, maintainerUrl, repo}) => Community-maintained integration. This service is built and maintained by{" "} {maintainer} . Pipecat does not test or officially support it. Please report issues and request changes on the{" "} source repository . Learn more about{" "} community integrations . ; ## Overview `XTTSVLLMTTSService` streams audio from a self-hosted [XTTSv2-vLLM streaming server](https://github.com/wuxuedaifu/xttsv2-vllm-streaming-server) — Coqui XTTSv2 served with vLLM for real-time, low-latency synthesis (\~0.45s time-to-first-byte on the maintainer's test hardware). It is a thin HTTP client: the heavy model server runs separately (as a Docker image, typically on a GPU host) and the service talks to it over an OpenAI-compatible streaming endpoint, outputting `TTSAudioRawFrame` audio into your Pipecat pipeline. Voice cloning conditioning is computed once from a short reference sample and cached for the lifetime of the service, so per-utterance requests stay fast. Source code, examples, and issues for the XTTS-vLLM integration The `pipecat-xtts-vllm` package on PyPI The XTTSv2-vLLM streaming server this client connects to ## Installation This is a community-maintained package distributed separately from `pipecat-ai`: ```bash theme={null} uv add pipecat-xtts-vllm ``` ## Prerequisites This service is a client for a self-hosted model server; there is no third-party account or API key. 1. **Run the model server.** Deploy the [XTTSv2-vLLM streaming server](https://github.com/wuxuedaifu/xttsv2-vllm-streaming-server) (Docker image, GPU recommended) and note its URL for `base_url`. See the server repository for deployment instructions. 2. **Provide a reference voice.** A \~6-second reference audio clip (as bytes) is used for voice cloning. Alternatively, supply precomputed `conditioning`. The integration code is MIT-licensed, but the underlying XTTSv2 **model weights** are distributed under the Coqui Public Model License (non-commercial use only). Review the server repository for licensing details before production use. ## Configuration Base URL of the running XTTSv2-vLLM streaming server, e.g. `http://localhost:8000`. Reference voice sample (\~6 seconds) used to compute voice-cloning conditioning. Required unless `conditioning` is provided. Optional precomputed conditioning (`gpt_cond_latent_b64` + `speaker_embeddings_b64`). If set, it takes precedence over `reference_audio` and skips the conditioning request. Language code for synthesis. XTTSv2 supports 17 languages: `en` (English), `es` (Spanish), `fr` (French), `de` (German), `it` (Italian), `pt` (Portuguese), `pl` (Polish), `tr` (Turkish), `ru` (Russian), `nl` (Dutch), `cs` (Czech), `ar` (Arabic), `zh-cn` (Chinese, Simplified), `hu` (Hungarian), `ko` (Korean), `ja` (Japanese), and `hi` (Hindi). Pass `auto` to let the server auto-detect the language. Token-delta streaming chunk size sent to the server. Speech speed multiplier. Output audio sample rate in Hz (XTTSv2 native is 24 kHz, 16-bit mono PCM). Optional shared aiohttp session used for requests. If not provided, the service creates and manages its own session. ## Usage ```python theme={null} from pathlib import Path from pipecat.pipeline.pipeline import Pipeline from pipecat_xtts_vllm import XTTSVLLMTTSService tts = XTTSVLLMTTSService( base_url="http://localhost:8000", reference_audio=Path("reference.wav").read_bytes(), language="en", ) pipeline = Pipeline( [ transport.input(), # audio/user input stt, # speech to text context_aggregator.user(), # add user text to context llm, # LLM generates response tts, # XTTS-vLLM synthesis transport.output(), # stream audio back to user context_aggregator.assistant(), # store assistant response ] ) ``` To reuse precomputed conditioning instead of a reference clip, import `XTTSVLLMConditioning` alongside the service (`from pipecat_xtts_vllm import XTTSVLLMConditioning, XTTSVLLMTTSService`) and pass it via the `conditioning=` argument. See the [foundational example](https://github.com/wuxuedaifu/pipecat-xtts-vllm/tree/main/examples/foundational) in the source repository for a complete, runnable script. ## Compatibility Tested with `pipecat-ai` v1.4.0. Check the [source repository](https://github.com/wuxuedaifu/pipecat-xtts-vllm) for the latest tested version and changelog.