> ## Documentation Index > Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt > Use this file to discover all available pages before exploring further. # xAI > Speech-to-text service implementation using xAI's real-time WebSocket API ## Overview `XAISTTService` provides real-time speech-to-text transcription using xAI's WebSocket STT API with support for interim results, configurable endpointing, multichannel audio, and speaker diarization. The service streams raw audio (PCM, μ-law, or A-law) to xAI's endpoint and emits interim and final transcription frames based on the server's `is_final` and `speech_final` flags. The connection is persistent: audio is streamed continuously and the server automatically detects utterance boundaries. Pipecat's API methods for xAI STT integration Complete transcription example with xAI STT Full voice agent with xAI STT, LLM, and TTS Official xAI voice API documentation ## Installation To use xAI STT services, install the required dependencies: ```bash theme={null} uv add "pipecat-ai[xai]" ``` ## Prerequisites ### xAI Account Setup Before using xAI STT services, you need: 1. **xAI Account**: Sign up at [xAI](https://x.ai/) 2. **API Key**: Generate an API key from your account dashboard 3. **Language Selection**: Choose from 16 supported languages ### Required Environment Variables * `XAI_API_KEY`: Your xAI API key for authentication ## Configuration ### XAISTTService xAI API key for authentication (used as Bearer token for the WebSocket handshake). WebSocket endpoint URL for xAI STT. Audio sample rate in Hz. Supported values: 8000, 16000, 22050, 24000, 44100, 48000\. Audio encoding format. One of `"pcm"` (signed 16-bit LE), `"mulaw"`, or `"alaw"`. Runtime-configurable settings for the STT service. See [Settings](#settings) below. P99 latency from speech end to final transcript in seconds. Override for your deployment. See [stt-benchmark](https://github.com/pipecat-ai/stt-benchmark). ### Settings Runtime-configurable settings passed via the `settings` constructor argument using `XAISTTService.Settings(...)`. These can be updated mid-conversation with `STTUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details. | Parameter | Type | Default | Description | | ----------------- | ----------------- | ------------- | ------------------------------------------------------------------------------------------------------------------------------------- | | `model` | `str` | `None` | Not applicable for xAI STT. *(Inherited from base STT settings.)* | | `language` | `Language \| str` | `Language.EN` | Recognition language. Supports: AR, BN, DE, EN, ES, FR, HI, ID, IT, JA, KO, PT, RU, TR, VI, ZH. *(Inherited from base STT settings.)* | | `interim_results` | `bool` | `True` | When True, partial transcripts are emitted approximately every 500ms. | | `endpointing` | `int \| None` | `None` | Silence duration in milliseconds that triggers a speech-final event. Range 0-5000. Server default is 10ms. | | `multichannel` | `bool \| None` | `None` | When True, transcribes each interleaved channel independently. Requires `channels` >= 2. | | `channels` | `int \| None` | `None` | Number of interleaved channels (2-8). Required when `multichannel` is True. | | `diarize` | `bool \| None` | `None` | When True, the server attaches a `speaker` field to each word identifying the detected speaker. | ## Usage ### Basic Setup ```python theme={null} import os from pipecat.services.xai.stt import XAISTTService stt = XAISTTService( api_key=os.getenv("XAI_API_KEY"), ) ``` ### With Custom Settings ```python theme={null} import os from pipecat.services.xai.stt import XAISTTService from pipecat.transcriptions.language import Language stt = XAISTTService( api_key=os.getenv("XAI_API_KEY"), sample_rate=24000, settings=XAISTTService.Settings( language=Language.ES, interim_results=True, endpointing=1000, diarize=True, ), ) ``` ### With Multichannel Audio ```python theme={null} import os from pipecat.services.xai.stt import XAISTTService stt = XAISTTService( api_key=os.getenv("XAI_API_KEY"), settings=XAISTTService.Settings( multichannel=True, channels=2, ), ) ``` ## Notes * **Connection management**: The WebSocket connection is persistent and automatically reconnects if it drops mid-session. Audio is streamed continuously and the server emits `transcript.partial` events with `is_final` and `speech_final` flags to mark utterance boundaries. * **Language support**: xAI STT accepts two-letter language codes. When set, the server applies Inverse Text Normalization for improved accuracy. * **Audio encoding**: Supports PCM (signed 16-bit LE), μ-law, and A-law encoding formats. PCM is recommended for best quality. * **Settings updates**: Changing settings requires reconnecting to the WebSocket. The service automatically handles disconnect and reconnect when settings are updated via `STTUpdateSettingsFrame`. ## Event Handlers xAI STT supports the standard [service connection events](/api-reference/server/events/service-events): | Event | Description | | ----------------- | ------------------------------- | | `on_connected` | Connected to xAI WebSocket | | `on_disconnected` | Disconnected from xAI WebSocket | ```python theme={null} @stt.event_handler("on_connected") async def on_connected(service): print("Connected to xAI STT") @stt.event_handler("on_disconnected") async def on_disconnected(service): print("Disconnected from xAI STT") ```