> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Groq (Whisper)

> Speech-to-text service implementation using Groq's Whisper API

## Overview

`GroqSTTService` provides high-accuracy speech recognition using Groq's hosted Whisper API with ultra-fast inference speeds. It uses Voice Activity Detection (VAD) to process speech segments efficiently for optimal performance and accuracy.

<CardGroup cols={2}>
  <Card title="Groq STT API Reference" icon="code" href="https://reference-server.pipecat.ai/en/latest/api/pipecat.services.groq.stt.html">
    Pipecat's API methods for Groq STT integration
  </Card>

  <Card title="Example Implementation" icon="play" href="https://github.com/pipecat-ai/pipecat/blob/main/examples/voice/voice-groq.py">
    Complete example with Groq ecosystem integration
  </Card>

  <Card title="Groq Documentation" icon="book" href="https://console.groq.com/docs/api-reference#audio-transcription">
    Official Groq STT documentation and features
  </Card>

  <Card title="Groq Console" icon="microphone" href="https://console.groq.com/keys">
    Access API keys and Whisper models
  </Card>
</CardGroup>

## Installation

To use Groq services, install the required dependency:

```bash theme={null}
uv add "pipecat-ai[groq]"
```

## Prerequisites

### Groq Account Setup

Before using Groq STT services, you need:

1. **Groq Account**: Sign up at [Groq Console](https://console.groq.com/)
2. **API Key**: Generate an API key from your console dashboard
3. **Model Access**: Ensure access to Whisper transcription models

### Required Environment Variables

* `GROQ_API_KEY`: Your Groq API key for authentication

## Configuration

<ParamField path="model" type="str" default="whisper-large-v3-turbo" deprecated>
  Whisper model to use for transcription. *Deprecated in v0.0.105. Use
  `settings=GroqSTTService.Settings(...)` instead.*
</ParamField>

<ParamField path="api_key" type="str" default="None">
  Groq API key. If not provided, uses `GROQ_API_KEY` environment variable.
</ParamField>

<ParamField path="base_url" type="str" default="https://api.groq.com/openai/v1">
  API base URL. Override for custom or proxied deployments.
</ParamField>

<ParamField path="language" type="Language" default="Language.EN" deprecated>
  Language of the audio input. *Deprecated in v0.0.105. Use
  `settings=GroqSTTService.Settings(...)` instead.*
</ParamField>

<ParamField path="prompt" type="str" default="None" deprecated>
  Optional text to guide the model's style or continue a previous segment.
  *Deprecated in v0.0.105. Use `settings=GroqSTTService.Settings(...)` instead.*
</ParamField>

<ParamField path="temperature" type="float" default="None" deprecated>
  Sampling temperature between 0 and 1. Lower values are more deterministic.
  Defaults to 0.0. *Deprecated in v0.0.105. Use
  `settings=GroqSTTService.Settings(...)` instead.*
</ParamField>

<ParamField path="settings" type="GroqSTTService.Settings" default="None">
  Runtime-configurable settings for the STT service. See [Settings](#settings)
  below.
</ParamField>

<ParamField path="ttfs_p99_latency" type="float" default="GROQ_TTFS_P99">
  P99 latency from speech end to final transcript in seconds. Override for your
  deployment.
</ParamField>

<ParamField path="push_empty_transcripts" type="bool" default="False">
  If true, allow empty `TranscriptionFrame` frames to be pushed downstream
  instead of discarding them. This is intended for situations where VAD fires
  even though the user did not speak. In these cases, it is useful to know that
  nothing was transcribed so that the agent can resume speaking, instead of
  waiting longer for a transcription.
</ParamField>

### Settings

Runtime-configurable settings passed via the `settings` constructor argument using `GroqSTTService.Settings(...)`. These can be updated mid-conversation with `STTUpdateSettingsFrame`. See [Service Settings](/pipecat/fundamentals/service-settings) for details.

| Parameter     | Type              | Default                    | Description                                                              |
| ------------- | ----------------- | -------------------------- | ------------------------------------------------------------------------ |
| `model`       | `str`             | `"whisper-large-v3-turbo"` | Whisper model to use. *(Inherited from base STT settings.)*              |
| `language`    | `Language \| str` | `Language.EN`              | Language of the audio input. *(Inherited from base STT settings.)*       |
| `prompt`      | `str`             | `None`                     | Optional text to guide the model's style or continue a previous segment. |
| `temperature` | `float`           | `None`                     | Sampling temperature between 0 and 1.                                    |

## Usage

### Basic Setup

```python theme={null}
from pipecat.services.groq.stt import GroqSTTService

stt = GroqSTTService(
    api_key=os.getenv("GROQ_API_KEY"),
)
```

### With Custom Model and Language

```python theme={null}
from pipecat.services.groq.stt import GroqSTTService
from pipecat.transcriptions.language import Language

stt = GroqSTTService(
    api_key=os.getenv("GROQ_API_KEY"),
    settings=GroqSTTService.Settings(
        model="whisper-large-v3-turbo",
        language=Language.ES,
    ),
)
```

### With Prompt and Temperature

```python theme={null}
from pipecat.services.groq.stt import GroqSTTService

stt = GroqSTTService(
    api_key=os.getenv("GROQ_API_KEY"),
    settings=GroqSTTService.Settings(
        prompt="This is a conversation about artificial intelligence and machine learning.",
        temperature=0.0,
    ),
)
```

<Tip>
  The `InputParams` / `params=` pattern is deprecated as of v0.0.105. Use
  `Settings` / `settings=` instead. See the [Service Settings
  guide](/pipecat/fundamentals/service-settings) for migration details.
</Tip>

## Notes

* **Segmented processing**: `GroqSTTService` inherits from `SegmentedSTTService` (via `BaseWhisperSTTService`), which buffers audio during speech (detected by VAD) and sends complete segments for transcription. This means it does not provide interim results -- only final transcriptions after each speech segment.
* **Whisper API compatible**: Groq uses the OpenAI-compatible Whisper API format. The service sends audio in WAV format and receives JSON transcription responses.
* **Ultra-fast inference**: Groq's LPU (Language Processing Unit) infrastructure provides significantly faster inference than CPU/GPU-based Whisper deployments, making it suitable for real-time applications despite the segmented processing approach.
* **Prompt guidance**: Use the `prompt` parameter to provide context that helps the model with domain-specific terminology or to maintain consistency across segments.
* **Multilingual support**: Whisper supports 99+ languages. The default is `Language.EN` (English). Set `language=None` in settings to enable automatic language detection, which will transcribe whatever language the user speaks.
