Overview

GroqLLMService provides access to Groq’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management.

Installation

To use GroqLLMService, install the required dependencies:

pip install "pipecat-ai[groq]"

You’ll also need to set up your Groq API key as an environment variable: GROQ_API_KEY

Configuration

Constructor Parameters

api_key
str
required

Your Groq API key

model
str
default:
"llama-3.1-70b-versatile"

Model identifier

base_url
str
default:
"https://api.groq.com/openai/v1"

Groq API endpoint

Input Parameters

Inherits OpenAI-compatible parameters:

frequency_penalty
Optional[float]

Reduces likelihood of repeating tokens based on their frequency. Range: [-2.0, 2.0]

max_tokens
Optional[int]

Maximum number of tokens to generate. Must be greater than or equal to 1

presence_penalty
Optional[float]

Reduces likelihood of repeating any tokens that have appeared. Range: [-2.0, 2.0]

temperature
Optional[float]

Controls randomness in the output. Range: [0.0, 2.0]

top_p
Optional[float]

Controls diversity via nucleus sampling. Range: [0.0, 1.0]

Usage Example

from pipecat.services.groq import GroqLLMService
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from openai.types.chat import ChatCompletionToolParam
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.task import PipelineParams, PipelineTask

# Configure service
llm = GroqLLMService(
    api_key="your-groq-api-key",
    model="llama-3.1-70b-versatile"
)

# Define tools for function calling
tools = [
    ChatCompletionToolParam(
        type="function",
        function={
            "name": "get_current_weather",
            "description": "Get the current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA"
                    },
                    "format": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The temperature unit to use"
                    }
                },
                "required": ["location", "format"]
            }
        }
    )
]

# Create context with system message and tools
context = OpenAILLMContext(
    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant in a voice conversation. Keep responses concise."
        }
    ],
    tools=tools
)

# Register function handlers
async def fetch_weather(function_name, tool_call_id, args, llm, context, result_callback):
    await result_callback({"conditions": "nice", "temperature": "75"})

llm.register_function(None, fetch_weather)

# Create context aggregator for message handling
context_aggregator = llm.create_context_aggregator(context)

# Set up pipeline
pipeline = Pipeline([
    transport.input(),
    context_aggregator.user(),
    llm,
    tts,
    transport.output(),
    context_aggregator.assistant()
])

# Create and configure task
task = PipelineTask(
    pipeline,
    PipelineParams(
        allow_interruptions=True,
        enable_metrics=True,
        enable_usage_metrics=True,
    ),
)

Methods

See the LLM base class methods for additional functionality.

Function Calling

Supports OpenAI-compatible function calling:

# Define tools
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get weather information",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            }
        }
    }
}]

# Configure context with tools
context = OpenAILLMContext(
    messages=[],
    tools=tools
)

# Register function handler
@service.function("get_weather")
async def handle_weather(location: str):
    return {"temperature": 72, "condition": "sunny"}

Available Models

Model NameDescription
llama3-groq-70b-8192-tool-use-previewLlama-3-Groq-70B-Tool-Use
llama3-groq-8b-8192-tool-use-previewLlama-3-Groq-8B-Tool-Use
llama-3.2-90b-vision-previewLlama 3.2 90B vision model
llama-3.2-11b-vision-previewLlama 3.2 11B vision model
llama-3.1-70b-versatileLlama 3.1 70B versatile model
llama-3.1-8b-instantLlama 3.1 8B instant model
mixtral-8x7b-chatMixtral 8x7B chat model
gemma-7b-itGemma 7B instruction model

See Groq’s docs for a complete list of supported models.

Frame Flow

Inherits the OpenAI LLM Service frame flow:

Metrics Support

The service collects standard LLM metrics:

  • Token usage (prompt and completion)
  • Processing duration
  • Time to First Byte (TTFB)
  • Function call metrics

Notes

  • OpenAI-compatible interface
  • Supports streaming responses
  • Handles function calling
  • Manages conversation context
  • Includes token usage tracking
  • Thread-safe processing
  • Automatic error handling