Groq

Overview

GroqLLMService provides access to Groq’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management.

API Reference

Complete API documentation and method details

Groq Docs

Official Groq API documentation and features

Example Code

Working example with function calling

Installation

To use Groq services, install the required dependency:

pip install "pipecat-ai[groq]"

You’ll also need to set up your Groq API key as an environment variable: GROQ_API_KEY.

Get your API key for free from Groq Console.

Frames

Input

OpenAILLMContextFrame - Conversation context and history
LLMMessagesFrame - Direct message list
VisionImageRawFrame - Images for vision processing (select models)
LLMUpdateSettingsFrame - Runtime parameter updates

Output

LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries
LLMTextFrame - Streamed completion chunks
FunctionCallInProgressFrame / FunctionCallResultFrame - Function call lifecycle
ErrorFrame - API or processing errors

Function Calling

Function Calling Guide

Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications.

Context Management

Context Management Guide

Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences.

Usage Example

import os
from pipecat.services.groq.llm import GroqLLMService
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.adapters.schemas.function_schema import FunctionSchema
from pipecat.adapters.schemas.tools_schema import ToolsSchema

# Configure Groq service for speed
llm = GroqLLMService(
    api_key=os.getenv("GROQ_API_KEY"),
    model="llama-3.3-70b-versatile",  # Fast, capable model
    params=GroqLLMService.InputParams(
        temperature=0.7,
        max_tokens=1000
    )
)

# Define function for tool calling
weather_function = FunctionSchema(
    name="get_current_weather",
    description="Get current weather information",
    properties={
        "location": {
            "type": "string",
            "description": "City and state, e.g. San Francisco, CA"
        },
        "format": {
            "type": "string",
            "enum": ["celsius", "fahrenheit"],
            "description": "Temperature unit to use"
        }
    },
    required=["location", "format"]
)

tools = ToolsSchema(standard_tools=[weather_function])

# Create context optimized for voice interaction
context = OpenAILLMContext(
    messages=[
        {
            "role": "system",
            "content": """You are a helpful assistant optimized for voice conversations.
            Keep responses concise and avoid special characters that don't work well in speech."""
        }
    ],
    tools=tools
)

# Create context aggregators with fast timeout for speed
from pipecat.processors.aggregators.llm_response import LLMUserAggregatorParams

context_aggregator = llm.create_context_aggregator(
    context,
    user_params=LLMUserAggregatorParams(aggregation_timeout=0.05)  # Fast aggregation
)

# Register function handler with feedback
async def fetch_weather(params):
    location = params.arguments["location"]
    await params.result_callback({"conditions": "sunny", "temperature": "75°F"})

llm.register_function("get_current_weather", fetch_weather)

# Optional: Add function call feedback for better UX
@llm.event_handler("on_function_calls_started")
async def on_function_calls_started(service, function_calls):
    await tts.queue_frame(TTSSpeakFrame("Let me check on that."))

# Use in pipeline with Groq STT for full Groq stack
pipeline = Pipeline([
    transport.input(),
    groq_stt,  # GroqSTTService for consistent ecosystem
    context_aggregator.user(),
    llm,
    tts,
    transport.output(),
    context_aggregator.assistant()
])

Metrics

Inherits all OpenAI metrics capabilities:

Time to First Byte (TTFB) - Ultra-low latency measurements
Processing Duration - Hardware-accelerated processing times
Token Usage - Prompt tokens, completion tokens, and totals

Learn how to enable Metrics in your Pipeline.

Additional Notes

OpenAI Compatibility: Full compatibility with OpenAI API features and parameters
Real-time Optimized: Ideal for conversational AI and streaming applications
Open Source Models: Access to Llama, Mixtral, and other open-source models
Vision Support: Select models support image understanding capabilities
Free Tier: Generous free tier available for development and testing

API Reference

Services

Utilities

Frameworks

Pipeline

Overview

API Reference

Groq Docs

Example Code

Installation

Frames

Input

Output

Function Calling

Function Calling Guide

Context Management

Context Management Guide

Usage Example

Metrics

Additional Notes

API Reference

Services

Utilities

Frameworks

Pipeline

​Overview

API Reference

Groq Docs

Example Code

​Installation

​Frames

​Input

​Output

​Function Calling

Function Calling Guide

​Context Management

Context Management Guide

​Usage Example

​Metrics

​Additional Notes

Overview

Installation

Frames

Input

Output

Function Calling

Context Management

Usage Example

Metrics

Additional Notes