Groq
LLM service implementation using Groq’s API with OpenAI-compatible interface
Overview
GroqLLMService
provides access to Groq’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService
and supports streaming responses, function calling, and context management.
Installation
To use GroqLLMService
, install the required dependencies:
You’ll also need to set up your Groq API key as an environment variable: GROQ_API_KEY
Configuration
Constructor Parameters
Your Groq API key
Model identifier
Groq API endpoint
Input Parameters
Inherits OpenAI-compatible parameters:
Reduces likelihood of repeating tokens based on their frequency. Range: [-2.0, 2.0]
Maximum number of tokens to generate. Must be greater than or equal to 1
Reduces likelihood of repeating any tokens that have appeared. Range: [-2.0, 2.0]
Controls randomness in the output. Range: [0.0, 2.0]
Controls diversity via nucleus sampling. Range: [0.0, 1.0]
Usage Example
Methods
See the LLM base class methods for additional functionality.
Function Calling
Supports OpenAI-compatible function calling:
Available Models
Model Name | Description |
---|---|
llama3-groq-70b-8192-tool-use-preview | Llama-3-Groq-70B-Tool-Use |
llama3-groq-8b-8192-tool-use-preview | Llama-3-Groq-8B-Tool-Use |
llama-3.2-90b-vision-preview | Llama 3.2 90B vision model |
llama-3.2-11b-vision-preview | Llama 3.2 11B vision model |
llama-3.1-70b-versatile | Llama 3.1 70B versatile model |
llama-3.1-8b-instant | Llama 3.1 8B instant model |
mixtral-8x7b-chat | Mixtral 8x7B chat model |
gemma-7b-it | Gemma 7B instruction model |
See Groq’s docs for a complete list of supported models.
Frame Flow
Inherits the OpenAI LLM Service frame flow:
Metrics Support
The service collects standard LLM metrics:
- Token usage (prompt and completion)
- Processing duration
- Time to First Byte (TTFB)
- Function call metrics
Notes
- OpenAI-compatible interface
- Supports streaming responses
- Handles function calling
- Manages conversation context
- Includes token usage tracking
- Thread-safe processing
- Automatic error handling