Fireworks AI
LLM service implementation using Fireworks AI’s API with OpenAI-compatible interface
Overview
FireworksLLMService
provides access to Fireworks AI’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService
and supports streaming responses, function calling, and context management.
Installation
To use FireworksLLMService
, install the required dependencies:
You’ll also need to set up your Fireworks API key as an environment variable: FIREWORKS_API_KEY
Configuration
Constructor Parameters
Your Fireworks AI API key
Model identifier
Fireworks AI API endpoint
Input Parameters
Inherits all input parameters from BaseOpenAILLMService:
Additional parameters to pass to the model
Reduces likelihood of repeating tokens based on their frequency. Range: [-2.0, 2.0]
Maximum number of tokens to generate. Must be greater than or equal to 1
Reduces likelihood of repeating any tokens that have appeared. Range: [-2.0, 2.0]
Controls randomness in the output. Range: [0.0, 2.0]
Controls diversity via nucleus sampling. Range: [0.0, 1.0]
Input Frames
Contains OpenAI-specific conversation context
Contains conversation messages
Contains image for vision model processing
Updates model settings
Output Frames
Contains generated text chunks
Indicates start of function call
Contains function call results
Methods
See the LLM base class methods for additional functionality.
Usage Example
Function Calling
Supports OpenAI-compatible function calling with the firefunction-v2
model:
Available Models
Fireworks AI provides access to various models, notably:
Model Name | Description |
---|---|
accounts/fireworks/models/firefunction-v2 | Optimized for function calling |
accounts/fireworks/models/firefunction-v1 | Optimized for function calling |
accounts/fireworks/models/llama-v3p1-8b-instruct | Llama 3.1 8B Instruct |
accounts/fireworks/models/llama-v3p1-70b-instruct | Llama 3.1 70B Instruct |
accounts/fireworks/models/llama-v3p1-405b-instruct | Llama 3.1 405B Instruct |
accounts/fireworks/models/mixtral-8x22b-instruct | Mixtral MoE 8x22B Instruct |
See Fireworks’s console for a complete list of supported models.
Frame Flow
Inherits the BaseOpenAI LLM Service frame flow:
Metrics Support
The service collects standard LLM metrics:
- Token usage (prompt and completion)
- Processing duration
- Time to First Byte (TTFB)
- Function call metrics
Notes
- OpenAI-compatible interface
- Supports streaming responses
- Handles function calling
- Manages conversation context
- Includes token usage tracking
- Thread-safe processing
- Automatic error handling