Overview

FireworksLLMService provides access to Fireworks AI’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management.

Installation

To use FireworksLLMService, install the required dependencies:

pip install pipecat-ai[fireworks]

You’ll also need to set up your Fireworks API key as an environment variable: FIREWORKS_API_KEY

Configuration

Constructor Parameters

api_key
str
required

Your Fireworks AI API key

model
str
default: "accounts/fireworks/models/firefunction-v2"

Model identifier

base_url
str
default: "https://api.fireworks.ai/inference/v1"

Fireworks AI API endpoint

Input Parameters

Inherits all input parameters from BaseOpenAILLMService:

extra
Optional[Dict[str, Any]]

Additional parameters to pass to the model

frequency_penalty
Optional[float]

Reduces likelihood of repeating tokens based on their frequency. Range: [-2.0, 2.0]

max_tokens
Optional[int]

Maximum number of tokens to generate. Must be greater than or equal to 1

presence_penalty
Optional[float]

Reduces likelihood of repeating any tokens that have appeared. Range: [-2.0, 2.0]

temperature
Optional[float]

Controls randomness in the output. Range: [0.0, 2.0]

top_p
Optional[float]

Controls diversity via nucleus sampling. Range: [0.0, 1.0]

Input Frames

OpenAILLMContextFrame
Frame

Contains OpenAI-specific conversation context

LLMMessagesFrame
Frame

Contains conversation messages

VisionImageRawFrame
Frame

Contains image for vision model processing

LLMUpdateSettingsFrame
Frame

Updates model settings

Output Frames

TextFrame
Frame

Contains generated text chunks

FunctionCallInProgressFrame
Frame

Indicates start of function call

FunctionCallResultFrame
Frame

Contains function call results

Methods

See the LLM base class methods for additional functionality.

Usage Example

from pipecat.services.fireworks import FireworksLLMService
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext

# Configure service
service = FireworksLLMService(
    api_key="your-fireworks-api-key",
    model="accounts/fireworks/models/firefunction-v2",
    params=FireworksLLMService.InputParams(
        temperature=0.7,
        max_tokens=1000
    )
)

# Create context
context = OpenAILLMContext(
    messages=[
        {"role": "system", "content": "You are a helpful assistant"},
        {"role": "user", "content": "What is machine learning?"}
    ], tools=[]
)

# Use in pipeline
pipeline = Pipeline(
    [
        transport.input(),
        context_aggregator.user(),
        llm,
        tts,
        transport.output(),
        context_aggregator.assistant(),
    ]
)

Function Calling

Supports OpenAI-compatible function calling with the firefunction-v2 model:

# Define tools
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get weather information",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            }
        }
    }
}]

# Configure context with tools
context = OpenAILLMContext(
    messages=[],
    tools=tools
)

# Register function handler
@service.function("get_weather")
async def handle_weather(location: str):
    return {"temperature": 72, "condition": "sunny"}

Available Models

Fireworks AI provides access to various models, notably:

Model NameDescription
accounts/fireworks/models/firefunction-v2Optimized for function calling
accounts/fireworks/models/firefunction-v1Optimized for function calling
accounts/fireworks/models/llama-v3p1-8b-instructLlama 3.1 8B Instruct
accounts/fireworks/models/llama-v3p1-70b-instructLlama 3.1 70B Instruct
accounts/fireworks/models/llama-v3p1-405b-instructLlama 3.1 405B Instruct
accounts/fireworks/models/mixtral-8x22b-instructMixtral MoE 8x22B Instruct

See Fireworks’s console for a complete list of supported models.

Frame Flow

Inherits the BaseOpenAI LLM Service frame flow:

Metrics Support

The service collects standard LLM metrics:

  • Token usage (prompt and completion)
  • Processing duration
  • Time to First Byte (TTFB)
  • Function call metrics

Notes

  • OpenAI-compatible interface
  • Supports streaming responses
  • Handles function calling
  • Manages conversation context
  • Includes token usage tracking
  • Thread-safe processing
  • Automatic error handling