Fireworks AI
LLM service implementation using Fireworks AI’s API with OpenAI-compatible interface
Overview
FireworksLLMService
provides access to Fireworks AI’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService
and supports streaming responses, function calling, and context management.
API Reference
Complete API documentation and method details
Fireworks Docs
Official Fireworks AI API documentation and features
Example Code
Working example with function calling
Installation
To use Fireworks AI services, install the required dependency:
You’ll also need to set up your Fireworks API key as an environment variable: FIREWORKS_API_KEY
.
Get your API key from Fireworks AI Console.
Frames
Input
OpenAILLMContextFrame
- Conversation context and historyLLMMessagesFrame
- Direct message listVisionImageRawFrame
- Images for vision processingLLMUpdateSettingsFrame
- Runtime parameter updates
Output
LLMFullResponseStartFrame
/LLMFullResponseEndFrame
- Response boundariesLLMTextFrame
- Streamed completion chunksFunctionCallInProgressFrame
/FunctionCallResultFrame
- Function call lifecycleErrorFrame
- API or processing errors
Function Calling
Function Calling Guide
Learn how to implement function calling with standardized schemas, register handlers, manage context properly, and control execution flow in your conversational AI applications.
Context Management
Context Management Guide
Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences.
Usage Example
Metrics
Inherits all OpenAI metrics capabilities:
- Time to First Byte (TTFB) - Response latency measurement
- Processing Duration - Total request processing time
- Token Usage - Prompt tokens, completion tokens, and totals
Enable with:
Additional Notes
- OpenAI Compatibility: Full compatibility with OpenAI API features and parameters
- Function Calling: Specialized firefunction models optimized for tool use
- Cost Effective: Competitive pricing for open-source model inference