Together AI
LLM service implementation using Together AI’s API with OpenAI-compatible interface
Overview
TogetherLLMService
provides access to Together AI’s language models, including Meta’s Llama 3.1 and 3.2 models, through an OpenAI-compatible interface. It inherits from OpenAILLMService
and maintains compatibility with OpenAI’s function calling format.
Installation
To use TogetherLLMService
, install the required dependencies:
You’ll also need to set up your Together API key as an environment variable: TOGETHER_API_KEY
Configuration
Constructor Parameters
Your Together AI API key
Together AI API endpoint
Model identifier
Input Parameters
Inherits all input parameters from OpenAILLMService:
Additional parameters to pass to the model
Reduces likelihood of repeating tokens based on their frequency. Range: [-2.0, 2.0]
Maximum number of tokens in the completion. Must be greater than or equal to 1
Maximum number of tokens to generate. Must be greater than or equal to 1
Reduces likelihood of repeating any tokens that have appeared. Range: [-2.0, 2.0]
Random seed for deterministic generation. Must be greater than or equal to 0
Controls randomness in the output. Range: [0.0, 2.0]
Controls diversity via nucleus sampling. Range: [0.0, 1.0]
Input Frames
Contains OpenAI-specific conversation context
Contains conversation messages
Contains image for vision model processing
Updates model settings
Output Frames
Contains generated text chunks
Indicates start of function call
Contains function call results
Context Management
The Together service uses specialized context management to handle conversations and message formatting. It relies on the OpenAI base class for context management, which includes managing the conversation history, system prompts, tool calls, and converting between OpenAI and Together message formats.
OpenAILLMContext
The base context manager for OpenAI conversations:
Context Aggregators
Context aggregators handle message format conversion and management. The service provides a method to create paired aggregators:
Creates user and assistant aggregators for handling message formatting.
Parameters
The context object containing conversation history and settings
Controls text preprocessing for assistant responses
Usage Example
The context management system ensures proper message formatting and history tracking throughout the conversation.
Methods
See the LLM base class methods for additional functionality.
Usage Example
Function Calling
Supports OpenAI-compatible function calling:
Available Models
Together AI provides access to various models, including:
Model Name | Description |
---|---|
meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo | Llama 3.1 8B instruct model optimized for speed |
meta-llama/Meta-Llama-3.1-34B-Instruct | Llama 3.1 34B instruct model |
meta-llama/Meta-Llama-3.1-70B-Instruct | Llama 3.1 70B instruct model |
meta-llama/Meta-Llama-3.1-405B-Instruct | Llama 3.1 405B instruct model |
meta-llama/Llama-3.2-3B-Instruct-Turbo | Llama 3.2 3B instruct model optimized for speed |
meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo | Llama 3.2 11B vision & instruct model |
meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo | Llama 3.2 90B vision & instruct model |
Frame Flow
Inherits the OpenAI LLM Service frame flow:
Metrics Support
The service collects the same metrics as OpenAILLMService:
- Token usage (prompt and completion)
- Processing duration
- Time to First Byte (TTFB)
- Function call metrics
Notes
- OpenAI-compatible interface
- Supports streaming responses
- Handles function calling
- Manages conversation context
- Includes token usage tracking
- Thread-safe processing
- Automatic error handling
- Inherits OpenAI service features