Google Gemini
Large Language Model service implementation using Google’s Gemini API
Overview
GoogleLLMService
provides integration with Google’s Gemini models, supporting streaming responses, function calling, and multimodal inputs. It includes specialized context handling for Google’s message format while maintaining compatibility with OpenAI-style contexts.
Installation
To use GoogleLLMService
, install the required dependencies:
You’ll also need to set up your Google API key as an environment variable: GOOGLE_API_KEY
Configuration
Constructor Parameters
Google API key
Model identifier
Model configuration parameters
Input Parameters
Additional parameters to pass to the model
Maximum number of tokens to generate. Must be greater than or equal to 1
Controls randomness in the output. Range: [0.0, 2.0]
Controls diversity via nucleus sampling. Must be greater than or equal to 0
Controls diversity via nucleus sampling. Range: [0.0, 1.0]
Input Frames
Contains conversation context
Contains conversation messages
Contains image for vision processing
Updates model settings
Output Frames
Contains generated text
Signals start of response
Signals end of response
Context Management
The Google service uses specialized context management to handle conversations and message formatting. This includes managing the conversation history, system prompts, function calls, and converting between OpenAI and Google message formats.
GoogleLLMContext
The base context manager for Google conversations:
Context Aggregators
Context aggregators handle message format conversion and management. The service provides a method to create paired aggregators:
Usage Example
The context management system ensures proper message formatting and history tracking throughout the conversation while handling the conversion between OpenAI and Google message formats automatically.
Methods
See the LLM base class methods for additional functionality.
Usage Examples
Basic Usage
With Function Calling
Frame Flow
Metrics Support
The service collects various metrics:
- Token usage (prompt and completion)
- Processing time
- Time to first byte (TTFB)
Notes
- Supports streaming responses
- Handles function calling
- Provides OpenAI compatibility
- Manages conversation context
- Supports vision inputs
- Includes metrics collection
- Thread-safe processing