Perplexity
LLM service implementation using Perplexity’s API with OpenAI-compatible interface
Overview
PerplexityLLMService
provides access to Perplexity’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService
and supports streaming responses and context management, with special handling for Perplexity’s incremental token reporting.
Unlike other LLM services, Perplexity does not support function calling. Instead, they offer native internet search built in without requiring special function calls.
Installation
To use PerplexityLLMService
, install the required dependencies:
You’ll need to set up your Perplexity API key as an environment variable: PERPLEXITY_API_KEY
.
Configuration
Constructor Parameters
Your Perplexity API key
Model identifier
Perplexity API endpoint
Input Parameters
Inherits OpenAI-compatible parameters:
Reduces likelihood of repeating tokens based on their frequency. Must be greater than 0
Maximum number of tokens to generate
Reduces likelihood of repeating any tokens that have appeared. Range: [-2.0, 2.0]
Controls randomness in the output. Range: [0.0, 2.0]
Controls diversity via nucleus sampling. Range: [0.0, 1.0]
Usage Example
Methods
See the LLM base class methods for additional functionality.
Available Models
Perplexity provides access to various models:
Model Name | Description |
---|---|
sonar | Default model optimized for general chat |
sonar-pro | Pro version of the sonar model |
See Perplexity’s documentation for the most up-to-date list of supported models.
Token Usage Handling
PerplexityLLMService includes special handling for token usage metrics:
- Accumulates incremental token updates from Perplexity
- Records prompt tokens on first appearance
- Tracks completion tokens as they increase
- Reports final totals at the end of processing
This ensures compatibility with OpenAI’s token reporting format while maintaining accurate metrics.
Frame Flow
Inherits the OpenAI LLM Service frame flow:
Metrics Support
The service collects standard LLM metrics:
- Token usage (prompt and completion)
- Processing duration
- Time to First Byte (TTFB)
Notes
- OpenAI-compatible interface
- Supports streaming responses
- Manages conversation context
- Custom token usage tracking for Perplexity’s incremental reporting
- Thread-safe processing
- Automatic error handling