Overview

PerplexityLLMService provides access to Perplexity’s language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses and context management, with special handling for Perplexity’s incremental token reporting.

Unlike other LLM services, Perplexity does not support function calling. Instead, they offer native internet search built in without requiring special function calls.

Installation

To use PerplexityLLMService, install the required dependencies:

pip install "pipecat-ai[perplexity]"

You’ll also need to set up your Perplexity API key as an environment variable: PERPLEXITY_API_KEY.

Get your API key from Perplexity API.

Frames

Input

  • OpenAILLMContextFrame - Conversation context and history
  • LLMMessagesFrame - Direct message list
  • LLMUpdateSettingsFrame - Runtime parameter updates

Output

  • LLMFullResponseStartFrame / LLMFullResponseEndFrame - Response boundaries
  • LLMTextFrame - Streamed completion chunks with citations
  • ErrorFrame - API or processing errors

Context Management

Context Management Guide

Learn how to manage conversation context, handle message history, and integrate context aggregators for consistent conversational experiences.

Usage Example

import os
from pipecat.services.perplexity.llm import PerplexityLLMService
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext

# Configure Perplexity service
llm = PerplexityLLMService(
    api_key=os.getenv("PERPLEXITY_API_KEY"),
    model="sonar-pro",  # Pro model for enhanced capabilities
    params=PerplexityLLMService.InputParams(
        temperature=0.7,
        max_tokens=1000
    )
)

# Create context optimized for search and current information
context = OpenAILLMContext(
    messages=[
        {
            "role": "system",
            "content": """You are a knowledgeable assistant with access to real-time information.
            When answering questions, use your search capabilities to provide current, accurate information.
            Always cite your sources when possible. Keep responses concise for voice output."""
        }
    ]
)

# Create context aggregators
context_aggregator = llm.create_context_aggregator(context)

# Use in pipeline for information-rich conversations
pipeline = Pipeline([
    transport.input(),
    stt,
    context_aggregator.user(),
    llm,  # Will automatically search and cite sources
    tts,
    transport.output(),
    context_aggregator.assistant()
])

# Enable metrics with special TTFB reporting for Perplexity
task = PipelineTask(
    pipeline,
    params=PipelineParams(
        enable_metrics=True,
        enable_usage_metrics=True,
        report_only_initial_ttfb=True,  # Optimized for Perplexity's response pattern
    )
)

Metrics

The service provides specialized token tracking for Perplexity’s incremental reporting:

  • Time to First Byte (TTFB) - Response latency measurement
  • Processing Duration - Total request processing time
  • Token Usage - Accumulated prompt and completion tokens

Enable with:

task = PipelineTask(
    pipeline,
    params=PipelineParams(
        enable_metrics=True,
        enable_usage_metrics=True,
    )
)

Additional Notes

  • No Function Calling: Perplexity doesn’t support traditional function calling but provides superior built-in search
  • Real-time Data: Access to current information without complex function orchestration
  • Source Citations: Automatic citation of web sources in responses
  • OpenAI Compatible: Uses familiar OpenAI-style interface and parameters