Context Management

What is Context in Pipecat?

In Pipecat, context refers to the conversation history that the LLM uses to generate responses. The context consists of a list of alternating user/assistant messages that represents the collective history of the entire conversation.

# Example context structure
messages = [
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hi there! How can I help you?"},
    # Context aggregators automatically add new messages here
]

The system prompt is typically set via system_instruction in the LLM service’s Settings, not as a message in the context. See System Instruction vs Context System Messages below for details. Since Pipecat is a real-time voice AI framework, context management happens automatically as the conversation flows, but you can also control it manually when needed.

How Context Updates During Conversations

Context updates happen automatically as frames flow through your pipeline: User Messages:

User speaks → InputAudioRawFrame → STT Service → TranscriptionFrame
context_aggregator.user() receives TranscriptionFrame and adds user message to context

Assistant Messages:

LLM generates response → LLMTextFrame → TTS Service → TTSTextFrame
context_aggregator.assistant() receives TTSTextFrame and adds assistant message to context

Frame types that update context:

TranscriptionFrame: Contains user speech converted to text by STT service
LLMTextFrame: Contains LLM-generated responses
TTSTextFrame: Contains bot responses converted to text by TTS service (represents what was actually spoken)

The TTS service processes LLMTextFrames but outputs TTSTextFrames, which represent the actual spoken text returned by the TTS provider. This ensures context matches what users actually hear.

Setting Up Context Management

Pipecat includes a context aggregator that creates and manages context for both user and assistant messages:

1. Create the Context and Context Aggregator

# Create LLM service with system instruction
llm = OpenAILLMService(
    api_key=os.getenv("OPENAI_API_KEY"),
    settings=OpenAILLMService.Settings(
        model="gpt-4o",
        system_instruction="You are a helpful voice assistant.",
    ),
)

# Create context (no system message needed — system_instruction handles it)
context = LLMContext()

# Create context aggregator instance
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(context)

The context aggregator also supports configuring user turn strategies and user mute strategies via LLMUserAggregatorParams.

About LLMContext: LLMContext is Pipecat’s universal context container that stores conversation messages, tool definitions, and tool choice settings. It uses an OpenAI-compatible format that works across all LLM services through automatic adapter translation. Key properties:

messages: List of conversation messages (system, user, assistant, tool)
tools: Optional ToolsSchema defining available functions
tool_choice: Optional strategy for tool selection

2. Context with Function Calling

Context can also include tools (function definitions) that the LLM can call during conversations:

from pipecat.adapters.schemas.function_schema import FunctionSchema
from pipecat.adapters.schemas.tools_schema import ToolsSchema

# Define available functions
weather_function = FunctionSchema(
    name="get_current_weather",
    description="Get the current weather",
    properties={
        "location": {
            "type": "string",
            "description": "The city and state, e.g. San Francisco, CA",
        },
        "format": {
            "type": "string",
            "enum": ["celsius", "fahrenheit"],
            "description": "The temperature unit to use.",
        },
    },
    required=["location", "format"],
)

# Create tools schema
tools = ToolsSchema(standard_tools=[weather_function])

# Create context with both messages and tools
context = LLMContext(messages, tools)
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(context)

Function call results are also automatically stored in the context, maintaining a complete conversation history including tool interactions.

We’ll cover function calling in detail in an upcoming section. The context aggregator handles function call storage automatically.

3. Add Context Aggregators to Your Pipeline

pipeline = Pipeline([
    transport.input(),
    stt,
    context_aggregator.user(),      # User context aggregator
    llm,
    tts,
    transport.output(),
    context_aggregator.assistant(), # Assistant context aggregator
])

System Instruction vs Context System Messages

There are two ways to provide a system prompt to your LLM:

Using `system_instruction` (recommended)

Set the system prompt via the LLM service’s Settings. The service automatically prepends it to the context messages on each request:

llm = OpenAILLMService(
    api_key=os.getenv("OPENAI_API_KEY"),
    settings=OpenAILLMService.Settings(
        model="gpt-4o",
        system_instruction="You are a helpful voice assistant.",
    ),
)

# Context only needs conversation messages
context = LLMContext()

This approach is recommended because:

Survives context updates: The system prompt is always prepended, even after LLMMessagesUpdateFrame replaces the context or context summarization compresses older messages.
Shared context: Multiple LLM services can share a single LLMContext while each provides its own system instruction.
Runtime updates: You can change the system prompt mid-conversation via LLMUpdateSettingsFrame.

Using a context system message

You can also include a system message directly in the context messages:

messages = [
    {"role": "system", "content": "You are a helpful voice assistant."},
]
context = LLMContext(messages)

This works but has limitations: the system message can be lost during context summarization or full context replacement, and it cannot be shared across multiple LLM services with different system prompts.

If both system_instruction and a system message in the context are set, system_instruction takes precedence and a warning is logged. Avoid using both at the same time.

Context Aggregator Placement

The placement of context aggregator instances in your pipeline is crucial for proper operation:

User Context Aggregator

Place the user context aggregator downstream from the STT service. Since the user’s speech results in TranscriptionFrame objects pushed by the STT service, the user aggregator needs to be positioned to collect these frames.

Assistant Context Aggregator

Place the assistant context aggregator after transport.output(). This positioning is important because:

The TTS service outputs TTSTextFrames in addition to audio
The assistant aggregator must be downstream to collect those frames
It ensures context updates happen word-by-word for specific services (e.g. Cartesia, ElevenLabs, and Rime)
Your context stays updated at the word level in case an interruption occurs

Always place the assistant context aggregator after transport.output() to ensure proper word-level context updates during interruptions.

Manual Context Control

You can programmatically add new messages to the context by pushing or queueing specific frames:

Adding Messages

LLMMessagesAppendFrame: Appends a new message to the existing context
LLMMessagesUpdateFrame: Completely replaces the existing context with new messages

# Add a new user message to context and trigger a response
new_message = {"role": "user", "content": "Tell me about your capabilities."}
await task.queue_frames([
    LLMMessagesAppendFrame([new_message], run_llm=True), # Optionally trigger bot response, too
])

Retrieving Current Context

The context aggregator provides a context property for getting the current context:

context = context_aggregator.user().context

Context Summarization

In long-running conversations, context grows with every exchange, increasing token usage and potentially hitting context window limits. Pipecat includes built-in context summarization that automatically compresses older conversation history while preserving recent messages. Enable it by setting enable_context_summarization=True when creating your context aggregators:

user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
    context,
    assistant_params=LLMAssistantAggregatorParams(
        enable_context_summarization=True,
    ),
)

Context Summarization Guide

Learn how to configure summarization triggers, customize behavior, and control what gets preserved.

Triggering Bot Responses

You may want to manually trigger the bot to speak in two scenarios:

Starting a pipeline where the bot should speak first
After editing the context using LLMMessagesAppendFrame or LLMMessagesUpdateFrame

# Example: Bot speaks first when pipeline starts
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
    # Trigger a response
    await task.queue_frames([LLMRunFrame()])

# Example: Bot speaks after context is edited
new_message = {"role": "user", "content": "Tell me a fun fact."}
await task.queue_frames([
    LLMMessagesAppendFrame([new_message], run_llm=True), # Trigger bot response
])

This gives you fine-grained control over when and how the bot responds during the conversation flow.

Key Takeaways

Context is conversation history - automatically maintained as users and bots exchange messages
Use system_instruction for system prompts - set it in LLM Settings rather than as a context message for reliability across context updates and summarization
Frame types matter - TranscriptionFrame for users, TTSTextFrame for assistants
Placement matters - user aggregator after STT, assistant aggregator after transport output
Tools are included - function definitions and results are stored in context
Manual control available - use frames to append messages or trigger responses when needed
Word-level precision - proper placement ensures context accuracy during interruptions
Automatic summarization - enable context summarization to manage long conversations efficiently and reduce token costs

What’s Next

Now that you understand context management, let’s explore how to configure the LLM services that process this context to generate intelligent responses.

LLM Inference

Learn how to configure language models in your voice AI pipeline

Learning Pipecat

Fundamentals

Features

Telephony

Context Management

What is Context in Pipecat?

How Context Updates During Conversations

Setting Up Context Management

1. Create the Context and Context Aggregator

2. Context with Function Calling

3. Add Context Aggregators to Your Pipeline

System Instruction vs Context System Messages

Using `system_instruction` (recommended)

Using a context system message

Context Aggregator Placement

User Context Aggregator

Assistant Context Aggregator

Manual Context Control

Adding Messages

Retrieving Current Context

Context Summarization

Context Summarization Guide

Triggering Bot Responses

Key Takeaways

What’s Next

LLM Inference

Learning Pipecat

Fundamentals

Features

Telephony

​What is Context in Pipecat?

​How Context Updates During Conversations

​Setting Up Context Management

​1. Create the Context and Context Aggregator

​2. Context with Function Calling

​3. Add Context Aggregators to Your Pipeline

​System Instruction vs Context System Messages

​Using system_instruction (recommended)

​Using a context system message

​Context Aggregator Placement

​User Context Aggregator

​Assistant Context Aggregator

​Manual Context Control

​Adding Messages

​Retrieving Current Context

​Context Summarization

Context Summarization Guide

​Triggering Bot Responses

​Key Takeaways

​What’s Next

LLM Inference

What is Context in Pipecat?

How Context Updates During Conversations

Setting Up Context Management

1. Create the Context and Context Aggregator

2. Context with Function Calling

3. Add Context Aggregators to Your Pipeline

System Instruction vs Context System Messages

Using `system_instruction` (recommended)

Using a context system message

Context Aggregator Placement

User Context Aggregator

Assistant Context Aggregator

Manual Context Control

Adding Messages

Retrieving Current Context

Context Summarization

Triggering Bot Responses

Key Takeaways

What’s Next