> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Context Management

> A guide to working with Pipecat's Context and Context Aggregators

## What is Context in Pipecat?

In Pipecat, **context** refers to the conversation history that the LLM uses to generate responses. The context consists of user/assistant messages representing the conversation history, and can also include **developer messages** for task-specific instructions to the LLM.

```python theme={null}
# Example context structure
messages = [
    {"role": "developer", "content": "Keep responses under two sentences."},
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hi there! How can I help you?"},
    # Context aggregators automatically add new messages here
]
```

The system prompt is typically set via `system_instruction` in the LLM service's [Settings](/pipecat/fundamentals/service-settings), not as a message in the context. See [System Instruction, Developer Messages, and System Messages](#system-instruction-developer-messages-and-system-messages) below for details.

Since Pipecat is a real-time voice AI framework, context management happens automatically as the conversation flows, but you can also control it manually when needed.

## How Context Updates During Conversations

Context updates happen automatically as frames flow through your pipeline:

**User Messages:**

1. User speaks → `InputAudioRawFrame` → STT Service → `TranscriptionFrame`
2. `context_aggregator.user()` receives `TranscriptionFrame` and adds user message to context

**Assistant Messages:**

1. LLM generates response → `LLMTextFrame` → TTS Service → `TTSTextFrame`
2. `context_aggregator.assistant()` receives `TTSTextFrame` and adds assistant message to context

**Frame types that update context:**

* **`TranscriptionFrame`**: Contains user speech converted to text by STT service
* **`LLMTextFrame`**: Contains LLM-generated responses
* **`TTSTextFrame`**: Contains bot responses converted to text by TTS service (represents what was actually spoken)

<Note>
  The TTS service processes `LLMTextFrame`s but outputs `TTSTextFrame`s, which
  represent the actual spoken text returned by the TTS provider. This ensures
  context matches what users actually hear.
</Note>

## Setting Up Context Management

Pipecat includes a context aggregator that creates and manages context for both user and assistant messages:

### 1. Create the Context and Context Aggregator

```python theme={null}
# Create LLM service with system instruction
llm = OpenAILLMService(
    api_key=os.getenv("OPENAI_API_KEY"),
    settings=OpenAILLMService.Settings(
        model="gpt-4o",
        system_instruction="You are a helpful voice assistant.",
    ),
)

# Create context (no system message needed — system_instruction handles it)
context = LLMContext()

# Create context aggregator instance
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(context)
```

<Note>
  The context aggregator also supports configuring [user turn
  strategies](/api-reference/server/utilities/turn-management/user-turn-strategies)
  and [user mute
  strategies](/api-reference/server/utilities/turn-management/user-mute-strategies)
  via `LLMUserAggregatorParams`.
</Note>

**About LLMContext:**

`LLMContext` is Pipecat's universal context container that stores conversation messages, tool definitions, and tool choice settings. It uses an OpenAI-compatible format that works across all LLM services through automatic adapter translation.

Key properties:

* **`messages`**: List of conversation messages (user, assistant, developer, tool)
* **`tools`**: Optional available functions — a list of direct functions and/or `FunctionSchema` objects (or a `ToolsSchema`)
* **`tool_choice`**: Optional strategy for tool selection

### 2. Context with Function Calling

Context can also include [tools](/pipecat/learn/function-calling#1-define-a-tool) (function definitions) that the LLM can call during conversations:

```python theme={null}
from pipecat.services.llm_service import FunctionCallParams

# A direct function: schema is auto-derived from the signature and docstring
async def get_current_weather(params: FunctionCallParams, location: str, format: str):
    """Get the current weather.

    Args:
        location: The city and state, e.g. "San Francisco, CA".
        format: The temperature unit to use. Must be either "celsius" or "fahrenheit".
    """
    await params.result_callback({"conditions": "sunny", "temperature": "75"})

# Create context with both messages and tools
context = LLMContext(messages, tools=[get_current_weather])
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(context)
```

Function call results are also automatically stored in the context, maintaining a complete conversation history including tool interactions.

<Note>
  We'll cover function calling in detail in an upcoming section. The context
  aggregator handles function call storage automatically.
</Note>

### 3. Add Context Aggregators to Your Pipeline

```python theme={null}
pipeline = Pipeline([
    transport.input(),
    stt,
    context_aggregator.user(),      # User context aggregator
    llm,
    tts,
    transport.output(),
    context_aggregator.assistant(), # Assistant context aggregator
])
```

## System Instruction, Developer Messages, and System Messages

Pipecat provides three ways to give instructions to your LLM, each suited for a different purpose.

### Using `system_instruction` (recommended)

Set the bot's personality and core behavior via the LLM service's Settings. The service automatically prepends it to the context messages on each request:

```python theme={null}
llm = OpenAILLMService(
    api_key=os.getenv("OPENAI_API_KEY"),
    settings=OpenAILLMService.Settings(
        model="gpt-4o",
        system_instruction="You are a helpful voice assistant.",
    ),
)

# Context only needs conversation messages
context = LLMContext()
```

This approach is recommended because:

* **Survives context updates**: The system prompt is always prepended, even after `LLMMessagesUpdateFrame` replaces the context or context summarization compresses older messages.
* **Shared context**: Multiple LLM services can share a single `LLMContext` while each provides its own system instruction.
* **Runtime updates**: You can change the system prompt mid-conversation via `LLMUpdateSettingsFrame`.

### Using developer messages in context

**Developer messages** (`"role": "developer"`) are task-specific instructions placed directly in the context. Use them for supplementary guidance that applies to a particular phase of the conversation rather than defining the bot's overall personality.

```python theme={null}
messages = [
    {"role": "developer", "content": "Keep responses under two sentences. Use metric units."},
]
context = LLMContext(messages)
```

Key characteristics:

* **Task-specific**: Use for instructions like response format constraints, domain rules, or workflow steps. The bot's personality belongs in `system_instruction`.
* **Part of normal context flow**: Developer messages participate in context like any other message. They are included in the summarization range and may be compressed or dropped during [context summarization](/pipecat/fundamentals/context-summarization).
* **Cross-provider translation**: Pipecat's adapters automatically convert developer messages for providers that don't support the role natively (for example, Anthropic receives them as user messages).

### Using a context system message (legacy)

You can also include a system message directly in the context messages:

```python theme={null}
messages = [
    {"role": "system", "content": "You are a helpful voice assistant."},
]
context = LLMContext(messages)
```

This works but has limitations: the system message can be lost during full context replacement, and it cannot be shared across multiple LLM services with different system prompts. Context summarization does preserve `messages[0]` when it's a system message, but `system_instruction` is still more reliable because it sits outside the context entirely.

For new projects, prefer `system_instruction` for personality and developer messages for task-specific instructions.

<Warning>
  If both `system_instruction` and a system message in the context are set,
  `system_instruction` takes precedence and a warning is logged. Avoid using
  both at the same time.
</Warning>

## Context Aggregator Placement

The placement of context aggregator instances in your pipeline is **crucial** for proper operation:

### User Context Aggregator

Place the user context aggregator **downstream from the STT service**. Since the user's speech results in `TranscriptionFrame` objects pushed by the STT service, the user aggregator needs to be positioned to collect these frames.

### Assistant Context Aggregator

Place the assistant context aggregator **after `transport.output()`**. This positioning is important because:

* The TTS service outputs `TTSTextFrame`s in addition to audio
* The assistant aggregator must be downstream to collect those frames
* It ensures context updates happen word-by-word for specific services (e.g. Cartesia, ElevenLabs, and Rime)
* Your context stays updated at the word level in case an interruption occurs

<Tip>
  Always place the assistant context aggregator **after** `transport.output()`
  to ensure proper word-level context updates during interruptions.
</Tip>

## Manual Context Control

You can programmatically add new messages to the context by pushing or queueing specific frames:

### Adding Messages

* **`LLMMessagesAppendFrame`**: Appends a new message to the existing context
* **`LLMMessagesUpdateFrame`**: Completely replaces the existing context with new messages
* **`LLMMessagesTransformFrame`**: Edits the existing context in place using a transform function

```python theme={null}
# Add a new user message to context and trigger a response
new_message = {"role": "user", "content": "Tell me about your capabilities."}
await worker.queue_frames([
    LLMMessagesAppendFrame([new_message], run_llm=True), # Optionally trigger bot response, too
])
```

#### Adding a message silently

All three frames take a `run_llm` argument that controls whether the change also prompts a bot response. Pass `run_llm=True` to respond; the default (`None`, which behaves like `False`) updates the context silently. This is useful when you collect information in the background and don't want the bot to react every time:

```python theme={null}
# Add a message to context without triggering a bot response
note = {"role": "user", "content": "Caller's name is Maria. Account verified."}
await worker.queue_frames([
    LLMMessagesAppendFrame([note], run_llm=False), # Silent: no response
])
```

#### Editing or removing specific messages

To surgically edit or remove individual messages without rebuilding the whole list, push an `LLMMessagesTransformFrame`. It takes a function that receives the current list of messages and returns a modified list. Use it to drop stale instructions, remove an offensive turn, or rewrite content:

```python theme={null}
from pipecat.frames.frames import LLMMessagesTransformFrame

def remove_language_instructions(messages):
    # Drop any message that contains a per-turn language instruction
    return [m for m in messages if "LANGUAGE INSTRUCTION" not in str(m.get("content", ""))]

await worker.queue_frames([
    LLMMessagesTransformFrame(transform=remove_language_instructions, run_llm=False),
])
```

<Note>
  To make the bot **speak** specific text and have it recorded in context (for
  example, a greeting the LLM did not generate), use [`TTSSpeakFrame` with
  `append_to_context=True`](/pipecat/learn/text-to-speech). That path is for
  driving speech output; the frames above are for editing the context directly.
</Note>

### Retrieving Current Context

The context aggregator provides a `context` property for getting the current context:

```python theme={null}
context = context_aggregator.user().context
```

## Context Summarization

In long-running conversations, context grows with every exchange, increasing token usage and potentially hitting context window limits. Pipecat includes built-in context summarization that automatically compresses older conversation history while preserving recent messages.

Enable it by setting `enable_auto_context_summarization=True` when creating your context aggregators (default: `False`):

```python theme={null}
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
    context,
    assistant_params=LLMAssistantAggregatorParams(
        enable_auto_context_summarization=True,
    ),
)
```

<Card title="Context Summarization Guide" icon="arrow-right" href="/pipecat/fundamentals/context-summarization">
  Learn how to configure summarization triggers, customize behavior, and control
  what gets preserved.
</Card>

## Triggering Bot Responses

You may want to manually trigger the bot to speak in two scenarios:

1. **Starting a pipeline** where the bot should speak first
2. **After editing the context** using `LLMMessagesAppendFrame` or `LLMMessagesUpdateFrame`

```python theme={null}
# Example: Bot speaks first when pipeline starts
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
    # Trigger a response
    await worker.queue_frames([LLMRunFrame()])
```

```python theme={null}
# Example: Bot speaks after context is edited
new_message = {"role": "user", "content": "Tell me a fun fact."}
await worker.queue_frames([
    LLMMessagesAppendFrame([new_message], run_llm=True), # Trigger bot response
])
```

This gives you fine-grained control over when and how the bot responds during the conversation flow.

## Key Takeaways

* **Context is conversation history** - automatically maintained as users and bots exchange messages
* **Use `system_instruction` for system prompts** - set it in LLM Settings rather than as a context message for reliability across context updates and summarization
* **Use developer messages for task instructions** - place task-specific guidance in context with `"role": "developer"` rather than overloading the system prompt
* **Frame types matter** - `TranscriptionFrame` for users, `TTSTextFrame` for assistants
* **Placement matters** - user aggregator after STT, assistant aggregator after transport output
* **Tools are included** - function definitions and results are stored in context
* **Manual control available** - use frames to append messages or trigger responses when needed
* **Word-level precision** - proper placement ensures context accuracy during interruptions
* **Automatic summarization** - enable context summarization to manage long conversations efficiently and reduce token costs

## What's Next

Now that you understand context management, let's explore how to configure the LLM services that process this context to generate intelligent responses.

<Card title="LLM Inference" icon="arrow-right" href="/pipecat/learn/llm">
  Learn how to configure language models in your voice AI pipeline
</Card>