> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pipecat.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Context Summarization

> Reference for LLMAutoContextSummarizationConfig, LLMContextSummaryConfig, LLMContextSummarizer, and SummaryAppliedEvent

## Overview

Context summarization automatically compresses older conversation history when token or message limits are reached. It is enabled on `LLMAssistantAggregatorParams`, configured via `LLMAutoContextSummarizationConfig` (auto-trigger thresholds) and `LLMContextSummaryConfig` (summary generation params), and managed by `LLMContextSummarizer`.

For a walkthrough of how to enable and customize context summarization, see the [Context Summarization guide](/pipecat/fundamentals/context-summarization).

## LLMAssistantAggregatorParams

```python theme={null}
from pipecat.processors.aggregators.llm_response_universal import LLMAssistantAggregatorParams
```

The summarization-related fields on `LLMAssistantAggregatorParams`.

<ParamField path="enable_auto_context_summarization" type="bool" default="False">
  Enables automatic context summarization. When `False` (the default), the
  summarizer is still created internally so that on-demand summarization via
  `LLMSummarizeContextFrame` works, but automatic trigger checks are skipped.
  Set to `True` to enable automatic summarization when either
  `max_context_tokens` or `max_unsummarized_messages` is reached.
</ParamField>

<ParamField path="auto_context_summarization_config" type="LLMAutoContextSummarizationConfig | None" default="None">
  Configuration for automatic summarization thresholds and summary generation.
  When `None`, default `LLMAutoContextSummarizationConfig` values are used.
</ParamField>

## LLMAutoContextSummarizationConfig

```python theme={null}
from pipecat.utils.context.llm_context_summarization import LLMAutoContextSummarizationConfig
```

Controls when automatic context summarization triggers.

<ParamField path="max_context_tokens" type="int | None" default="8000">
  Maximum context size in estimated tokens before triggering summarization.
  Tokens are estimated using the heuristic of 1 token per 4 characters. Set to
  `None` to disable token-based triggering. At least one of `max_context_tokens`
  or `max_unsummarized_messages` must be set.
</ParamField>

<ParamField path="max_unsummarized_messages" type="int | None" default="20">
  Maximum number of new messages before triggering summarization, even if the
  token limit has not been reached. Set to `None` to disable message-count
  triggering. At least one of `max_context_tokens` or
  `max_unsummarized_messages` must be set.
</ParamField>

<ParamField path="summary_config" type="LLMContextSummaryConfig" default="LLMContextSummaryConfig()">
  Configuration for how summaries are generated. See below.
</ParamField>

## LLMContextSummaryConfig

```python theme={null}
from pipecat.utils.context.llm_context_summarization import LLMContextSummaryConfig
```

Controls how summaries are generated. Used as `summary_config` inside `LLMAutoContextSummarizationConfig`, or passed directly to `LLMSummarizeContextFrame` for on-demand summarization.

<ParamField path="target_context_tokens" type="int" default="6000">
  Target token count for the generated summary. Passed to the LLM as
  `max_tokens`. Auto-adjusted to 80% of `max_context_tokens` if it exceeds that
  value.
</ParamField>

<ParamField path="min_messages_after_summary" type="int" default="4">
  Number of recent messages to preserve uncompressed after each summarization.
</ParamField>

<ParamField path="summarization_prompt" type="str | None" default="None">
  Custom system prompt for the LLM when generating summaries. When `None`, uses
  a built-in default prompt.
</ParamField>

<ParamField path="summary_message_template" type="str" default="&#x22;Conversation summary: {summary}&#x22;">
  Template for formatting the summary when injected into context. Must contain `   {summary}` as a placeholder. Allows wrapping summaries in custom delimiters
  (e.g., XML tags) so system prompts can distinguish summaries from live
  conversation.
</ParamField>

<ParamField path="llm" type="LLMService | None" default="None">
  Dedicated LLM service for generating summaries. When set, summarization
  requests are sent to this service instead of the pipeline's primary LLM.
  Useful for routing summarization to a cheaper or faster model. When `None`,
  the pipeline LLM handles summarization.
</ParamField>

<ParamField path="summarization_timeout" type="float" default="120.0">
  Maximum time in seconds to wait for the LLM to generate a summary. If
  exceeded, summarization is aborted and future summarization attempts are
  unblocked.
</ParamField>

## LLMSummarizeContextFrame

```python theme={null}
from pipecat.frames.frames import LLMSummarizeContextFrame
```

Push this frame into the pipeline to trigger on-demand context summarization without waiting for automatic thresholds.

<ParamField path="config" type="LLMContextSummaryConfig | None" default="None">
  Per-request override for summary generation settings (prompt, token budget,
  messages to keep). When `None`, the summarizer's default
  `LLMContextSummaryConfig` is used.
</ParamField>

On-demand summarization works even when `enable_auto_context_summarization` is `False` — the summarizer is always created internally to handle manually pushed frames.

```python theme={null}
from pipecat.frames.frames import LLMSummarizeContextFrame

# Trigger with default settings
await llm.queue_frame(LLMSummarizeContextFrame())

# Trigger with per-request overrides
await llm.queue_frame(
    LLMSummarizeContextFrame(
        config=LLMContextSummaryConfig(
            target_context_tokens=2000,
            min_messages_after_summary=2,
        )
    )
)
```

<Note>
  If a summarization is already in progress, the manual request is ignored.
</Note>

## LLMContextSummarizer

```python theme={null}
from pipecat.processors.aggregators.llm_context_summarizer import LLMContextSummarizer
```

Monitors context size and orchestrates summarization. Created automatically by `LLMAssistantAggregator` when `enable_auto_context_summarization=True`.

### Event Handlers

| Event                | Parameters                   | Description                                                           |
| -------------------- | ---------------------------- | --------------------------------------------------------------------- |
| `on_summary_applied` | `event: SummaryAppliedEvent` | Emitted after a summary has been successfully applied to the context. |

#### on\_summary\_applied

The `on_summary_applied` event is exposed on both `LLMContextSummarizer` and `LLMAssistantAggregator`. Register handlers on the aggregator for cleaner access:

```python theme={null}
@assistant_aggregator.event_handler("on_summary_applied")
async def on_summary_applied(aggregator, summarizer, event: SummaryAppliedEvent):
    logger.info(
        f"Context summarized: {event.original_message_count} -> "
        f"{event.new_message_count} messages "
        f"({event.summarized_message_count} summarized, "
        f"{event.preserved_message_count} preserved)"
    )
```

You can also register handlers directly on the summarizer if you have access to it:

```python theme={null}
summarizer = assistant_aggregator._summarizer
@summarizer.event_handler("on_summary_applied")
async def on_summary_applied(summarizer, event: SummaryAppliedEvent):
    logger.info(
        f"Context summarized: {event.original_message_count} -> "
        f"{event.new_message_count} messages"
    )
```

## SummaryAppliedEvent

```python theme={null}
from pipecat.processors.aggregators.llm_context_summarizer import SummaryAppliedEvent
```

Event data emitted when context summarization completes successfully.

<ParamField path="original_message_count" type="int">
  Number of messages in context before summarization.
</ParamField>

<ParamField path="new_message_count" type="int">
  Number of messages in context after summarization.
</ParamField>

<ParamField path="summarized_message_count" type="int">
  Number of messages that were compressed into the summary.
</ParamField>

<ParamField path="preserved_message_count" type="int">
  Number of messages preserved uncompressed (initial system message at
  `messages[0]` if present, plus recent messages).
</ParamField>

## Deprecated: LLMContextSummarizationConfig

```python theme={null}
from pipecat.utils.context.llm_context_summarization import LLMContextSummarizationConfig
```

<Warning>
  `LLMContextSummarizationConfig` is deprecated since v0.0.104. Use
  `LLMAutoContextSummarizationConfig` with a nested `LLMContextSummaryConfig`
  instead. The old class still works but emits a `DeprecationWarning`.
</Warning>

<Note>
  Both `max_context_tokens` and `max_unsummarized_messages` can now be set to
  `None` independently to disable that threshold. At least one must remain set.
</Note>

The old class flattened all parameters into a single object. Migrate by splitting trigger thresholds (`max_context_tokens`, `max_unsummarized_messages`) into `LLMAutoContextSummarizationConfig` and summary generation params into `LLMContextSummaryConfig`:

```python theme={null}
# Before (deprecated)
config = LLMContextSummarizationConfig(
    max_context_tokens=4000,
    target_context_tokens=3000,
    max_unsummarized_messages=10,
)

# After
config = LLMAutoContextSummarizationConfig(
    max_context_tokens=4000,
    max_unsummarized_messages=10,
    summary_config=LLMContextSummaryConfig(
        target_context_tokens=3000,
    ),
)
```

Similarly, the `LLMAssistantAggregatorParams` fields were renamed:

* `enable_context_summarization` → `enable_auto_context_summarization`
* `context_summarization_config` → `auto_context_summarization_config`

The old field names still work with a `DeprecationWarning`.
