Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 33 additions & 1 deletion api-reference/server/utilities/context-summarization.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,42 @@ description: "Reference for LLMAutoContextSummarizationConfig, LLMContextSummary

## Overview

Context summarization automatically compresses older conversation history when token or message limits are reached. It is configured via `LLMAutoContextSummarizationConfig` (auto-trigger thresholds) and `LLMContextSummaryConfig` (summary generation params), and managed by `LLMContextSummarizer`.
Context summarization automatically compresses older conversation history when token or message limits are reached. It is enabled on `LLMAssistantAggregatorParams`, configured via `LLMAutoContextSummarizationConfig` (auto-trigger thresholds) and `LLMContextSummaryConfig` (summary generation params), and managed by `LLMContextSummarizer`.

For a walkthrough of how to enable and customize context summarization, see the [Context Summarization guide](/pipecat/fundamentals/context-summarization).

## LLMAssistantAggregatorParams

```python
from pipecat.processors.aggregators.llm_response_universal import LLMAssistantAggregatorParams
```

The summarization-related fields on `LLMAssistantAggregatorParams`.

<ParamField path="enable_auto_context_summarization" type="bool" default="False">
Enables automatic context summarization. When `False` (the default), the
summarizer is still created internally so that on-demand summarization via
`LLMSummarizeContextFrame` works, but automatic trigger checks are skipped.
Set to `True` to enable automatic summarization when either
`max_context_tokens` or `max_unsummarized_messages` is reached.
</ParamField>

<ParamField
path="auto_context_summarization_config"
type="Optional[LLMAutoContextSummarizationConfig]"
default="None"
>
Configuration for automatic summarization thresholds and summary generation.
When `None`, default `LLMAutoContextSummarizationConfig` values are used.
</ParamField>

<Note>
The older field names `enable_context_summarization` and
`context_summarization_config` are deprecated but still accepted. Passing
them emits a `DeprecationWarning` and the values are mapped to the new
fields. See the deprecation section at the bottom of this page.
</Note>

## LLMAutoContextSummarizationConfig

```python
Expand Down
24 changes: 23 additions & 1 deletion pipecat/fundamentals/context-summarization.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
)
```

With the default configuration, summarization triggers at 8000 estimated tokens or after 20 new messages, whichever comes first.
Automatic summarization is **disabled by default** (`enable_auto_context_summarization=False`). When enabled with the default configuration, summarization triggers at 8000 estimated tokens or after 20 new messages, whichever comes first.

## Customizing Behavior

Expand Down Expand Up @@ -88,6 +88,28 @@ Context summarization intelligently preserves:
- **Function call sequences**: Incomplete function call/result pairs are not split during summarization
- **Developer messages are NOT preserved**: Developer messages (`"role": "developer"`) are included in the summarization range like any other message and may be compressed or dropped. If instructions need to survive summarization, use [`system_instruction`](/pipecat/learn/context-management#using-system_instruction-recommended-for-personality) instead.

## Limitations

<Warning>
Context summarization always generates a summary. Pipecat does not provide
a truncation-only mode that drops old messages without summarizing them. If
you want to bound context size, the available knob is to tune
Comment thread
jamsea marked this conversation as resolved.
Outdated
`max_context_tokens`, `max_unsummarized_messages`, `target_context_tokens`,
and `min_messages_after_summary`. Setting both `max_context_tokens` and
`max_unsummarized_messages` to `None` is not allowed, so summarization
cannot be reduced to pure truncation.
</Warning>

<Warning>
Only the message at `messages[0]` is preserved as the initial system
message. Preserving additional system messages (for example, the first two)
is not configurable. Mid-conversation system messages are treated as
regular messages and are included in the summarization range. If you need
persistent instructions that survive summarization, use
[`system_instruction`](/pipecat/learn/context-management#using-system_instruction-recommended-for-personality)
in LLM Settings instead of additional system-role messages.
Comment thread
jamsea marked this conversation as resolved.
Outdated
</Warning>

## Custom Summarization Prompts

You can override the default summarization prompt to control how the LLM generates summaries:
Expand Down
4 changes: 2 additions & 2 deletions pipecat/learn/context-management.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -259,13 +259,13 @@ context = context_aggregator.user().context

In long-running conversations, context grows with every exchange, increasing token usage and potentially hitting context window limits. Pipecat includes built-in context summarization that automatically compresses older conversation history while preserving recent messages.

Enable it by setting `enable_context_summarization=True` when creating your context aggregators:
Enable it by setting `enable_auto_context_summarization=True` when creating your context aggregators (default: `False`):

```python
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
context,
assistant_params=LLMAssistantAggregatorParams(
enable_context_summarization=True,
enable_auto_context_summarization=True,
),
)
```
Expand Down
Loading