Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 33 additions & 8 deletions api-reference/server/utilities/context-summarization.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,39 @@ description: "Reference for LLMAutoContextSummarizationConfig, LLMContextSummary

## Overview

Context summarization automatically compresses older conversation history when token or message limits are reached. It is configured via `LLMAutoContextSummarizationConfig` (auto-trigger thresholds) and `LLMContextSummaryConfig` (summary generation params), and managed by `LLMContextSummarizer`.
Context summarization automatically compresses older conversation history when token or message limits are reached. It is enabled on `LLMAssistantAggregatorParams`, configured via `LLMAutoContextSummarizationConfig` (auto-trigger thresholds) and `LLMContextSummaryConfig` (summary generation params), and managed by `LLMContextSummarizer`.

For a walkthrough of how to enable and customize context summarization, see the [Context Summarization guide](/pipecat/fundamentals/context-summarization).

## LLMAssistantAggregatorParams

```python
from pipecat.processors.aggregators.llm_response_universal import LLMAssistantAggregatorParams
```

The summarization-related fields on `LLMAssistantAggregatorParams`.

<ParamField
path="enable_auto_context_summarization"
type="bool"
default="False"
>
Enables automatic context summarization. When `False` (the default), the
summarizer is still created internally so that on-demand summarization via
`LLMSummarizeContextFrame` works, but automatic trigger checks are skipped.
Set to `True` to enable automatic summarization when either
`max_context_tokens` or `max_unsummarized_messages` is reached.
</ParamField>

<ParamField
path="auto_context_summarization_config"
type="LLMAutoContextSummarizationConfig | None"
default="None"
>
Configuration for automatic summarization thresholds and summary generation.
When `None`, default `LLMAutoContextSummarizationConfig` values are used.
</ParamField>

## LLMAutoContextSummarizationConfig

```python
Expand Down Expand Up @@ -80,10 +109,10 @@ Controls how summaries are generated. Used as `summary_config` inside `LLMAutoCo
the pipeline LLM handles summarization.
</ParamField>

<ParamField path="summarization_timeout" type="Optional[float]" default="120.0">
<ParamField path="summarization_timeout" type="float" default="120.0">
Maximum time in seconds to wait for the LLM to generate a summary. If
exceeded, summarization is aborted and future summarization attempts are
unblocked. Set to `None` to disable the timeout.
unblocked.
</ParamField>

## LLMSummarizeContextFrame
Expand All @@ -94,11 +123,7 @@ from pipecat.frames.frames import LLMSummarizeContextFrame

Push this frame into the pipeline to trigger on-demand context summarization without waiting for automatic thresholds.

<ParamField
path="config"
type="Optional[LLMContextSummaryConfig]"
default="None"
>
<ParamField path="config" type="LLMContextSummaryConfig | None" default="None">
Per-request override for summary generation settings (prompt, token budget,
messages to keep). When `None`, the summarizer's default
`LLMContextSummaryConfig` is used.
Expand Down
4 changes: 2 additions & 2 deletions pipecat/fundamentals/context-summarization.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Context summarization automatically triggers when **either** condition is met:
- **Token limit reached**: Context size exceeds `max_context_tokens` (estimated using ~4 characters per token)
- **Message count reached**: Number of new messages exceeds `max_unsummarized_messages`

You can disable either threshold by setting it to `None`, as long as at least one remains active.
You can disable either threshold by setting it to `None`, but at least one must remain active. Summarization always generates a summary and cannot be reduced to pure truncation.

When triggered, the system:

Expand Down Expand Up @@ -49,7 +49,7 @@ user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
)
```

With the default configuration, summarization triggers at 8000 estimated tokens or after 20 new messages, whichever comes first.
Automatic summarization is **disabled by default** (`enable_auto_context_summarization=False`). When enabled with the default configuration, summarization triggers at 8000 estimated tokens or after 20 new messages, whichever comes first.

## Customizing Behavior

Expand Down
4 changes: 2 additions & 2 deletions pipecat/learn/context-management.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -259,13 +259,13 @@ context = context_aggregator.user().context

In long-running conversations, context grows with every exchange, increasing token usage and potentially hitting context window limits. Pipecat includes built-in context summarization that automatically compresses older conversation history while preserving recent messages.

Enable it by setting `enable_context_summarization=True` when creating your context aggregators:
Enable it by setting `enable_auto_context_summarization=True` when creating your context aggregators (default: `False`):

```python
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
context,
assistant_params=LLMAssistantAggregatorParams(
enable_context_summarization=True,
enable_auto_context_summarization=True,
),
)
```
Expand Down
Loading