From 6c149858800ba10e644c65db61b36d995cf454bd Mon Sep 17 00:00:00 2001 From: James Hush Date: Fri, 17 Apr 2026 15:43:35 +0800 Subject: [PATCH 1/3] Document context summarization defaults, limitations, and aggregator params Addresses three kapa.ai coverage gaps: the default value of enable_auto_context_summarization, the absence of a truncation-only mode, and the fact that only messages[0] is preserved as a system message. - Add LLMAssistantAggregatorParams section to the reference page with enable_auto_context_summarization (default: False) and auto_context_summarization_config, plus a note on the deprecated field names. - Add a Limitations section to the fundamentals page covering the no-truncation-only-mode constraint and the single-system-message preservation behavior. - State the default for enable_auto_context_summarization explicitly in the fundamentals page. - Update the context-management page to use the non-deprecated enable_auto_context_summarization field name. --- .../utilities/context-summarization.mdx | 34 ++++++++++++++++++- .../fundamentals/context-summarization.mdx | 24 ++++++++++++- pipecat/learn/context-management.mdx | 4 +-- 3 files changed, 58 insertions(+), 4 deletions(-) diff --git a/api-reference/server/utilities/context-summarization.mdx b/api-reference/server/utilities/context-summarization.mdx index da3ca980..e604778f 100644 --- a/api-reference/server/utilities/context-summarization.mdx +++ b/api-reference/server/utilities/context-summarization.mdx @@ -5,10 +5,42 @@ description: "Reference for LLMAutoContextSummarizationConfig, LLMContextSummary ## Overview -Context summarization automatically compresses older conversation history when token or message limits are reached. It is configured via `LLMAutoContextSummarizationConfig` (auto-trigger thresholds) and `LLMContextSummaryConfig` (summary generation params), and managed by `LLMContextSummarizer`. +Context summarization automatically compresses older conversation history when token or message limits are reached. It is enabled on `LLMAssistantAggregatorParams`, configured via `LLMAutoContextSummarizationConfig` (auto-trigger thresholds) and `LLMContextSummaryConfig` (summary generation params), and managed by `LLMContextSummarizer`. For a walkthrough of how to enable and customize context summarization, see the [Context Summarization guide](/pipecat/fundamentals/context-summarization). +## LLMAssistantAggregatorParams + +```python +from pipecat.processors.aggregators.llm_response_universal import LLMAssistantAggregatorParams +``` + +The summarization-related fields on `LLMAssistantAggregatorParams`. + + + Enables automatic context summarization. When `False` (the default), the + summarizer is still created internally so that on-demand summarization via + `LLMSummarizeContextFrame` works, but automatic trigger checks are skipped. + Set to `True` to enable automatic summarization when either + `max_context_tokens` or `max_unsummarized_messages` is reached. + + + + Configuration for automatic summarization thresholds and summary generation. + When `None`, default `LLMAutoContextSummarizationConfig` values are used. + + + + The older field names `enable_context_summarization` and + `context_summarization_config` are deprecated but still accepted. Passing + them emits a `DeprecationWarning` and the values are mapped to the new + fields. See the deprecation section at the bottom of this page. + + ## LLMAutoContextSummarizationConfig ```python diff --git a/pipecat/fundamentals/context-summarization.mdx b/pipecat/fundamentals/context-summarization.mdx index 2ae58c16..5ac6b67b 100644 --- a/pipecat/fundamentals/context-summarization.mdx +++ b/pipecat/fundamentals/context-summarization.mdx @@ -49,7 +49,7 @@ user_aggregator, assistant_aggregator = LLMContextAggregatorPair( ) ``` -With the default configuration, summarization triggers at 8000 estimated tokens or after 20 new messages, whichever comes first. +Automatic summarization is **disabled by default** (`enable_auto_context_summarization=False`). When enabled with the default configuration, summarization triggers at 8000 estimated tokens or after 20 new messages, whichever comes first. ## Customizing Behavior @@ -88,6 +88,28 @@ Context summarization intelligently preserves: - **Function call sequences**: Incomplete function call/result pairs are not split during summarization - **Developer messages are NOT preserved**: Developer messages (`"role": "developer"`) are included in the summarization range like any other message and may be compressed or dropped. If instructions need to survive summarization, use [`system_instruction`](/pipecat/learn/context-management#using-system_instruction-recommended-for-personality) instead. +## Limitations + + + Context summarization always generates a summary. Pipecat does not provide + a truncation-only mode that drops old messages without summarizing them. If + you want to bound context size, the available knob is to tune + `max_context_tokens`, `max_unsummarized_messages`, `target_context_tokens`, + and `min_messages_after_summary`. Setting both `max_context_tokens` and + `max_unsummarized_messages` to `None` is not allowed, so summarization + cannot be reduced to pure truncation. + + + + Only the message at `messages[0]` is preserved as the initial system + message. Preserving additional system messages (for example, the first two) + is not configurable. Mid-conversation system messages are treated as + regular messages and are included in the summarization range. If you need + persistent instructions that survive summarization, use + [`system_instruction`](/pipecat/learn/context-management#using-system_instruction-recommended-for-personality) + in LLM Settings instead of additional system-role messages. + + ## Custom Summarization Prompts You can override the default summarization prompt to control how the LLM generates summaries: diff --git a/pipecat/learn/context-management.mdx b/pipecat/learn/context-management.mdx index 38820329..b5049705 100644 --- a/pipecat/learn/context-management.mdx +++ b/pipecat/learn/context-management.mdx @@ -259,13 +259,13 @@ context = context_aggregator.user().context In long-running conversations, context grows with every exchange, increasing token usage and potentially hitting context window limits. Pipecat includes built-in context summarization that automatically compresses older conversation history while preserving recent messages. -Enable it by setting `enable_context_summarization=True` when creating your context aggregators: +Enable it by setting `enable_auto_context_summarization=True` when creating your context aggregators (default: `False`): ```python user_aggregator, assistant_aggregator = LLMContextAggregatorPair( context, assistant_params=LLMAssistantAggregatorParams( - enable_context_summarization=True, + enable_auto_context_summarization=True, ), ) ``` From 19504bbd39cf87519c6d9fb9c608721f791beede Mon Sep 17 00:00:00 2001 From: James Hush Date: Fri, 17 Apr 2026 15:52:00 +0800 Subject: [PATCH 2/3] Update pipecat/fundamentals/context-summarization.mdx Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- pipecat/fundamentals/context-summarization.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pipecat/fundamentals/context-summarization.mdx b/pipecat/fundamentals/context-summarization.mdx index 5ac6b67b..d19dce15 100644 --- a/pipecat/fundamentals/context-summarization.mdx +++ b/pipecat/fundamentals/context-summarization.mdx @@ -93,7 +93,7 @@ Context summarization intelligently preserves: Context summarization always generates a summary. Pipecat does not provide a truncation-only mode that drops old messages without summarizing them. If - you want to bound context size, the available knob is to tune + you want to bound context size, the available knobs are to tune `max_context_tokens`, `max_unsummarized_messages`, `target_context_tokens`, and `min_messages_after_summary`. Setting both `max_context_tokens` and `max_unsummarized_messages` to `None` is not allowed, so summarization From 89d6c0f07169e4e932ac6c83af638a897a734946 Mon Sep 17 00:00:00 2001 From: Mark Backman Date: Fri, 17 Apr 2026 07:45:03 -0400 Subject: [PATCH 3/3] Context summarization fixes --- .../utilities/context-summarization.mdx | 25 +++++++------------ .../fundamentals/context-summarization.mdx | 24 +----------------- 2 files changed, 10 insertions(+), 39 deletions(-) diff --git a/api-reference/server/utilities/context-summarization.mdx b/api-reference/server/utilities/context-summarization.mdx index e604778f..c5ef0027 100644 --- a/api-reference/server/utilities/context-summarization.mdx +++ b/api-reference/server/utilities/context-summarization.mdx @@ -17,7 +17,11 @@ from pipecat.processors.aggregators.llm_response_universal import LLMAssistantAg The summarization-related fields on `LLMAssistantAggregatorParams`. - + Enables automatic context summarization. When `False` (the default), the summarizer is still created internally so that on-demand summarization via `LLMSummarizeContextFrame` works, but automatic trigger checks are skipped. @@ -27,20 +31,13 @@ The summarization-related fields on `LLMAssistantAggregatorParams`. Configuration for automatic summarization thresholds and summary generation. When `None`, default `LLMAutoContextSummarizationConfig` values are used. - - The older field names `enable_context_summarization` and - `context_summarization_config` are deprecated but still accepted. Passing - them emits a `DeprecationWarning` and the values are mapped to the new - fields. See the deprecation section at the bottom of this page. - - ## LLMAutoContextSummarizationConfig ```python @@ -112,10 +109,10 @@ Controls how summaries are generated. Used as `summary_config` inside `LLMAutoCo the pipeline LLM handles summarization. - + Maximum time in seconds to wait for the LLM to generate a summary. If exceeded, summarization is aborted and future summarization attempts are - unblocked. Set to `None` to disable the timeout. + unblocked. ## LLMSummarizeContextFrame @@ -126,11 +123,7 @@ from pipecat.frames.frames import LLMSummarizeContextFrame Push this frame into the pipeline to trigger on-demand context summarization without waiting for automatic thresholds. - + Per-request override for summary generation settings (prompt, token budget, messages to keep). When `None`, the summarizer's default `LLMContextSummaryConfig` is used. diff --git a/pipecat/fundamentals/context-summarization.mdx b/pipecat/fundamentals/context-summarization.mdx index d19dce15..1f6b56e7 100644 --- a/pipecat/fundamentals/context-summarization.mdx +++ b/pipecat/fundamentals/context-summarization.mdx @@ -15,7 +15,7 @@ Context summarization automatically triggers when **either** condition is met: - **Token limit reached**: Context size exceeds `max_context_tokens` (estimated using ~4 characters per token) - **Message count reached**: Number of new messages exceeds `max_unsummarized_messages` -You can disable either threshold by setting it to `None`, as long as at least one remains active. +You can disable either threshold by setting it to `None`, but at least one must remain active. Summarization always generates a summary and cannot be reduced to pure truncation. When triggered, the system: @@ -88,28 +88,6 @@ Context summarization intelligently preserves: - **Function call sequences**: Incomplete function call/result pairs are not split during summarization - **Developer messages are NOT preserved**: Developer messages (`"role": "developer"`) are included in the summarization range like any other message and may be compressed or dropped. If instructions need to survive summarization, use [`system_instruction`](/pipecat/learn/context-management#using-system_instruction-recommended-for-personality) instead. -## Limitations - - - Context summarization always generates a summary. Pipecat does not provide - a truncation-only mode that drops old messages without summarizing them. If - you want to bound context size, the available knobs are to tune - `max_context_tokens`, `max_unsummarized_messages`, `target_context_tokens`, - and `min_messages_after_summary`. Setting both `max_context_tokens` and - `max_unsummarized_messages` to `None` is not allowed, so summarization - cannot be reduced to pure truncation. - - - - Only the message at `messages[0]` is preserved as the initial system - message. Preserving additional system messages (for example, the first two) - is not configurable. Mid-conversation system messages are treated as - regular messages and are included in the summarization range. If you need - persistent instructions that survive summarization, use - [`system_instruction`](/pipecat/learn/context-management#using-system_instruction-recommended-for-personality) - in LLM Settings instead of additional system-role messages. - - ## Custom Summarization Prompts You can override the default summarization prompt to control how the LLM generates summaries: