diff --git a/api-reference/server/utilities/context-summarization.mdx b/api-reference/server/utilities/context-summarization.mdx index da3ca980..c5ef0027 100644 --- a/api-reference/server/utilities/context-summarization.mdx +++ b/api-reference/server/utilities/context-summarization.mdx @@ -5,10 +5,39 @@ description: "Reference for LLMAutoContextSummarizationConfig, LLMContextSummary ## Overview -Context summarization automatically compresses older conversation history when token or message limits are reached. It is configured via `LLMAutoContextSummarizationConfig` (auto-trigger thresholds) and `LLMContextSummaryConfig` (summary generation params), and managed by `LLMContextSummarizer`. +Context summarization automatically compresses older conversation history when token or message limits are reached. It is enabled on `LLMAssistantAggregatorParams`, configured via `LLMAutoContextSummarizationConfig` (auto-trigger thresholds) and `LLMContextSummaryConfig` (summary generation params), and managed by `LLMContextSummarizer`. For a walkthrough of how to enable and customize context summarization, see the [Context Summarization guide](/pipecat/fundamentals/context-summarization). +## LLMAssistantAggregatorParams + +```python +from pipecat.processors.aggregators.llm_response_universal import LLMAssistantAggregatorParams +``` + +The summarization-related fields on `LLMAssistantAggregatorParams`. + + + Enables automatic context summarization. When `False` (the default), the + summarizer is still created internally so that on-demand summarization via + `LLMSummarizeContextFrame` works, but automatic trigger checks are skipped. + Set to `True` to enable automatic summarization when either + `max_context_tokens` or `max_unsummarized_messages` is reached. + + + + Configuration for automatic summarization thresholds and summary generation. + When `None`, default `LLMAutoContextSummarizationConfig` values are used. + + ## LLMAutoContextSummarizationConfig ```python @@ -80,10 +109,10 @@ Controls how summaries are generated. Used as `summary_config` inside `LLMAutoCo the pipeline LLM handles summarization. - + Maximum time in seconds to wait for the LLM to generate a summary. If exceeded, summarization is aborted and future summarization attempts are - unblocked. Set to `None` to disable the timeout. + unblocked. ## LLMSummarizeContextFrame @@ -94,11 +123,7 @@ from pipecat.frames.frames import LLMSummarizeContextFrame Push this frame into the pipeline to trigger on-demand context summarization without waiting for automatic thresholds. - + Per-request override for summary generation settings (prompt, token budget, messages to keep). When `None`, the summarizer's default `LLMContextSummaryConfig` is used. diff --git a/pipecat/fundamentals/context-summarization.mdx b/pipecat/fundamentals/context-summarization.mdx index 2ae58c16..1f6b56e7 100644 --- a/pipecat/fundamentals/context-summarization.mdx +++ b/pipecat/fundamentals/context-summarization.mdx @@ -15,7 +15,7 @@ Context summarization automatically triggers when **either** condition is met: - **Token limit reached**: Context size exceeds `max_context_tokens` (estimated using ~4 characters per token) - **Message count reached**: Number of new messages exceeds `max_unsummarized_messages` -You can disable either threshold by setting it to `None`, as long as at least one remains active. +You can disable either threshold by setting it to `None`, but at least one must remain active. Summarization always generates a summary and cannot be reduced to pure truncation. When triggered, the system: @@ -49,7 +49,7 @@ user_aggregator, assistant_aggregator = LLMContextAggregatorPair( ) ``` -With the default configuration, summarization triggers at 8000 estimated tokens or after 20 new messages, whichever comes first. +Automatic summarization is **disabled by default** (`enable_auto_context_summarization=False`). When enabled with the default configuration, summarization triggers at 8000 estimated tokens or after 20 new messages, whichever comes first. ## Customizing Behavior diff --git a/pipecat/learn/context-management.mdx b/pipecat/learn/context-management.mdx index 38820329..b5049705 100644 --- a/pipecat/learn/context-management.mdx +++ b/pipecat/learn/context-management.mdx @@ -259,13 +259,13 @@ context = context_aggregator.user().context In long-running conversations, context grows with every exchange, increasing token usage and potentially hitting context window limits. Pipecat includes built-in context summarization that automatically compresses older conversation history while preserving recent messages. -Enable it by setting `enable_context_summarization=True` when creating your context aggregators: +Enable it by setting `enable_auto_context_summarization=True` when creating your context aggregators (default: `False`): ```python user_aggregator, assistant_aggregator = LLMContextAggregatorPair( context, assistant_params=LLMAssistantAggregatorParams( - enable_context_summarization=True, + enable_auto_context_summarization=True, ), ) ```