diff --git a/api-reference/pipecat-flows/overview.mdx b/api-reference/pipecat-flows/overview.mdx index 991435dc..b34339c4 100644 --- a/api-reference/pipecat-flows/overview.mdx +++ b/api-reference/pipecat-flows/overview.mdx @@ -107,6 +107,10 @@ Pipecat Flows works with any LLM service that supports function calling. Pipecat Any service that extends Pipecat's `LLMService` base class is supported. This includes OpenAI-compatible services like Groq, Together, Cerebras, DeepSeek, and others. +### Realtime (S2S) models + +Realtime speech-to-speech services such as Gemini Live and OpenAI Realtime are not currently supported. See [Using Flows with Realtime Models](/pipecat-flows/guides/realtime-models) for the recommended cascade configuration. + ## Additional Notes - **State Management**: Use `flow_manager.state` dictionary for persistent conversation data diff --git a/api-reference/server/services/s2s/aws.mdx b/api-reference/server/services/s2s/aws.mdx index 0bb425aa..9092cc38 100644 --- a/api-reference/server/services/s2s/aws.mdx +++ b/api-reference/server/services/s2s/aws.mdx @@ -7,6 +7,12 @@ description: "Real-time speech-to-speech service implementation using AWS Nova S `AWSNovaSonicLLMService` enables natural, real-time conversations with AWS Nova Sonic. It provides built-in audio transcription, voice activity detection, and context management for creating interactive AI experiences with bidirectional audio streaming, text generation, and function calling capabilities. + + **Not compatible with Pipecat Flows.** Flows requires a cascade LLM service. + See [Using Flows with Realtime + Models](/pipecat-flows/guides/realtime-models). + + + **Not compatible with Pipecat Flows.** Flows requires a cascade LLM service. + See [Using Flows with Realtime + Models](/pipecat-flows/guides/realtime-models). + + Want to start building? Check out our [Gemini Live Guide](/pipecat/features/gemini-live) for general concepts, then follow the diff --git a/api-reference/server/services/s2s/gemini-live.mdx b/api-reference/server/services/s2s/gemini-live.mdx index f72d4b7e..d5078e22 100644 --- a/api-reference/server/services/s2s/gemini-live.mdx +++ b/api-reference/server/services/s2s/gemini-live.mdx @@ -7,6 +7,12 @@ description: "A real-time, multimodal conversational AI service powered by Googl `GeminiLiveLLMService` enables natural, real-time conversations with Google's Gemini model. It provides built-in audio transcription, voice activity detection, and context management for creating interactive AI experiences with multimodal capabilities including audio, video, and text processing. + + **Not compatible with Pipecat Flows.** Flows requires a cascade LLM service. + See [Using Flows with Realtime + Models](/pipecat-flows/guides/realtime-models). + + Want to start building? Check out our [Gemini Live Guide](/pipecat/features/gemini-live). diff --git a/api-reference/server/services/s2s/grok.mdx b/api-reference/server/services/s2s/grok.mdx index f4841e81..037eb3de 100644 --- a/api-reference/server/services/s2s/grok.mdx +++ b/api-reference/server/services/s2s/grok.mdx @@ -7,6 +7,12 @@ description: "Real-time speech-to-speech service implementation using xAI's Grok `GrokRealtimeLLMService` provides real-time, multimodal conversation capabilities using xAI's Grok Voice Agent API. It supports speech-to-speech interactions with integrated LLM processing, function calling, and advanced conversation management with low-latency response times. + + **Not compatible with Pipecat Flows.** Flows requires a cascade LLM service. + See [Using Flows with Realtime + Models](/pipecat-flows/guides/realtime-models). + + + **Not compatible with Pipecat Flows.** Flows requires a cascade LLM service. + See [Using Flows with Realtime + Models](/pipecat-flows/guides/realtime-models). + + + **Not compatible with Pipecat Flows.** Flows requires a cascade LLM service. + See [Using Flows with Realtime + Models](/pipecat-flows/guides/realtime-models). + + + **Not compatible with Pipecat Flows.** Flows requires a cascade LLM service. + See [Using Flows with Realtime + Models](/pipecat-flows/guides/realtime-models). + + + For a complete runnable walkthrough (nodes, functions, and a working end-to-end + example), see the [Flows Quickstart](/pipecat-flows/guides/quickstart). + + +## If You Specifically Need Realtime S2S + +If speech-to-speech is a hard requirement, build with plain Pipecat (without Flows) and manage conversation state in your own code. The S2S service pages have everything you need to get started: + + + + Realtime speech-to-speech with Google Gemini Live + + + Realtime speech-to-speech with OpenAI's Realtime API + + diff --git a/pipecat-flows/introduction.mdx b/pipecat-flows/introduction.mdx index cbc68a1a..28198dda 100644 --- a/pipecat-flows/introduction.mdx +++ b/pipecat-flows/introduction.mdx @@ -15,6 +15,12 @@ Pipecat Flows is best suited for use cases where: - **Your bot handles complex tasks** that can be broken down into smaller, manageable pieces - **You want to improve LLM accuracy** by focusing the model on one specific task at a time instead of managing multiple responsibilities simultaneously + + Looking for Gemini Live, OpenAI Realtime, or another speech-to-speech model? + See [Using Flows with Realtime + Models](/pipecat-flows/guides/realtime-models). + + ## How Pipecat and Pipecat Flows Work Together **Pipecat** defines the core capabilities of your bot — the pipeline and processors that enable receiving audio, transcribing input, running LLM completions, converting responses to audio, and sending audio back to the user.