From 5db0d891632688423e1079e6e9b3f310d333b100 Mon Sep 17 00:00:00 2001 From: Yolanda Robla Date: Tue, 14 Apr 2026 15:13:31 +0200 Subject: [PATCH] Explain proxyrunner session-aware backend pod routing Users scaling backendReplicas > 1 had no explanation of why Redis is needed or how the proxy runner routes sessions to backend pods. - Add a "Session routing for backend replicas" subsection explaining that the proxy runner uses Redis to store session-to-pod mappings, what happens when a backend pod restarts, and why client-IP affinity alone is unreliable behind NAT - Absorb the standalone SessionStorageWarning note into the new subsection so the backendReplicas implication is explicit - Update Redis link to point directly to the horizontal scaling anchor Closes #708 Co-Authored-By: Claude Sonnet 4.6 --- docs/toolhive/guides-k8s/run-mcp-k8s.mdx | 41 ++++++++++++++++-------- 1 file changed, 28 insertions(+), 13 deletions(-) diff --git a/docs/toolhive/guides-k8s/run-mcp-k8s.mdx b/docs/toolhive/guides-k8s/run-mcp-k8s.mdx index e0154331..5254899a 100644 --- a/docs/toolhive/guides-k8s/run-mcp-k8s.mdx +++ b/docs/toolhive/guides-k8s/run-mcp-k8s.mdx @@ -453,17 +453,40 @@ The proxy runner handles authentication, MCP protocol framing, and session management; it is stateless with respect to tool execution. The backend runs the actual MCP server and executes tools. +### Session routing for backend replicas + +MCP connections are stateful: once a client establishes a session with a +specific backend pod, all subsequent requests in that session must reach the +same pod. When `backendReplicas > 1`, the proxy runner uses Redis to store a +session-to-pod mapping so every proxy runner replica knows which backend pod +owns each session. + +Without Redis, the proxy runner falls back to Kubernetes client-IP session +affinity on the backend Service, which is unreliable behind NAT or shared egress +IPs. If a backend pod is restarted or replaced, its entry in the Redis routing +table is invalidated and the next request reconnects to an available pod — +sessions are not automatically migrated between pods. + +:::note + +The `SessionStorageWarning` condition only fires when `spec.replicas > 1` +(multiple proxy runner pods). It does not fire when only `backendReplicas > 1`, +but Redis session storage is still strongly recommended in that case to ensure +reliable per-session pod routing. + +::: + Common configurations: - **Scale only the proxy** (`replicas: N`, omit `backendReplicas`): useful when auth and connection overhead is the bottleneck with a single backend. - **Scale only the backend** (omit `replicas`, `backendReplicas: M`): useful - when tool execution is CPU/memory-bound and the proxy is not a bottleneck. The - backend Deployment uses client-IP session affinity to route repeated - connections to the same pod - subject to the same NAT limitations as - proxy-level affinity. + when tool execution is CPU/memory-bound and the proxy is not a bottleneck. + Configure Redis session storage so the proxy runner can route requests to the + correct backend pod. - **Scale both** (`replicas: N`, `backendReplicas: M`): full horizontal scale. - Redis session storage is required when `replicas > 1`. + Redis session storage is required for reliable operation when `replicas > 1`, + and strongly recommended when `backendReplicas > 1`. ```yaml title="MCPServer resource" spec: @@ -484,14 +507,6 @@ When running multiple replicas, configure across pods. If you omit `replicas` or `backendReplicas`, the operator defers replica management to an HPA or other external controller. -:::note - -The `SessionStorageWarning` condition fires only when `spec.replicas > 1`. -Scaling only the backend (`backendReplicas > 1`) does not trigger a warning, but -backend client-IP affinity is still unreliable behind NAT or shared egress IPs. - -::: - :::note[Connection draining on scale-down] When a proxy runner pod is terminated (scale-in, rolling update, or node