Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 28 additions & 13 deletions docs/toolhive/guides-k8s/run-mcp-k8s.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -453,17 +453,40 @@ The proxy runner handles authentication, MCP protocol framing, and session
management; it is stateless with respect to tool execution. The backend runs the
actual MCP server and executes tools.

### Session routing for backend replicas

MCP connections are stateful: once a client establishes a session with a
specific backend pod, all subsequent requests in that session must reach the
same pod. When `backendReplicas > 1`, the proxy runner uses Redis to store a
session-to-pod mapping so every proxy runner replica knows which backend pod
owns each session.

Without Redis, the proxy runner falls back to Kubernetes client-IP session
affinity on the backend Service, which is unreliable behind NAT or shared egress
IPs. If a backend pod is restarted or replaced, its entry in the Redis routing
table is invalidated and the next request reconnects to an available pod —
sessions are not automatically migrated between pods.
Comment on lines +460 to +468
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording implies Redis is always used whenever backendReplicas > 1 ("the proxy runner uses Redis ..."), but the next paragraph describes behavior without Redis. Consider rephrasing to make it explicit that Redis-backed routing only applies when Redis session storage is configured, and otherwise the proxy runner relies on Service-level affinity (with the limitations described).

Suggested change
same pod. When `backendReplicas > 1`, the proxy runner uses Redis to store a
session-to-pod mapping so every proxy runner replica knows which backend pod
owns each session.
Without Redis, the proxy runner falls back to Kubernetes client-IP session
affinity on the backend Service, which is unreliable behind NAT or shared egress
IPs. If a backend pod is restarted or replaced, its entry in the Redis routing
table is invalidated and the next request reconnects to an available pod —
sessions are not automatically migrated between pods.
same pod. When `backendReplicas > 1` and Redis session storage is configured,
the proxy runner uses Redis to store a session-to-pod mapping so every proxy
runner replica knows which backend pod owns each session.
Without Redis session storage, the proxy runner relies on Kubernetes client-IP
session affinity on the backend Service, which is unreliable behind NAT or
shared egress IPs. If a backend pod is restarted or replaced, its entry in the
Redis routing table is invalidated and the next request reconnects to an
available pod — sessions are not automatically migrated between pods.

Copilot uses AI. Check for mistakes.

:::note

The `SessionStorageWarning` condition only fires when `spec.replicas > 1`
(multiple proxy runner pods). It does not fire when only `backendReplicas > 1`,
but Redis session storage is still strongly recommended in that case to ensure
reliable per-session pod routing.

:::

Common configurations:

- **Scale only the proxy** (`replicas: N`, omit `backendReplicas`): useful when
auth and connection overhead is the bottleneck with a single backend.
- **Scale only the backend** (omit `replicas`, `backendReplicas: M`): useful
when tool execution is CPU/memory-bound and the proxy is not a bottleneck. The
backend Deployment uses client-IP session affinity to route repeated
connections to the same pod - subject to the same NAT limitations as
proxy-level affinity.
when tool execution is CPU/memory-bound and the proxy is not a bottleneck.
Configure Redis session storage so the proxy runner can route requests to the
correct backend pod.
- **Scale both** (`replicas: N`, `backendReplicas: M`): full horizontal scale.
Redis session storage is required when `replicas > 1`.
Redis session storage is required for reliable operation when `replicas > 1`,
and strongly recommended when `backendReplicas > 1`.

```yaml title="MCPServer resource"
spec:
Expand All @@ -484,14 +507,6 @@ When running multiple replicas, configure
across pods. If you omit `replicas` or `backendReplicas`, the operator defers
replica management to an HPA or other external controller.

:::note

The `SessionStorageWarning` condition fires only when `spec.replicas > 1`.
Scaling only the backend (`backendReplicas > 1`) does not trigger a warning, but
backend client-IP affinity is still unreliable behind NAT or shared egress IPs.

:::

:::note[Connection draining on scale-down]

When a proxy runner pod is terminated (scale-in, rolling update, or node
Expand Down