Skip to content
Open
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
f4f2ef2
feat(ai): add memory types
AlemTuzlak May 10, 2026
fca4624
feat(ai): add memory helper functions
AlemTuzlak May 10, 2026
42904a2
feat(ai): expose @tanstack/ai/memory subpath
AlemTuzlak May 10, 2026
474eb4a
test(ai): add failing memory middleware tests
AlemTuzlak May 10, 2026
397098c
feat(ai): add memoryMiddleware
AlemTuzlak May 10, 2026
c60faa0
fix(ai): tighten memory middleware test types for noUncheckedIndexedA…
AlemTuzlak May 10, 2026
f9945a7
feat(ai-event-client): add memory devtools events
AlemTuzlak May 10, 2026
c88c65d
feat(ai): emit memory devtools events from middleware
AlemTuzlak May 10, 2026
ab7dc97
feat(ai-memory): scaffold new package
AlemTuzlak May 10, 2026
01ba8a8
test(ai-memory): add shared adapter contract suite
AlemTuzlak May 10, 2026
72cc2b6
feat(ai-memory): add inMemoryMemoryAdapter
AlemTuzlak May 10, 2026
40be462
fix(ai-memory): tighten in-memory adapter lint compliance
AlemTuzlak May 10, 2026
055cd50
feat(ai-memory): add redisMemoryAdapter
AlemTuzlak May 10, 2026
d6df979
docs(ai): add tanstack-ai-memory skill
AlemTuzlak May 10, 2026
e0913b2
docs(ai-memory): add in-memory adapter skill
AlemTuzlak May 10, 2026
32b15de
docs(ai-memory): add redis adapter skill
AlemTuzlak May 10, 2026
74e7136
docs: add memory middleware concept and quickstart pages
AlemTuzlak May 10, 2026
1dd988a
chore: changeset for memory middleware
AlemTuzlak May 10, 2026
c93e7f6
chore: final formatting
AlemTuzlak May 10, 2026
ecd38ac
fix(ai, ai-memory): clean up lint and knip findings
AlemTuzlak May 10, 2026
d1fb337
fix(ai, ai-memory): address whole-feature audit findings
AlemTuzlak May 10, 2026
6576f7c
ci: apply automated fixes
autofix-ci[bot] May 10, 2026
54bec71
fix(ai): address CR Round 1 core middleware findings
AlemTuzlak May 10, 2026
2c3588c
fix(ai-memory): redis adapter scope semantics
AlemTuzlak May 10, 2026
5600b3b
feat(ai-memory): nodeRedisAsRedisLike helper for node-redis v4+
AlemTuzlak May 10, 2026
2eb1425
test(ai, ai-memory): tighten flaky and vacuous CR assertions
AlemTuzlak May 10, 2026
ac100b8
chore(ai-memory): set initial version to 0.0.0 for first publish
AlemTuzlak May 10, 2026
9fcb483
fix(ai, ai-memory): address CR Round 2 bucket-a findings
AlemTuzlak May 10, 2026
64a3872
fix(ai): close error-path observability gaps in memory middleware
AlemTuzlak May 10, 2026
59ec97e
fix(ai, ai-memory): close remaining scope-value-validation gaps
AlemTuzlak May 10, 2026
8a8d599
fix(ai-memory): escape _ in scope values to prevent placeholder colli…
AlemTuzlak May 10, 2026
94359d1
chore: refresh pnpm-lock.yaml for ai-memory ioredis peer dep
AlemTuzlak May 10, 2026
d17ae31
ci: apply automated fixes
autofix-ci[bot] May 10, 2026
ed23b50
docs: consolidate memory pages into a top-level Memory section
AlemTuzlak May 10, 2026
224f805
fix(ai, ai-memory): address CodeRabbit code review feedback
AlemTuzlak May 10, 2026
b478e8e
docs, chore: address CodeRabbit polish feedback
AlemTuzlak May 10, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions .changeset/memory-middleware.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
'@tanstack/ai': minor
'@tanstack/ai-event-client': minor
'@tanstack/ai-memory': minor
---

**Add server-side memory support via `memoryMiddleware`.**

A new `memoryMiddleware` from `@tanstack/ai/memory` retrieves relevant memories at chat init and persists user/assistant turns + tool results at finish. The middleware injects a rendered system prompt before the model call and runs persistence via `ctx.defer` so streaming is never blocked.

`@tanstack/ai`:

- New subpath `@tanstack/ai/memory` exporting `memoryMiddleware`, the `MemoryAdapter` / `MemoryRecord` / `MemoryScope` types, the `MemoryOp` union, helpers (`scopeMatches`, `cosine`, `lexicalOverlap`, `recencyScore`, `defaultRenderMemory`, `defaultScoreHit`, `isExpired`).
- Middleware extension hooks: `shouldRetrieve`, `rerank`, `shouldRemember`, `extractMemories`, `onToolResult`, `afterPersist`, plus app-level `events.*` callbacks and a `strict` mode.

`@tanstack/ai-event-client`:

- Five new events on `AIDevtoolsEventMap`: `memory:retrieve:started`, `memory:retrieve:completed`, `memory:persist:started`, `memory:persist:completed`, `memory:error`.

`@tanstack/ai-memory` (new package):

- `inMemoryMemoryAdapter()` — zero-dep adapter for dev/tests.
- `redisMemoryAdapter({ redis, prefix? })` — production adapter for plain Redis (`redis` listed as optional peer dependency).
- Both adapters pass a shared contract suite covering scope isolation, expiry, cursor pagination, kinds filtering, lexical-only ranking, semantic ranking with embeddings, and serialization round-trip (Redis).
18 changes: 18 additions & 0 deletions docs/config.json
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,24 @@
}
]
},
{
"label": "Middlewares",
"children": [
{
"label": "Memory",
"to": "middlewares/memory"
}
]
},
{
"label": "Guides",
"children": [
{
"label": "Memory Quickstart",
"to": "guides/memory-quickstart"
}
]
},
{
"label": "Advanced",
"children": [
Expand Down
136 changes: 136 additions & 0 deletions docs/guides/memory-quickstart.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
---
title: Memory Quickstart
id: memory-quickstart
order: 1
description: "Add cross-session memory to a TanStack AI chat() call in five steps — install the package, pick an adapter, wire memoryMiddleware, optionally add an embedder, and derive scope server-side."
keywords:
- tanstack ai
- memory
- quickstart
- in-memory adapter
- redis adapter
- chat middleware
---

You have a working `chat()` call and you want it to remember context across turns or sessions. By the end of this guide, you'll have `memoryMiddleware` retrieving relevant records into the prompt and persisting new turns through a real adapter, with scope derived safely from your server-validated session.

> **Want the full contract first?** See the [Memory Middleware](../middlewares/memory) concept page for the adapter interface, hooks, and devtools events.

## Step 1 — Install the package

`@tanstack/ai` is already installed. Add the adapter package:

```bash
pnpm add @tanstack/ai-memory
```

`@tanstack/ai-memory` exports the built-in `inMemoryMemoryAdapter` and `redisMemoryAdapter`. The middleware itself (`memoryMiddleware`) and the type contract (`MemoryAdapter`, `MemoryScope`, `MemoryRecord`, ...) live on the `@tanstack/ai/memory` subpath of the core package — no extra install required for those.

## Step 2 — Pick an adapter

> **In-memory** — `inMemoryMemoryAdapter()` is zero-dependency and stores records in a `Map`. Use it for local development, Vitest / Playwright tests, and single-process demos. Records vanish on process restart.

> **Redis** — `redisMemoryAdapter({ redis })` persists across restarts and shares state across processes. Use it for production. Bring your own Redis client (`ioredis`, `redis`, Upstash, ...) — the adapter is BYO-client.

Custom adapters implement the `MemoryAdapter` interface from `@tanstack/ai/memory`.

## Step 3 — Wire `memoryMiddleware` into `chat()`

Start with the in-memory adapter — it's the fastest path to a working setup:

```ts
import { chat } from '@tanstack/ai'
import { openaiText } from '@tanstack/ai-openai'
import { memoryMiddleware } from '@tanstack/ai/memory'
import { inMemoryMemoryAdapter } from '@tanstack/ai-memory'

const memory = inMemoryMemoryAdapter()

const stream = chat({
adapter: openaiText('gpt-4o'),
messages,
middleware: [
memoryMiddleware({
adapter: memory,
scope: { tenantId: 'demo', userId: 'alice' },
}),
],
})
```

That's a working setup. Each turn, the middleware retrieves relevant records into the system prompt (lexical search by default), then deferred-persists the user message and the assistant response after the stream finishes.

When you're ready to ship, swap the adapter and keep everything else the same:

```ts
import Redis from 'ioredis'
import { redisMemoryAdapter } from '@tanstack/ai-memory'

const redis = new Redis(process.env.REDIS_URL!)
const memory = redisMemoryAdapter({ redis })

memoryMiddleware({ adapter: memory, scope })
```

## Step 4 — Add an embedder (optional)

The middleware accepts an `embedder` for semantic search. **Add one when you need it; skip it when you don't:**

- **Skip** if your scopes are small (a few hundred records per user) — lexical scoring handles this fine and there is no embedding cost or latency.
- **Add** when scopes grow large or queries don't share keywords with stored records, and your adapter supports vector search (Redis with vector ops, hosted vector DBs, custom adapters).

```ts
import { memoryMiddleware } from '@tanstack/ai/memory'

memoryMiddleware({
adapter: memory,
scope,
embedder: {
async embed(text) {
// Use any embedding model — OpenAI, Cohere, a local model, etc.
const result = await embeddings.create({ input: text })
return result.data[0].embedding
},
},
})
```

The embedder is invoked on the retrieval path (to embed the query) and may be invoked again on the persist path (to embed assistant text or extracted facts). Implementations should be idempotent.

## Step 5 — Derive scope server-side

`scope` is the isolation boundary. Static scopes are fine for fixtures, but in any real multi-tenant app you must derive scope per request from server-validated session data — never from the request body.

```ts
import { chat } from '@tanstack/ai'
import { memoryMiddleware } from '@tanstack/ai/memory'

type AppCtx = { session: { tenantId: string; userId: string; activeThreadId: string } }

const stream = chat({
adapter: openaiText('gpt-4o'),
messages,
context: { session }, // attached by your auth middleware, not from req.body
middleware: [
memoryMiddleware({
adapter: memory,
scope: (ctx) => {
const { session } = ctx.context as AppCtx
return {
tenantId: session.tenantId,
userId: session.userId,
threadId: session.activeThreadId,
}
},
}),
],
})
```

If you accept `userId` or `tenantId` from the client, one user can read or overwrite another user's memory. The function form on `scope` is the safer default — it executes per request and only sees what your server attached to the chat context.

## Where to go next

- [Memory Middleware](../middlewares/memory) — adapter contract, hooks reference, devtools events, failure modes
- [In-memory adapter skill](https://github.com/TanStack/ai) — `tanstack-ai-memory-in-memory` (when to use, capacity limits)
- [Redis adapter skill](https://github.com/TanStack/ai) — `tanstack-ai-memory-redis` (vector search, key layout, ops)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Replace placeholder repo-root links with direct targets.

Line 135-136 currently send users to the repository root, not the adapter skill docs.

Proposed fix
-- [In-memory adapter skill](https://github.com/TanStack/ai) — `tanstack-ai-memory-in-memory` (when to use, capacity limits)
-- [Redis adapter skill](https://github.com/TanStack/ai) — `tanstack-ai-memory-redis` (vector search, key layout, ops)
+- [In-memory adapter skill](https://github.com/TanStack/ai/blob/main/packages/typescript/ai-memory/skills/tanstack-ai-memory-in-memory/SKILL.md) — `tanstack-ai-memory-in-memory` (when to use, capacity limits)
+- [Redis adapter skill](https://github.com/TanStack/ai/blob/main/packages/typescript/ai-memory/skills/tanstack-ai-memory-redis/SKILL.md) — `tanstack-ai-memory-redis` (vector search, key layout, ops)

As per coding guidelines, "Verify documentation links are valid via pnpm test:docs command."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- [In-memory adapter skill](https://github.com/TanStack/ai)`tanstack-ai-memory-in-memory` (when to use, capacity limits)
- [Redis adapter skill](https://github.com/TanStack/ai)`tanstack-ai-memory-redis` (vector search, key layout, ops)
- [In-memory adapter skill](https://github.com/TanStack/ai/blob/main/packages/typescript/ai-memory/skills/tanstack-ai-memory-in-memory/SKILL.md)`tanstack-ai-memory-in-memory` (when to use, capacity limits)
- [Redis adapter skill](https://github.com/TanStack/ai/blob/main/packages/typescript/ai-memory/skills/tanstack-ai-memory-redis/SKILL.md)`tanstack-ai-memory-redis` (vector search, key layout, ops)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/guides/memory-quickstart.md` around lines 135 - 136, The two markdown
links titled "[In-memory adapter skill]" and "[Redis adapter skill]" currently
point to the repository root; update their hrefs to the specific adapter skill
documentation pages for the packages `tanstack-ai-memory-in-memory` and
`tanstack-ai-memory-redis` respectively so the links go directly to each
adapter's docs (replace the root URLs in those link entries with the correct doc
targets) and run the pnpm test:docs validation to confirm they are valid.

170 changes: 170 additions & 0 deletions docs/middlewares/memory.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
---
title: Memory Middleware
id: memory-middleware
order: 1
description: "Persist and recall context across turns and sessions in TanStack AI — the memoryMiddleware retrieves relevant records into the prompt, then deferred-persists user, assistant, and tool turns through a pluggable adapter."
keywords:
- tanstack ai
- memory
- long-term memory
- retrieval
- persistence
- middleware
- rag
- personalization
---

`memoryMiddleware` plugs server-side memory into a `chat()` run. It retrieves relevant records from a pluggable adapter into the system prompt before the model runs, then asynchronously persists what should be remembered after the run finishes. It is the right tool when you need recall **across turns or across sessions** — not for keeping recent messages in the same request.

> **Want a copy-paste setup before reading the contract?** See the [Memory Quickstart](../guides/memory-quickstart) guide.

## When to reach for it

| Need | Use this |
|------|----------|
| "Remember what the user told me last week" | Memory middleware + persistent adapter |
| "Each tenant or user has its own context" | Memory middleware with scoped adapter calls |
| "Cache expensive tool results across requests" | Memory middleware with `onToolResult` + `kind: 'tool-result'` |
| Keep the last N turns in the same request | Just pass them in `messages` — memory is overkill |

Memory is for cross-turn / cross-session recall. The `messages` array on `chat()` already covers within-turn history.

## Adapter contract

Adapters are thin storage. They persist, fetch, search, and isolate by scope — they do not decide what to remember or how to render hits. Every backend implements the same seven methods:

| Method | Purpose |
|--------|---------|
| `name` | Stable identifier used in logs and devtools. |
| `add(records)` | Upsert one or many records by `id`. Same id replaces. |
| `get(id, scope)` | Fetch a single record. Returns `undefined` for missing, out-of-scope, or expired records. |
| `update(id, scope, patch)` | Patch a record in place. Preserves `id`/`scope`/`createdAt`, bumps `updatedAt`. |
| `search(query)` | Relevance-ranked search within a scope. Strategy (lexical / semantic / hybrid) is adapter-defined. |
| `list(scope, options)` | Non-relevance browsing — for inspectors, admin tools, exports. |
| `delete(ids, scope)` | Remove ids within a scope. Out-of-scope ids are silently skipped. |
| `clear(scope)` | Wipe everything matching a scope. Empty scope (`{}`) is treated as misuse. |

Three invariants every adapter MUST uphold: **scope isolation** (no cross-scope reads or writes), **expiry filtering** (`expiresAt` records are excluded from reads), and **id uniqueness** across all scopes.

Built-in adapters live in `@tanstack/ai-memory`:

```ts
import { inMemoryMemoryAdapter, redisMemoryAdapter } from '@tanstack/ai-memory'
```

Custom adapters implement `MemoryAdapter` from `@tanstack/ai/memory`.

## Scope and security

`MemoryScope` is the isolation boundary. Every key is optional and orthogonal — the adapter rejects cross-scope reads and writes:

```ts
import type { MemoryScope } from '@tanstack/ai/memory'

type MemoryScope = {
tenantId?: string
userId?: string
sessionId?: string
threadId?: string
namespace?: string
}
```

**Always derive scope server-side from trusted state.** Accepting `tenantId` or `userId` from the request body is how one user reads another user's memory. The function form on `scope` is the recommended pattern — it runs per request and has access to the validated chat context:

```ts
memoryMiddleware({
adapter,
scope: (ctx) => {
const session = (ctx.context as AppCtx).session // server-validated
return {
tenantId: session.tenantId,
userId: session.userId,
threadId: session.activeThreadId,
}
},
})
```

Pass the validated session through `chat({ context: { session } })`. The static form (`scope: { tenantId: 'acme' }`) is fine for single-tenant or test fixtures, but the function form is safer in any multi-tenant deployment.

## Retrieval flow

Retrieval runs once per `chat()` invocation, during the `init` phase:

1. `shouldRetrieve({ userText, scope })` — optional gate. Return `false` to skip retrieval entirely for this turn.
2. `adapter.search({ scope, text, embedding?, topK, minScore, kinds })` — the adapter decides whether to use the embedding (semantic), the text (lexical), or both (hybrid).
3. `rerank(hits, { scope, query, ctx })` — optional re-rank between search and render. Plug in MMR, RRF, or a cross-encoder.
4. `render(hits)` — formats the final hit set into a string injected into the prompt. Defaults to `defaultRenderMemory`.

An `embedder` is **optional**. Adapters that support semantic search (Redis with vector ops, hosted vector DBs) need one; lexical-only setups don't.

## Persistence flow

Persistence is **deferred** via `ctx.defer` — it runs after the chat stream finishes and never blocks the response:

1. `shouldRemember({ message, responseText })` — optional gate on whether to write at all this turn.
2. The middleware persists user and assistant turns as `kind: 'message'`.
3. `extractMemories({ userText, responseText, scope, adapter })` — return a `MemoryOp[]` (mixed add/update/delete) or `MemoryRecord[]` (treated as all-add) to capture facts, preferences, or summaries.
4. For each completed tool call, `onToolResult({ toolName, toolCallId, args, result, scope, adapter })` — same return shape, typically used to persist results as `kind: 'tool-result'`.
5. `afterPersist({ newRecords, scope, adapter })` — fires after `adapter.add` commits, with newly-added records (not updates or deletes).

## Extension hooks

| Hook | Phase | Use for |
|------|-------|---------|
| `shouldRetrieve` | before search | Skip retrieval for cheap turns or content-gated requests |
| `rerank` | between search and render | MMR, RRF, recency boosts, cross-encoder rerankers |
| `shouldRemember` | before persist | Drop short, sensitive, or transient messages |
| `extractMemories` | after model finishes | Mem0-style consolidation — extract facts and preferences |
| `onToolResult` | per completed tool call | Persist tool outputs as `kind: 'tool-result'` |
| `afterPersist` | after `adapter.add` commits | Background work — summarisation, eviction, indexing |

`extractMemories` and `onToolResult` may return `MemoryRecord[]` (shorthand: all-add) or `MemoryOp[]` (mixed `add` / `update` / `delete`).

## Devtools events

The middleware emits five events on `aiEventClient` (from `@tanstack/ai-event-client`):

| Event | When |
|-------|------|
| `memory:retrieve:started` | Retrieval path begins (after `shouldRetrieve` returns true) |
| `memory:retrieve:completed` | Final hit set is ready (post-rerank, pre-render) |
| `memory:persist:started` | Persist path is about to call `adapter.add` |
| `memory:persist:completed` | `adapter.add` succeeded |
| `memory:error` | Retrieval, persistence, or extraction threw |

Hits and records carry a 200-character `preview` only — full text is never streamed by default, so devtools never leak full memory contents.

For application telemetry that should not depend on devtools being installed, use the `events.*` callbacks on `MemoryMiddlewareOptions` (`onRetrieveStart`, `onRetrieveEnd`, `onPersistStart`, `onPersistEnd`, `onError`).

## Failure modes

By default `strict: false` — retrieval and persistence failures emit `memory:error` (and call `events.onError`), but the chat run continues with degraded memory. Set `strict: true` when memory correctness is more important than uptime, for example in compliance-sensitive deployments or in tests where a missed write is worse than a failed turn.

## TypeScript types

```ts
import type {
MemoryAdapter,
MemoryRecord,
MemoryRecordPatch,
MemoryScope,
MemoryQuery,
MemorySearchResult,
MemoryListOptions,
MemoryListResult,
MemoryHit,
MemoryKind,
MemoryRole,
MemoryEmbedder,
MemoryOp,
MemoryMiddlewareOptions,
} from '@tanstack/ai/memory'
```

## Next steps

- [Memory Quickstart](../guides/memory-quickstart) — wire the middleware into a real `chat()` call in five steps
- [Middleware](../advanced/middleware) — the underlying `chat()` middleware lifecycle and hooks
- [Observability](../advanced/observability) — subscribe to `memory:*` events for tracing
3 changes: 3 additions & 0 deletions knip.json
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,9 @@
"packages/typescript/ai-client": {
"ignoreDependencies": ["@standard-schema/spec"]
},
"packages/typescript/ai-memory": {
"ignoreDependencies": ["redis"]
},
"packages/typescript/ai-react-ui": {
"ignoreDependencies": ["react-dom"]
},
Expand Down
Loading
Loading