Memory
Long-term memory in gecx-chat lets the assistant remember facts about a user
across conversations. The feature is opt-in: when ChatClientConfig.memory
is omitted, no tools are registered, no interceptor installs, and no extractor
runs — existing apps are unaffected.
This guide walks through enabling memory, choosing a backend, and wiring the React surface. See the reference for the full API.
Quickstart
import {
createChatClient,
createLocalMemoryAdapter,
createMemoryStorage,
} from 'gecx-chat';
const storage = createMemoryStorage();
const client = createChatClient({
// ...auth, transport, etc.
storage,
memory: {
adapter: createLocalMemoryAdapter({ storage }),
// Defaults: read.mode = 'inject', write.tool = true
},
});
// React side:
import { useMemory, MemoryList } from 'gecx-chat/react';
function MemoryPanel() {
return <MemoryList />;
}
That's it. The SDK now:
- Registers
memory.save,memory.update,memory.deleteas tools the model can call. - Injects a system message with the user's saved facts into every outbound send.
- Exposes
useMemory()and<MemoryList>for app-side display & control.
Choosing an adapter
| Adapter | Lives where | Survives device change | Setup cost | Good for |
|---|---|---|---|---|
createLocalMemoryAdapter | Client only (StorageAdapter) | No | None | Prototypes, single-device consumer apps |
createServerMemoryAdapter | Customer-defined REST | Yes | Implement the endpoint | Strict privacy postures, headless agents |
createHybridMemoryAdapter | Cache + remote | Yes | Implement the endpoint | Production B2C/B2B — recommended default |
The hybrid adapter wraps a remote source-of-truth with a local cache, doing optimistic writes and server-wins reconciliation:
const remote = createServerMemoryAdapter({ endpoint: '/api/memory' });
const cache = createLocalMemoryAdapter({ storage });
const adapter = createHybridMemoryAdapter({ remote, cache });
createChatClient({ memory: { adapter } });
If the remote save fails, the cache rolls back by default. Set
onRemoteFailure: 'keep' to keep the optimistic write — useful for offline
posture where you'll resync later.
Choosing a read mode
MemoryConfig.read.mode controls how saved memory reaches the model:
inject(default) — Prepends a system message with the top-N most relevant entries on every send. Cheap, deterministic, debuggable. Hits a ceiling around 1–2KB of text.recall— No automatic injection. Registers amemory.recalltool the model calls when it wants context. Token-efficient but the model has to remember to ask.hybrid— Inject a small pinned summary AND exposememory.recall. Best balance for production.off— Nothing flows to the model. Useful for write-only / export pipelines.
memory: {
adapter,
read: { mode: 'hybrid', inject: { maxEntries: 5, semantic: true } },
}
Write paths
Memory has two write paths, both optional:
Tool-call (default ON when memory is configured) — the model invokes
memory.save({ text, key?, scope? }) when it judges a fact worth keeping.
Same pattern as Anthropic's memory tool.
Auto-extraction (opt-in via write.extractor) — runs after each
assistant turn over the last N turns. Two built-in extractors:
// Zero-LLM, regex-based:
import { createHeuristicExtractor, COMMON_HEURISTIC_PATTERNS } from 'gecx-chat';
memory: { adapter, write: { extractor: createHeuristicExtractor({ patterns: COMMON_HEURISTIC_PATTERNS }) } }
// Second LLM pass — more thorough, costs another round-trip:
const { createLLMExtractor } = await import('gecx-chat/memory/extractors/llm');
memory: { adapter, write: { extractor: createLLMExtractor({ transport, model: 'gemini-flash' }) } }
Set write.requireUserApproval: true to gate extractor outputs behind a
user confirmation; the candidates surface as a memory-approval part and
persist only on useMemory().approve(id).
Scoping: identity vs conversation
Every memory has a scope:
- Identity-wide — applies across every conversation that user has. ("Prefers vegan recipes.")
- Conversation-scoped — applies only to one thread. ("This conversation is about an order RMA-1234.")
By default, memory.save saves identity-wide. Pass scope: 'conversation'
to scope to the active conversation.
At read time, conversation memories rank higher than identity memories so a narrower context wins when both match.
Conflict resolution
Memory is append-only: every save() inserts a new row. If an entry
declares a key, prior unarchived entries with the same (scope, key) get
archivedAt stamped — they're hidden from default lists but preserved for
audit. Reads naturally pick the newest non-archived entry per key.
This avoids the entire class of "in-place update over a race" bugs and gives you a free audit trail.
Governance & user control
Memory is gated by three independent switches, evaluated top-down:
DataGovernancePolicy.consent(strongest) — withdrawing consent disables and unregisters the memory tools, throwsMEMORY_CONSENT_WITHDRAWNon save/update/delete, and cancels in-flight extractions. Existing entries are preserved on disk until the standard governance "delete my data" flow clears them.temporaryConversation: trueonclient.createSession({ temporary: true })— for that session's lifetime only: bypasses both write paths and the inject step. Mirrors ChatGPT's Temporary Chat.useMemory().setEnabled(false)— user-controllable kill switch persisted to identity-scoped storage. No reads, no writes, no extraction. Existing entries preserved.
If multiple gates apply, the strongest wins. A governance-disabled state
cannot be re-enabled by setEnabled(true).
Analytics
Memory operations emit ProductAnalyticsEvents:
memory_saved,memory_deleted,memory_clearedmemory_recalled(withcount,mode,semantic)memory_extraction_proposed,memory_extraction_resolvedmemory_consent_changed
These flow through the same analytics sinks as everything else. The
applied-ai-retail demo wires these into its /analytics dashboard.
Error handling
Memory errors are first-class ChatSdkError codes. See
error-codes.md for the full list.
Critically: a memory failure NEVER blocks a send. The interceptor catches
adapter errors silently and the conversation proceeds without injected
context.
When to NOT use memory
- Strict-PII contexts where any user-identifying fact would violate
policy. Either disable memory or run the
localadapter inside a retention-mode-sessionstorage so entries die with the session. - One-shot use cases with no continuity expectation (CSAT polls, FAQ bots).
- Sessions where the user has consented to ephemeral mode — pass
temporary: truetocreateSession.
Where to go next
- Memory API reference — full type signatures
for
MemoryStore, adapters, extractors, and React hooks. - Architecture → Optional subsystems — where memory sits in the runtime.
- The applied retail demo's server-side memory store at
apps/applied-ai-retail/src/lib/serverMemoryStore.tsis a good cross-device continuity reference. - The in-chat memory drawer at
apps/applied-ai-retail/src/components/chat/MemoryDrawer.tsxshows the UX side.
docs/guides/memory.md