AI That Finally Remembers — Complete LangGraph Memory Guide [2026 Tutorial]

This is Part 4 of the series.

Part 1: Why Python Still Dominates in 2026

Part 2: Build Your Own AI Chatbot — RAG From Scratch to Deployment

Part 3: One AI Is No Longer Enough — LangGraph Multi-Agent Systems

Part 4: AI That Finally Remembers — Complete LangGraph Memory Guide ← You are here

📌 Level: Intermediate (Parts 1–3 are sufficient background) ⏱️ Reading time: ~12 min / Hands-on time: ~2 hours 🛠️ End result: An AI assistant that remembers conversations and personalizes over time

In the last part we built a 3-agent AI team.

The researcher, writer, and fact-checker collaborated to produce solid output. But what happens when you open it the next day?

It starts completely fresh.

“I told you my name was Alex yesterday, didn’t I?” → “I’m sorry, I don’t have that information.”

“Write it in the style of that Python article we did last time.” → “What style would you like?”

From the user’s perspective, this is no better than a search box. A real assistant needs to remember.

Today we’re going to attach memory to LangGraph to build an AI that is genuinely yours.

📊 Table of Contents

Why Memory Matters — The Limits of Amnesiac AI
Two Types of LangGraph Memory — Short-term vs Long-term
Short-Term Memory — MemorySaver (within-session recall)
Mid-Term Memory — SQLiteSaver (local persistent storage)
Long-Term Memory — PostgreSQL (production-grade)
The Long-Term Memory Store — Learning User Preferences
Multi-User Handling — Separating Users with thread_id
Practical Pattern: Summary Compression for Cost Control
Memory Design Checklist

1. Why Memory Matters

Feel the difference side by side.

AI without memory:

			
[Day 1] User: "I'm a backend developer and mainly use Python."
        AI:   "Got it!"
[Day 2] User: "Explain this at my level."
        AI:   "Could you tell me what field you're in and what language you use..."

AI with memory:

			
[Day 1] User: "I'm a backend developer and mainly use Python."
        AI:   "I'll remember that!"
[Day 2] User: "Explain this at my level."
        AI:   "Framing this for a Python backend developer — since you already
               know asyncio, let me use that as context..."

		

Do you feel the difference? The second AI behaves like an assistant for exactly one reason: it remembers.

LangGraph manages memory in two layers.

2. Two Types of LangGraph Memory

LangGraph’s official memory design is clear.

			
┌─────────────────────────────────────────────────┐
│            LangGraph Memory Architecture          │
├────────────────────────┬────────────────────────┤
│    Short-term Memory   │    Long-term Memory     │
├────────────────────────┼────────────────────────┤
│ Within current thread  │ Across multiple sessions│
│ Managed automatically  │ Must be saved/retrieved │
│ Handled by checkpointer│ Handled by Store        │
│ Gone when session ends │ Persists permanently    │
├────────────────────────┼────────────────────────┤
│ Examples:              │ Examples:               │
│ - Current conversation │ - User's name           │
│ - Active task state    │ - Preferred writing style│
│ - Error history        │ - Past projects         │
└────────────────────────┴────────────────────────┘

		

Checkpointer = handles short-term memory. Saves the current conversation flow. Store = handles long-term memory. Persists facts and preferences across sessions.

Production systems need both. Let’s build them step by step.

3. Short-Term Memory — MemorySaver (Within-Session Recall)

Start with the simplest option. MemorySaver is a checkpointer that stores in RAM. It disappears when the server restarts, but within a session it remembers everything.

			
# memory_basic.py
import os
from dotenv import load_dotenv
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, END
from langgraph.graph.message import add_messages
from langgraph.checkpoint.memory import MemorySaver
from langchain_anthropic import ChatAnthropic
from langchain_core.messages import HumanMessage, SystemMessage
load_dotenv()
# ── State Definition ────────────────────────────────
class ChatState(TypedDict):
    messages: Annotated[list, add_messages]
    # add_messages: accumulates messages instead of overwriting
# ── LLM ────────────────────────────────────────────
llm = ChatAnthropic(
    model="claude-sonnet-4-20250514",
    api_key=os.getenv("ANTHROPIC_API_KEY"),
    max_tokens=1024
)
# ── Chatbot Node ────────────────────────────────────
def chatbot_node(state: ChatState) -> dict:
    system = SystemMessage(content="""
You are a friendly AI assistant with excellent memory.
Always reference the conversation history and respond with appropriate context.
""")
    response = llm.invoke([system] + state["messages"])
    return {"messages": [response]}
# ── Graph Assembly ──────────────────────────────────
builder = StateGraph(ChatState)
builder.add_node("chatbot", chatbot_node)
builder.set_entry_point("chatbot")
builder.add_edge("chatbot", END)
# Key line: attach MemorySaver
memory = MemorySaver()
graph = builder.compile(checkpointer=memory)
# ── Chat Function ───────────────────────────────────
def chat(user_input: str, thread_id: str = "default") -> str:
    """
    thread_id: identifies the conversation session.
    Same thread_id = the AI remembers the prior conversation.
    """
    config = {"configurable": {"thread_id": thread_id}}
    result = graph.invoke(
        {"messages": [HumanMessage(content=user_input)]},
        config=config
    )
    return result["messages"][-1].content
# ── Test ────────────────────────────────────────────
if __name__ == "__main__":
    tid = "user-test-001"
    print("=" * 50)
    r1 = chat("Hi! My name is Alex. I'm a Python developer.", tid)
    print(f"AI: {r1}\n")
    r2 = chat("What's my name again?", tid)
    print(f"AI: {r2}\n")
    r3 = chat("Do you remember my job?", tid)
    print(f"AI: {r3}\n")

		

Run this and the AI will remember the name and job within the same session.

⚠️ MemorySaver’s limitation Restarting the server wipes all conversations. Fine for development and testing, not viable for real services.

4. Mid-Term Memory — SQLiteSaver (Local Persistent Storage)

If memory needs to survive a server restart, use SQLite. Everything is stored in a single .db file — no extra installation required, and perfect for personal projects or single-server deployments.

pip install langgraph-checkpoint-sqlite

			
# memory_sqlite.py
from langgraph.checkpoint.sqlite import SqliteSaver
# Other imports same as above...
# Replace MemorySaver with SqliteSaver
# ":memory:" → in-memory (testing only)
# "chat_memory.db" → writes to file (persistent)
with SqliteSaver.from_conn_string("chat_memory.db") as checkpointer:
    graph = builder.compile(checkpointer=checkpointer)
    config = {"configurable": {"thread_id": "user-alex"}}
    graph.invoke(
        {"messages": [HumanMessage("I mainly build FastAPI services with Python.")]},
        config=config
    )
    print("✅ Run 1 complete. Try stopping and restarting the process.")

		

Even after killing and restarting the process, the chat_memory.db file remains and the conversation can be resumed.

			
# Running again later — previous conversation is restored
with SqliteSaver.from_conn_string("chat_memory.db") as checkpointer:
    graph = builder.compile(checkpointer=checkpointer)
    config = {"configurable": {"thread_id": "user-alex"}}
    result = graph.invoke(
        {"messages": [HumanMessage("What framework did I say I mainly use?")]},
        config=config
    )
    print(result["messages"][-1].content)
    # → "You said you mainly use FastAPI!"

		

5. Long-Term Memory — PostgreSQL (Production-Grade)

For services that need simultaneous access from multiple servers, or that need to manage millions of conversations, use PostgreSQL. It is the backend LangGraph officially recommends for production.

pip install langgraph-checkpoint-postgres psycopg psycopg-pool

			
# memory_postgres.py
from langgraph.checkpoint.postgres import PostgresSaver
import os
DB_URI = os.getenv("DATABASE_URL")
# e.g. "postgresql://user:password@localhost:5432/mydb"
def get_graph_with_postgres():
    """Return a graph connected to a PostgreSQL checkpointer"""
    with PostgresSaver.from_conn_string(DB_URI) as checkpointer:
        # Run only once — auto-creates the required tables
        checkpointer.setup()
        graph = builder.compile(checkpointer=checkpointer)
        return graph, checkpointer
# Pattern for use with FastAPI
from fastapi import FastAPI
from contextlib import asynccontextmanager
from langgraph.checkpoint.postgres.aio import AsyncPostgresSaver
from langchain_core.messages import HumanMessage
app = FastAPI()
graph_instance = None
@asynccontextmanager
async def lifespan(app: FastAPI):
    """Connect to DB on startup, close on shutdown"""
    global graph_instance
    async with AsyncPostgresSaver.from_conn_string(DB_URI) as checkpointer:
        await checkpointer.setup()  # Init tables (runs once)
        graph_instance = builder.compile(checkpointer=checkpointer)
        print("✅ PostgreSQL checkpointer connected")
        yield  # Server is running
    # Connection released automatically on shutdown
app = FastAPI(lifespan=lifespan)
@app.post("/chat/{user_id}")
async def chat_endpoint(user_id: str, message: str):
    config = {"configurable": {"thread_id": user_id}}
    result = await graph_instance.ainvoke(
        {"messages": [HumanMessage(content=message)]},
        config=config
    )
    return {"reply": result["messages"][-1].content}

		

Environment	Recommended Checkpointer	Characteristics
Development / Testing	`MemorySaver`	No install needed, fast, resets on restart
Local service	`SqliteSaver`	File-based, persistent, single-server
Production	`PostgresSaver`	Multi-server, high availability, encryption support

6. The Long-Term Memory Store — Learning User Preferences

If the checkpointer remembers “conversation flow,” the Store remembers “facts.”

It accumulates things like a user’s preferred writing style, frequently used language, and job role across multiple sessions. Without this, true personalization is impossible.

			
# memory_store.py
from langgraph.store.memory import InMemoryStore
# Development: InMemoryStore
# Production: connect a persistent store (DB, Redis, etc.)
store = InMemoryStore()
def save_user_preference(user_id: str, key: str, value: str):
    """Save a user preference to the Store"""
    namespace = ("user_prefs", user_id)
    store.put(namespace, key, {"value": value})
    print(f"💾 Saved: [{user_id}] {key} = {value}")
def load_user_preferences(user_id: str) -> dict:
    """Load all preferences for a user"""
    namespace = ("user_prefs", user_id)
    items = store.search(namespace)
    return {item.key: item.value["value"] for item in items}
# ── Personalized chatbot node ─────────────────────────
def smart_chatbot_node(state: ChatState, config: dict) -> dict:
    """
    Retrieves stored preferences via user ID
    and injects them into the system prompt
    """
    user_id = config.get("configurable", {}).get("user_id", "anonymous")
    prefs = load_user_preferences(user_id)
    pref_text = ""
    if prefs:
        pref_text = "\n\n[What I know about this user]\n"
        for k, v in prefs.items():
            pref_text += f"- {k}: {v}\n"
    system = SystemMessage(content=f"""
You are a personalized AI assistant.
Provide responses tailored to the user's background and preferences.{pref_text}
""")
    response = llm.invoke([system] + state["messages"])
    # Auto-detect and save new preferences from conversation
    last_msg = state["messages"][-1].content.lower()
    if "python" in last_msg and ("love" in last_msg or "prefer" in last_msg):
        save_user_preference(user_id, "preferred_language", "Python")
    if "developer" in last_msg or "engineer" in last_msg:
        save_user_preference(user_id, "role", "Developer")
    return {"messages": [response]}

		

Practical Usage Example

			
# Pre-populate user info
save_user_preference("user-alex", "name", "Alex")
save_user_preference("user-alex", "role", "Python Backend Developer")
save_user_preference("user-alex", "preferred_style", "Technical and concise")
save_user_preference("user-alex", "experience", "3 years")
# Verify stored info
prefs = load_user_preferences("user-alex")
print(prefs)
# → {'name': 'Alex', 'role': 'Python Backend Developer', ...}
# From now on, every conversation with this user automatically reflects this info
config = {
    "configurable": {
        "thread_id": "session-001",
        "user_id": "user-alex"    # Used to look up the Store
    }
}

		

7. Multi-User Handling — Separating Users with thread_id

In a real service, multiple users connect simultaneously. thread_id must be designed correctly to prevent conversations from bleeding across users.

			
# thread_id design patterns
# ✅ Good — unique per user and purpose
config_alex  = {"configurable": {"thread_id": "user-alex-general"}}
config_minho = {"configurable": {"thread_id": "user-minho-general"}}
# One user can hold multiple threads
config_alex_work  = {"configurable": {"thread_id": "user-alex-work-project-a"}}
config_alex_study = {"configurable": {"thread_id": "user-alex-study-python"}}
# ❌ Bad — all users share the same thread_id
# Conversations will be mixed together
config_bad = {"configurable": {"thread_id": "global"}}

		

			
# Auto-generate per-user thread_id in FastAPI
from fastapi import FastAPI, Header
from typing import Optional
app = FastAPI()
@app.post("/chat")
async def chat(
    message: str,
    session_id: str,                              # Client-managed session ID
    x_user_id: Optional[str] = Header(None)      # User ID from JWT
):
    # Combine user ID + session ID for guaranteed uniqueness
    thread_id = f"{x_user_id}-{session_id}"
    config = {"configurable": {
        "thread_id": thread_id,
        "user_id": x_user_id
    }}
    result = await graph_instance.ainvoke(
        {"messages": [HumanMessage(content=message)]},
        config=config
    )
    return {"reply": result["messages"][-1].content, "thread_id": thread_id}

		

8. Practical Pattern: Summary Compression for Cost Control

Memory introduces a problem: the longer the conversation, the higher the token cost.

A 100-turn conversation means sending 100 messages to the LLM every single time. That’s not sustainable.

The solution: compress old messages into a summary.

			
# memory_summary.py — conversation summary pattern
def summarize_if_too_long(state: ChatState) -> dict:
    """
    When the message count exceeds 20, compress older messages into a summary.
    Maintains: summary of past + most recent 5 messages.
    """
    messages = state["messages"]
    THRESHOLD = 20  # Trigger compression above this count
    KEEP_RECENT = 5  # Always keep the last N messages as-is
    if len(messages) <= THRESHOLD:
        return {}  # Still short — do nothing
    to_summarize = messages[:-KEEP_RECENT]
    recent = messages[-KEEP_RECENT:]
    # Generate summary via LLM
    summary_prompt = f"""
Summarize the following conversation in 3–5 sentences.
Be sure to include: the user's name, job, preferences, and key topics discussed.
Conversation:
{chr(10).join([f"{m.type}: {m.content}" for m in to_summarize])}
"""
    summary_response = llm.invoke([HumanMessage(content=summary_prompt)])
    summary_message = SystemMessage(
        content=f"[Previous Conversation Summary]\n{summary_response.content}"
    )
    print(f"🗜️ Memory compressed: {len(to_summarize)} messages → 1 summary")
    return {"messages": [summary_message] + recent}

		

			
# Add summary node to the graph
builder = StateGraph(ChatState)
builder.add_node("chatbot", chatbot_node)
builder.add_node("summarize", summarize_if_too_long)
builder.set_entry_point("chatbot")
builder.add_edge("chatbot", "summarize")
builder.add_edge("summarize", END)

		

With this pattern, token costs stay flat no matter how long the conversation grows.

Approach	Context Size After 100 Turns	Cost
No summary	All 100 messages	💸💸💸
Summary compression	1 summary + last 5 messages	💸

9. Memory Design Checklist

Answer these questions before designing memory in your project.

Choosing the right backend

[ ] Local dev / prototype → MemorySaver
[ ] Single-server small service → SqliteSaver
[ ] Multi-server / production → PostgresSaver

Designing thread_id

[ ] Is every user’s ID unique?
[ ] Are different conversation topics separated into distinct threads?
[ ] For logged-in users, are you using user_id + session_id combination?

Long-term memory Store

[ ] Is there information that must persist across sessions? (name, role, preferences)
[ ] Auto-detection vs manual save — which approach fits?

Cost optimization

[ ] Is there logic to summarize long conversations?
[ ] Is only the most recent N messages kept in active context?

Error handling

[ ] Is there a fallback if the DB connection fails?
[ ] Does the app degrade gracefully to MemorySaver when needed?

Wrapping Up — The Gap Between an AI That Remembers and One That Doesn’t

Before and after adding memory, an AI feels completely different.

An AI without memory is an intern who starts from zero every morning. An AI with memory is a colleague who knows you.

Here’s what we covered:

MemorySaver → In-session memory. Development and testing only.
SqliteSaver → File-based persistent memory. Local or small-scale services.
PostgresSaver → DB-backed. The production standard.
Store → Cross-session facts and preferences. True personalization.
Summary pattern → Cost optimization for long conversations.

Combine these four layers and you have an AI assistant that knows you, remembers you, and grows with you.

In Part 5 — the finale — we’ll bring everything together: packaging the full AI assistant service with Docker and deploying it to the cloud. Everything built across Parts 1–4 will come together in one complete system.

🔖 Other posts in this series

Part 1: Why Python Still Dominates in 2026

Part 2: Build Your Own AI Chatbot — RAG From Scratch to Deployment

Part 3: One AI Is No Longer Enough — LangGraph Multi-Agent Systems

Part 4: AI That Finally Remembers — Complete LangGraph Memory Guide ← You are here

Part 5: Bringing It All Together — Docker Packaging & Cloud Deployment (coming soon)

Tags: #Python #LangGraph #AIMemory #Chatbot #LangChain #PostgreSQL #SQLite #AIDevelopment #DevTutorial #2026

Sources: LangGraph Official Docs · DigitalOcean LangGraph+Mem0 Tutorial · Markaicode LangGraph Memory Guide · LangChain Checkpoint Docs