Agent Integration

Patterns for wiring OpenMem into LLM-based agents.

tip

For Claude Code, the easiest path is the Claude Code plugin — it handles all the wiring for you via MCP and slash commands.

Basic pattern

Before each LLM call, recall relevant context and inject it into the prompt:

from openmem import MemoryEngine

engine = MemoryEngine("agent.db")

def agent_turn(user_message: str) -> str:
    # Recall relevant context
    results = engine.recall(user_message, top_k=5, token_budget=2000)
    context = "\n".join(f"- {r.memory.text}" for r in results)

    prompt = f"""Relevant context from previous work:
{context}

User: {user_message}"""

    response = call_llm(prompt)
    return response

Storing memories from conversations

Have your agent extract facts, decisions, and preferences as it works:

# Facts the agent discovers
engine.add(
    "The /api/users endpoint returns 500 on empty payload",
    type="incident",
    entities=["/api/users"],
)

# Decisions made during the session
engine.add(
    "We chose JWT with 24h expiry for auth tokens",
    type="decision",
    entities=["JWT", "auth"],
    confidence=0.9,
)

# User preferences
engine.add(
    "User prefers TypeScript over JavaScript",
    type="preference",
    entities=["TypeScript", "JavaScript"],
)

# Constraints
engine.add(
    "All API responses must include request_id",
    type="constraint",
    entities=["API", "request_id"],
)

Building a knowledge graph

Link related memories as context accumulates:

m1 = engine.add("Auth uses JWT tokens", type="decision", entities=["JWT", "auth"])
m2 = engine.add("Tokens expire after 24h", type="decision", entities=["JWT"])
m3 = engine.add("Refresh tokens stored in httpOnly cookies", type="decision", entities=["JWT", "cookies"])

engine.link(m1.id, m2.id, "supports")
engine.link(m1.id, m3.id, "supports")
engine.link(m2.id, m3.id, "depends_on")

A query about "authentication" will pull in all three via spreading activation.

Reinforcing useful memories

When a memory proves useful, reinforce it:

results = engine.recall("how does auth work?")
for r in results:
    if r.score > 0.5:
        engine.reinforce(r.memory.id)

Handling outdated information

When facts change, supersede instead of deleting:

old = engine.add("API rate limit is 100 req/min", type="fact", entities=["API", "rate-limit"])

# Later, the limit changes
new = engine.add("API rate limit increased to 500 req/min", type="fact", entities=["API", "rate-limit"])
engine.supersede(old.id, new.id)

The old memory stays in the graph (for context) but gets a 50% score penalty.

Session lifecycle

A recommended pattern for long-running agents:

engine = MemoryEngine("project.db")

# Start of session: run decay to age old memories
engine.decay_all()

# During session: add, recall, reinforce, link
# ...

# End of session: check stats
stats = engine.stats()
print(f"Memories: {stats['memory_count']}, Avg strength: {stats['avg_strength']:.2f}")

Multi-agent setup

Multiple agents can share a memory store by pointing to the same database file:

# Agent 1: code assistant
code_engine = MemoryEngine("shared.db")
code_engine.add("Codebase uses ESM modules", type="fact", entities=["ESM"])

# Agent 2: project manager
pm_engine = MemoryEngine("shared.db")
results = pm_engine.recall("module system")
# Finds the memory stored by Agent 1

warning

SQLite handles concurrent reads well, but concurrent writes from multiple processes may cause locking. For multi-process setups, WAL mode (enabled by default) helps, but handle SQLITE_BUSY errors with retries.

Token budget strategies

Choose a budget based on your model's context window:

# Conservative: small context, precise memories
results = engine.recall(query, top_k=3, token_budget=500)

# Generous: large context, more background
results = engine.recall(query, top_k=10, token_budget=4000)

# Adaptive: scale budget based on available context
available_tokens = model_context_limit - len(user_message) // 4 - system_prompt_tokens
results = engine.recall(query, top_k=10, token_budget=available_tokens)

Basic pattern​

Storing memories from conversations​

Building a knowledge graph​

Reinforcing useful memories​

Handling outdated information​

Session lifecycle​

Multi-agent setup​

Token budget strategies​