YourMemory: AI memory with biological decay hits 59% recall on LoCoMo-10

YourMemory implements persistent memory for AI agents using the Ebbinghaus forgetting curve — memories decay unless reinforced by recall, and unused data is pruned when it hits a threshold. Built as a local-first MCP server on DuckDB, it combines BM25, vector search, and a graph layer to solve the "logical neighbor" problem where semantic search misses relevant but non-similar nodes.
Benchmarks
On the LoCoMo-10 benchmark (1,534 QA pairs across 10 multi-session conversations):
- YourMemory: 59% Recall@5 (95% CI: 56–61%)
- Zep Cloud: 28% (95% CI: 26–30%)
That's 2× better recall than Zep Cloud. Stateless vector stores reportedly suffer 84% more token waste.
Quick Start
Python 3.11–3.14. No Docker or external services needed.
pip install yourmemory
yourmemory-setupGet your config path:
yourmemory-pathMCP Configuration
Claude Code — add to ~/.claude/settings.json:
{
"mcpServers": {
"yourmemory": {
"command": "yourmemory"
}
}
}Claude Desktop — add to the appropriate config file:
{
"mcpServers": {
"yourmemory": {
"command": "yourmemory"
}
}
}Cline, Cursor, OpenCode, and any MCP-compatible client (Windsurf, Continue, Zed) can wire it in using the full path from yourmemory-path.
Memory Workflow
Copy the sample instructions:
cp sample_CLAUDE.md CLAUDE.mdThen edit CLAUDE.md with your name and user ID. Claude follows a recall → store → update workflow on every task using three MCP tools:
recall_memory(query)— surfaces relevant memories at start of taskstore_memory(content, importance)— embeds and stores with biological decayupdate_memory(id, new_content)— re-embeds and replaces outdated info
Example: store_memory("Sachit prefers tabs over spaces in Python", importance=0.9, category="fact")
Who It's For
Developers building AI coding agents that run long-lived projects and need to remember user preferences, project context, and avoid retraining from scratch each session.
📖 Read the full source: HN LLM Tools
👀 See Also

Qwen 3.6 27B with MTP on V100 32GB: 54 t/s via llama.cpp Branch
am17an's MTP branch of llama.cc runs Qwen 3.6 27B at 54 t/s on V100 32GB via PCIe adapter, dropping to 29-30 t/s without MTP.

Local Behavioral Monitoring System with MCP Pipeline and Claude Code
A developer built a local behavioral monitoring system called BRAIN that tracks app switches, file operations, and dev sessions, piping data through a custom MCP server to Claude Code. The system runs 100% locally with zero cloud dependency.

Building a Local Voice AI Assistant with SwiftUI and CSM-1B on Apple Silicon
A developer built mobiGlas, a SwiftUI app that pairs with OpenClaw to enable hands-free conversations via AirPods, using local voice cloning (CSM-1B on M2 Ultra) and no cloud APIs.

Terminal-Based 3D Renderer Built with Multi-Agent Claude Code System
A developer created tortuise, a pure terminal-based 3D renderer that displays Gaussian splats using Unicode and ASCII symbols, built over 3 days using 70-80 AI agents coordinated through a Claude Code setup with subagents inside subagents.