agentmemory V4 achieves 96.2% on LongMemEval benchmark, outperforms commercial AI memory systems

✍️ OpenClawRadar📅 Published: March 27, 2026🔗 Source
agentmemory V4 achieves 96.2% on LongMemEval benchmark, outperforms commercial AI memory systems
Ad

agentmemory V4 is an open-source memory system for AI agents that just achieved a world record score of 96.2% on LongMemEval, the standard benchmark for long-term AI agent memory.

Benchmark Performance

The system outperformed several funded AI memory companies:

  • PwC Chronos: 95.6%
  • Mastra: 94.87%
  • OMEGA: 93.2% (raw)
  • Supermemory: 85.86%
  • Emergence AI: 86%
  • Zep: 71.2%

Development Details

Built solo in 16 days on a mid-range gaming PC (i3-12100F) with a total cost of $1,000. The system uses Claude Opus as a generator and GPT-4o as a judge, but the retrieval architecture is the core innovation.

Technical Architecture

The system combines multiple retrieval techniques in a single SQLite-backed system:

  • HNSW (Hierarchical Navigable Small World) for approximate nearest neighbor search
  • BM25 for traditional text retrieval
  • Cross-encoder for relevance scoring
  • Knowledge graph integration
  • Temporal grounding for time-aware memory retrieval

Availability

The system is open source under the MIT license and available at: github.com/JordanMcCann/agentmemory

📖 Read the full source: r/LocalLLaMA

Ad

👀 See Also