memv: Open-Source Memory System for AI Agents

memv is an open-source memory system designed for AI agents with a unique approach to knowledge extraction. Unlike traditional memory systems that extract every fact and rely heavily on retrieval for organization, memv focuses only on storing prediction errors. It uses predict-calibrate extraction, where before extracting knowledge from a new interaction, it predicts what the episode should contain based on existing knowledge. Only facts that were unexpected are stored, as importance is derived from surprise rather than from initial large language model (LLM) scoring.
Key Details
- Bi-temporal Model: Each fact is tracked by both event and transaction times, allowing queries like "what did we know about this user in January?"
- Hybrid Retrieval: Utilizes vector similarity (sqlite-vec) combined with BM25 text search (FTS5) through Reciprocal Rank Fusion.
- Contradiction Handling: New facts automatically contradict and invalidate older conflicting ones, yet the full history is preserved.
- SQLite Default: Zero external dependencies - no need for Postgres, Redis, or Pinecone.
- Framework Agnostic: Works with LangGraph, CrewAI, AutoGen, LlamaIndex, or plain Python.
- MIT Licensed: Compatible with Python 3.13+ and utilizes asynchronous operations.
A sample setup using memv:
from memv import Memory
from memv.embeddings import OpenAIEmbedAdapter
from memv.llm import PydanticAIAdapter
memory = Memory(
db_path="memory.db",
embedding_client=OpenAIEmbedAdapter(),
llm_client=PydanticAIAdapter("openai:gpt-4o-mini"),
)
async with memory:
await memory.add_exchange(
user_id="user-123",
user_message="I just started at Anthropic as a researcher.",
assistant_message="Congrats! What's your focus area?",
)
await memory.process("user-123")
result = await memory.retrieve("What does the user do?", user_id="user-123")
The project is currently at an early stage (v0.1.0), and feedback is encouraged, especially concerning the extraction approach and potential useful integrations.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Clooks: A Persistent Hook Runtime for Claude Code
Clooks is a persistent HTTP daemon that handles Claude Code hook dispatch without process spawning, reducing latency from ~34.6ms to ~0.31ms per invocation. It includes automatic migration, LLM handlers with prompt templates, dependency resolution, and plugin packaging.

Self-Hosted GitHub Bot Runs Claude Code with 40+ Webhook Triggers and MCP Tools
A self-hosted GitHub bot leverages Claude Agent SDK with full Claude Code features, supporting 40+ webhook triggers, 4 built-in MCP servers, and custom YAML-based workflows for PR review, CI auto-fix, and issue triage.

Bullshit Benchmark Tests LLM Resistance to Nonsensical Prompts
The Bullshit Benchmark evaluates whether AI models identify and push back on obvious nonsense prompts instead of confidently generating incorrect answers. Results show Claude models perform significantly better than Gemini models at detecting nonsensical questions.

Claude Desktop App Cowork Feature Enables AI-to-AI Communication via Shared Google Docs
Users have successfully implemented Claude-to-Claude communication using the new cowork function in the desktop app, with two agents reading and writing to a shared Google Doc. The test involved five rounds of question-and-answer dialogue between the AI agents.