Engram v1.0.0: Persistent Memory for Local LLMs via Knowledge Graph

What Engram Does
Engram solves the problem of LLMs forgetting everything between sessions by providing persistent memory via a knowledge graph. Unlike vector databases that only find similar text, Engram understands relationships and can reason over them.
Core Features
- Knowledge graph with typed entities, relationships, and properties
- Hybrid search combining BM25 + vector similarity using Ollama/OpenAI embeddings or local ONNX
- Confidence lifecycle where facts strengthen with confirmation, weaken with time, and correct on contradiction
- Inference engine with forward/backward chaining that derives new facts from rules
- Built-in MCP server that works with Claude Code, Cursor, and Windsurf out of the box
- HTTP REST API with 25+ endpoints on port 3030
- Built-in web UI for graph exploration, search, and natural language queries
- Peer-to-peer mesh sync between instances with ed25519 authentication
- CORS enabled for any frontend integration
Technical Details
The entire system runs as an 8.3 MB binary with zero external dependencies. All data lives in a single .brain file that can be copied to back up or moved to migrate. No cloud, Docker, Python, or external database is required.
MCP Integration
MCP configuration is simple:
{
"mcpServers": {
"engram": {
"command": "engram",
"args": ["mcp", "/path/to/knowledge.brain"]
}
}
}The MCP server exposes these tools: engram_store, engram_relate, engram_query, engram_search, engram_prove, and engram_explain.
Quick Start Commands
engram create my.brain
engram store "PostgreSQL" my.brain
engram serve my.brainAfter running engram serve, the web UI is available at http://localhost:3030.
Availability
Engram is free for personal, research, and education use, with a commercial license available. The source and releases are on GitHub.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Fino: Open-Source MCP Server for Personal Finance Analysis with Claude
Fino is a free, open-source MCP server that connects Claude to bank accounts through Plaid, stores transaction data locally in SQLite, and provides Claude with tools for financial analysis.

No-Code Persistent Memory System for Claude Using Notion and MCP
A radiologist built a 'Cognitive Hub' in Notion that Claude reads and writes to through MCP, creating a structured knowledge base with a routing table to load only relevant information per conversation. The system has grown to 70+ pages after a month of daily use.

Running Google Gemma 4 26B-A4B Locally with LM Studio 0.4.0 Headless CLI
LM Studio 0.4.0 introduces llmster and the lms CLI for headless local model inference. The article details setting up Google's Gemma 4 26B-A4B MoE model on a MacBook Pro M4 Pro, achieving 51 tokens/second with 48GB unified memory.

Simplifying Automation with OpenClaw Wrappers
OpenClaw Wrappers offer an efficient way to manage AI coding agents. Discover how these tools integrate easily into existing frameworks with specific command examples and community feedback.