Memento Vault: Local Tool for Persistent Context in Claude Code Sessions

Memento Vault addresses the issue of Claude Code forgetting context between sessions by automatically capturing and retrieving relevant information without manual maintenance.
How It Works
The tool uses hooks that plug into Claude Code's lifecycle:
- When a session ends: A hook reads the transcript, scores it, and decides what to keep. Substantial sessions get atomic notes written to a local git repo. Each note contains one idea with frontmatter including certainty scores and tags, plus wikilinks to related notes. Trivial sessions get a one-liner in a daily log.
- When a session starts: It injects a briefing showing your project's recent sessions and the most relevant vault notes for what you're about to work on.
- On every prompt: It searches your vault and surfaces matching notes before Claude processes your input.
- On every file read: It injects context about code areas you've touched before.
Technical Details
All retrieval uses local BM25 + vector search with no LLM calls. The system has 472ms average latency per prompt and costs nothing to run. Context overhead is approximately 149 input units per session. Retrieval quality scores NDCG@10 = 0.892 on LongMemEval (500 questions).
A background consolidation layer called Inception clusters notes by embedding similarity and writes pattern notes after sessions, identifying recurring issues across projects.
The entire system uses markdown files in a git repo, browsable in Obsidian, searchable with grep, and diffable with git log. There's no database, Docker, or cloud dependency.
Installation
git clone https://github.com/sandsower/memento-vault.git
cd memento-vault
./install.sh --experimentalRequirements: Python 3 and Claude Code. QMD adds semantic search (optional). Works on Linux and macOS.
The project includes 271 tests and is MIT licensed.
📖 Read the full source: r/ClaudeAI
👀 See Also

Token Reducer: A Claude Code Plugin for Intelligent Context Compression
Token Reducer is a Claude Code plugin that processes repository context locally to reduce token usage by 90-98% using AST-based chunking, hybrid retrieval, and TextRank compression. It's MIT licensed and available via the plugin marketplace.

V6rge AI Suite Update Adds NVIDIA GPU Support and Beta Coding Agent
V6rge AI Suite has released an update that fixes GPU detection issues, adds full NVIDIA GPU support for better performance, and introduces a new beta coding agent that generates and assists with code directly inside the app.

LLMSpend: Open-source cost tracker for Anthropic and OpenAI SDKs
LLMSpend is a Python library that adds cost tracking to Anthropic and OpenAI SDK calls with two lines of code. It provides local SQLite storage, CLI reporting, and a web dashboard without sending data externally.

Integrating Local LLM Agents with ComfyUI for Natural Language Batch Image Generation
A developer shares how they wired their local OpenClaw agent to ComfyUI, enabling natural language commands for batch image generation workflows. The integration uses a custom agent skill that maps English requests to ComfyUI workflow JSON and handles API communication.