Recall: Local Project Memory for Claude Code, Zero Tokens

Recall is a fully-local project memory plugin for Claude Code that solves the cold-start problem without spending any model tokens on summarization. It captures session transcripts into .recall/history.md and condenses them into a compact context.md (~1–2K tokens) using a classical Python summarizer — not an LLM call.

How It Works

During a session: The Stop / SessionEnd hooks append new activity incrementally to .recall/history.md — only new turns, fully local.
At session start: The SessionStart hook surfaces context.md and prompts Claude to confirm: resume from saved context? and keep logging this session?

Key Advantages

Zero token spend on memory: Summarization is done locally by a deterministic algorithm, not by an API call. No API key or external model required.
Privacy: Transcripts (code, paths, secrets) never leave your machine. Most memory tools pipe context to a model endpoint; Recall doesn't.
Low friction: No pip install, no local model to run, no key configure — works offline immediately on plugin load.

Output Files

Two files in .recall/:

history.md — append-only log of prompts, replies, files touched, commands run.
context.md — overwritten summary containing: goal, summary, next steps/open threads, files touched, where you left off.

Comparison with Built-in Claude Code Memory

Feature	CLAUDE.md	--continue / --resume	Recall
What	Hand-written notes & rules	Reloads a prior conversation	Auto-captured session log + local summary
Upkeep	Manual	None (you pick session)	None — written as you work
Holds	Instructions to follow	Full prior transcript	Goal, files, commands, where you left off, next steps
Cost to resume	Small	Large (replays full transcript)	~1–2K tokens (compact digest)
Form	Markdown you edit	Local session state	Plaintext in `.recall/` — diffable & shareable
Claude treats it as	Instructions	The conversation	Fenced untrusted reference data

In short: CLAUDE.md is how I want you to work; Recall is here’s what we did last time and where we stopped — produced offline with zero model tokens spent.

📖 Read the full source: HN LLM Tools