Fullerenes: Open-source persistent memory layer for coding agents cuts tokens by 64% on SWE-bench

Fullerenes is an open-source persistent memory layer for AI coding agents. Instead of re-reading files every session, it builds a local knowledge graph from your repo using Tree-sitter and exposes it over MCP (Model Context Protocol). Agents query the graph for functions, classes, imports, and call relationships rather than reading raw files — cutting token consumption drastically.
How it works
Run npx fullerenes init in your repo. It walks the codebase with Tree-sitter, extracts every function, class, import, and call relationship, and stores it in a local SQLite graph. Agents connect via MCP and ask targeted questions.
The design draws on retrieval research: Repoformer (retrieve only when needed), HippoRAG and G-Retriever (graph beats flat chunks), and LLMLingua (aggressive context compression). The goal is better signal per token, not more context.
Unique MCP tools
Two standout tools:
predict_impact({ functionName: "x" })— Before editing, the agent asks what else will break. Traverses the edge graph and returns direct + transitive dependents with a risk score. Blast radius before the first keystroke.get_function({ name: "x", includeBody: true })— Signature, body, and callers in one MCP call. No follow-upread_fileneeded.
Benchmark results
- SWE-bench Verified (1 instance so far): Codex baseline 91,949 tokens → with Fullerenes 32,945 tokens. 64% reduction.
- Internal (5 questions on this repo): Raw files 2,452 tokens avg → Fullerenes 137 tokens avg. 94.4% reduction.
- External (Gemini CLI on a Python project): Raw files 27,292 tokens → Fullerenes AGENTS.md 919 tokens. 96.6% reduction.
Limitations
Tree-sitter is structural, not semantic. Dynamic dispatch and metaprogramming will miss edges. LSP integration is on the roadmap. One SWE-bench instance is not a broad result — more are being run.
Local & open source
Everything runs locally: SQLite, no server, no API key, pure npm (no Python), works offline, MIT license. 589 npm downloads in 40 hours before the Reddit post. 14 stars. Just launched.
github.com/codebreaker77/Fullerenes
npmjs.com/package/fullerenes
Three questions the author is asking the community: Does graph-based retrieval change your agent workflows, or is long context winning? What MCP tools beyond the current 8? Does the SWE-bench methodology look sound?
📖 Read the full source: r/ClaudeAI
👀 See Also

Configuring OpenClaw with VAST.AI GPU Rental for Unlimited Ollama Prompts
A user describes combining VAST.AI GPU rental with Ollama and OpenClaw to bypass prompt limits, but encountered configuration challenges requiring manual JSON editing.

Galadriel: Open-Source Warm-Cache Harness for Persistent Claude Agents
Galadriel is a 3-tier stacked caching harness for Claude that reduces costs by 87% and latency to under 3s for 100K token prompts. Integrates MemPalace for persistent vector memory.

Optio: Orchestrating AI Coding Agents in Kubernetes from Ticket to PR
Optio is an open-source orchestration system that turns tickets into merged pull requests using AI coding agents like Claude Code or Codex. It handles the full lifecycle in isolated Kubernetes pods with a feedback loop that auto-resumes agents on CI failures or review feedback.

AI Agent Session Center: 3D Dashboard for Monitoring Claude Code Sessions
AI Agent Session Center is a real-time dashboard that visualizes Claude Code sessions as 3D robots in a cyberdrome, with animations showing agent status and features including live terminal views, approval alerts, and session resume. It installs via npx with lightweight bash hooks.