Reduce AI Coding Session Costs by 90% with Graph-Based Code Indexing

A Reddit user reports spending $2-6 per query on Claude Code due to the model re-reading dozens of files every session. Even with caching (70% of tokens from cache at 90% discount), cache resets per session. The fix: a local server that indexes the codebase into a graph database, queried via the Model Context Protocol (MCP) instead of raw file reads.
How It Works
- Instead of AST parsing or vector embeddings, the tool uses an LLM to generate a purpose, summary, and business context for each file, plus links to its functions, classes, and imports.
- The graph is exposed through an MCP server; Claude queries the graph for targeted lookups (2-4 nodes per question) instead of dumping the entire repo into context.
- Session costs dropped from dollars to cents. The approach works equally well with open-source models like DeepSeek-V4 and Kimi-2.6 because retrieval (not model size) does the heavy lifting.
Setup Details
Everything runs locally, single-tenant, no cloud dependency. The project is open-sourced on GitHub: github.com/ByteBell/bytebell-oss. The user notes they aren't using AST parsing or vectors — the graph is LLM-generated file analyses.
Who This Is For
Developers using Claude Code (or any token-cost AI agent) on large codebases who want to slash costs by caching structural context across sessions.
📖 Read the full source: r/ClaudeAI
👀 See Also

Fino: Open-Source MCP Server for Personal Finance Analysis with Claude
Fino is a free, open-source MCP server that connects Claude to bank accounts through Plaid, stores transaction data locally in SQLite, and provides Claude with tools for financial analysis.

AGI in md: 11 Cognitive Compression Levels for Claude System Prompts
A GitHub repository documents 11 levels of cognitive compression that can be encoded in Claude system prompts, with Level 8 shifting from analysis to construction and improving Haiku's performance from 0/3 to 4/4. The project includes 28 prompts, 299 raw outputs, and full experiment logs across 19 domains.

Anamnese: A Portable Memory Layer for Claude and ChatGPT via MCP
Anamnese is a free MCP server that stores memories, tasks, goals, and notes across Claude and ChatGPT, pulling only relevant context into conversations while letting users view, edit, and export their data.

Unofficial Ultrahuman Ring MCP Server for AI Agent Integration
A community-developed MCP server wraps the Ultrahuman Partner API, allowing AI coding agents to directly access ring and CGM metrics like sleep, HRV, glucose, and recovery scores via structured data calls.