Reduce AI Coding Session Costs by 90% with Graph-Based Code Indexing

✍️ OpenClawRadar📅 Published: May 10, 2026🔗 Source
Reduce AI Coding Session Costs by 90% with Graph-Based Code Indexing
Ad

A Reddit user reports spending $2-6 per query on Claude Code due to the model re-reading dozens of files every session. Even with caching (70% of tokens from cache at 90% discount), cache resets per session. The fix: a local server that indexes the codebase into a graph database, queried via the Model Context Protocol (MCP) instead of raw file reads.

How It Works

  • Instead of AST parsing or vector embeddings, the tool uses an LLM to generate a purpose, summary, and business context for each file, plus links to its functions, classes, and imports.
  • The graph is exposed through an MCP server; Claude queries the graph for targeted lookups (2-4 nodes per question) instead of dumping the entire repo into context.
  • Session costs dropped from dollars to cents. The approach works equally well with open-source models like DeepSeek-V4 and Kimi-2.6 because retrieval (not model size) does the heavy lifting.
Ad

Setup Details

Everything runs locally, single-tenant, no cloud dependency. The project is open-sourced on GitHub: github.com/ByteBell/bytebell-oss. The user notes they aren't using AST parsing or vectors — the graph is LLM-generated file analyses.

Who This Is For

Developers using Claude Code (or any token-cost AI agent) on large codebases who want to slash costs by caching structural context across sessions.

📖 Read the full source: r/ClaudeAI

Ad

👀 See Also