Single-call MCP pipeline reduces Claude Code token usage by 74%

✍️ OpenClawRadar📅 Published: March 1, 2026🔗 Source
Single-call MCP pipeline reduces Claude Code token usage by 74%
Ad

A developer has shared their experience building a context engine (MCP server) that gives Claude Code a dependency graph of codebases, enabling it to read only relevant code instead of entire files. The tool reduces token usage significantly by serving dependency graphs and skeletons rather than raw files.

Original problem and initial solution

Claude Code typically reads entire files and dumps everything into context, consuming tokens rapidly. The initial approach involved serving only relevant code via MCP using dependency graphs and skeletons instead of raw files, which alone reduced token usage by 65%.

Identified inefficiency and solution

Users pointed out that the MCP workflow itself was wasteful, with agents making multiple round trips: calling get_context_capsule, reading the result, then calling get_impact_graph, reading that result, followed by search_memory, and reading that result. This created three round trips with overlapping results in context.

The run_pipeline fix

The developer shipped a single-call MCP tool called run_pipeline that replaces the multi-step workflow. The tool auto-detects intent (debug/modify/refactor/explore) and runs the appropriate combination of context search, impact analysis, and memory recall server-side.

run_pipeline({
  task: "fix JWT validation bug",
  preset: "auto",
  max_tokens: 10000,
  observation: "JWT uses Ed25519" // save insight in same call
})

This single call replaces 3-4 individual calls. Results are deduplicated and merged within a token budget before reaching the context window, resulting in approximately 60% fewer context tokens compared to calling tools individually. The observation parameter allows agents to save learned information in the same call without a separate save_observation step. Memory is linked to code graph nodes, so when code changes, observations are automatically flagged as stale.

Ad

Additional features shipped

  • Passive observation pipeline: file watcher → blake3 hash diff → AST-level structural diffs → auto-correlation with tool calls → zero-config observations
  • CLI that works without VS Code: npm install -g vexp-cli
  • Git hooks that don't overwrite existing ones (marker-delimited blocks)
  • Token savings display in VS Code sidebar showing actual numbers with a 24-hour rolling window

Availability

The tool is free to try with a generous free tier offering 2,000 nodes, basic pipeline functionality, and full session memory. No account or API key is required, and it makes zero network calls. The core architecture includes a Rust graph engine and tree-sitter parsers built by the developer, with Claude Code assisting on MCP protocol layer, SQLite schema migrations, and agent instruction templates.

📖 Read the full source: r/ClaudeAI

Ad

👀 See Also

Feynman: Open Source Research Agent with Paper-Codebase Audit Tool
Tools

Feynman: Open Source Research Agent with Paper-Codebase Audit Tool

Feynman is an open source research agent CLI that dispatches four subagents in parallel to answer research questions and includes a unique audit tool that compares paper claims against actual codebases. It features one-command installation, MIT license, and runs on pi for agent runtime with alphaxiv for paper search.

OpenClawRadar
Legal MCP Server for Claude Provides Access to 4M+ US Court Opinions
Tools

Legal MCP Server for Claude Provides Access to 4M+ US Court Opinions

A free, open-source MCP server built with Claude Code gives Claude AI access to 4M+ real US court opinions, providing 18 tools for case law search, citation tracing, Bluebook parsing, Clio practice management, and PACER federal filings without hallucinations.

OpenClawRadar
Moving from CLAUDE.md rules to infrastructure enforcement with Citadel
Tools

Moving from CLAUDE.md rules to infrastructure enforcement with Citadel

A developer found that adding more rules to CLAUDE.md beyond about 100 lines reduced compliance, with 40% redundancy in their file. The solution was moving enforcement from instructions to infrastructure using lifecycle hooks, skills, and campaign files, culminating in the open-source Citadel system.

OpenClawRadar
onWatch: Open-source local API quota tracker with SQLite storage
Tools

onWatch: Open-source local API quota tracker with SQLite storage

onWatch is a local-first API quota tracker that stores all data in a local SQLite database with no cloud service, telemetry, or account creation. It's a single binary (~13MB) that runs as a background daemon using <50MB RAM and serves a dashboard on localhost.

OpenClawRadar