Single-call MCP pipeline cuts Claude Code tokens 74%

A developer has shared their experience building a context engine (MCP server) that gives Claude Code a dependency graph of codebases, enabling it to read only relevant code instead of entire files. The tool reduces token usage significantly by serving dependency graphs and skeletons rather than raw files.

Original problem and initial solution

Claude Code typically reads entire files and dumps everything into context, consuming tokens rapidly. The initial approach involved serving only relevant code via MCP using dependency graphs and skeletons instead of raw files, which alone reduced token usage by 65%.

Identified inefficiency and solution

Users pointed out that the MCP workflow itself was wasteful, with agents making multiple round trips: calling get_context_capsule, reading the result, then calling get_impact_graph, reading that result, followed by search_memory, and reading that result. This created three round trips with overlapping results in context.

The run_pipeline fix

The developer shipped a single-call MCP tool called run_pipeline that replaces the multi-step workflow. The tool auto-detects intent (debug/modify/refactor/explore) and runs the appropriate combination of context search, impact analysis, and memory recall server-side.

run_pipeline({
  task: "fix JWT validation bug",
  preset: "auto",
  max_tokens: 10000,
  observation: "JWT uses Ed25519" // save insight in same call
})

This single call replaces 3-4 individual calls. Results are deduplicated and merged within a token budget before reaching the context window, resulting in approximately 60% fewer context tokens compared to calling tools individually. The observation parameter allows agents to save learned information in the same call without a separate save_observation step. Memory is linked to code graph nodes, so when code changes, observations are automatically flagged as stale.

Additional features shipped

Passive observation pipeline: file watcher → blake3 hash diff → AST-level structural diffs → auto-correlation with tool calls → zero-config observations
CLI that works without VS Code: npm install -g vexp-cli
Git hooks that don't overwrite existing ones (marker-delimited blocks)
Token savings display in VS Code sidebar showing actual numbers with a 24-hour rolling window

Availability

The tool is free to try with a generous free tier offering 2,000 nodes, basic pipeline functionality, and full session memory. No account or API key is required, and it makes zero network calls. The core architecture includes a Rust graph engine and tree-sitter parsers built by the developer, with Claude Code assisting on MCP protocol layer, SQLite schema migrations, and agent instruction templates.

📖 Read the full source: r/ClaudeAI