Claude Code token audit reveals hidden costs from default tool loading

✍️ OpenClawRadar📅 Published: April 15, 2026🔗 Source
Claude Code token audit reveals hidden costs from default tool loading
Ad

Token waste investigation reveals significant overhead

A developer conducted an audit of 926 Claude Code sessions after noticing rapid token consumption following Anthropic's rate limit changes. The investigation revealed that every Claude Code session starts with a base payload of approximately 45,000 tokens before any user input. This includes system prompts, tool definitions, agent descriptions, memory files, skill descriptions, and MCP schemas.

On the standard 200k context window, this 45k starting load represents over 20% of the available context consumed before any conversation begins. Since Claude Code operates as a stateless loop, this entire context gets rebuilt and resent with every single turn, making the starting overhead a recurring cost.

Default tool loading consumes significant tokens

The audit found that 20,000 tokens of the starting context came from system tool schema definitions. By default, Claude Code loads the full JSON schema for every available tool into context at session start, regardless of whether those tools will be used.

The developer discovered a setting called enable_tool_search that enables deferred tool loading. When enabled, this setting only loads 6 primary tools initially and lazy-loads the rest on demand instead of dumping all tool schemas upfront.

Configuration change yields immediate savings

To enable deferred tool loading, add this to your settings.json:

{
  "env": {
    "ENABLE_TOOL_SEARCH": "true"
  }
}

This single configuration change reduced starting context from 45,000 to 20,000 tokens, with system tool overhead dropping from 20,000 to 6,000 tokens. This saves 14,000 tokens on every turn of every session.

Ad

Cost implications of default settings

The developer calculated the impact of this one setting across their usage. With sessions averaging 22 turns, the 14,000 extra tokens per turn amounted to 308,000 unnecessary tokens per session. Across 858 sessions, this totaled 264 million tokens.

At cache-read pricing ($0.50/MTok), this represented $132 in unnecessary costs. However, since over half of the turns were hitting expired caches (which triggers full input pricing at $5/MTok), the actual cost was estimated between $132 and $1,300 from this single default setting.

Additional optimization strategies

The developer also implemented other optimizations that reduced starting context by 4,000-5,000 tokens:

  • Trimming and reworking CLAUDE markdown and memory files
  • Consolidating skill descriptions
  • Turning off unused MCP servers
  • Tightening schema injections from memory hooks

Claude Code stores conversations as JSONL files locally under ~/.claude/projects/, though there's no built-in way to get detailed breakdowns by session, cost per project, or expense categories. The built-in /insights command was found to be insufficient for diagnosing waste.

📖 Read the full source: r/ClaudeAI

Ad

👀 See Also